llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00

Author	SHA1	Message	Date
Justin Holewinski	dbdc639118	[NVPTX] Add missing patterns for div.approx with immediate denominator llvm-svn: 199746	2014-01-21 14:40:05 +00:00
Saleem Abdulrasool	aae6a31cf3	tools: support decoding ARM EHABI opcodes in readobj Add support to llvm-readobj to decode the actual opcodes. The ARM EHABI opcodes are a variable length instruction set that describe the operations required for properly unwinding stack frames. The primary motivation for this change is to ease the creation of tests for the ARM EHABI object emission as well as the unwinding directive handling in the ARM IAS. Thanks to Logan Chien for an extra test case! llvm-svn: 199708	2014-01-21 02:33:15 +00:00
Saleem Abdulrasool	d7349ac01d	ARM IAS: add support for .unwind_raw directive This implements the unwind_raw directive for the ARM IAS. The unwind_raw directive takes the form of a stack offset value followed by one or more bytes representing the opcodes to be emitted. The opcode emitted will interpreted as if it were assembled by the opcode assembler via the standard unwinding directives. Thanks to Logan Chien for an extra test! llvm-svn: 199707	2014-01-21 02:33:10 +00:00
Saleem Abdulrasool	4a5175ebb3	ARM IAS: support .personalityindex The .personalityindex directive is equivalent to the .personality directive with the ARM EABI personality with the specific index (0, 1, 2). Both of these directives indicate personality routines, so enhance the personality directive handling to take into account personalityindex. Bonus fix: flush the UnwindContext at the beginning of a new function. Thanks to Logan Chien for additional tests! llvm-svn: 199706	2014-01-21 02:33:02 +00:00
Kevin Qin	d925e0a953	[AArch64 NEON] Fix a bug caused by undef lane when generating VEXT. It was commited as r199628 but reverted in r199628 as causing regression test failed. It's because of old vervsion of patch I used to commit. Sorry for mistake. llvm-svn: 199704	2014-01-21 01:48:52 +00:00
Andrea Di Biagio	5b5e7e3cb5	[X86] Teach how to combine a vselect into a movss/movsd Add target specific rules for combining vselect dag nodes into movss/movsd when possible. If the vector type of the vselect dag node in input is either MVT::v4i13 or MVT::v4f32, then try to fold according to rules: 1) fold (vselect (build_vector (0, -1, -1, -1)), A, B) -> (movss A, B) 2) fold (vselect (build_vector (-1, 0, 0, 0)), A, B) -> (movss B, A) If the vector type of the vselect dag node in input is either MVT::v2i64 or MVT::v2f64 (and we have SSE2), then try to fold according to rules: 3) fold (vselect (build_vector (0, -1)), A, B) -> (movsd A, B) 4) fold (vselect (build_vector (-1, 0)), A, B) -> (movsd B, A) llvm-svn: 199683	2014-01-20 19:35:22 +00:00
Adrian Prantl	7a4be154a2	Debug info: On ARM ensure that all __TEXT sections come before the optional DWARF sections, so compiling with -g does not result in different code being generated for PC-relative loads. This is reapplying a diet r197922 (__TEXT-only). llvm-svn: 199681	2014-01-20 19:15:59 +00:00
Adrian Prantl	ac36339a73	Revert "Debug info: On ARM ensure that the data sections come before the" Cut back on the cargo cult. The order of __DATA sections doesn't affect generated code. This reverts commit r197922. llvm-svn: 199680	2014-01-20 19:15:55 +00:00
James Molloy	bd56139ee7	Remove the useless pseudo instructions VDUPfdf and VDUPfqf, replacing them with patterns to match VDUPLN. llvm-svn: 199675	2014-01-20 17:14:48 +00:00
Hal Finkel	0fab5a5e7b	Fix misched-aa-colored.ll to require asserts (trying again) Perhaps it needs to be in caps. llvm-svn: 199661	2014-01-20 14:15:28 +00:00
Hal Finkel	a7712939fe	Fix misched-aa-colored.ll to require asserts. -misched=shuffle is NDEBUG only. Maybe we should change that. llvm-svn: 199659	2014-01-20 14:09:34 +00:00
Hal Finkel	fe3a3ecfad	Update IR when merging slots in stack coloring The way that stack coloring updated MMOs when merging stack slots, while correct, is suboptimal, and is incompatible with the use of AA during instruction scheduling. The solution, which involves the use of const_cast (and more importantly, updating the IR from within an MI-level pass), obviously requires some explanation: When the stack coloring pass was originally committed, the code in ScheduleDAGInstrs::buildSchedGraph tracked possible alias sets by using GetUnderlyingObject, and all load/store and store/store memory control dependencies where added between SUs at the object level (where only one object, that returned by GetUnderlyingObject, was used to identify the object associated with each MMO). When stack coloring merged stack slots, it would replace MMOs derived from the remapped alloca with the alloca with which the remapped alloca was being replaced. Because ScheduleDAGInstrs only used single objects, and tracked alias sets at the object level, this was a fine solution. In r169744, (Andy and) I updated the code in ScheduleDAGInstrs to use GetUnderlyingObjects, and track alias sets using, potentially, multiple underlying objects for each MMO. This was done, primarily, to provide the ability to look through PHIs, and provide better scheduling for induction-variable-dependent loads and stores inside loops. At this point, the MMO-updating code in stack coloring became suboptimal, because it would clear the MMOs for (i.e. completely pessimize) all instructions for which r169744 might help in scheduling. Updating the IR directly is the simplest fix for this (and the one with, by far, the least compile-time impact), but others are possible (we could give each MMO a small vector of potential values, or make use of a remapping table, constructed from MFI, inside ScheduleDAGInstrs). Unfortunately, replacing all MMO values derived from the remapped alloca with the base replacement alloca fundamentally breaks our ability to use AA during instruction scheduling (which is critical to performance on some targets). The reason is that the original MMO might have had an offset (either constant or dynamic) from the base remapped alloca, and that offset is not present in the updated MMO. One possible way around this would be to use GetPointerBaseWithConstantOffset, and update not only the MMO's value, but also its offset based on the original offset. Unfortunately, this solution would only handle constant offsets, and for safety (because AA is not completely restricted to deducing relationships with constant offsets), we would need to clear all MMOs without constant offsets over the entire function. This would be an even worse pessimization than the current single-object restriction. Any other solution would involve passing around a vector of remapped allocas, and teaching AA to use it, introducing additional complexity and overhead into AA. Instead, when remapping an alloca, we replace all IR uses of that alloca as well (optionally inserting a bitcast as necessary). This is even more efficient that the old MMO-updating code in the stack coloring pass (because it removes the need to call GetUnderlyingObject on all MMO values), removes the single-object pessimization in the default configuration, and enables the correct use of AA during instruction scheduling (all without any additional overhead). LLVM now no longer miscompiles itself on x86_64 when using -enable-misched -enable-aa-sched-mi -misched-bottomup=0 -misched-topdown=0 -misched=shuffle! Fixed PR18497. Because the alloca replacement is now done at the IR level, unless the MMO directly refers to the remapped alloca, the change cannot be seen at the MI level. As a result, there is no good way to fix test/CodeGen/X86/pr14090.ll. llvm-svn: 199658	2014-01-20 14:03:16 +00:00
David Woodhouse	d30df6b04f	[x86] Fix disassembly of MOV16ao16 et al. The addition of IC_OPSIZE_ADSIZE in r198759 wasn't quite complete. It also turns out to have been unnecessary. The disassembler handles the AdSize prefix for itself, and doesn't care about the difference between (e.g.) MOV8ao8 and MOB8ao8_16 definitions. So just let them coexist and don't worry about it. llvm-svn: 199654	2014-01-20 12:02:53 +00:00
David Woodhouse	2992f8669c	[x86] Fix 16-bit disassembly of JCXZ/JECXZ llvm-svn: 199653	2014-01-20 12:02:48 +00:00
David Woodhouse	316c7ec362	[x86] Rename MOVSD/STOSD/LODSD/OUTSD to MOVSL/STOSL/LODSL/OUTSL The disassembler has a special case for 'L' vs. 'W' in its heuristic for checking for 32-bit and 16-bit equivalents. We could expand the heuristic, but better just to be consistent in using the 'L' suffix. llvm-svn: 199652	2014-01-20 12:02:44 +00:00
David Woodhouse	1ae3cd66f2	[x86] Fix disassembly of callw instruction Not quite sure why this was marked isAsmParserOnly, but it means that the disassembler can't see it either. llvm-svn: 199651	2014-01-20 12:02:40 +00:00
David Woodhouse	40ce5ad1c0	[x86] Fix 16-bit handling of OpSize bit When disassembling in 16-bit mode the meaning of the OpSize bit is inverted. Instructions found in the IC_OPSIZE context will actually not have the 0x66 prefix, and instructions in the IC context will have the 0x66 prefix. Make use of the existing special-case handling for the 0x66 prefix being in the wrong place, to cope with this. llvm-svn: 199650	2014-01-20 12:02:35 +00:00
David Woodhouse	302d198064	[x86] Support i386---code16 triple for emitting 16-bit code llvm-svn: 199648	2014-01-20 12:02:25 +00:00
Chandler Carruth	003ef14be1	[PM] Wire up the Verifier for the new pass manager and connect it to the various opt verifier commandline options. Mostly mechanical wiring of the verifier to the new pass manager. Exercises one of the more unusual aspects of it -- a pass can be either a module or function pass interchangably. If this is ever problematic, we can make things more constrained, but for things like the verifier where there is an "obvious" applicability at both levels, it seems convenient. This is the next-to-last piece of basic functionality left to make the opt commandline driving of the new pass manager minimally functional for testing and further development. There is still a lot to be done there (notably the factoring into .def files to kill the current boilerplate code) but it is relatively uninteresting. The only interesting bit left for minimal functionality is supporting the registration of analyses. I'm planning on doing that on top of the .def file switch mostly because the boilerplate for the analyses would be significantly worse. llvm-svn: 199646	2014-01-20 11:34:08 +00:00
Kai Nacke	b191003529	ARM: add tlsldo relocation Add support for the symbol(tlsldo) relocation. This is required in order to solve PR18554. Reviewed by R. Golin, A. Korobeynikov. llvm-svn: 199644	2014-01-20 11:00:40 +00:00
Artyom Skrobov	264da37809	[ARM] Do not generate Tag_DIV_use=AllowDIVExt when hardware div is non-optional: it should have the default value of AllowDIVIfExists llvm-svn: 199638	2014-01-20 10:18:42 +00:00
Chandler Carruth	f3546bc541	Revert r199628: "[AArch64 NEON] Fix a bug caused by undef lane when generating VEXT." This test fails the newly added regression tests. llvm-svn: 199631	2014-01-20 08:18:01 +00:00
Owen Anderson	e0205fdcd8	Fix all the remaining lost-fast-math-flags bugs I've been able to find. The most important of these are cases in the generic logic for combining BinaryOperators. This logic hadn't been updated to handle FastMathFlags, and it took me a while to detect it because it doesn't show up in a simple search for CreateFAdd. llvm-svn: 199629	2014-01-20 07:44:53 +00:00
Kevin Qin	a2c8e30bce	[AArch64 NEON] Fix a bug caused by undef lane when generating VEXT. llvm-svn: 199628	2014-01-20 07:32:26 +00:00
Kevin Qin	a1ebedbe48	[AArch64 NEON] Accept both #0.0 and #0 for comparing with floating point zero in asm parser. For FCMEQ, FCMGE, FCMGT, FCMLE and FCMLT, floating point zero will be printed as #0.0 instead of #0. To support the history codes using #0, we consider to let asm parser accept both #0.0 and #0. llvm-svn: 199621	2014-01-20 02:14:05 +00:00
Benjamin Kramer	813eb189fa	InstCombine: Modernize a bunch of cast combines. Also make them vector-aware. llvm-svn: 199608	2014-01-19 20:05:13 +00:00
Benjamin Kramer	47d4c4c113	InstCombine: Replace a hand-rolled version of isKnownToBeAPowerOfTwo with the real thing. llvm-svn: 199604	2014-01-19 16:48:41 +00:00
Benjamin Kramer	0de38fdc6a	InstCombine: Teach most integer add/sub/mul/div combines how to deal with vectors. llvm-svn: 199602	2014-01-19 15:24:22 +00:00
Benjamin Kramer	b864b5d907	InstCombine: Refactor fmul/fdiv combines to handle vectors. llvm-svn: 199598	2014-01-19 13:36:27 +00:00
Chandler Carruth	8b7504e0a3	Fix a really nasty SROA bug with how we handled out-of-bounds memcpy intrinsics. Reported on the list by Evan with a couple of attempts to fix, but it took a while to dig down to the root cause. There are two overlapping bugs here, both centering around the circumstance of discovering a memcpy operand which is known to be completely outside the bounds of the alloca. First, we need to kill the other side of the memcpy if it was added to this alloca. Otherwise we'll factor it into our slicing and try to rewrite it even though we know for a fact that it is dead. This is made more tricky because we can visit the sides in either order. So we have to both kill the other side and skip instructions marked as dead. The latter really should be goodness in every case, but here is a matter of correctness. Second, we need to actually remove the uses of the alloca by the memcpy when queuing it for later deletion. Otherwise it may still be using the alloca when we go to promote it (if the rewrite re-uses the existing alloca instruction). Do this by factoring out the use-clobbering used when for nixing a Phi argument and re-using it across the operands of a to-be-deleted instruction. llvm-svn: 199590	2014-01-19 12:16:54 +00:00
Saleem Abdulrasool	b7bd80577f	ARM ELF: ensure that the tag types are corrected Ensure that the tag types are reflected on a replacement. This is particularly important for the compatibility tag which has multiple representations where the last definition wins. llvm-svn: 199577	2014-01-19 08:25:41 +00:00
Saleem Abdulrasool	c5281b558c	ARM: update build attributes for ABI r2.09 Update names for the names as per the current ABI errata. Mark deprecated tags as such. llvm-svn: 199576	2014-01-19 08:25:35 +00:00
Arnold Schwaighofer	2c67b7dc58	LoopVectorizer: A reduction that has multiple uses of the reduction value is not a reduction. Really. Under certain circumstances (the use list of an instruction has to be set up right - hence the extra pass in the test case) we would not recognize when a value in a potential reduction cycle was used multiple times by the reduction cycle. Fixes PR18526. radar://15851149 llvm-svn: 199570	2014-01-19 03:18:31 +00:00
Chandler Carruth	608f08d699	[PM] Make the verifier work independently of any pass manager. This makes the 'verifyFunction' and 'verifyModule' functions totally independent operations on the LLVM IR. It also cleans up their API a bit by lifting the abort behavior into their clients and just using an optional raw_ostream parameter to control printing. The implementation of the verifier is now just an InstVisitor with no multiple inheritance. It also is significantly more const-correct, and hides the const violations internally. The two layers that force us to break const correctness are building a DomTree and dispatching through the InstVisitor. A new VerifierPass is used to implement the legacy pass manager interface in terms of the other pieces. The error messages produced may be slightly different now, and we may have slightly different short circuiting behavior with different usage models of the verifier, but generally everything works equivalently and this unblocks wiring the verifier up to the new pass manager. llvm-svn: 199569	2014-01-19 02:22:18 +00:00
Nick Lewycky	f31f7a5863	Don't refuse to transform constexpr(call(arg, ...)) to call(constexpr(arg), ...)) just because the function has multiple return values even if their return types are the same. Patch by Eduard Burtescu! llvm-svn: 199564	2014-01-18 22:47:12 +00:00
Benjamin Kramer	1f0b6cda68	ARM: Let the assembler reject v5 instructions in v4 mode. PR18524. llvm-svn: 199559	2014-01-18 19:03:19 +00:00
NAKAMURA Takumi	718dd2ef23	[CMake] Add llvm-tblgen to dependencies of check-llvm. llvm-tblgen is not built when external LLVM_TABLEGEN is specified. Even then, llvm-tblgen should be built for testing tblgen itself. llvm-svn: 199558	2014-01-18 19:01:08 +00:00
Benjamin Kramer	ace2801d74	InstCombine: Make the (fmul X, -1.0) -> (fsub -0.0, X) transform handle vectors too. PR18532. llvm-svn: 199553	2014-01-18 16:43:14 +00:00
Adrian Prantl	b65add1299	Debug info (LTO): Move the creation of accessibility flags to getOrCreateSubprogramDIE to avoid attributes being added twice when DIEs are merged. rdar://problem/15842330. llvm-svn: 199536	2014-01-18 02:12:00 +00:00
Owen Anderson	8750294bae	Fix more instances of dropped fast math flags when optimizing FADD instructions. All found by inspection (aka grep). llvm-svn: 199528	2014-01-18 00:48:14 +00:00
Reid Kleckner	395f9ebf2a	Add an inalloca flag to allocas Summary: The only current use of this flag is to mark the alloca as dynamic, even if its in the entry block. The stack adjustment for the alloca can never be folded into the prologue because the call may clear it and it has to be allocated at the top of the stack. Reviewers: majnemer CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2571 llvm-svn: 199525	2014-01-17 23:58:17 +00:00
Rui Ueyama	035830cabc	llvm-objdump/COFF: Print ordinal base number. llvm-svn: 199518	2014-01-17 22:02:24 +00:00
Juergen Ributzka	4f480d4b83	Add two new calling conventions for runtime calls This patch adds two new target-independent calling conventions for runtime calls - PreserveMost and PreserveAll. The target-specific implementation for X86-64 is defined as following: - Arguments are passed as for the default C calling convention - The same applies for the return value(s) - PreserveMost preserves all GPRs - except R11 - PreserveAll preserves all GPRs and all XMMs/YMMs - except R11 Reviewed by Lang and Philip llvm-svn: 199508	2014-01-17 19:47:03 +00:00
Daniel Sanders	32197355d2	[mips][msa] Correct pattern for LSA Summary: $rs and $rt were the wrong way round in the .td and the testcase wasn't strict enough to detect the mistake. Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D2554 llvm-svn: 199498	2014-01-17 15:40:05 +00:00
Renato Golin	035f4db32b	Add MLA alias for ARMv4 support. Fix MLA defs to use register class GPRnopc. Add encoding tests for multiply instructions. (Alias for MUL/SMLAL/UMLAL added by r199026.) Patch by Zhaoshi. llvm-svn: 199491	2014-01-17 13:53:08 +00:00
Kostya Serebryany	88b5111b60	[asan] extend asan-coverage (still experimental). - add a mode for collecting per-block coverage (-asan-coverage=2). So far the implementation is naive (all blocks are instrumented), the performance overhead on top of asan could be as high as 30%. - Make sure the one-time calls to __sanitizer_cov are moved to function buttom, which in turn required to copy the original debug info into the call insn. Here is the performance data on SPEC 2006 (train data, comparing asan with asan-coverage={0,1,2}): asan+cov0 asan+cov1 diff 0-1 asan+cov2 diff 0-2 diff 1-2 400.perlbench, 65.60, 65.80, 1.00, 76.20, 1.16, 1.16 401.bzip2, 65.10, 65.50, 1.01, 75.90, 1.17, 1.16 403.gcc, 1.64, 1.69, 1.03, 2.04, 1.24, 1.21 429.mcf, 21.90, 22.60, 1.03, 23.20, 1.06, 1.03 445.gobmk, 166.00, 169.00, 1.02, 205.00, 1.23, 1.21 456.hmmer, 88.30, 87.90, 1.00, 91.00, 1.03, 1.04 458.sjeng, 210.00, 222.00, 1.06, 258.00, 1.23, 1.16 462.libquantum, 1.73, 1.75, 1.01, 2.11, 1.22, 1.21 464.h264ref, 147.00, 152.00, 1.03, 160.00, 1.09, 1.05 471.omnetpp, 115.00, 116.00, 1.01, 140.00, 1.22, 1.21 473.astar, 133.00, 131.00, 0.98, 142.00, 1.07, 1.08 483.xalancbmk, 118.00, 120.00, 1.02, 154.00, 1.31, 1.28 433.milc, 19.80, 20.00, 1.01, 20.10, 1.02, 1.01 444.namd, 16.20, 16.20, 1.00, 17.60, 1.09, 1.09 447.dealII, 41.80, 42.20, 1.01, 43.50, 1.04, 1.03 450.soplex, 7.51, 7.82, 1.04, 8.25, 1.10, 1.05 453.povray, 14.00, 14.40, 1.03, 15.80, 1.13, 1.10 470.lbm, 33.30, 34.10, 1.02, 34.10, 1.02, 1.00 482.sphinx3, 12.40, 12.30, 0.99, 13.00, 1.05, 1.06 llvm-svn: 199488	2014-01-17 11:00:30 +00:00
Kevin Qin	e739fc1b8e	[AArch64 NEON] Expand vector for UDIV/SDIV/UREM/SREM/FREM as neon doesn't support these operations. llvm-svn: 199485	2014-01-17 09:54:30 +00:00
Craig Topper	913806d6aa	Teach x86 asm parser to handle 'opaque ptr' in Intel syntax. llvm-svn: 199477	2014-01-17 07:44:10 +00:00
Craig Topper	5f334ea607	Teach X86 asm parser to understand 'ZMMWORD PTR' in Intel syntax. llvm-svn: 199476	2014-01-17 07:37:39 +00:00
Craig Topper	ad05ca604d	Fix intel syntax for 64-bit version of FXSAVE/FXRSTOR to use '64' suffix instead of 'q' llvm-svn: 199474	2014-01-17 07:25:39 +00:00

1 2 3 4 5 ...

22430 Commits