llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00

Author	SHA1	Message	Date
Sanjay Patel	3a15ccb43d	added comment (using freshly updated update_llc_test_checks.py) llvm-svn: 253935	2015-11-23 23:22:05 +00:00
Sanjay Patel	415fbe0230	[x86] add test to show suboptimal codegen (PR25554) llvm-svn: 253934	2015-11-23 23:18:20 +00:00
Andy Ayers	5fba38a9da	findDeadCallerSavedReg needs to pay attention to calling convention Caller saved regs differ between SysV and Win64. Use the tail call available set to scavenge from. Refactor register info to create new helper to get at tail call GPRs. Added a new test case for windows. Fixed up a number of X64 tests since now RCX is preferred over RDX on SysV. Differential Revision: http://reviews.llvm.org/D14878 llvm-svn: 253927	2015-11-23 22:17:44 +00:00
James Y Knight	f0ca422f64	Make utils/update_llc_test_checks.py note that the assertions are autogenerated. Also update existing test cases which appear to be generated by it and weren't modified (other than addition of the header) by rerunning it. llvm-svn: 253917	2015-11-23 21:33:58 +00:00
Simon Pilgrim	5ff8bf1540	[X86][FMA] Regenerate tests. Fixes some broken checks. llvm-svn: 253830	2015-11-22 19:05:53 +00:00
Simon Pilgrim	97bc638a02	[X86][AVX] Added load splat tests. Placeholder for upcoming patch for PR23022. llvm-svn: 253824	2015-11-22 16:52:16 +00:00
Elena Demikhovsky	678dc46339	AVX-512: Optimized INSERT_SUBVECTOR for i1 vector types ISERT_SUBVECTOR for i1 vectors may be done with shifts, when we insert into the lower part, or into the upper part, on into all-zero vector. CONCAT_VECTORS uses ISERT_SUBVECTOR. Differential Revision: http://reviews.llvm.org/D14815 llvm-svn: 253819	2015-11-22 13:57:38 +00:00
Simon Pilgrim	37096d919a	[MachineInstrBuilder] Support for adding a ConstantPoolIndex MO with an additional offset. MachineInstrBuilder::addDisp can already add an immediate or global address MO with an adjusted offset, this patch adds support for constant pool indices as well. All remaining MO types still assert - there are a number of other types that could support adjusted offsets but I have no test cases at this time. Required to fix a regression in D13988 found by Mikael Holmén during stress testing (test case attached). Differential Revision: http://reviews.llvm.org/D14867 llvm-svn: 253795	2015-11-21 21:42:26 +00:00
Simon Pilgrim	24d9446a79	[X86][SSE] Added SSE2 PSUBUS tests llvm-svn: 253783	2015-11-21 13:57:22 +00:00
Simon Pilgrim	1732480b50	[X86][SSE] Regenerate TRUNC-SEXT tests Tidied up triple and regenerate tests using update_llc_test_checks.py llvm-svn: 253782	2015-11-21 13:32:29 +00:00
Simon Pilgrim	163ad0f8b7	[X86][SSE] Regenerate MINMAX tests Tidied up triple and regenerate tests using update_llc_test_checks.py llvm-svn: 253781	2015-11-21 13:29:42 +00:00
Simon Pilgrim	4533bbec83	[X86][SSE] Regenerate PSUBUS tests Tidied up triple and regenerate tests using update_llc_test_checks.py llvm-svn: 253780	2015-11-21 13:25:50 +00:00
Simon Pilgrim	6bc8690714	[X86][AVX] Regenerate AVX splat tests Tidied up triple and regenerate tests using update_llc_test_checks.py llvm-svn: 253778	2015-11-21 13:23:14 +00:00
Simon Pilgrim	bbc4bbde20	[X86][AVX512] Added AVX512 VMOVLHPS/VMOVHLPS shuffle decode comments. llvm-svn: 253777	2015-11-21 13:04:42 +00:00
Simon Pilgrim	948540dab2	[X86][SSE] Legal XMM Register Class ordering for SSE1 It turns out we have a number of places that just grab the first type attached to a register class for various reasons. This is fine unless for some reason that type isn't legal on the current target, such as for SSE1 which doesn't support v16i8/v8i16/v4i32/v2i64 - all of which were included before 4f32 in the class. Given that this is such a rare situation I've just re-ordered the types and placed the float types first. Fix for PR16133 Differential Revision: http://reviews.llvm.org/D14787 llvm-svn: 253773	2015-11-21 12:38:34 +00:00
Reid Kleckner	ba8ad3f697	[WinEH] Disable most forms of demotion Now that the register allocator knows about the barriers on funclet entry and exit, testing has shown that this is unnecessary. We still demote PHIs on unsplittable blocks due to the differences between the IR CFG and the Machine CFG. llvm-svn: 253619	2015-11-19 23:23:33 +00:00
Simon Pilgrim	764ff90848	[X86][SSE4A] Fix issue with EXTRQI shuffles not starting at the correct start index. Found during stress testing. llvm-svn: 253611	2015-11-19 22:13:56 +00:00
Sanjay Patel	938dbcf1dc	[CGP] despeculate expensive cttz/ctlz intrinsics This is another step towards allowing SimplifyCFG to speculate harder, but then have CGP clean things up if the target doesn't like it. Previous patches in this series: http://reviews.llvm.org/D12882 http://reviews.llvm.org/D13297 D13297 should catch most expensive ops, but speculation of cttz/ctlz requires special handling because of weirdness in the intrinsic definition for handling a zero input (that definition can probably be blamed on x86). For example, if we have the usual speculated-by-select expensive op pattern like this: %tobool = icmp eq i64 %A, 0 %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true) ; is_zero_undef == true %cond = select i1 %tobool, i64 64, i64 %0 ret i64 %cond There's an instcombine that will turn it into: %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 false) ; is_zero_undef == false This CGP patch is looking for that case and despeculating it back into: entry: %tobool = icmp eq i64 %A, 0 br i1 %tobool, label %cond.end, label %cond.true cond.true: %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true) ; is_zero_undef == true br label %cond.end cond.end: %cond = phi i64 [ %0, %cond.true ], [ 64, %entry ] ret i64 %cond This unfortunately may lead to poorer codegen (see the changes in the existing x86 test), but if we increase speculation in SimplifyCFG (the next step in this patch series), then we should avoid those kinds of cases in the first place. The need for this patch was originally mentioned here: http://reviews.llvm.org/D7506 with follow-up here: http://reviews.llvm.org/D7554 Differential Revision: http://reviews.llvm.org/D14630 llvm-svn: 253573	2015-11-19 16:37:10 +00:00
Hans Wennborg	1ead7346cd	X86: More efficient legalization of wide integer compares In particular, this makes the code for 64-bit compares on 32-bit targets much more efficient. Example: define i32 @test_slt(i64 %a, i64 %b) { entry: %cmp = icmp slt i64 %a, %b br i1 %cmp, label %bb1, label %bb2 bb1: ret i32 1 bb2: ret i32 2 } Before this patch: test_slt: movl 4(%esp), %eax movl 8(%esp), %ecx cmpl 12(%esp), %eax setae %al cmpl 16(%esp), %ecx setge %cl je .LBB2_2 movb %cl, %al .LBB2_2: testb %al, %al jne .LBB2_4 movl $1, %eax retl .LBB2_4: movl $2, %eax retl After this patch: test_slt: movl 4(%esp), %eax movl 8(%esp), %ecx cmpl 12(%esp), %eax sbbl 16(%esp), %ecx jge .LBB1_2 movl $1, %eax retl .LBB1_2: movl $2, %eax retl Differential Revision: http://reviews.llvm.org/D14496 llvm-svn: 253572	2015-11-19 16:35:08 +00:00
Elena Demikhovsky	6aa44f30d0	AVX-512: Fixed COPY_TO_REGCLASS for mask registers Copying one mask register to another under BW should be done with kmovq instruction, otherwise we can loose some bits. Copying 8 bits under DQ may be done with kmovb. Differential Revision: http://reviews.llvm.org/D14812 llvm-svn: 253563	2015-11-19 13:13:00 +00:00
Simon Pilgrim	015080a582	[X86][AVX] Fix lowering of X86ISD::VZEXT_MOVL for 128-bit -> 256-bit extension The lowering patterns for X86ISD::VZEXT_MOVL for 128-bit to 256-bit vectors were just copying the lower xmm instead of actually masking off the first scalar using a blend. Fix for PR25320. Differential Revision: http://reviews.llvm.org/D14151 llvm-svn: 253561	2015-11-19 12:18:37 +00:00
Igor Breger	0a68600909	AVX512: Implemented encoding, intrinsics and DAG lowering for VMOVDDUP instructions. Differential Revision: http://reviews.llvm.org/D14702 llvm-svn: 253548	2015-11-19 08:26:56 +00:00
Elena Demikhovsky	fea4d52acf	Pointers in Masked Load, Store, Gather, Scatter intrinsics The masked intrinsics support all integer and floating point data types. I added the pointer type to this list. Added tests for CodeGen and for Loop Vectorizer. Updated the Language Reference. Differential Revision: http://reviews.llvm.org/D14150 llvm-svn: 253544	2015-11-19 07:17:16 +00:00
Pete Cooper	b753649d63	Revert "Change memcpy/memset/memmove to have dest and source alignments." This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543	2015-11-19 05:56:52 +00:00
Quentin Colombet	2d801721a4	[X86] Enable shrink-wrapping by default. Differential Revision: http://reviews.llvm.org/D14156 rdar://problem/21118279 llvm-svn: 253528	2015-11-19 00:38:00 +00:00
Pete Cooper	aca4c5cdc6	Change memcpy/memset/memmove to have dest and source alignments. Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.llvm\.memset.)i32\ [0-9]\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, / isVolatile / false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, / isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511	2015-11-18 22:17:24 +00:00
Simon Pilgrim	8ddf1acb2a	[DAGCombiner] Vector constant folding for comparisons This patch adds support for vector constant folding of integer/float comparisons. This requires FoldConstantVectorArithmetic to support scalar constant operands (in this case ISD::CONDCASE). In future we should be able to support other scalar constant types as necessary (and possibly start calling FoldConstantVectorArithmetic for all node creations) Differential Revision: http://reviews.llvm.org/D14683 llvm-svn: 253504	2015-11-18 21:17:19 +00:00
Asaf Badouh	e49f73285d	[X86][AVX512CD] add mask broadcast intrinsics Differential Revision: http://reviews.llvm.org/D14573 llvm-svn: 253450	2015-11-18 09:42:45 +00:00
Simon Pilgrim	3efea43ca6	[X86][AVX] Added 256-bit shuffle splat tests. llvm-svn: 253449	2015-11-18 09:39:38 +00:00
Rafael Espindola	2c21fe4650	Stop producing .data.rel sections. If a section is rw, it is irrelevant if the dynamic linker will write to it or not. It looks like llvm implemented this because gcc was doing it. It looks like gcc implemented this in the hope that it would put all the relocated items close together and speed up the dynamic linker. There are two problem with this: * It doesn't work. Both bfd and gold will map .data.rel to .data and concatenate the input sections in the order they are seen. * If we want a feature like that, it can be implemented directly in the linker since it knowns where the dynamic relocations are. llvm-svn: 253436	2015-11-18 06:02:15 +00:00
Cong Hou	f11a4c60a1	Improving edge probabilities computation when choosing the best successor in machine block placement. When looking for the best successor from the outer loop for a block belonging to an inner loop, the edge probability computation can be improved so that edges in the inner loop are ignored. For example, suppose we are building chains for the non-loop part of the following code, and looking for B1's best successor. Assume the true body is very hot, then B3 should be the best candidate. However, because of the existence of the back edge from B1 to B0, the probability from B1 to B3 can be very small, preventing B3 to be its successor. In this patch, when computing the probability of the edge from B1 to B3, the weight on the back edge B1->B0 is ignored, so that B1->B3 will have 100% probability. if (...) do { B0; ... // some branches B1; } while(...); else B2; B3; Differential revision: http://reviews.llvm.org/D10825 llvm-svn: 253414	2015-11-18 00:52:52 +00:00
Simon Pilgrim	1145b63f7f	[X86][AVX512] Added AVX512 SHUFP/VPERMILP shuffle decode comments. llvm-svn: 253396	2015-11-17 23:29:49 +00:00
Simon Pilgrim	c048839544	[X86][AVX512] Added support for AVX512 UNPCK shuffle decode comments. llvm-svn: 253391	2015-11-17 22:35:45 +00:00
Simon Pilgrim	f43318e7eb	[X86][SSE] Share AVX1/AVX2 shuffle tests with AVX512 where possible llvm-svn: 253379	2015-11-17 21:19:45 +00:00
Reid Kleckner	00daa6cd20	[WinEH] Move WinEHFuncInfo from MachineModuleInfo to MachineFunction Summary: Now that there is a one-to-one mapping from MachineFunction to WinEHFuncInfo, we don't need to use a DenseMap to select the right WinEHFuncInfo for the current funclet. The main challenge here is that X86WinEHStatePass is an IR pass that doesn't have access to the MachineFunction. I gave it its own WinEHFuncInfo object that it uses to calculate state numbers, which it then throws away. As long as nobody creates or removes EH pads between this pass and SDAG construction, we will get the same state numbers. The other thing X86WinEHStatePass does is to mark the EH registration node. Instead of communicating which alloca was the registration through WinEHFuncInfo, I added the llvm.x86.seh.ehregnode intrinsic. This intrinsic generates no code and simply marks the alloca in use. Reviewers: JCTremoulet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14668 llvm-svn: 253378	2015-11-17 21:10:25 +00:00
Pat Gavlin	189930fea1	Lower statepoints with multi-def targets. Statepoint lowering currently expects that the target method of a statepoint only defines a single value. This precludes using statepoints with ABIs that return values in multiple registers (e.g. the SysV AMD64 ABI). This change adds support for lowering statepoints with mutli-def targets. llvm-svn: 253339	2015-11-17 16:04:21 +00:00
Dan Gohman	3ee3e60f77	Use TargetRegisterInfo for printing MachineOperand register comments Several places in AsmPrinter.cpp print comments describing MachineOperand registers using MCRegisterInfo, which uses MCOperand-oriented names. This doesn't work for targets that use virtual registers exclusively, as WebAssembly does, since virtual registers are represented and printed differently. This patch preserves what seems to be the spirit of r229978, avoiding the use of TM.getSubtargetImpl(), while still using MachineOperand-oriented printing for MachineOperands. Differential Revision: http://reviews.llvm.org/D14709 llvm-svn: 253338	2015-11-17 16:01:28 +00:00
Igor Breger	f0cc44d6b9	AVX512 : regenerate the test file against trunk. Differential Revision: http://reviews.llvm.org/D14742 llvm-svn: 253321	2015-11-17 08:03:43 +00:00
Rafael Espindola	47008fdea7	Drop prelink support. The way prelink used to work was * The compiler decides if a given section only has relocations that are know to point to the same DSO. If so, it names it .data.rel.ro.local<something>. * The static linker puts all of these together. * The prelinker program assigns addresses to each library and resolves the local relocations. There are many problems with this: * It is incompatible with address space randomization. * The information passed by the compiler is redundant. The linker knows if a given relocation is in the same DSO or not. If could sort by that if so desired. * There are newer ways of speeding up DSO (gnu hash for example). * Even if we want to implement this again in the compiler, the previous implementation is pretty broken. It talks about relocations that are "resolved by the static linker". If they are resolved, there are none left for the prelinker. What one needs to track is if an expression will require only dynamic relocations that point to the same DSO. At this point it looks like the prelinker is an historical curiosity. For example, fedora has retired it because it failed to build for two releases (http://pkgs.fedoraproject.org/cgit/prelink.git/commit/?id=eb43100a8331d91c801ee3dcdb0a0bb9babfdc1f) This patch removes support for it. That is, it stops printing the ".local" sections. llvm-svn: 253280	2015-11-17 00:51:23 +00:00
Reid Kleckner	d9dbfe296b	[WinEH] Don't let UnwindHelp alias the return address On top of that, don't bother allocating and initializing UnwindHelp if we don't have any funclets. Currently we always use RBP as our frame pointer when funclets are present, so this change makes it impossible to come here without any fixed stack objects. Fixes PR25533. llvm-svn: 253245	2015-11-16 18:47:25 +00:00
Igor Breger	06ae954df6	AVX512: Implemented encoding and intrinsics for VMOVSHDUP/VMOVSLDUP instructions. Differential Revision: http://reviews.llvm.org/D14322 llvm-svn: 253185	2015-11-16 07:22:00 +00:00
Igor Breger	02e6595c76	Revert r253160. It broke layering violation. Reproducible with BUILD_SHARED_LIBS=ON. llvm-svn: 253163	2015-11-15 12:19:11 +00:00
Igor Breger	3ec0d86d6a	AVX512: Implemented encoding and intrinsics for VMOVSHDUP/VMOVSLDUP instructions. Differential Revision: http://reviews.llvm.org/D14322 llvm-svn: 253160	2015-11-15 07:23:13 +00:00
Simon Pilgrim	82c137eccb	[X86][SSE] Fixed arch/triple and regenerated results. Tidyup before diffs from new patch. llvm-svn: 253144	2015-11-14 20:42:01 +00:00
Simon Pilgrim	0cf8ee9f6e	[X86][SSE] Added extra vector truncation tests Baseline comparison to D14588 llvm-svn: 253132	2015-11-14 15:23:59 +00:00
Quentin Colombet	a869c64da5	[ShrinkWrapping] Disable the optimization for functions with sanitize like attribute. Even if the target supports shrink-wrapping, the prologue and epilogue must not move because a crash can happen anywhere and sanitizers need to be able to unwind from the PC of the crash. llvm-svn: 253116	2015-11-14 01:55:17 +00:00
Reid Kleckner	21fb9398ce	[WinEH] Fix ESP management with 32-bit __CxxFrameHandler3 The C++ EH personality automatically restores ESP from the C++ EH registration node after a catchret. I mistakenly thought it was like SEH, which does not restore ESP. It makes sense for C++ EH to differ from SEH here because SEH does not use funclets for catches, and does not allow catching inside of finally. C++ EH may need to unwind through multiple catch funclets and eventually catchret to some outer funclet. Therefore, the runtime has to keep track of which ESP to use with catchret, rather than having the compiler reload it manually. llvm-svn: 253084	2015-11-13 21:27:00 +00:00
Cong Hou	2d7895e79a	[X86][SSE] Combine UNPCKL with vector_shuffle into UNPCKH to save one instruction for sext from v16i8 to v16i16 and v8i16 to v8i32. This patch is enabling combining UNPCKL with vector_shuffle that moves the upper half of a vector into the lower half, into a UNPCKH instruction. For example: t2: v16i8 = vector_shuffle<8,9,10,11,12,13,14,15,u,u,u,u,u,u,u,u> t1, undef:v16i8 t3: v16i8 = X86ISD::UNPCKL undef:v16i8, t2 will be combined to: t3: v16i8 = X86ISD::UNPCKH undef:v16i8, t1 Differential revision: http://reviews.llvm.org/D14399 llvm-svn: 253067	2015-11-13 19:47:43 +00:00
Reid Kleckner	031d7d6009	Add missing triple to WinEH test case llvm-svn: 253062	2015-11-13 19:11:12 +00:00
Reid Kleckner	1584024037	[WinEH] Make UnwindHelp a fixed stack object allocated after XMM CSRs Now the offset of UnwindHelp in our EH tables and the offset that we store to in the prologue agree. llvm-svn: 253059	2015-11-13 19:06:01 +00:00

1 2 3 4 5 ...

6683 Commits