llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-01 00:12:50 +01:00

Author	SHA1	Message	Date
Jakob Stoklund Olesen	914857b29a	Remove the -live-regunits command line option. Register allocators depend on it being permanently enabled now. llvm-svn: 158873	2012-06-20 23:31:34 +00:00
Jakob Stoklund Olesen	6d2db5c3d9	Only update regunit live ranges that have been precomputed. Regunit live ranges are computed on demand, so when mi-sched calls handleMove, some regunits may not have live ranges yet. That makes updating them easier: Just skip the non-existing ranges. They will be computed correctly from the rescheduled machine code when they are needed. llvm-svn: 158831	2012-06-20 18:00:57 +00:00
Craig Topper	d63e429d68	Don't insert 128-bit UNDEF into 256-bit vectors. Just keep the 256-bit vector. Original patch by Elena Demikhovsky. Tweaked by me to allow possibility of covering more cases. llvm-svn: 158792	2012-06-20 05:39:26 +00:00
Rafael Espindola	0f267bbe04	really add a triple :-( llvm-svn: 158696	2012-06-19 02:17:35 +00:00
Rafael Espindola	1cc3be37a0	Add a triple to the test. llvm-svn: 158695	2012-06-19 01:42:34 +00:00
Rafael Espindola	38c45a939d	Move the support for using .init_array from ARM to the generic TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM, on X86 it is not easy to find out if .init_array should be used or not, so the decision is made via TargetOptions and defaults to off. Add a command line option to llc that enables it. llvm-svn: 158692	2012-06-19 00:48:28 +00:00
Chandler Carruth	1c3df655ea	Add a regression test for the bug exposed by r158087, which has been temporarily reverted. This test is annoyingly overspecified, but I don't know of another way to thoroughly test the saving and restoring of the registers. While this will have to be adjusted even with the issue fixed in order to re-apply r158087, those adjustments should very clearly indicate that it is still correct (%esp getting restored prior to pops), whereas without it, this case can easily slip under the radar. Still, any suggestions for improvements are very welcome. All credit to Matt Beaumont-Gay for reducing this out of an insane Address Sanitizer crash to a reasonably small seg-faulting C program when built with -mstackrealign. I just reduced it to IR, which was much simpler. =] llvm-svn: 158656	2012-06-18 09:15:04 +00:00
Chandler Carruth	d2716ae111	Temporarily revert r158087. This patch causes problems when both dynamic stack realignment and dynamic allocas combine in the same function. With this patch, we no longer build the epilog correctly, and silently restore registers from the wrong position in the stack. Thanks to Matt for tracking this down, and getting at least an initial test case to Chad. I'm going to try to check a variation of that test case in so we can easily track the fixes required. llvm-svn: 158654	2012-06-18 07:03:12 +00:00
Craig Topper	b2299168d3	Fix intrinsics for XOP frczss/sd instructions. These instructions only take one source register and zero the upper bits of the destination rather than preserving them. llvm-svn: 158396	2012-06-13 07:18:53 +00:00
Craig Topper	ad5e38e410	Replace XOP vpcom intrinsics with fewer intrinsics that take the immediate as an argument. llvm-svn: 158278	2012-06-09 16:46:13 +00:00
Jakob Stoklund Olesen	ce0f9aef12	Don't run RAFast in the optimizing regalloc pipeline. The fast register allocator is not supposed to work in the optimizing pipeline. It doesn't make sense to compute live intervals, run full copy coalescing, and then run RAFast. Fast register allocation in the optimizing pipeline is better done by RABasic. llvm-svn: 158242	2012-06-08 23:15:12 +00:00
Manman Ren	40901656e6	Test case for r158160 llvm-svn: 158218	2012-06-08 18:42:37 +00:00
Manman Ren	f51a6d5fae	X86: optimize generated code for integer ABS This patch will generate the following for integer ABS: movl %edi, %eax negl %eax cmovll %edi, %eax INSTEAD OF movl %edi, %ecx sarl $31, %ecx leal (%rdi,%rcx), %eax xorl %ecx, %eax There exists a target-independent DAG combine for integer ABS, which converts integer ABS to sar+add+xor. For X86, we match this pattern back to neg+cmov. This is implemented in PerformXorCombine. rdar://10695237 llvm-svn: 158175	2012-06-07 22:39:10 +00:00
Rafael Espindola	4c9d611360	Use a base register instead of an index register with the local dynamic model. Fixes pr13048. llvm-svn: 158158	2012-06-07 18:39:19 +00:00
Manman Ren	1d91fc3342	X86: replace SUB with CMP if possible This patch will optimize the following movq %rdi, %rax subq %rsi, %rax cmovsq %rsi, %rdi movq %rdi, %rax to cmpq %rsi, %rdi cmovsq %rsi, %rdi movq %rdi, %rax Perform this optimization if the actual result of SUB is not used. rdar: 11540023 llvm-svn: 158126	2012-06-07 00:42:47 +00:00
Manman Ren	f591de61da	Revert r157755. The commit is intended to fix rdar://11540023. It is implemented as part of peephole optimization. We can actually implement this in the SelectionDAG lowering phase. llvm-svn: 158122	2012-06-06 23:53:03 +00:00
Chad Rosier	5a354cd5e8	Add support for dynamic stack realignment in the presence of dynamic allocas on X86. rdar://11496434 llvm-svn: 158087	2012-06-06 17:37:40 +00:00
Nadav Rotem	6969a673e9	Remove the "-promote-elements" flag. This flag is now enabled by default. llvm-svn: 157925	2012-06-04 11:27:21 +00:00
Craig Topper	5837bcfc02	Rename FMA3 feature flag to just FMA to match gcc so it can be added to clang. llvm-svn: 157903	2012-06-03 18:58:46 +00:00
Craig Topper	8d3031fa46	Rename fma4 intrinsics to just fma since they are now used for both FMA4 and FMA3. Autoupgrade support coming in a separate commit. llvm-svn: 157898	2012-06-03 07:26:46 +00:00
Manman Ren	c3a6de9953	Revert r157831 llvm-svn: 157896	2012-06-03 03:14:24 +00:00
Craig Topper	685b86b007	Use sse_load_f32/64 for scalar FMA3 intrinsic patterns instead of 128-bit loads to match instruction behavior. llvm-svn: 157895	2012-06-03 01:40:43 +00:00
Manman Ren	74ccc117d5	X86: peephole optimization to remove cmp instruction This patch will optimize the following: sub r1, r3 cmp r3, r1 or cmp r1, r3 bge L1 TO sub r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can eliminate the "cmp" instruction. llvm-svn: 157831	2012-06-01 19:49:33 +00:00
Chris Lattner	773aa116f2	testcase for PR13006, thanks to Duncan for filing it. llvm-svn: 157824	2012-06-01 18:19:46 +00:00
Hans Wennborg	4344ad4a86	Implement the local-dynamic TLS model for x86 (PR3985) This implements codegen support for accesses to thread-local variables using the local-dynamic model, and adds a clean-up pass so that the base address for the TLS block can be re-used between local-dynamic access on an execution path. llvm-svn: 157818	2012-06-01 16:27:21 +00:00
Craig Topper	02e5a00a70	Remove fadd(fmul) patterns for FMA3. This needs to be implemented by paying attention to FP_CONTRACT and matching @llvm.fma which is not available yet. This will allow us to enablle intrinsic use at least though. llvm-svn: 157804	2012-06-01 06:07:48 +00:00
Chris Lattner	8b71fa09ec	enhance the logic for looking through tailcalls to look through transparent casts in multiple-return value scenarios, like what happens on X86-64 when returning small structs. llvm-svn: 157800	2012-06-01 05:29:15 +00:00
Chris Lattner	9f7a47d886	enhance getNoopInput to know about vector<->vector bitcasts of legal types, as well as int<->ptr casts. This allows us to tailcall functions with some trivial casts between the call and return (i.e. because the return types disagree). llvm-svn: 157798	2012-06-01 05:16:33 +00:00
Chris Lattner	748ecee801	add some simple 64-bit tail call tests. llvm-svn: 157797	2012-06-01 05:03:31 +00:00
Chris Lattner	f4a7be0c55	merge some tests. llvm-svn: 157795	2012-06-01 05:00:54 +00:00
Chris Lattner	39d2bf08f8	rename test llvm-svn: 157794	2012-06-01 04:58:50 +00:00
Manman Ren	82e2c9debf	X86: replace SUB with CMP if possible This patch will optimize the following movq %rdi, %rax subq %rsi, %rax cmovsq %rsi, %rdi movq %rdi, %rax to cmpq %rsi, %rdi cmovsq %rsi, %rdi movq %rdi, %rax Perform this optimization if the actual result of SUB is not used. rdar: 11540023 llvm-svn: 157755	2012-05-31 17:20:29 +00:00
Elena Demikhovsky	194da7364d	Added FMA3 Intel instructions. I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks. I added tests for GodeGen and intrinsics. I did not change llvm.fma.f32/64 - it may be done later. llvm-svn: 157737	2012-05-31 09:20:20 +00:00
Craig Topper	c6bd90e646	Add intrinsic for pclmulqdq instruction. llvm-svn: 157731	2012-05-31 04:37:40 +00:00
Jakob Stoklund Olesen	8a558631c0	Prioritize smaller register classes for urgent evictions. It helps compile exotic inline asm. In the test case, normal GR32 virtual registers use up eax-edx so the final GR32_ABCD live range has no registers left. Since all the live ranges were tiny, we had no way of prioritizing the smaller register class. This patch allows tiny unspillable live ranges to be evicted by tiny unspillable live ranges from a smaller register class. <rdar://problem/11542429> llvm-svn: 157715	2012-05-30 21:46:58 +00:00
Chris Lattner	1d39403e7b	it's pointed out that R11 can be used for magic things, and doing things just for 64-bit registers is silly. Just optimize 3 more. llvm-svn: 157699	2012-05-30 18:08:02 +00:00
Chris Lattner	505bcf70a0	Extend the (abi-irrelevant) return convention to be able to return more than two values in integer registers. This is already supported by the fastcc convention, but it doesn't hurt to support it in the standard conventions as well. In cases where we can cheat at the calling convention, this allows us to avoid returning things through memory in more cases. llvm-svn: 157698	2012-05-30 17:50:14 +00:00
Benjamin Kramer	0c823ae0ed	Add intrinsics, code gen, assembler and disassembler support for the SSE4a extrq and insertq instructions. This required light surgery on the assembler and disassembler because the instructions use an uncommon encoding. They are the only two instructions in x86 that use register operands and two immediates. llvm-svn: 157634	2012-05-29 19:05:25 +00:00
Chris Lattner	3ae033bbe9	These tests used intrinsics with the wrong prototype. They weren't caught because the old verifier just checked that something "was a pointer", but not that the pointee was correct. llvm-svn: 157544	2012-05-27 19:35:41 +00:00
Benjamin Kramer	07f18e78ba	SelectionDAGBuilder: When emitting small compare chains for switches order them by using edge weights. SimplifyCFG tends to form a lot of 2-3 case switches when merging branches. Move the most likely condition to the front so it is checked first and the others can be skipped. This is currently not as effective as it could be because SimplifyCFG destroys profiling metadata when merging branches and switches. Merging branch weight metadata is tricky though. This code touches at most 3 cases so I didn't use a proper sorting algorithm. llvm-svn: 157521	2012-05-26 20:01:32 +00:00
NAKAMURA Takumi	ad733c5014	test/CodeGen/X86/bigstructret.ll: Suppress one test. It is msvc-incompatible. (compatible to mingw32 and netbsd, though) llvm-svn: 157474	2012-05-25 15:40:54 +00:00
NAKAMURA Takumi	93dbb2e681	test/CodeGen/X86/bigstructret.ll: Relax stack offsets for hosts of stack-align=8, eg. win32 and netbsd. llvm-svn: 157471	2012-05-25 15:12:21 +00:00
Eli Friedman	d89582030a	Simplify code for calling a function where CanLowerReturn fails, fixing a small bug in the process. llvm-svn: 157446	2012-05-25 00:09:29 +00:00
David Blaikie	74e7769ec9	Fix for CHECK-NOT misspelling. Patch by Nicklas Bo Jensen. llvm-svn: 157421	2012-05-24 22:08:29 +00:00
Jakob Stoklund Olesen	ce44a7a9ae	Correctly deal with identity copies in RegisterCoalescer. Now that the coalescer keeps live intervals and machine code in sync at all times, it needs to deal with identity copies differently. When merging two virtual registers, all identity copies are removed right away. This means that other identity copies must come from somewhere else, and they are going to have a value number. Deal with such copies by merging the value numbers before erasing the copy instruction. Otherwise, we leave dangling value numbers in the live interval. This fixes PR12927. llvm-svn: 157340	2012-05-23 20:21:06 +00:00
Nuno Lopes	944814b41a	revert my previous patches that introduced an additional parameter to the objectsize intrinsic. After a lot of discussion, we realized it's not the best option for run-time bounds checking llvm-svn: 157255	2012-05-22 15:25:31 +00:00
Jakob Stoklund Olesen	151b044b75	Only erase virtregs with no uses left. Also make sure registers aren't erased twice if the dead def mentions the register twice. This fixes PR12911. llvm-svn: 157254	2012-05-22 14:52:12 +00:00
Craig Topper	57d54b1831	Allow 256-bit shuffles to still be split even if only half of the shuffle comes from two 128-bit pieces. llvm-svn: 157175	2012-05-21 06:40:16 +00:00
Peter Collingbourne	c9e3c75c4d	When legalising shifts, do not pre-build a list of operands which may be RAUW'd by the recursive call to LegalizeOps; instead, retrieve the other operands when calling UpdateNodeOperands. Fixes PR12889. llvm-svn: 157162	2012-05-20 18:36:15 +00:00
Jakob Stoklund Olesen	ed19f92618	Properly constrain register classes for sub-registers. Not all GR64 registers have sub_8bit sub-registers. llvm-svn: 157150	2012-05-20 06:38:37 +00:00

1 2 3 4 5 ...

3381 Commits