llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 14:02:52 +02:00

Author	SHA1	Message	Date
Jakob Stoklund Olesen	83d6dda738	Delete stale comment. llvm-svn: 144542	2011-11-14 18:03:05 +00:00
Chandler Carruth	462bb16130	Fix an overflow bug in MachineBranchProbabilityInfo. This pass relied on the sum of the edge weights not overflowing uint32, and crashed when they did. This is generally safe as BranchProbabilityInfo tries to provide this guarantee. However, the CFG can get modified during codegen in a way that grows the sum of the edge weights. This doesn't seem unreasonable (imagine just adding more blocks all with the default weight of 16), but it is hard to come up with a case that actually triggers 32-bit overflow. Fortuately, the single-source GCC build is good at this. The solution isn't very pretty, but its no worse than the previous code. We're already summing all of the edge weights on each query, we can sum them, check for an overflow, compute a scale, and sum them again. I've included a greatly reduced test case out of the GCC source that triggers it. It's a pretty lame test, as it clearly is just barely triggering the overflow. I'd like to have something that is much more definitive, but I don't understand the fundamental pattern that triggers an explosion in the edge weight sums. The buggy code is duplicated within this file. I'll colapse them into a single implementation in a subsequent commit. llvm-svn: 144526	2011-11-14 08:50:16 +00:00
Chad Rosier	0e5094ca87	Add support for ARM halfword load/stores and signed byte loads with negative offsets. rdar://10412592 llvm-svn: 144518	2011-11-14 04:09:28 +00:00
Chandler Carruth	b7f21af176	Teach machine block placement to cope with unnatural loops. These don't get loop info structures associated with them, and so we need some way to make forward progress selecting and placing basic blocks. The technique used here is pretty brutal -- it just scans the list of blocks looking for the first unplaced candidate. It keeps placing blocks like this until the CFG becomes tractable. The cost is somewhat unfortunate, it requires allocating a vector of all basic block pointers eagerly. I have some ideas about how to simplify and optimize this, but I'm trying to get the logic correct first. Thanks to Benjamin Kramer for the reduced test case out of GCC. Sadly there are other bugs that GCC is tickling that I'm reducing and working on now. llvm-svn: 144516	2011-11-14 00:00:35 +00:00
Chandler Carruth	e67c92282f	Rewrite #3 of machine block placement. This is based somewhat on the second algorithm, but only loosely. It is more heavily based on the last discussion I had with Andy. It continues to walk from the inner-most loop outward, but there is a key difference. With this algorithm we ensure that as we visit each loop, the entire loop is merged into a single chain. At the end, the entire function is treated as a "loop", and merged into a single chain. This chain forms the desired sequence of blocks within the function. Switching to a single algorithm removes my biggest problem with the previous approaches -- they had different behavior depending on which system triggered the layout. Now there is exactly one algorithm and one basis for the decision making. The other key difference is how the chain is formed. This is based heavily on the idea Andy mentioned of keeping a worklist of blocks that are viable layout successors based on the CFG. Having this set allows us to consistently select the best layout successor for each block. It is expensive though. The code here remains very rough. There is a lot that needs to be done to clean up the code, and to make the runtime cost of this pass much lower. Very much WIP, but this was a giant chunk of code and I'd rather folks see it sooner than later. Everything remains behind a flag of course. I've added a couple of tests to exercise the issues that this iteration was motivated by: loop structure preservation. I've also fixed one test that was exhibiting the broken behavior of the previous version. llvm-svn: 144495	2011-11-13 11:20:44 +00:00
Chad Rosier	58ab241006	The order in which the predicate is added differs between Thumb and ARM mode. Fix predicate when in ARM mode and restore SelectIntrinsicCall. llvm-svn: 144494	2011-11-13 09:44:21 +00:00
Chad Rosier	8cfccc356e	Temporarily disable SelectIntrinsicCall when in ARM mode. This is causing failures. llvm-svn: 144492	2011-11-13 05:14:43 +00:00
Chad Rosier	acd199b5a4	Add support for emitting both signed- and zero-extend loads. Fix SimplifyAddress to handle either a 12-bit unsigned offset or the ARM +/-imm8 offsets (addressing mode 3). This enables a load followed by an integer extend to be folded into a single load. For example: ldrb r1, [r0] ldrb r1, [r0] uxtb r2, r1 => mov r3, r2 mov r3, r1 llvm-svn: 144488	2011-11-13 02:23:59 +00:00
Jakob Stoklund Olesen	3eaaa93104	Remove the -color-ss-with-regs option. It was off by default. The new register allocators don't have the problems that made it necessary to reallocate registers during stack slot coloring. llvm-svn: 144481	2011-11-13 00:31:23 +00:00
Jakob Stoklund Olesen	d0ddec5771	Delete the 'standard' spiller with used the old spilling framework. The current register allocators all use the inline spiller. llvm-svn: 144477	2011-11-12 23:29:02 +00:00
Jakob Stoklund Olesen	bb527a67c0	Remove histogram tests. Counting the number of occurences of each opcode is not a useful test. llvm-svn: 144474	2011-11-12 22:39:40 +00:00
Jakob Stoklund Olesen	9195bec6e7	RAGreedy is better about hinting now. Or maybe we are just getting lucky. llvm-svn: 144473	2011-11-12 22:39:37 +00:00
Jakob Stoklund Olesen	4aa9c6888f	Linear scan is going away. llvm-svn: 144472	2011-11-12 22:39:34 +00:00
Jakob Stoklund Olesen	e1b1bbb882	XFAIL test that depends on linear scan to remove dead code. Filed PR11364 to track the problem. Should the register allocator eliminate dead code? llvm-svn: 144471	2011-11-12 22:39:30 +00:00
Jakob Stoklund Olesen	43b7a3871b	Remove obsolete test. This test was committed with a bugfix to RemoveCopyByCommutingDef, but that optimization is no longer triggered by this test. llvm-svn: 144470	2011-11-12 22:39:27 +00:00
Jakob Stoklund Olesen	6a290484cb	Remove obsolete test. This test is for a very specific LocalRewriter bug. LocalRewriter is going away. llvm-svn: 144469	2011-11-12 22:39:24 +00:00
Jakob Stoklund Olesen	005eabf28a	Remove obsolete test. I don't think this test does what is was supposed to do, and LocalRewriter is going away anyway. llvm-svn: 144463	2011-11-12 20:37:57 +00:00
Jakob Stoklund Olesen	c11d7a9b4d	Eliminate more linear scan tests. llvm-svn: 144462	2011-11-12 20:35:26 +00:00
Jakob Stoklund Olesen	0fe59856fd	Switch a couple -O0 tests to RABasic. llvm-svn: 144461	2011-11-12 20:11:04 +00:00
Jakob Stoklund Olesen	94ce588b20	Switch a few tests off linearscan. llvm-svn: 144460	2011-11-12 19:53:52 +00:00
Jakob Stoklund Olesen	f8fed2a3a7	Delete old test of a VirtRegRewriter feature. This test doesn't expose the issue with RAGreedy. I filed PR11363 to track the missing InlineSpiller feature. llvm-svn: 144459	2011-11-12 19:53:48 +00:00
Jakob Stoklund Olesen	49118cf9a5	Remove old test that doesn't make sense. The test is checking that the output doesn't contains any 'mov ' strings. It does contain movl, though. llvm-svn: 144458	2011-11-12 19:53:45 +00:00
Craig Topper	0458cdf64a	Add more AVX2 shift lowering support. Move AVX2 variable shift to use patterns instead of custom lowering code. llvm-svn: 144457	2011-11-12 09:58:49 +00:00
Eli Friedman	8563e57e38	Don't try to form pre/post-indexed loads/stores until after LegalizeDAG runs. Fixes PR11029. llvm-svn: 144438	2011-11-12 00:35:34 +00:00
Chad Rosier	a2a0fbeded	Add support in fast-isel for selecting memset/memcpy/memmove intrinsics. llvm-svn: 144426	2011-11-11 23:31:03 +00:00
Chad Rosier	88ab27405f	Loosen test by using REs. Approved by Devang. llvm-svn: 144425	2011-11-11 23:25:38 +00:00
Andrew Trick	6ff75a5d8d	Preserve MachineMemOperands in ARMLoadStoreOptimizer. Fixes PR8113. llvm-svn: 144409	2011-11-11 22:18:09 +00:00
Dan Bailey	ad6c209a79	allow non-device function calls in PTX when natively handling device-side printf llvm-svn: 144388	2011-11-11 14:45:12 +00:00
Craig Topper	50df7c3842	Add lowering for AVX2 shift instructions. llvm-svn: 144380	2011-11-11 07:39:23 +00:00
Chad Rosier	feb72bfc08	Add support for using immediates with select instructions. rdar://10412592 llvm-svn: 144376	2011-11-11 06:20:39 +00:00
Eli Friedman	285b451941	Make sure to expand SIGN_EXTEND_INREG for NEON vectors. PR11319, round 3. llvm-svn: 144361	2011-11-11 03:16:38 +00:00
Chad Rosier	ac92994773	Add support for using MVN to materialize negative constants. rdar://10412592 llvm-svn: 144348	2011-11-11 00:36:21 +00:00
Chad Rosier	7b7dced006	When in ARM mode, LDRH/STRH require special handling of negative offsets. For correctness, disable this for now. rdar://10418009 llvm-svn: 144316	2011-11-10 21:09:49 +00:00
NAKAMURA Takumi	ea14fd81c6	test/CodeGen/X86/lsr-loop-exit-cond.ll: Try to appease linux and freebsd bots to specify explicit -mtriple=x86_64-darwin. I guess it expects -relocation-model=pic. llvm-svn: 144290	2011-11-10 14:18:59 +00:00
Evan Cheng	4760ff0763	Use a bigger hammer to fix PR11314 by disabling the "forcing two-address instruction lower optimization" in the pre-RA scheduler. The optimization, rather the hack, was done before MI use-list was available. Now we should be able to implement it in a better way, perhaps in the two-address pass until a MI scheduler is available. Now that the scheduler has to backtrack to handle call sequences. Adding artificial scheduling constraints is just not safe. Furthermore, the hack is not taking all the other scheduling decisions into consideration so it's just as likely to pessimize code. So I view disabling this optimization goodness regardless of PR11314. llvm-svn: 144267	2011-11-10 07:43:16 +00:00
Chad Rosier	69cdae5eb9	For immediate encodings of icmp, zero or sign extend first. Then determine if the value is negative and flip the sign accordingly. rdar://10422026 llvm-svn: 144258	2011-11-10 01:30:39 +00:00
Jakob Stoklund Olesen	bc48cd34b6	Strip old implicit operands after foldMemoryOperand. The TII.foldMemoryOperand hook preserves implicit operands from the original instruction. This is not what we want when those implicit operands refer to the register being spilled. Implicit operands referring to other registers are preserved. This fixes PR11347. llvm-svn: 144247	2011-11-10 00:17:03 +00:00
Eli Friedman	c93f8aa514	Make sure we correctly unroll conversions between v2f64 and v2i32 on ARM. llvm-svn: 144241	2011-11-09 23:36:02 +00:00
Eli Friedman	b01f15653c	Add check so we don't try to perform an impossible transformation. Fixes issue from PR11319. llvm-svn: 144216	2011-11-09 22:25:12 +00:00
Nadav Rotem	ddc6bfa543	AVX2: Add patterns for variable shift operations llvm-svn: 144212	2011-11-09 21:22:13 +00:00
Chad Rosier	228dc76221	Use REs to remove dependencies on the register allocation order. llvm-svn: 144209	2011-11-09 20:06:13 +00:00
Duncan Sands	2934a0eaeb	Speculatively revert commit 144124 (djg) in the hope that the 32 bit dragonegg self-host buildbot will recover (it is complaining about object files differing between different build stages). Original commit message: Add a hack to the scheduler to disable pseudo-two-address dependencies in basic blocks containing calls. This works around a problem in which these artificial dependencies can get tied up in calling seqeunce scheduling in a way that makes the graph unschedulable with the current approach of using artificial physical register dependencies for calling sequences. This fixes PR11314. llvm-svn: 144188	2011-11-09 14:20:48 +00:00
Nadav Rotem	e66a72a2c4	Add AVX2 support for vselect of v32i8 llvm-svn: 144187	2011-11-09 13:21:28 +00:00
Craig Topper	432dd8d623	Enable execution dependency fix pass for YMM registers when AVX2 is enabled. Add AVX2 logical operations to list of replaceable instructions. llvm-svn: 144179	2011-11-09 09:37:21 +00:00
Craig Topper	7ff77dc2b1	Add instruction selection for AVX2 integer comparisons. llvm-svn: 144176	2011-11-09 08:06:13 +00:00
Craig Topper	d82abb7156	Add AVX2 instruction lowering for add, sub, and mul. llvm-svn: 144174	2011-11-09 07:28:55 +00:00
Chad Rosier	e32fed6868	Add support for encoding immediates in icmp and fcmp. Hopefully, this will remove a fair number of unnecessary materialized constants. rdar://10412592 llvm-svn: 144163	2011-11-09 03:22:02 +00:00
Jakob Stoklund Olesen	1239fed1e2	Collapse DomainValues across loop back-edges. During the initial RPO traversal of the basic blocks, remember the ones that are incomplete because of back-edges from predecessors that haven't been visited yet. After the initial RPO, revisit all those loop headers so the incoming DomainValues on the back-edges can be properly collapsed. This will properly fix execution domains on software pipelined code, like the included test case. llvm-svn: 144151	2011-11-09 01:06:56 +00:00
Dan Gohman	b6cf7c4e94	Add a hack to the scheduler to disable pseudo-two-address dependencies in basic blocks containing calls. This works around a problem in which these artificial dependencies can get tied up in calling seqeunce scheduling in a way that makes the graph unschedulable with the current approach of using artificial physical register dependencies for calling sequences. This fixes PR11314. llvm-svn: 144124	2011-11-08 21:29:06 +00:00
Evan Cheng	08e61752f2	Add workaround for Cortex-M3 errata 602117 by replacing ldrd x, y, [x] with ldm or ldr pairs. llvm-svn: 144123	2011-11-08 21:21:09 +00:00

1 2 3 4 5 ...

5278 Commits