llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00

Author	SHA1	Message	Date
Chris Lattner	f9d9976374	fix an oversight which caused us to compile the testcase (and other less trivial things) into a dummy lea. Before we generated: _test: ## @test movq _G@GOTPCREL(%rip), %rax leaq (%rax), %rax ret now we produce: _test: ## @test movq _G@GOTPCREL(%rip), %rax ret This is part of rdar://9289558 llvm-svn: 129662	2011-04-17 17:12:08 +00:00
Chris Lattner	5e00f501ff	Fix rdar://9289512 - not folding load into compare at -O0 The basic issue here is that bottom-up isel is matching the branch and compare, and was failing to fold the load into the branch/compare combo. Fixing this (by allowing folding into any instruction of a sequence that is selected) allows us to produce things like: cmpb $0, 52(%rax) je LBB4_2 instead of: movb 52(%rax), %cl cmpb $0, %cl je LBB4_2 This makes the generated -O0 code run a bit faster, but also speeds up compile time by putting less pressure on the register allocator and generating less code. This was one of the biggest classes of missing load folding. Implementing this shrinks 176.gcc's c-decl.s (as a random example) by about 4% in (verbose-asm) line count. llvm-svn: 129656	2011-04-17 06:35:44 +00:00
Chris Lattner	1fe5f78b7e	split a complex predicate out to a helper function. Simplify two for loops, which don't need to check for falling off the end of a block and end of phi nodes, since terminators are never phis. llvm-svn: 129655	2011-04-17 06:03:19 +00:00
Chris Lattner	cb194276e0	fix rdar://9289583 - fast isel should handle non-canonical commutative binops allowing us to fold the immediate into the 'and' in this case: int test1(int i) { return 8&i; } llvm-svn: 129653	2011-04-17 01:16:47 +00:00
Eli Friedman	2798137293	PR9055: extend the fix to PR4050 (r70179) to apply to zext and anyext. Returning a new node makes the code try to replace the old node, which in the included testcase is killed by CSE. llvm-svn: 129650	2011-04-16 23:25:34 +00:00
Evan Cheng	b720f37282	Fix divmod libcall lowering. Convert to {S\|U}DIVREM first and then expand the node to a libcall. rdar://9280991 llvm-svn: 129633	2011-04-16 03:08:26 +00:00
Chris Lattner	0304b82f80	Fix a ton of comment typos found by codespell. Patch by Luis Felipe Strano Moraes! llvm-svn: 129558	2011-04-15 05:18:47 +00:00
Owen Anderson	0ce6c0f86e	Fix another instance of the DAG combiner not using the correct type for the RHS of a shift. llvm-svn: 129522	2011-04-14 17:30:49 +00:00
Andrew Trick	e89c19ab7b	In the pre-RA scheduler, maintain cmp+br proximity. This is done by pushing physical register definitions close to their use, which happens to handle flag definitions if they're not glued to the branch. This seems to be generally a good thing though, so I didn't need to add a target hook yet. The primary motivation is to generate code closer to what people expect and rule out missed opportunity from enabling macro-op fusion. As a side benefit, we get several 2-5% gains on x86 benchmarks. There is one regression: SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is an independent scheduler bug that will be tracked separately. See rdar://problem/9283108. Incidentally, pre-RA scheduling is only half the solution. Fixing the later passes is tracked by: <rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump Fixes: <rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion llvm-svn: 129508	2011-04-14 05:15:06 +00:00
Chris Lattner	d4ba43dc76	sink a call into its only use. llvm-svn: 129503	2011-04-14 04:12:47 +00:00
Owen Anderson	d98929ed6c	During post-legalization DAG combining, be careful to only create shifts where the RHS is of the legal type for the new operation. llvm-svn: 129484	2011-04-13 23:22:23 +00:00
Andrew Trick	916e01c917	Recommit r129383. PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency. Additional fixes: Do something reasonable for subtargets with generic itineraries by handle node latency the same as for an empty itinerary. Now nodes default to unit latency unless an itinerary explicitly specifies a zero cycle stage or it is a TokenFactor chain. Original fixes: UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make the ndoe latency adjustments work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129421	2011-04-13 00:38:32 +00:00
Andrew Trick	d83e7b6a5d	Revert 129383. It causes some targets to hit a scheduler assert. llvm-svn: 129385	2011-04-12 20:14:07 +00:00
Andrew Trick	1e0821075d	PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency. UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make these heuristic adjustments to node latency work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129383	2011-04-12 19:54:36 +00:00
Jay Foad	0d5ca4cf44	Don't include Operator.h from InstrTypes.h. llvm-svn: 129271	2011-04-11 09:35:34 +00:00
Chris Lattner	e8dfbaef19	Avoid excess precision issues that lead to generating host-compiler-specific code. Switch lowering probably shouldn't be using FP for this. This resolves PR9581. llvm-svn: 129199	2011-04-09 06:57:13 +00:00
Chris Lattner	badb8ca63c	have dag combine zap "store undef", which can be formed during call lowering with undef arguments. llvm-svn: 129185	2011-04-09 02:32:02 +00:00
Evan Cheng	bc053100af	Change -arm-trap-func= into a non-arm specific option. Now Intrinsic::trap is lowered into a call to the specified trap function at sdisel time. llvm-svn: 129152	2011-04-08 21:37:21 +00:00
Andrew Trick	36a1759769	Added a check in the preRA scheduler for potential interference on a induction variable. The preRA scheduler is unaware of induction vars, so we look for potential "virtual register cycles" instead. Fixes <rdar://problem/8946719> Bad scheduling prevents coalescing llvm-svn: 129100	2011-04-07 19:54:57 +00:00
Bill Wendling	a8db395dc1	Revamp the SjLj "dispatch setup" intrinsic. It needed to be moved closer to the setjmp statement, because the code directly after the setjmp needs to know about values that are on the stack. Also, the 'bitcast' of the function context was causing a dead load. This wouldn't be too horrible, except that at -O0 it wasn't optimized out, and because it wasn't using the correct base pointer (if there is a VLA), it would try to access a value from a garbage address. <rdar://problem/9130540> llvm-svn: 128873	2011-04-05 01:37:43 +00:00
Stuart Hastings	1635b37415	Revert 123704; it broke threaded LLVM. llvm-svn: 128868	2011-04-05 00:37:28 +00:00
Cameron Zwarich	2748634089	Add a RemoveFromWorklist method to DCI. This is needed to do some complicated transformations in target-specific DAG combines without causing DAGCombiner to delete the same node twice. If you know of a better way to avoid this (see my next patch for an example), please let me know. llvm-svn: 128758	2011-04-02 02:40:26 +00:00
Evan Cheng	28382f9178	Add comments. llvm-svn: 128730	2011-04-01 19:57:01 +00:00
Evan Cheng	13c73e4836	Assign node order numbers to results of call instruction lowering. This should improve src line debug info when sdisel is used. rdar://9199118 llvm-svn: 128728	2011-04-01 19:42:22 +00:00
Evan Cheng	39574b2766	Issue libcalls __udivmodi4 / __divmodi4 for div / rem pairs. rdar://8911343 llvm-svn: 128696	2011-04-01 00:42:02 +00:00
Benjamin Kramer	f9e1ba7398	Turn SelectionDAGBuilder::GetRegistersForValue into a local function. It couldn't be used outside of the file because SDISelAsmOperandInfo is local to SelectionDAGBuilder.cpp. Making it a static function avoids a weird linkage dance. llvm-svn: 128342	2011-03-26 16:35:10 +00:00
Andrew Trick	651a3701f9	Fix for -pre-RA-sched=source. Yet another case of unchecked NULL node (for physreg copy). May fix PR9509. llvm-svn: 128266	2011-03-25 06:40:55 +00:00
Eli Friedman	76fcfaab12	PR9535: add support for splitting and scalarizing vector ISD::FP_ROUND. Also cleaning up some duplicated code while I'm here. llvm-svn: 128176	2011-03-23 22:18:48 +00:00
Andrew Trick	b702dae9b2	Ensure that def-side physreg copies are scheduled above any other uses so the scheduler can't create new interferences on the copies themselves. Prior to this fix the scheduler could get stuck in a loop creating copies. Fixes PR9509. llvm-svn: 128164	2011-03-23 20:42:39 +00:00
Andrew Trick	ca42e62048	whitespace llvm-svn: 128163	2011-03-23 20:40:18 +00:00
Andrew Trick	d9c599d01c	Added block number and name to isel debug output. I'm tired of doing this manually for each checkout. If anyone knows a better way debug isel for non-trivial tests feel free to revert and let me know how to do it. llvm-svn: 128132	2011-03-23 01:38:28 +00:00
Eric Christopher	44e3d3c26e	Grammar-o. llvm-svn: 128004	2011-03-21 18:06:21 +00:00
Nadav Rotem	92561196b7	Add support for legalizing UINT_TO_FP of vectors on platforms which do not have native support for this operation (such as X86). The legalized code uses two vector INT_TO_FP operations and is faster than scalarizing. llvm-svn: 127951	2011-03-19 13:09:10 +00:00
Benjamin Kramer	52ffb6ea96	BuildUDIV: If the divisor is even we can simplify the fixup of the multiplied value by introducing an early shift. This allows us to compile "unsigned foo(unsigned x) { return x/28; }" into shrl $2, %edi imulq $613566757, %rdi, %rax shrq $32, %rax ret instead of movl %edi, %eax imulq $613566757, %rax, %rcx shrq $32, %rcx subl %ecx, %eax shrl %eax addl %ecx, %eax shrl $4, %eax on x86_64 llvm-svn: 127829	2011-03-17 20:39:14 +00:00
Cameron Zwarich	cea63dc052	Move more logic into getTypeForExtArgOrReturn. llvm-svn: 127809	2011-03-17 14:53:37 +00:00
Cameron Zwarich	a5746339cc	Rename getTypeForExtendedInteger() to getTypeForExtArgOrReturn(). llvm-svn: 127807	2011-03-17 14:21:56 +00:00
Cameron Zwarich	2bb1e45ea3	The x86-64 ABI says that a bool is only guaranteed to be sign-extended to a byte rather than an int. Thankfully, this only causes LLVM to miss optimizations, not generate incorrect code. This just fixes the zext at the return. We still insert an i32 ZextAssert when reading a function's arguments, but it is followed by a truncate and another i8 ZextAssert so it is not optimized. llvm-svn: 127766	2011-03-16 22:20:18 +00:00
Cameron Zwarich	860d06739b	Don't recompute something that we already have in a local variable. llvm-svn: 127764	2011-03-16 22:20:07 +00:00
Evan Cheng	bac3e87eaa	sext(undef) = 0, because the top bits will all be the same. zext(undef) = 0, because the top bits will be zero. llvm-svn: 127649	2011-03-15 02:22:10 +00:00
Evan Cheng	50f2d406ec	BIT_CONVERT has been renamed to BITCAST. llvm-svn: 127600	2011-03-14 18:19:52 +00:00
Evan Cheng	cb70b9e80b	Minor optimization. sign-ext/anyext of undef is still undef. llvm-svn: 127598	2011-03-14 18:15:55 +00:00
Owen Anderson	78afadfa5d	Teach FastISel to support register-immediate-immediate instructions. llvm-svn: 127496	2011-03-11 21:33:55 +00:00
Andrew Trick	6aa37a4a2b	Replace -dag-chain-limit flag with constant. It has survived a release cycle without being touched, so no longer needs to pollute the hidden-help text. llvm-svn: 127468	2011-03-11 17:46:59 +00:00
Evan Cheng	d5d2d4a158	Avoid replacing the value of a directly stored load with the stored value if the load is indexed. rdar://9117613. llvm-svn: 127440	2011-03-11 00:48:56 +00:00
Evan Cheng	a3a7a7e364	Re-commit 127368 and 127371. They are exonerated. llvm-svn: 127380	2011-03-10 00:16:32 +00:00
Evan Cheng	d7a2008a55	Revert 127368 and 127371 for now. llvm-svn: 127376	2011-03-09 23:53:17 +00:00
Evan Cheng	b717770dfe	Change the definition of TargetRegisterInfo::getCrossCopyRegClass to be more flexible. If it returns a register class that's different from the input, then that's the register class used for cross-register class copies. If it returns a register class that's the same as the input, then no cross- register class copies are needed (normal copies would do). If it returns null, then it's not at all possible to copy registers of the specified register class. llvm-svn: 127368	2011-03-09 22:47:38 +00:00
Andrew Trick	e529ddb2d7	Improve pre-RA-sched register pressure tracking for duplicate operands. This helps cases like 2008-07-19-movups-spills.ll, but doesn't have an obvious impact on benchmarks llvm-svn: 127347	2011-03-09 19:12:43 +00:00
Benjamin Kramer	4cf03850a2	Fix typo, make helper static. llvm-svn: 127335	2011-03-09 16:19:12 +00:00
Eric Christopher	d3c7e834e8	Fix some latent bugs if the nodes are unschedulable. We'd gotten away with this before since none of the register tracking or nightly tests had unschedulable nodes. This should probably be refixed with a special default Node that just returns some "don't touch me" values. Fixes PR9427 llvm-svn: 127263	2011-03-08 19:35:47 +00:00

1 2 3 4 5 ...

4888 Commits