llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00

Author	SHA1	Message	Date
Evan Cheng	b7d41a6680	Mark MOV8mr_NOREX and MOV8rm_NOREX as mayStore / mayLoad respectively. llvm-svn: 70461	2009-04-30 00:58:57 +00:00
Chris Lattner	794fb5b4b3	fix a regression handling indirect results: these need to be considered memory operands otherwise the writebacks get lost when the inline asm doesn't otherwise have side effects. This fixes rdar://6839427, though clang really shouldn't generate these anymore. llvm-svn: 70455	2009-04-30 00:48:50 +00:00
Nate Begeman	b407809122	Fix infinite recursion in the C++ code which handles movddup by making it unnecessary. llvm-svn: 70425	2009-04-29 22:47:44 +00:00
Evan Cheng	62fdc300dd	spillPhysRegAroundRegDefsUses() may have invalidated iterators stored in fixed_ IntervalPtrs. Reset them. llvm-svn: 70378	2009-04-29 07:16:34 +00:00
Chris Lattner	e1eefefdc3	Disable the load-shrinking optimization from looking at anything larger than 64-bits, avoiding a crash. This should really be fixed to use APInts, though type legalization happens to help us out and we get good code on the attached testcase at least. This fixes rdar://6836460 llvm-svn: 70360	2009-04-29 03:45:07 +00:00
Bill Wendling	7546bed590	Second attempt: Massive check in. This changes the "-fast" flag to "-O#" in llc. If you want to use the old behavior, the flag is -O0. This change allows for finer-grained control over which optimizations are run at different -O levels. Most of this work was pretty mechanical. The majority of the fixes came from verifying that a "fast" variable wasn't used anymore. The JIT still uses a "Fast" flag. I'll change the JIT with a follow-up patch. llvm-svn: 70343	2009-04-29 00:15:41 +00:00
Anton Korobeynikov	1799ac4b55	Properly print 'P' modifier on inline asm memory operands. This should fix PR3379 and PR4064. Patch inspired by Edwin Török! llvm-svn: 70328	2009-04-28 21:49:33 +00:00
Evan Cheng	754a0d2f9e	Fix PR4034. Bug in LiveInterval::join when it's compacting new valno's. llvm-svn: 70291	2009-04-28 06:24:09 +00:00
Evan Cheng	8a9736a26c	Fix for PR4051. When 2address pass delete an instruction, update kill info when necessary. llvm-svn: 70279	2009-04-28 02:12:36 +00:00
Bill Wendling	ef47ace92f	r70270 isn't ready yet. Back this out. Sorry for the noise. llvm-svn: 70275	2009-04-28 01:04:53 +00:00
Bill Wendling	2799e916c3	Massive check in. This changes the "-fast" flag to "-O#" in llc. If you want to use the old behavior, the flag is -O0. This change allows for finer-grained control over which optimizations are run at different -O levels. Most of this work was pretty mechanical. The majority of the fixes came from verifying that a "fast" variable wasn't used anymore. The JIT still uses a "Fast" flag. I'm not 100% sure if it's necessary to change it there... llvm-svn: 70270	2009-04-28 00:21:31 +00:00
Evan Cheng	c315cf24e3	Fix PR4076. Correctly create live interval of physical register with two-address update. llvm-svn: 70245	2009-04-27 20:42:46 +00:00
Dan Gohman	e1a532cb4f	Permit ChangeCompareStride to rewrite a comparison when the factor between the comparison's iv stride and the candidate stride is exactly -1. llvm-svn: 70244	2009-04-27 20:35:32 +00:00
Dan Gohman	ff30ebd710	Teach getZeroExtendExpr and getSignExtendExpr to use trip-count information to simplify [sz]ext({a,+,b}) to {zext(a),+,[zs]ext(b)}, as appropriate. These functions and the trip count code each call into the other, so this requires careful handling to avoid infinite recursion. During the initial trip count computation, conservative SCEVs are used, which are subsequently discarded once the trip count is actually known. Among other benefits, this change lets LSR automatically eliminate some unnecessary zext-inreg and sext-inreg operation where the operand is an induction variable. llvm-svn: 70241	2009-04-27 20:16:15 +00:00
Nate Begeman	7902a2344d	Revert accidental testcase reduction llvm-svn: 70226	2009-04-27 18:42:40 +00:00
Nate Begeman	9d121924fd	2nd attempt, fixing SSE4.1 issues and implementing feedback from duncan. PR2957 ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes as the shuffle mask. A value of -1 represents UNDEF. In addition to eliminating the creation of illegal BUILD_VECTORS just to represent shuffle masks, we are better about canonicalizing the shuffle mask, resulting in substantially better code for some classes of shuffles. llvm-svn: 70225	2009-04-27 18:41:29 +00:00
Evan Cheng	43fc90ae59	Fix PR4056. It's possible a physical register def is dead if its implicit use is deleted by two-address pass. llvm-svn: 70213	2009-04-27 17:36:47 +00:00
Dan Gohman	4aeebb184b	Fix the syntax for a PR number in a test. llvm-svn: 70208	2009-04-27 15:08:34 +00:00
Dan Gohman	744f455d55	When transforming sext(trunc(load(x))) into sext(smaller load(x)), the trunc is directly replaced with the smaller load, so don't try to create a new sext node. This fixes PR4050. llvm-svn: 70179	2009-04-27 02:00:55 +00:00
Evan Cheng	696a04eba2	Do not share a single unknown val# for all the live ranges merged into a physical sub-register live interval. When coalescer is merging in clobbered virtaul register live interval into a physical register live interval, give each virtual register val# a separate val# in the physical register live interval. Otherwise, the coalescer would have lost track of the definitions information it needs to make correct coalescing decisions. llvm-svn: 70026	2009-04-25 09:25:19 +00:00
Dale Johannesen	493c3bcdc0	Fix PR 4057, a crash doing float->char const folding. This particular one is undefined behavior (although this isn't related to the crash), so it will no longer do it at compile time, which seems better. llvm-svn: 69990	2009-04-24 21:34:13 +00:00
Rafael Espindola	4e7a0bf1f1	Fix PR 4004 by including the call to __tls_get_addr in X86tlsaddr. This is not very elegant, but neither is the tls specification :-( llvm-svn: 69968	2009-04-24 12:59:40 +00:00
Rafael Espindola	0b1037ad26	Revert 69952. Causes testsuite failures on linux x86-64. llvm-svn: 69967	2009-04-24 12:40:33 +00:00
Nate Begeman	c1a09c7dfa	PR2957 ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes as the shuffle mask. A value of -1 represents UNDEF. In addition to eliminating the creation of illegal BUILD_VECTORS just to represent shuffle masks, we are better about canonicalizing the shuffle mask, resulting in substantially better code for some classes of shuffles. A clean up of x86 shuffle code, and some canonicalizing in DAGCombiner is next. llvm-svn: 69952	2009-04-24 03:42:54 +00:00
Dan Gohman	3499a53e1d	Explicitly pass -tailcallopt=false to these tests so that they work as intended no matter what the default setting of that option is. llvm-svn: 69911	2009-04-23 19:39:41 +00:00
Evan Cheng	a36c6c6819	It has finally happened. Spiller is now using live interval info. This fixes a very subtle bug. vr defined by an implicit_def is allowed overlap with any register since it doesn't actually modify anything. However, if it's used as a two-address use, its live range can be extended and it can be spilled. The spiller must take care not to emit a reload for the vn number that's defined by the implicit_def. This is both a correctness and performance issue. llvm-svn: 69743	2009-04-21 22:46:52 +00:00
Evan Cheng	c248188b46	Added a linearscan register allocation optimization. When the register allocator spill an interval with multiple uses in the same basic block, it creates a different virtual register for each of the reloads. e.g. %reg1498<def> = MOV32rm %reg1024, 1, %reg0, 12, %reg0, Mem:LD(4,4) [sunkaddr39 + 0] %reg1506<def> = MOV32rm %reg1024, 1, %reg0, 8, %reg0, Mem:LD(4,4) [sunkaddr42 + 0] %reg1486<def> = MOV32rr %reg1506 %reg1486<def> = XOR32rr %reg1486, %reg1498, %EFLAGS<imp-def,dead> %reg1510<def> = MOV32rm %reg1024, 1, %reg0, 4, %reg0, Mem:LD(4,4) [sunkaddr45 + 0] => %reg1498<def> = MOV32rm %reg2036, 1, %reg0, 12, %reg0, Mem:LD(4,4) [sunkaddr39 + 0] %reg1506<def> = MOV32rm %reg2037, 1, %reg0, 8, %reg0, Mem:LD(4,4) [sunkaddr42 + 0] %reg1486<def> = MOV32rr %reg1506 %reg1486<def> = XOR32rr %reg1486, %reg1498, %EFLAGS<imp-def,dead> %reg1510<def> = MOV32rm %reg2038, 1, %reg0, 4, %reg0, Mem:LD(4,4) [sunkaddr45 + 0] From linearscan's point of view, each of reg2036, 2037, and 2038 are separate registers, each is "killed" after a single use. The reloaded register is available and it's often clobbered right away. e.g. In thise case reg1498 is allocated EAX while reg2036 is allocated RAX. This means we end up with multiple reloads from the same stack slot in the same basic block. Now linearscan recognize there are other reloads from same SS in the same BB. So it'll "downgrade" RAX (and its aliases) after reg2036 is allocated until the next reload (reg2037) is done. This greatly increase the likihood reloads from SS are reused. This speeds up sha1 from OpenSSL by 5.8%. It is also an across the board win for SPEC2000 and 2006. llvm-svn: 69585	2009-04-20 08:01:12 +00:00
Chris Lattner	13a0dd0288	testcase for PR3898 llvm-svn: 69473	2009-04-18 20:49:22 +00:00
Duncan Sands	d2ba02aa87	Don't try to make BUILD_VECTOR operands have the same type as the vector element type: allow them to be of a wider integer type than the element type all the way through the system, and not just as far as LegalizeDAG. This should be safe because it used to be this way (the old type legalizer would produce such nodes), so backends should be able to handle it. In fact only targets which have legal vector types with an illegal promoted element type will ever see this (eg: <4 x i16> on ppc). This fixes a regression with the new type legalizer (vec_splat.ll). Also, treat SCALAR_TO_VECTOR the same as BUILD_VECTOR. After all, it is just a special case of BUILD_VECTOR. llvm-svn: 69467	2009-04-18 20:16:54 +00:00
Dale Johannesen	8a4446429e	Adjust XFAIL syntax, maybe that will help. The other way worked for me... llvm-svn: 69414	2009-04-18 02:01:23 +00:00
Dale Johannesen	05d46aca49	patch 69408 breaks this by removing the opportunity for the optimization it's testing to kick in (although it improves the code, getting rid of all spills). I don't understand the optimization well enough to rescue the test, so XFAILing. llvm-svn: 69409	2009-04-18 00:11:50 +00:00
Bob Wilson	b3e4773035	Rename file to have the correct suffix. llvm-svn: 69380	2009-04-17 20:40:20 +00:00
Bob Wilson	b8756b00cd	Use CallConvLower.h and TableGen descriptions of the calling conventions for ARM. Patch by Sandeep Patel. llvm-svn: 69371	2009-04-17 19:07:39 +00:00
Rafael Espindola	d74132e2c5	For general dynamic TLS access we must use leaq foo@TLSGD(%rip), %rdi as part of the instruction sequence. Using a register other than %rdi and then copying it to %rdi is not valid. llvm-svn: 69350	2009-04-17 14:35:58 +00:00
Evan Cheng	2d5be54315	Teach spiller to unfold instructions which modref spill slot when a scratch register is available and when it's profitable. e.g. xorq %r12<kill>, %r13 addq %rax, -184(%rbp) addq %r13, -184(%rbp) ==> xorq %r12<kill>, %r13 movq -184(%rbp), %r12 addq %rax, %r12 addq %r13, %r12 movq %r12, -184(%rbp) Two more instructions, but fewer memory accesses. It can also open up opportunities for more optimizations. llvm-svn: 69341	2009-04-17 01:29:40 +00:00
Rafael Espindola	a07d1c3103	fix PR3995. A scale must be 1, 2, 4 or 8. llvm-svn: 69284	2009-04-16 12:34:53 +00:00
Dan Gohman	98aa1d9693	Expand GEPs in ScalarEvolution expressions. SCEV expressions can now have pointer types, though in contrast to C pointer types, SCEV addition is never implicitly scaled. This not only eliminates the need for special code like IndVars' EliminatePointerRecurrence and LSR's own GEP expansion code, it also does a better job because it lets the normal optimizations handle pointer expressions just like integer expressions. Also, since LLVM IR GEPs can't directly index into multi-dimensional VLAs, moving the GEP analysis out of client code and into the SCEV framework makes it easier for clients to handle multi-dimensional VLAs the same way as other arrays. Some existing regression tests show improved optimization. test/CodeGen/ARM/2007-03-13-InstrSched.ll in particular improved to the point where if-conversion started kicking in; I turned it off for this test to preserve the intent of the test. llvm-svn: 69258	2009-04-16 03:18:22 +00:00
Dan Gohman	e1c4d4c5be	Fix the RUN lines so that this test actually tests. llvm-svn: 69096	2009-04-14 22:50:17 +00:00
Dan Gohman	365c457893	For the h-register addressing-mode trick, use the correct value for any non-address uses of the address value. This fixes 186.crafty. llvm-svn: 69094	2009-04-14 22:45:05 +00:00
Dan Gohman	3c19cf07d9	When the result of an EXTRACT_SUBREG, INSERT_SUBREG, or SUBREG_TO_REG operator is used by a CopyToReg to export the value to a different block, don't reuse the CopyToReg's register for the subreg operation result if the register isn't precisely the right class for the subreg operation. Also, rename the h-registers.ll test, now that there are more than one. llvm-svn: 69087	2009-04-14 22:17:14 +00:00
Evan Cheng	b64f2c1b08	Some of GR8_NOREX registers are only available in 64-bit mode. llvm-svn: 69049	2009-04-14 16:57:43 +00:00
Dale Johannesen	862ade6f10	Use the output of the asm so the optimizer won't delete it. llvm-svn: 69018	2009-04-14 01:51:40 +00:00
Evan Cheng	9f44d3148c	Fix PR3934 part 2. findOnlyInterestingUse() was not setting IsCopy and IsDstPhys which are returned by value and used by callee. This happened to work on the earlier test cases because of a logic error in the caller side. llvm-svn: 69006	2009-04-14 00:32:25 +00:00
Evan Cheng	fa48d5c8d0	PR3934: Fix a bogus two-address pass assertion. llvm-svn: 68979	2009-04-13 20:04:24 +00:00
Dan Gohman	be7227005f	Implement x86 h-register extract support. - Add patterns for h-register extract, which avoids a shift and mask, and in some cases a temporary register. - Add address-mode matching for turning (X>>(8-n))&(255<<n), where n is a valid address-mode scale value, into an h-register extract and a scaled-offset address. - Replace X86's MOV32to32_ and related instructions with the new target-independent COPY_TO_SUBREG instruction. On x86-64 there are complicated constraints on h registers, and CodeGen doesn't currently provide a high-level way to express all of them, so they are handled with a bunch of special code. This code currently only supports extracts where the result is used by a zero-extend or a store, though these are fairly common. These transformations are not always beneficial; since there are only 4 h registers, they sometimes require extra move instructions, and this sometimes increases register pressure because it can force out values that would otherwise be in one of those registers. However, this appears to be relatively uncommon. llvm-svn: 68962	2009-04-13 16:09:41 +00:00
Rafael Espindola	72347bffce	X86-64 TLS support for local exec and initial exec. llvm-svn: 68947	2009-04-13 13:02:49 +00:00
Chris Lattner	c1bfdc9bb2	Add a new "available_externally" linkage type. This is intended to support C99 inline, GNU extern inline, etc. Related bugzilla's include PR3517, PR3100, & PR2933. Nothing uses this yet, but it appears to work. llvm-svn: 68940	2009-04-13 05:44:34 +00:00
Rafael Espindola	ad8137187c	In X86DAGToDAGISel::MatchWrapper, if base or index are set, avoid matching only if symbolic addresses are RIP relatives. llvm-svn: 68924	2009-04-12 23:00:38 +00:00
Rafael Espindola	412b15f4ed	Add tests for the parts of X86-64 TLS that are already implemented. llvm-svn: 68901	2009-04-12 10:43:41 +00:00
Chris Lattner	6d6cf3ff4a	fix a cross-block fastisel crash handling overflow intrinsics. See comment for details. This fixes rdar://6772169 llvm-svn: 68890	2009-04-12 07:51:14 +00:00

1 2 3 4 5 ...

1685 Commits