llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-27 22:12:47 +01:00

Author	SHA1	Message	Date
Evan Cheng	5d841097a9	Change allowsUnalignedMemoryAccesses to take type argument since some targets support unaligned mem access only for certain types. (Should it be size instead?) ARM v7 supports unaligned access for i16 and i32, some v6 variants support it as well. llvm-svn: 79127	2009-08-15 19:23:44 +00:00
Dan Gohman	d69323d37a	On x86-64, for a varargs function, don't store the xmm registers to the register save area if %al is 0. This avoids touching xmm regsiters when they aren't actually used. llvm-svn: 79061	2009-08-15 01:38:56 +00:00
Anton Korobeynikov	933d8e1118	Properly handle indirect win64 args when they're passed in memory llvm-svn: 79009	2009-08-14 18:19:10 +00:00
Owen Anderson	9df206d02d	Push LLVMContexts through the IntegerType APIs. llvm-svn: 78948	2009-08-13 21:58:54 +00:00
Owen Anderson	75ebfc8728	Fix warnings. llvm-svn: 78725	2009-08-11 21:59:30 +00:00
Owen Anderson	48f2f0ae72	Split EVT into MVT and EVT, the former representing _just_ a primitive type, while the latter is capable of representing either a primitive or an extended type. llvm-svn: 78713	2009-08-11 20:47:22 +00:00
Owen Anderson	b4bce99769	Rename MVT to EVT, in preparation for splitting SimpleValueType out into its own struct type. llvm-svn: 78610	2009-08-10 22:56:29 +00:00
Owen Anderson	30bf6c8dab	SimpleValueType-ify a few more methods on TargetLowering. llvm-svn: 78595	2009-08-10 20:46:15 +00:00
Owen Anderson	cf56d576eb	Continue the SimpleValueType-ification. llvm-svn: 78593	2009-08-10 20:18:46 +00:00
Owen Anderson	dcb47bda67	Start moving TargetLowering away from using full MVTs and towards SimpleValueType, which will simplify the privatization of IntegerType in the future. llvm-svn: 78584	2009-08-10 18:56:59 +00:00
Anton Korobeynikov	8e6a142223	Better handle kernel code model. Also, generalize the things and fix one subtle bug with small code model. llvm-svn: 78255	2009-08-05 23:01:26 +00:00
Dan Gohman	5d566d918b	Major calling convention code refactoring. Instead of awkwardly encoding calling-convention information with ISD::CALL, ISD::FORMAL_ARGUMENTS, ISD::RET, and ISD::ARG_FLAGS nodes, TargetLowering provides three virtual functions for targets to override: LowerFormalArguments, LowerCall, and LowerRet, which replace the custom lowering done on the special nodes. They provide the same information, but in a more immediately usable format. This also reworks much of the target-independent tail call logic. The decision of whether or not to perform a tail call is now cleanly split between target-independent portions, and the target dependent portion in IsEligibleForTailCallOptimization. This also synchronizes all in-tree targets, to help enable future refactoring and feature work. llvm-svn: 78142	2009-08-05 01:29:28 +00:00
Anton Korobeynikov	b33dbbe7fd	Perform bitconvert to proper type llvm-svn: 77965	2009-08-03 08:14:14 +00:00
Anton Korobeynikov	3a8e354d47	Add 'Indirect' LocInfo class and use to pass __m128 on win64. Also minore fixes here and there (mostly __m64). llvm-svn: 77964	2009-08-03 08:13:56 +00:00
Anton Korobeynikov	00018fb248	Cleanup Darwin MMX calling conv stuff - make the stuff more generic. This also fixes a subtle bug, when 6th v1i64 argument passed wrongly. llvm-svn: 77963	2009-08-03 08:13:24 +00:00
Anton Korobeynikov	0bac80c138	Unbreak Win64 CC. Step one: honour register save area, fix some alignment and provide a different set of call-clobberred registers. llvm-svn: 77962	2009-08-03 08:12:53 +00:00
Rafael Espindola	08c8a9e6d5	Remove a bitcast that was a no-op. Thanks to Eli Friedman for noticing it. llvm-svn: 77942	2009-08-03 03:00:05 +00:00
Rafael Espindola	daefe7aa54	Use movq to move 64 bits in and out of mmx registers. Fixes PR4669 llvm-svn: 77940	2009-08-03 02:45:34 +00:00
Dan Gohman	abd57d8aec	Minor code cleanups. llvm-svn: 77795	2009-08-01 19:14:37 +00:00
Chris Lattner	c156a00641	refactor section construction in TLOF to be through an explicit initialize method, which can be called when an MCContext is available. llvm-svn: 77687	2009-07-31 17:42:42 +00:00
Dan Gohman	0a16a3ee84	Rename GRAD to GR32_AD, to follow the naming convention of other classes. And define its SubRegClassList. llvm-svn: 77601	2009-07-30 17:02:08 +00:00
Evan Cheng	148032a1a2	Optimize some common usage patterns of atomic built-ins __sync_add_and_fetch() and __sync_sub_and_fetch. When the return value is not used (i.e. only care about the value in the memory), x86 does not have to use add to implement these. Instead, it can use add, sub, inc, dec instructions with the "lock" prefix. This is currently implemented using a bit of instruction selection trick. The issue is the target independent pattern produces one output and a chain and we want to map it into one that just output a chain. The current trick is to select it into a merge_values with the first definition being an implicit_def. The proper solution is to add new ISD opcodes for the no-output variant. DAG combiner can then transform the node before it gets to target node selection. Problem #2 is we are adding a whole bunch of x86 atomic instructions when in fact these instructions are identical to the non-lock versions. We need a way to add target specific information to target nodes and have this information carried over to machine instructions. Asm printer (or JIT) can use this information to add the "lock" prefix. llvm-svn: 77582	2009-07-30 08:33:02 +00:00
Eric Christopher	c9c896290e	Add llvm_unreachable for ... unreachable code! llvm-svn: 77480	2009-07-29 18:14:04 +00:00
Chris Lattner	a54286efc5	whitespace cleanup. llvm-svn: 77438	2009-07-29 05:48:09 +00:00
Eric Christopher	b64d6c8efc	Fix comment. llvm-svn: 77415	2009-07-29 01:01:19 +00:00
Eric Christopher	c7b97d1f03	Add support for gcc __builtin_ia32_ptest{z,c,nzc} intrinsics. Lower to ptest instruction plus setcc. Revamp ptest instruction. Add test. llvm-svn: 77407	2009-07-29 00:28:05 +00:00
Owen Anderson	390e9778d4	Return ConstantVector to 2.5 API. llvm-svn: 77366	2009-07-28 21:19:26 +00:00
Chris Lattner	c74586940a	the apple "ld_classic" linker doesn't support .literal16 in 32-bit mode, and "ld64" (the default linker) falls back to it in -static mode. llvm-svn: 77334	2009-07-28 17:50:28 +00:00
Chris Lattner	55461787cc	Rip all of the global variable lowering logic out of TargetAsmInfo. Since it is highly specific to the object file that will be generated in the end, this introduces a new TargetLoweringObjectFile interface that is implemented for each of ELF/MachO/COFF/Alpha/PIC16 and XCore. Though still is still a brutal and ugly refactoring, this is a major step towards goodness. This patch also: 1. fixes a bunch of dangling pointer problems in the PIC16 backend. 2. disables the TargetLowering copy ctor which PIC16 was accidentally using. 3. gets us closer to xcore having its own crazy target section flags and pic16 not having to shadow sections with its own objects. 4. fixes wierdness where ELF targets would set CStringSection but not CStringSection_. Factor the code better. 5. fixes some bugs in string lowering on ELF targets. llvm-svn: 77294	2009-07-28 03:13:23 +00:00
Owen Anderson	256c2c250e	Move ConstantFP construction back to the 2.5-ish API. llvm-svn: 77247	2009-07-27 20:59:43 +00:00
Owen Anderson	cc33e89571	Revert the ConstantInt constructors back to their 2.5 forms where possible, thanks to contexts-on-types. More to come. llvm-svn: 77011	2009-07-24 23:12:02 +00:00
Eric Christopher	c205a8da9d	Update insertps handling based on feedback. Move to a v4f32 style to support vector arguments and scalar arguments correctly. Update lowering and fix comment to refer to pinsr* instead of insertps. llvm-svn: 76921	2009-07-24 00:33:09 +00:00
Eli Friedman	2b4857cdff	Add support for MMX VSETCC. llvm-svn: 76713	2009-07-22 01:06:52 +00:00
Owen Anderson	cc287b28c9	Get rid of the Pass+Context magic. llvm-svn: 76702	2009-07-22 00:24:57 +00:00
Eli Friedman	45160af6bd	Remove shift amount flavor. It isn't actually complete enough to be useful, and it's currently unused. (Some issues: it isn't actually rich enough to capture the semantics on many architectures, and semantics can vary depending on the type being shifted.) llvm-svn: 76633	2009-07-21 20:12:16 +00:00
Dale Johannesen	8b0ece80d9	revert 76503 while I figure out what's going on llvm-svn: 76517	2009-07-21 00:12:29 +00:00
Dale Johannesen	ee3f2d6dc3	Make sure a global matching asm 'i' constraint gets its flags set properly. (hasMemory is clearly irrelevant when matching 'i', I don't understand what this was supposed to be doing.) gcc.apple/asm-block-25.c (test passed before by accident, but generated code was wrong) llvm-svn: 76503	2009-07-20 23:39:13 +00:00
Chris Lattner	72b24cbbf6	Copy ExpandInlineAsm to TargetLowering from TargetAsmInfo. llvm-svn: 76441	2009-07-20 17:51:36 +00:00
Evan Cheng	67ccedff04	Fix x86 inline ams 'q' constraint support. In 32-bit mode, it's just like 'Q', i.e. EAX, EDX, ECX, EBX. In 64-bit mode, it just means all the i64r registers. Yeah, that makes sense. llvm-svn: 76248	2009-07-17 22:13:25 +00:00
Owen Anderson	13080d27c5	Move a few more convenience factory functions from Constant to LLVMContext. llvm-svn: 75840	2009-07-15 21:51:10 +00:00
Torok Edwin	f955a6ef49	llvm_unreachable->llvm_unreachable(0), LLVM_UNREACHABLE->llvm_unreachable. This adds location info for all llvm_unreachable calls (which is a macro now) in !NDEBUG builds. In NDEBUG builds location info and the message is off (it only prints "UREACHABLE executed"). llvm-svn: 75640	2009-07-14 16:55:14 +00:00
Chris Lattner	496f872969	Fix PR4533, which is about buggy codegen in x86-64 -static mode. Basically, using: lea symbol(%rip), %rax is not valid in -static mode, because the current RIP may not be within 32-bits of "symbol" when an app is built partially pic and partially static. The fix for this is to compile it to: lea symbol, %rax It would be better to codegen this as: movq $symbol, %rax but that will come next. The hard part of fixing this bug was fixing abi-isel, which was actively testing for the wrong behavior. Also, the RUN lines are completely impossible to understand what they are testing. To help with this, convert the -static x86-64 codegen tests to use filecheck. This is much more stable and makes it more clear what the codegen is expected to be. llvm-svn: 75382	2009-07-11 20:29:19 +00:00
Torok Edwin	ae8a3ff177	assert(0) -> LLVM_UNREACHABLE. Make llvm_unreachable take an optional string, thus moving the cerr<< out of line. LLVM_UNREACHABLE is now a simple wrapper that makes the message go away for NDEBUG builds. llvm-svn: 75379	2009-07-11 20:10:48 +00:00
Chris Lattner	478fc8442b	remove the now-dead TM argument to these methods. llvm-svn: 75276	2009-07-10 21:00:45 +00:00
Chris Lattner	9deef50410	add a couple of predicates to test for "stub style pic in PIC mode" and "stub style pic in dynamic-no-pic" mode. llvm-svn: 75273	2009-07-10 20:47:30 +00:00
Chris Lattner	7c038a2b3c	eliminate GVRequiresRegister, replacing it with predicates we need for other purposes. llvm-svn: 75243	2009-07-10 07:38:24 +00:00
Chris Lattner	e4e0c73ed0	change a bunch of logic in LowerGlobalAddress to leverage the work done in ClassifyGlobalReference instead of reconstructing the info awkwardly. llvm-svn: 75240	2009-07-10 07:34:39 +00:00
Chris Lattner	0cae8c7845	move some classification logic around. Now GVRequiresExtraLoad is just a trivial wrapper around "ClassifyGlobalReference", which stole a ton of logic from LowerGlobalAddress. llvm-svn: 75237	2009-07-10 07:20:05 +00:00
Chris Lattner	4e8de888f2	change isGlobalStubReference to take target flags instead of a MachineOperand. llvm-svn: 75236	2009-07-10 06:29:59 +00:00
Chris Lattner	41fccd30b7	GVRequiresExtraLoad is now never used for calls, simplify it based on this. llvm-svn: 75232	2009-07-10 05:52:02 +00:00
Chris Lattner	832a724072	actually, just eliminate PCRelGVRequiresExtraLoad. It makes the code more complex and slow than just directly testing what we care about. llvm-svn: 75231	2009-07-10 05:48:03 +00:00
Chris Lattner	2161376696	There is only one case where GVRequiresExtraLoad returns true for calls: split its handling out to PCRelGVRequiresExtraLoad, and simplify code based on this. llvm-svn: 75230	2009-07-10 05:45:15 +00:00
Chris Lattner	2e5e403f53	the "isDirectCall" operand of GVRequiresRegister is always false, eliminate it. llvm-svn: 75229	2009-07-10 05:37:11 +00:00
Owen Anderson	8970999512	Thread LLVMContext through MVT and related parts of SDISel. llvm-svn: 75153	2009-07-09 17:57:24 +00:00
Chris Lattner	7fcfc81604	simplify this logic a bit more. llvm-svn: 75118	2009-07-09 07:02:30 +00:00
Chris Lattner	1614fd5095	move reasoning about darwin $non_lazy_ptr stubs from asmprinter into isel. llvm-svn: 75117	2009-07-09 06:59:17 +00:00
Chris Lattner	f6ad5e86c4	make isel use MO_PIC_BASE_OFFSET when lowering globalvalues on darwin in pic mode, instead of having asmprinter just "know" to print them. llvm-svn: 75109	2009-07-09 05:47:33 +00:00
Chris Lattner	f42a8c82d9	make isel decide whether to emit $stub's on darwin instead of asmprinter. llvm-svn: 75107	2009-07-09 05:27:35 +00:00
Chris Lattner	06266970b0	Make isel determine where to emit PLT-relative calls instead of having asmprinter do it. llvm-svn: 75104	2009-07-09 05:02:21 +00:00
Chris Lattner	76adfe755d	simplify some code based on the fact that picstyles != none are only valid in pic or dynamic-no-pic mode. Also, x86-64 never used picstylegot. llvm-svn: 75101	2009-07-09 04:39:06 +00:00
Chris Lattner	0ee57926e4	all this logic always returns true because GOT mode is never active in x86-64 mode. Simplify it away, someone should evaluate this. llvm-svn: 75100	2009-07-09 04:27:47 +00:00
Chris Lattner	f7ea4f5067	isPICStyleRIPRel() and friends are never true in -static mode. Simplify code based on this. llvm-svn: 75099	2009-07-09 04:24:46 +00:00
Chris Lattner	cd52f7f20e	When in -static mode, force the PIC style to none. Doing this requires fixing code which conflated RIPRel PIC with x86-64. Fix these to just check for X86-64 directly. llvm-svn: 75092	2009-07-09 03:15:51 +00:00
Chris Lattner	fb40a495b0	merge two identical functions and simplify things that are GOT specific llvm-svn: 75091	2009-07-09 02:55:47 +00:00
Chris Lattner	47173f26e4	hoist check for IsTailCall to callers. Eliminate redundant check for x86-64: GOT-style PIC is never used on x86-64. llvm-svn: 75090	2009-07-09 02:46:53 +00:00
Chris Lattner	255e408e78	change a few methods to be static functions. llvm-svn: 75089	2009-07-09 02:44:11 +00:00
Chris Lattner	5cdf9d71f5	move handling of dllimport linkage in isel, not in asmprinter. llvm-svn: 75086	2009-07-09 00:58:53 +00:00
Torok Edwin	358888da3a	Implement changes from Chris's feedback. Finish converting lib/Target. llvm-svn: 75043	2009-07-08 20:53:28 +00:00
Torok Edwin	980729667e	Convert more abort() calls to llvm_report_error(). Also remove trailing semicolon. llvm-svn: 75027	2009-07-08 19:04:27 +00:00
Torok Edwin	ad3be984b7	Start converting to new error handling API. cerr+abort -> llvm_report_error assert(0)+abort -> LLVM_UNREACHABLE (assert(0)+llvm_unreachable-> abort() included) llvm-svn: 75018	2009-07-08 18:01:40 +00:00
Dale Johannesen	5487047295	Don't accept globals as matching 'i' constraint in PIC modes (in accordance with existing comment). gcc.apple/asm-block-25.c llvm-svn: 74886	2009-07-07 00:18:49 +00:00
Tilmann Scheller	cea3c16aa5	Add NumFixedArgs attribute to CallSDNode which indicates the number of fixed arguments in a vararg call. With the SVR4 ABI on PowerPC, vector arguments for vararg calls are passed differently depending on whether they are a fixed or a variable argument. Variable vector arguments always go into memory, fixed vector arguments are put into vector registers. If there are no free vector registers available, fixed vector arguments are put on the stack. The NumFixedArgs attribute allows to decide for an argument in a vararg call whether it belongs to the fixed or variable portion of the parameter list. llvm-svn: 74764	2009-07-03 06:44:53 +00:00
Bill Wendling	fdd5badace	Update comments to make it clear that the function alignment is the Log2 of the bytes and not bytes. llvm-svn: 74624	2009-07-01 18:50:55 +00:00
Bill Wendling	c0fb316bd3	Add an "alignment" field to the MachineFunction object. It makes more sense to have the alignment be calculated up front, and have the back-ends obey whatever alignment is decided upon. This allows for future work that would allow for precise no-op placement and the like. llvm-svn: 74564	2009-06-30 22:38:32 +00:00
David Greene	0bf8cb7487	Add a 256-bit register class and YMM registers. llvm-svn: 74469	2009-06-29 22:50:51 +00:00
Owen Anderson	d0e12300d9	Add a target-specific DAG combine on X86 to fold the common pattern of fence-atomic-fence down to just the atomic op. This is possible thanks to X86's relatively strong memory model, which guarantees that locked instructions (which are used to implement atomics) are implicit fences. llvm-svn: 74435	2009-06-29 18:04:45 +00:00
David Greene	21d2c76116	Add more vector ValueTypes for AVX and other extended vector instruction sets. llvm-svn: 74427	2009-06-29 16:47:10 +00:00
Chris Lattner	9571347ce0	pull @GOT, @GOTOFF, @GOTPCREL handling into isel from the asmprinter. llvm-svn: 74378	2009-06-27 05:39:56 +00:00
Chris Lattner	19eb0dad26	Reimplement rip-relative addressing in the X86-64 backend. The new implementation primarily differs from the former in that the asmprinter doesn't make a zillion decisions about whether or not something will be RIP relative or not. Instead, those decisions are made by isel lowering and propagated through to the asm printer. To achieve this, we: 1. Represent RIP relative addresses by setting the base of the X86 addr mode to X86::RIP. 2. When ISel Lowering decides that it is safe to use RIP, it lowers to X86ISD::WrapperRIP. When it is unsafe to use RIP, it lowers to X86ISD::Wrapper as before. 3. This removes isRIPRel from X86ISelAddressMode, representing it with a basereg of RIP instead. 4. The addressing mode matching logic in isel is greatly simplified. 5. The asmprinter is greatly simplified, notably the "NotRIPRel" predicate passed through various printoperand routines is gone now. 6. The various symbol printing routines in asmprinter now no longer infer when to emit (%rip), they just print the symbol. I think this is a big improvement over the previous situation. It does have two small caveats though: 1. I implemented a horrible "no-rip" modifier for the inline asm "P" constraint modifier. This is a short term hack, there is a much better, but more involved, solution. 2. I had to xfail an -aggressive-remat testcase because it isn't handling the use of RIP in the constant-pool reading instruction. This specific test is easy to fix without -aggressive-remat, which I intend to do next. llvm-svn: 74372	2009-06-27 04:16:01 +00:00
Chris Lattner	1f3d17f45d	Move all the TLS processing logic into isel, don't do it in asmprinter at all. llvm-svn: 74327	2009-06-26 21:20:29 +00:00
Chris Lattner	0a0494b4f9	move magic for PIC constantpool references from asmprinter to isel. llvm-svn: 74313	2009-06-26 19:22:52 +00:00
Chris Lattner	05eb63598b	start adding logic in isel to determine asm printer semantics, step N of M. llvm-svn: 74246	2009-06-26 00:43:52 +00:00
Chris Lattner	e358de060d	indentation fix llvm-svn: 73840	2009-06-21 02:22:34 +00:00
Eli Friedman	c80a4f18de	Misc accumulated tweaks to legalization logic for various targets. llvm-svn: 73476	2009-06-16 06:40:59 +00:00
Chris Lattner	eb664fc504	I got J and K backward, many thanks to Eli for spotting this! llvm-svn: 73372	2009-06-15 04:39:05 +00:00
Chris Lattner	e427a956ca	implement support for the 'K' asm constraint, PR4347 llvm-svn: 73366	2009-06-15 04:01:39 +00:00
Arnold Schwaighofer	780e3addf8	Fix Bug 4278: X86-64 with -tailcallopt calling convention out of sync with regular cc. The only difference between the tail call cc and the normal cc was that one parameter register - R9 - was reserved for calling functions through a function pointer. After time the tail call cc has gotten out of sync with the regular cc. We can use R11 which is also caller saved but not used as parameter register for potential function pointers and remove the special tail call cc on x86-64. llvm-svn: 73233	2009-06-12 16:26:57 +00:00
Anton Korobeynikov	1447d902e3	Silence a warning llvm-svn: 73152	2009-06-09 23:00:39 +00:00
Eli Friedman	1609a6524f	Get rid of some unnecessary code. llvm-svn: 73017	2009-06-07 07:28:45 +00:00
Eli Friedman	d4b463b0dc	Slightly generalize the code that handles shuffles of consecutive loads on x86 to handle more cases. Fix a bug in said code that would cause it to read past the end of an object. Rewrite the code in SelectionDAGLegalize::ExpandBUILD_VECTOR to be a bit more general. Remove PerformBuildVectorCombine, which is no longer necessary with these changes. In addition to simplifying the code, with this change, we can now catch a few more cases of consecutive loads. llvm-svn: 73012	2009-06-07 06:52:44 +00:00
Eli Friedman	4395222136	Avoid crashing on a variable-index insertelement with element type i16. llvm-svn: 72991	2009-06-06 06:32:50 +00:00
Eli Friedman	e546f94ef5	Get rid of some bogus patterns for X86vzmovl. Don't create VZEXT_MOVL nodes for vectors with an i16 element type. Add an optimization for building a vector which is all zeros/undef except for the bottom element, where the bottom element is an i8 or i16. llvm-svn: 72988	2009-06-06 06:05:10 +00:00
Eli Friedman	05eef883e8	PR2598: make sure to expand illegal forms of integer/floating-point conversions for x86, like <2 x i32> -> <2 x float> and <4 x i16> -> <4 x float>. llvm-svn: 72983	2009-06-06 03:57:58 +00:00
Devang Patel	8d170194e8	Add new function attribute - noimplicitfloat Update code generator to use this attribute and remove NoImplicitFloat target option. Update llc to set this attribute when -no-implicit-float command line option is used. llvm-svn: 72959	2009-06-05 21:57:13 +00:00
Nate Begeman	058d4eeccf	Adapt the x86 build_vector dagcombine to the current state of the legalizer. build vectors with i64 elements will only appear on 32b x86 before legalize. Since vector widening occurs during legalize, and produces i64 build_vector elements, the dag combiner is never run on these before legalize splits them into 32b elements. Teach the build_vector dag combine in x86 back end to recognize consecutive loads producing the low part of the vector. Convert the two uses of TLI's consecutive load recognizer to pass LoadSDNodes since that was required implicitly. Add a testcase for the transform. Old: subl $28, %esp movl 32(%esp), %eax movl 4(%eax), %ecx movl %ecx, 4(%esp) movl (%eax), %eax movl %eax, (%esp) movaps (%esp), %xmm0 pmovzxwd %xmm0, %xmm0 movl 36(%esp), %eax movaps %xmm0, (%eax) addl $28, %esp ret New: movl 4(%esp), %eax pmovzxwd (%eax), %xmm0 movl 8(%esp), %eax movaps %xmm0, (%eax) ret llvm-svn: 72957	2009-06-05 21:37:30 +00:00
Devang Patel	d0745140a3	Evan thinks NoImplicitFloat check is not required here. llvm-svn: 72954	2009-06-05 18:48:29 +00:00
Dan Gohman	273546fbdc	Remove unnecessary #includes. llvm-svn: 72782	2009-06-03 16:47:12 +00:00
Dale Johannesen	8b6ee9e312	Revert 72707 and 72709, for the moment. llvm-svn: 72712	2009-06-02 03:12:52 +00:00
Dale Johannesen	c08669561e	Make the implicit inputs and outputs of target-independent ADDC/ADDE use MVT::i1 (later, whatever it gets legalized to) instead of MVT::Flag. Remove CARRY_FALSE in favor of 0; adjust all target-independent code to use this format. Most targets will still produce a Flag-setting target-dependent version when selection is done. X86 is converted to use i32 instead, which means TableGen needs to produce different code in xxxGenDAGISel.inc. This keys off the new supportsHasI1 bit in xxxInstrInfo, currently set only for X86; in principle this is temporary and should go away when all other targets have been converted. All relevant X86 instruction patterns are modified to represent setting and using EFLAGS explicitly. The same can be done on other targets. The immediate behavior change is that an ADC/ADD pair are no longer tightly coupled in the X86 scheduler; they can be separated by instructions that don't clobber the flags (MOV). I will soon add some peephole optimizations based on using other instructions that set the flags to feed into ADC. llvm-svn: 72707	2009-06-01 23:27:20 +00:00
Bill Wendling	8235a05c1a	Untabification. llvm-svn: 72604	2009-05-30 01:09:53 +00:00
Evan Cheng	40810c4d1b	Added optimization that narrow load / op / store and the 'op' is a bit twiddling instruction and its second operand is an immediate. If bits that are touched by 'op' can be done with a narrower instruction, reduce the width of the load and store as well. This happens a lot with bitfield manipulation code. e.g. orl $65536, 8(%rax) => orb $1, 10(%rax) Since narrowing is not always a win, e.g. i32 -> i16 is a loss on x86, dag combiner consults with the target before performing the optimization. llvm-svn: 72507	2009-05-28 00:35:15 +00:00
Eli Friedman	9a87deee7e	Ger rid of some dead code. llvm-svn: 72494	2009-05-27 20:39:00 +00:00
Eli Friedman	b8c9f7ee35	Don't abuse the quirky behavior of LegalizeDAG for XINT_TO_FP and FP_TO_XINT. Necessary for some cleanups I'm working on. Updated from the previous version (r72431) to fix a bug and make some things a bit clearer. llvm-svn: 72445	2009-05-27 00:47:34 +00:00
Daniel Dunbar	75f52bda74	Back out r72431, it is causing a number of compilation crashes with clang. llvm-svn: 72436	2009-05-26 21:27:02 +00:00
Eli Friedman	f7d0c01ed6	Don't abuse the quirky behavior of LegalizeDAG for XINT_TO_FP and FP_TO_XINT. Necessary for some cleanups I'm working on. llvm-svn: 72431	2009-05-26 19:18:56 +00:00
Eli Friedman	f4d25bb2b6	Make the X86 backend mark EXTRACT_SUBVECTOR as Expand, at least for the moment. llvm-svn: 72350	2009-05-23 22:44:52 +00:00
Eli Friedman	d877b76d14	Make the x86 backend custom-lower UINT_TO_FP and FP_TO_UINT on 32-bit systems instead of attempting to promote them to a 64-bit SINT_TO_FP or FP_TO_SINT. This is in preparation for removing the type legalization code from LegalizeDAG: once type legalization is gone from LegalizeDAG, it won't be able to handle the i64 operand/result correctly. This isn't quite ideal, but I don't think any other operation for any target ends up in this situation, so treating this case specially seems reasonable. llvm-svn: 72324	2009-05-23 09:59:16 +00:00
Evan Cheng	9bd08f0cde	Run code placement optimization for targets that want it (arm and x86 for now). llvm-svn: 71726	2009-05-13 21:42:09 +00:00
Chris Lattner	7b2dabcac9	Fix PR4152: asm constraint validation happens before dag combine, so we need to work a bit to combine things like (x+c1+c2) into x+c3. llvm-svn: 71232	2009-05-08 18:23:14 +00:00
Nate Begeman	b407809122	Fix infinite recursion in the C++ code which handles movddup by making it unnecessary. llvm-svn: 70425	2009-04-29 22:47:44 +00:00
Nate Begeman	414534b3eb	Implement review feedback for vector shuffle work. llvm-svn: 70372	2009-04-29 05:20:52 +00:00
Nate Begeman	9d121924fd	2nd attempt, fixing SSE4.1 issues and implementing feedback from duncan. PR2957 ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes as the shuffle mask. A value of -1 represents UNDEF. In addition to eliminating the creation of illegal BUILD_VECTORS just to represent shuffle masks, we are better about canonicalizing the shuffle mask, resulting in substantially better code for some classes of shuffles. llvm-svn: 70225	2009-04-27 18:41:29 +00:00
Rafael Espindola	4e7a0bf1f1	Fix PR 4004 by including the call to __tls_get_addr in X86tlsaddr. This is not very elegant, but neither is the tls specification :-( llvm-svn: 69968	2009-04-24 12:59:40 +00:00
Rafael Espindola	0b1037ad26	Revert 69952. Causes testsuite failures on linux x86-64. llvm-svn: 69967	2009-04-24 12:40:33 +00:00
Nate Begeman	c1a09c7dfa	PR2957 ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes as the shuffle mask. A value of -1 represents UNDEF. In addition to eliminating the creation of illegal BUILD_VECTORS just to represent shuffle masks, we are better about canonicalizing the shuffle mask, resulting in substantially better code for some classes of shuffles. A clean up of x86 shuffle code, and some canonicalizing in DAGCombiner is next. llvm-svn: 69952	2009-04-24 03:42:54 +00:00
Duncan Sands	58c9c564a9	Get rid of what looks like a copy-and-pasted typo. Spotted by gcc-4.5. llvm-svn: 69673	2009-04-21 09:44:39 +00:00
Bob Wilson	f7e9ff1d28	Move duplicated AddLiveIn function from X86 and ARM backends to be a method in the MachineFunction class, renaming it to addLiveIn for consistency with the same method in MachineBasicBlock. Thanks for Anton for suggesting this. llvm-svn: 69615	2009-04-20 18:36:57 +00:00
Rafael Espindola	d74132e2c5	For general dynamic TLS access we must use leaq foo@TLSGD(%rip), %rdi as part of the instruction sequence. Using a register other than %rdi and then copying it to %rdi is not valid. llvm-svn: 69350	2009-04-17 14:35:58 +00:00
Rafael Espindola	72347bffce	X86-64 TLS support for local exec and initial exec. llvm-svn: 68947	2009-04-13 13:02:49 +00:00
Dan Gohman	8121b3f88d	Remove the obsolete SelectionDAG::getNodeValueTypes and simplify code that uses it by using SelectionDAG::getVTList instead. llvm-svn: 68744	2009-04-09 23:54:40 +00:00
Dan Gohman	6cb1387261	Fix grammaros in comments. llvm-svn: 68666	2009-04-09 02:06:09 +00:00
Rafael Espindola	7eb72dc5f2	Re-apply 68552. Tested by bootstrapping llvm-gcc and using that to build llvm. llvm-svn: 68645	2009-04-08 21:14:34 +00:00
Rafael Espindola	d4563305fd	Avoid a hard coded constant. llvm-svn: 68603	2009-04-08 08:09:33 +00:00
Dan Gohman	c9ce27d6b7	Implement support for using modeling implicit-zero-extension on x86-64 with SUBREG_TO_REG, teach SimpleRegisterCoalescing to coalesce SUBREG_TO_REG instructions (which are similar to INSERT_SUBREG instructions), and teach the DAGCombiner to take advantage of this on targets which support it. This eliminates many redundant zero-extension operations on x86-64. This adds a new TargetLowering hook, isZExtFree. It's similar to isTruncateFree, except it only applies to actual definitions, and not no-op truncates which may not zero the high bits. Also, this adds a new optimization to SimplifyDemandedBits: transform operations like x+y into (zext (add (trunc x), (trunc y))) on targets where all the casts are no-ops. In contexts where the high part of the add is explicitly masked off, this allows the mask operation to be eliminated. Fix the DAGCombiner to avoid undoing these transformations to eliminate casts on targets where the casts are no-ops. Also, this adds a new two-address lowering heuristic. Since two-address lowering runs before coalescing, it helps to be able to look through copies when deciding whether commuting and/or three-address conversion are profitable. Also, fix a bug in LiveInterval::MergeInClobberRanges. It didn't handle the case that a clobber range extended both before and beyond an existing live range. In that case, multiple live ranges need to be added. This was exposed by the new subreg coalescing code. Remove 2008-05-06-SpillerBug.ll. It was bugpoint-reduced, and the spiller behavior it was looking for no longer occurrs with the new instruction selection. llvm-svn: 68576	2009-04-08 00:15:30 +00:00
Bill Wendling	6e702cf68c	Temporarily revert r68552. This was causing a failure in the self-hosting LLVM builds. --- Reverse-merging (from foreign repository) r68552 into '.': U test/CodeGen/X86/tls8.ll U test/CodeGen/X86/tls10.ll U test/CodeGen/X86/tls2.ll U test/CodeGen/X86/tls6.ll U lib/Target/X86/X86Instr64bit.td U lib/Target/X86/X86InstrSSE.td U lib/Target/X86/X86InstrInfo.td U lib/Target/X86/X86RegisterInfo.cpp U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86CodeEmitter.cpp U lib/Target/X86/X86FastISel.cpp U lib/Target/X86/X86InstrInfo.h U lib/Target/X86/X86ISelDAGToDAG.cpp U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.cpp U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.h U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.h U lib/Target/X86/X86ISelLowering.h U lib/Target/X86/X86InstrInfo.cpp U lib/Target/X86/X86InstrBuilder.h U lib/Target/X86/X86RegisterInfo.td llvm-svn: 68560	2009-04-07 22:35:25 +00:00
Rafael Espindola	0324937229	Reduce code duplication on the TLS implementation. This introduces a small regression on the generated code quality in the case we are just computing addresses, not loading values. Will work on it and on X86-64 support. llvm-svn: 68552	2009-04-07 21:37:46 +00:00
Mon P Wang	f829fb5cab	Added a x86 dag combine to increase the chances to use a movq for v2i64 on x86-32. llvm-svn: 68368	2009-04-03 02:43:30 +00:00
Chris Lattner	f1719bf7b5	silence warning in release-asserts build. llvm-svn: 68253	2009-04-01 22:14:45 +00:00
Evan Cheng	44fdb5d570	i128 shift libcalls are not available on x86. llvm-svn: 68133	2009-03-31 19:38:51 +00:00
Evan Cheng	3e30bcbd69	When optimzing a mul by immediate into two, the resulting mul's should get a x86 specific node to avoid dag combiner from hacking on them further. llvm-svn: 68066	2009-03-30 21:36:47 +00:00
Rafael Espindola	37522e768a	Have only one definition of X86AddrNumOperands. llvm-svn: 67949	2009-03-28 18:55:31 +00:00
Evan Cheng	a15fdaa292	Optimize some 64-bit multiplication by constants into two lea's or one lea + shl since imulq is slow (latency 5). e.g. x * 40 => shlq $3, %rdi leaq (%rdi,%rdi,4), %rax This has the added benefit of allowing more multiply to be folded into addressing mode. e.g. a * 24 + b => leaq (%rdi,%rdi,2), %rax leaq (%rsi,%rax,8), %rax llvm-svn: 67917	2009-03-28 05:57:29 +00:00
Rafael Espindola	38604d9598	I am trying to add a segment to the X86 addresses matching to improve TLS support (see http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090309/075220.html), but that code is VERY brittle. This patch just makes it a bit more resistant. llvm-svn: 67843	2009-03-27 15:26:30 +00:00
Evan Cheng	ab6e38c88d	-no-implicit-float means explicit fp operations are legal. llvm-svn: 67784	2009-03-26 23:06:32 +00:00
Bill Wendling	f4247ff478	Pull transform from target-dependent code into target-independent code. llvm-svn: 67742	2009-03-26 06:14:09 +00:00
Bill Wendling	f79eccc675	Match this pattern so that we can generate simpler code: %a = ... %b = and i32 %a, 2 %c = srl i32 %b, 1 %d = br i32 %c, into %a = ... %b = and %a, 2 %c = X86ISD::CMP %b, 0 %d = X86ISD::BRCOND %c ... This applies only when the AND constant value has one bit set and the SRL constant is equal to the log2 of the AND constant. The back-end is smart enough to convert the result into a TEST/JMP sequence. llvm-svn: 67728	2009-03-26 01:47:50 +00:00
Bill Wendling	2fe64f48aa	These instructions have special lowering that may lower them to SSE instructions. Prevent that if we don't want implicit uses of SSE. llvm-svn: 66877	2009-03-13 08:41:47 +00:00
Evan Cheng	f9951d1557	Fix some significant problems with constant pools that resulted in unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues. 1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants. 2. MachineConstantPool alignment field is also a log2 value. 3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values. 4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries. 5. Asm printer uses expensive data structure multimap to track constant pool entries by sections. 6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic. Solutions: 1. ConstantPoolSDNode alignment field is changed to keep non-log2 value. 2. MachineConstantPool alignment field is also changed to keep non-log2 value. 3. Functions that create ConstantPool nodes are passing in non-log2 alignments. 4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT. 5. Asm printer uses cheaper data structure to group constant pool entries. 6. Asm printer compute entry offsets after grouping is done. 7. Change JIT code to compute entry offsets on the fly. llvm-svn: 66875	2009-03-13 07:51:59 +00:00
Chris Lattner	cbbdd230dd	generalize the previous code to use the full generality of LEA for i32/i64 expressions (we could also do i16 on cpus where i16 lea is fast, but I didn't add this). On the example, we now generate: _test: movl 4(%esp), %eax cmpl $42, (%eax) setl %al movzbl %al, %eax leal 4(%eax,%eax,8), %eax ret instead of: _test: movl 4(%esp), %eax cmpl $41, (%eax) movl $4, %ecx movl $13, %eax cmovg %ecx, %eax ret llvm-svn: 66869	2009-03-13 05:53:31 +00:00
Chris Lattner	878d951f8f	optimize the case of cond ? 42 : 41 and friends. This compiles the example to: _test: movl 4(%esp), %eax cmpl $41, (%eax) setg %al movzbl %al, %eax orl $4294967294, %eax ret instead of: movl 4(%esp), %eax cmpl $41, (%eax) movl $4294967294, %ecx movl $4294967295, %eax cmova %ecx, %eax ret which is smaller in code size and faster. rdar://6668608 llvm-svn: 66868	2009-03-13 05:22:11 +00:00
Chris Lattner	26a971c4ec	Move 3 "(add (select cc, 0, c), x) -> (select cc, x, (add, x, c))" related transformations out of target-specific dag combine into the ARM backend. These were added by Evan in r37685 with no testcases and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll). Add some simple X86-specific (for now) DAG combines that turn things like cond ? 8 : 0 -> (zext(cond) << 3). This happens frequently with the recently added cp constant select optimization, but is a very general xform. For example, we now compile the second example in const-select.ll to: _test: movsd LCPI2_0, %xmm0 ucomisd 8(%esp), %xmm0 seta %al movzbl %al, %eax movl 4(%esp), %ecx movsbl (%ecx,%eax,4), %eax ret instead of: _test: movl 4(%esp), %eax leal 4(%eax), %ecx movsd LCPI2_0, %xmm0 ucomisd 8(%esp), %xmm0 cmovbe %eax, %ecx movsbl (%ecx), %eax ret This passes multisource and dejagnu. llvm-svn: 66779	2009-03-12 06:52:53 +00:00
Evan Cheng	46e903d2f6	On x86, if the only use of a i64 load is a i64 store, generate a pair of double load and store instead. llvm-svn: 66776	2009-03-12 05:59:15 +00:00
Bill Wendling	fca05e3a5c	Add a -no-implicit-float flag. This acts like -soft-float, but may generate floating point instructions that are explicitly specified by the user. llvm-svn: 66719	2009-03-11 22:30:01 +00:00
Mon P Wang	287e422039	For yonah, fix a vector shuffle case for v16i8 where we didn't properly clear some bits. llvm-svn: 66684	2009-03-11 18:47:57 +00:00
Mon P Wang	2867737ad2	Fixed a v8i16 shuffle case that should generate a pshufb instead of a pshuflw/hw. llvm-svn: 66645	2009-03-11 06:35:11 +00:00
Chris Lattner	eb9327f335	formatting change, reduce indentation. No functionality change. llvm-svn: 66642	2009-03-11 05:48:52 +00:00
Dan Gohman	b9c32f1aca	Arithmetic instructions don't set EFLAGS bits OF and CF bits the same say the "test" instruction does in overflow cases, so eliminating the test is only safe when those bits aren't needed, as is the case for COND_E and COND_NE, or if it can be proven that no overflow will occur. For now, just restrict the optimization to COND_E and COND_NE and don't do any overflow analysis. llvm-svn: 66318	2009-03-07 01:58:32 +00:00
Dan Gohman	1e9db7c1a1	When creating X86ISD::INC and X86ISD::DEC nodes, only add one operand. The extra operand didn't appear to cause any trouble, but it was erroneous regardless. llvm-svn: 66206	2009-03-05 21:29:28 +00:00
Dan Gohman	f6f684b206	Fix the "test" optimization to recognize "dec" as an add of negative one, as subtracts of immediates are canonicalized to adds. llvm-svn: 66180	2009-03-05 19:32:48 +00:00
Dan Gohman	31fb085c2e	Re-apply 66008, now that the unfoldMemoryOperand bug is fixed. llvm-svn: 66058	2009-03-04 19:44:21 +00:00

1 2 3 4 5 ...

1130 Commits