llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-28 22:42:52 +01:00

Author	SHA1	Message	Date
Stuart Hastings	4f72cea26a	Recognize another euphemism for MOVDQ2Q. llvm-svn: 72808	2009-06-03 21:39:14 +00:00
Evan Cheng	b71402d6ae	For Darwin / x86_64, override -relocation-model=static to pic if the output is assembly since Darwin assembler does not really support -static codeine. I view this as a temporary workaround until the assembler / linker changes. llvm-svn: 72806	2009-06-03 21:13:54 +00:00
Dan Gohman	691dd710e9	Remove the redundant TM member from X86DAGToDAGISel; replace it with an accessor method which simply casts the parent class SelectionDAGISel's TM to the target-specific type. llvm-svn: 72801	2009-06-03 20:20:00 +00:00
Dan Gohman	273546fbdc	Remove unnecessary #includes. llvm-svn: 72782	2009-06-03 16:47:12 +00:00
Duncan Sands	ab3e57c63f	Avoid a warning "'U' might be used uninitialized in this function" when using a not-too-smart compiler. llvm-svn: 72768	2009-06-03 12:05:18 +00:00
Dan Gohman	609f627ed7	Revert r72734. The Darwin assembler doesn't support the static relocation model on x86-64. Higher level logic should override the relocation model to PIC on x86_64-apple-darwin. llvm-svn: 72746	2009-06-03 00:37:20 +00:00
Evan Cheng	7e66d61bec	On Darwin x86_64 small code model doesn't guarantee code address fits in 32-bit. llvm-svn: 72734	2009-06-02 20:09:31 +00:00
Dale Johannesen	8b6ee9e312	Revert 72707 and 72709, for the moment. llvm-svn: 72712	2009-06-02 03:12:52 +00:00
Dale Johannesen	fe3b3add52	Add missing file. llvm-svn: 72709	2009-06-01 23:48:58 +00:00
Dale Johannesen	c08669561e	Make the implicit inputs and outputs of target-independent ADDC/ADDE use MVT::i1 (later, whatever it gets legalized to) instead of MVT::Flag. Remove CARRY_FALSE in favor of 0; adjust all target-independent code to use this format. Most targets will still produce a Flag-setting target-dependent version when selection is done. X86 is converted to use i32 instead, which means TableGen needs to produce different code in xxxGenDAGISel.inc. This keys off the new supportsHasI1 bit in xxxInstrInfo, currently set only for X86; in principle this is temporary and should go away when all other targets have been converted. All relevant X86 instruction patterns are modified to represent setting and using EFLAGS explicitly. The same can be done on other targets. The immediate behavior change is that an ADC/ADD pair are no longer tightly coupled in the X86 scheduler; they can be separated by instructions that don't clobber the flags (MOV). I will soon add some peephole optimizations based on using other instructions that set the flags to feed into ADC. llvm-svn: 72707	2009-06-01 23:27:20 +00:00
Bruno Cardoso Lopes	7765059062	Fix new CodeEmitter stuff to follow LLVM codying style. Patch by Aaron Gray llvm-svn: 72697	2009-06-01 19:57:37 +00:00
Dan Gohman	b38d9b6a57	Fix a grammaro and clarify a comment. llvm-svn: 72668	2009-05-31 17:52:18 +00:00
Bruno Cardoso Lopes	4da7e7af43	First patch in the direction of splitting MachineCodeEmitter in two subclasses: JITCodeEmitter and ObjectCodeEmitter. No functional changes yet. Patch by Aaron Gray llvm-svn: 72631	2009-05-30 20:51:52 +00:00
Evan Cheng	2d198e1bc2	(i64 (zext (srl GR32 8))) -> movzbl AH is not safe since srl 8 only clear the top 8 bits. llvm-svn: 72618	2009-05-30 08:43:27 +00:00
Bill Wendling	8235a05c1a	Untabification. llvm-svn: 72604	2009-05-30 01:09:53 +00:00
Evan Cheng	550fc9ba9f	More h-registers tricks: folding zext nodes. llvm-svn: 72558	2009-05-29 01:44:43 +00:00
Bill Wendling	dd6cbdb28c	The MONITOR and MWAIT instructions have insufficient information for decoding. Essentially, they both map to the same column in the "opcode extensions for one- and two-byte opcodes" table in the x86 manual. The RawFrm complicates decoding this. Instead, use opcode 0x01, prefix 0x01, and form MRM1r. Then have the code emitter special case these, a la [SML]FENCE. llvm-svn: 72556	2009-05-28 23:40:46 +00:00
Evan Cheng	3d35b7e54c	Fix MOVMSKPDrr encoding. llvm-svn: 72535	2009-05-28 18:55:28 +00:00
Evan Cheng	99643b717c	Fix PSIGND encoding bug. Patch by Sean Callanan. llvm-svn: 72534	2009-05-28 18:48:53 +00:00
Bill Wendling	8a8d271d29	"The instructions MMX_PSADBWrm and MMX_PSADBWrr have opcode 0b11100000 (e0), but the Intel manual (screenshot) says it should be 0b11110110 (f6). The existing encoding causes a disassembly conflict with MMX_PAVGBrm, which really should be 0f e0." Patch by Sean Callanan! llvm-svn: 72508	2009-05-28 02:04:00 +00:00
Evan Cheng	40810c4d1b	Added optimization that narrow load / op / store and the 'op' is a bit twiddling instruction and its second operand is an immediate. If bits that are touched by 'op' can be done with a narrower instruction, reduce the width of the load and store as well. This happens a lot with bitfield manipulation code. e.g. orl $65536, 8(%rax) => orb $1, 10(%rax) Since narrowing is not always a win, e.g. i32 -> i16 is a loss on x86, dag combiner consults with the target before performing the optimization. llvm-svn: 72507	2009-05-28 00:35:15 +00:00
Eli Friedman	9a87deee7e	Ger rid of some dead code. llvm-svn: 72494	2009-05-27 20:39:00 +00:00
Evan Cheng	9fd570a8a1	Fix sfence jit encoding. Patch by Sean Callanan. llvm-svn: 72488	2009-05-27 18:38:01 +00:00
Eli Friedman	b8c9f7ee35	Don't abuse the quirky behavior of LegalizeDAG for XINT_TO_FP and FP_TO_XINT. Necessary for some cleanups I'm working on. Updated from the previous version (r72431) to fix a bug and make some things a bit clearer. llvm-svn: 72445	2009-05-27 00:47:34 +00:00
Daniel Dunbar	75f52bda74	Back out r72431, it is causing a number of compilation crashes with clang. llvm-svn: 72436	2009-05-26 21:27:02 +00:00
Stefanus Du Toit	031dcf315f	Update CPU capabilities for AMD machines - added processors k8-sse3, opteron-sse3, athlon64-sse3, amdfam10, and barcelona with appropriate sse3/4a levels - added FeatureSSE4A for amdfam10 processors in X86Subtarget: - added hasSSE4A - updated AutoDetectSubtargetFeatures to detect SSE4A - updated GetCurrentX86CPU to detect family 15 with sse3 as k8-sse3 and family 10h as amdfam10 New processor names match those used by gcc. Patch by Paul Redmond! llvm-svn: 72434	2009-05-26 21:04:35 +00:00
Eli Friedman	f7d0c01ed6	Don't abuse the quirky behavior of LegalizeDAG for XINT_TO_FP and FP_TO_XINT. Necessary for some cleanups I'm working on. llvm-svn: 72431	2009-05-26 19:18:56 +00:00
Chris Lattner	636bd70540	add some late optimizations that GCC does. It thinks these are a win even on Core2, not just AMD processors which was a surprise to me. llvm-svn: 72396	2009-05-25 20:28:19 +00:00
Chris Lattner	60098b3e93	we should eventually add -march=atom and the new atom movbe instruction. llvm-svn: 72387	2009-05-25 16:34:44 +00:00
Eli Friedman	f4d25bb2b6	Make the X86 backend mark EXTRACT_SUBVECTOR as Expand, at least for the moment. llvm-svn: 72350	2009-05-23 22:44:52 +00:00
Anton Korobeynikov	34fc85e2ee	Propagate CPU string out of SubtargetFeatures llvm-svn: 72335	2009-05-23 19:50:50 +00:00
Eli Friedman	d877b76d14	Make the x86 backend custom-lower UINT_TO_FP and FP_TO_UINT on 32-bit systems instead of attempting to promote them to a 64-bit SINT_TO_FP or FP_TO_SINT. This is in preparation for removing the type legalization code from LegalizeDAG: once type legalization is gone from LegalizeDAG, it won't be able to handle the i64 operand/result correctly. This isn't quite ideal, but I don't think any other operation for any target ends up in this situation, so treating this case specially seems reasonable. llvm-svn: 72324	2009-05-23 09:59:16 +00:00
Evan Cheng	e17c02e328	Try again. Allow call to immediate address for ELF or when in static relocation mode. llvm-svn: 72160	2009-05-20 04:53:57 +00:00
Evan Cheng	8a4887572e	Cannot use immediate as call absolute target in PIC mode. llvm-svn: 72154	2009-05-20 01:11:00 +00:00
Dale Johannesen	a0756109d8	Add OpSize to 16-bit ADC and SBB. llvm-svn: 72045	2009-05-18 21:41:59 +00:00
Dale Johannesen	6efc155312	Fill in the missing patterns for ADC and SBB. Some comment cleanup. llvm-svn: 72022	2009-05-18 17:44:15 +00:00
Mike Stump	a25bd435de	Reflow to fit 80-col. llvm-svn: 71813	2009-05-14 23:23:37 +00:00
Mike Stump	cd9198dd91	Reflow to fit 80-col. llvm-svn: 71812	2009-05-14 23:22:47 +00:00
Evan Cheng	9bd08f0cde	Run code placement optimization for targets that want it (arm and x86 for now). llvm-svn: 71726	2009-05-13 21:42:09 +00:00
Bill Wendling	e421c8f63d	Change MachineInstrBuilder::addReg() to take a flag instead of a list of booleans. This gives a better indication of what the "addReg()" is doing. Remembering what all of those booleans mean isn't easy, especially if you aren't spending all of your time in that code. I took Jakob's suggestion and made it illegal to pass in "true" for the flag. This should hopefully prevent any unintended misuse of this (by reverting to the old way of using addReg()). llvm-svn: 71722	2009-05-13 21:33:08 +00:00
Bill Wendling	a6f172b0f2	More MSVC fixes -- class/struct conflicts. llvm-svn: 71601	2009-05-12 21:55:29 +00:00
Evan Cheng	7f78e63cf3	80 col violations. llvm-svn: 71582	2009-05-12 20:17:52 +00:00
Evan Cheng	96cd1decc6	Avoid unneeded SIB byte encoding. Patch by Zoltan Varga. llvm-svn: 71520	2009-05-12 00:07:35 +00:00
Dan Gohman	0edabc8a6f	Convert a subtract into a negate and an add when it helps x86 address folding. llvm-svn: 71446	2009-05-11 18:02:53 +00:00
Duncan Sands	f7af13b2d4	Rename PaddedSize to AllocSize, in the hope that this will make it more obvious what it represents, and stop it being confused with the StoreSize. llvm-svn: 71349	2009-05-09 07:06:46 +00:00
Anton Korobeynikov	b3dc881070	Factor out cycle-finder code and make it generic. llvm-svn: 71241	2009-05-08 18:51:58 +00:00
Chris Lattner	7b2dabcac9	Fix PR4152: asm constraint validation happens before dag combine, so we need to work a bit to combine things like (x+c1+c2) into x+c3. llvm-svn: 71232	2009-05-08 18:23:14 +00:00
Evan Cheng	2a1d20b0fb	Optimize code placement in loop to eliminate unconditional branches or move unconditional branch to the outside of the loop. e.g. /// A: /// ... /// <fallthrough to B> /// /// B: --> loop header /// ... /// jcc <cond> C, [exit] /// /// C: /// ... /// jmp B /// /// ==> /// /// A: /// ... /// jmp B /// /// C: --> new loop header /// ... /// <fallthough to B> /// /// B: /// ... /// jcc <cond> C, [exit] llvm-svn: 71209	2009-05-08 06:34:09 +00:00
Dale Johannesen	b2cf4944c2	Use X86AddrNumOperands instead of magic constant one more place. This fixes a bunch of x86-64 JIT regressions. (Introduced when the value of the magic constant changed in 68645. At the time apparently nobody noticed; failures were hidden in 70343-70439 by an unrelated bug, so showed up again as "new" failures in 70440.) llvm-svn: 71106	2009-05-06 19:04:30 +00:00
Chris Lattner	5cc9a36d1c	Add basic support for code generation of addrspace(257) -> FS relative on x86. Patch by Zoltan Varga! llvm-svn: 70992	2009-05-05 18:52:19 +00:00
Evan Cheng	138eed76c7	Revert part of 70929 that has to do with determining whether a SIB byte is needed. It causes a lot of x86_64 JIT failures. llvm-svn: 70986	2009-05-05 18:18:57 +00:00
Evan Cheng	6843d3293b	- Avoid the longer SIB encoding on x86_64 when it's not needed. - Synchronize instruction length computation code in X86InstrInfo with code in X86CodeEmitter.cpp Patch by Zoltan Varga. llvm-svn: 70929	2009-05-04 22:49:16 +00:00
Dan Gohman	2973567a95	X86FastISel doesn't support the -tailcallopt ABI. llvm-svn: 70902	2009-05-04 19:50:33 +00:00
Argyrios Kyrtzidis	123a0fb56f	Fix compilation for some targets other than x86. llvm-svn: 70522	2009-04-30 23:50:26 +00:00
Argyrios Kyrtzidis	9956976b76	Make DebugLoc independent of DwarfWriter. -Replace DebugLocTuple's Source ID with CompileUnit's GlobalVariable* -Remove DwarfWriter::getOrCreateSourceID -Make necessary changes for the above (fix callsites, etc.) llvm-svn: 70520	2009-04-30 23:22:31 +00:00
Dan Gohman	8a0e27efb2	Set mayLoad on MOVZX32_NOREXrm8 too. llvm-svn: 70466	2009-04-30 03:11:48 +00:00
Evan Cheng	b7d41a6680	Mark MOV8mr_NOREX and MOV8rm_NOREX as mayStore / mayLoad respectively. llvm-svn: 70461	2009-04-30 00:58:57 +00:00
Bill Wendling	40a162f75f	Instead of passing in an unsigned value for the optimization level, use an enum, which better identifies what the optimization is doing. And is more flexible for future uses. llvm-svn: 70440	2009-04-29 23:29:43 +00:00
Nate Begeman	b407809122	Fix infinite recursion in the C++ code which handles movddup by making it unnecessary. llvm-svn: 70425	2009-04-29 22:47:44 +00:00
Nate Begeman	414534b3eb	Implement review feedback for vector shuffle work. llvm-svn: 70372	2009-04-29 05:20:52 +00:00
Bill Wendling	7546bed590	Second attempt: Massive check in. This changes the "-fast" flag to "-O#" in llc. If you want to use the old behavior, the flag is -O0. This change allows for finer-grained control over which optimizations are run at different -O levels. Most of this work was pretty mechanical. The majority of the fixes came from verifying that a "fast" variable wasn't used anymore. The JIT still uses a "Fast" flag. I'll change the JIT with a follow-up patch. llvm-svn: 70343	2009-04-29 00:15:41 +00:00
Anton Korobeynikov	1799ac4b55	Properly print 'P' modifier on inline asm memory operands. This should fix PR3379 and PR4064. Patch inspired by Edwin Török! llvm-svn: 70328	2009-04-28 21:49:33 +00:00
Bill Wendling	ef47ace92f	r70270 isn't ready yet. Back this out. Sorry for the noise. llvm-svn: 70275	2009-04-28 01:04:53 +00:00
Bill Wendling	2799e916c3	Massive check in. This changes the "-fast" flag to "-O#" in llc. If you want to use the old behavior, the flag is -O0. This change allows for finer-grained control over which optimizations are run at different -O levels. Most of this work was pretty mechanical. The majority of the fixes came from verifying that a "fast" variable wasn't used anymore. The JIT still uses a "Fast" flag. I'm not 100% sure if it's necessary to change it there... llvm-svn: 70270	2009-04-28 00:21:31 +00:00
Nate Begeman	9d121924fd	2nd attempt, fixing SSE4.1 issues and implementing feedback from duncan. PR2957 ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes as the shuffle mask. A value of -1 represents UNDEF. In addition to eliminating the creation of illegal BUILD_VECTORS just to represent shuffle masks, we are better about canonicalizing the shuffle mask, resulting in substantially better code for some classes of shuffles. llvm-svn: 70225	2009-04-27 18:41:29 +00:00
Dan Gohman	a241dec2fc	Rename GR8_ABCD to GR8_ABCD_L and create GR8_ABCD_H, and use these to precisely describe the h-register subreg register classes. Thanks to Jakob Stoklund Olesen for spotting this and for the initial patch! Also, make getStoreRegOpcode and getLoadRegOpcode aware of the needs of h registers. llvm-svn: 70211	2009-04-27 16:41:36 +00:00
Dan Gohman	180fa04e35	Rename GR8_, GR16_, GR32_, and GR64_ to GR8_ABCD, GR16_ABCD, GR32_ABCD, and GR64_ABCD, respectively, to help describe them. llvm-svn: 70210	2009-04-27 16:33:14 +00:00
Dan Gohman	885b9c3688	Break up long multi-mnemonic strings into separate lines for readability. llvm-svn: 70209	2009-04-27 15:13:28 +00:00
Mon P Wang	904d654436	Revised 68749 to allow matching of load/stores for address spaces < 256. llvm-svn: 70197	2009-04-27 07:22:10 +00:00
Chris Lattner	b47e34ac59	add support for detecting process features on win64, patch by Nicolas Capens! llvm-svn: 70057	2009-04-25 18:27:23 +00:00
Rafael Espindola	4e7a0bf1f1	Fix PR 4004 by including the call to __tls_get_addr in X86tlsaddr. This is not very elegant, but neither is the tls specification :-( llvm-svn: 69968	2009-04-24 12:59:40 +00:00
Rafael Espindola	0b1037ad26	Revert 69952. Causes testsuite failures on linux x86-64. llvm-svn: 69967	2009-04-24 12:40:33 +00:00
Nate Begeman	c1a09c7dfa	PR2957 ISD::VECTOR_SHUFFLE now stores an array of integers representing the shuffle mask internal to the node, rather than taking a BUILD_VECTOR of ConstantSDNodes as the shuffle mask. A value of -1 represents UNDEF. In addition to eliminating the creation of illegal BUILD_VECTORS just to represent shuffle masks, we are better about canonicalizing the shuffle mask, resulting in substantially better code for some classes of shuffles. A clean up of x86 shuffle code, and some canonicalizing in DAGCombiner is next. llvm-svn: 69952	2009-04-24 03:42:54 +00:00
Dan Gohman	6a4629e856	Add support for printing MO_ExternalSymbol operands in memory operand tuples. This doesn't ever come up in normal code however. llvm-svn: 69848	2009-04-23 00:57:37 +00:00
Duncan Sands	58c9c564a9	Get rid of what looks like a copy-and-pasted typo. Spotted by gcc-4.5. llvm-svn: 69673	2009-04-21 09:44:39 +00:00
Rafael Espindola	5adc7ad39e	TLS_addr64 and TLS_addr32 define RDI and EAX. They don't use them. This fixes PR4002. llvm-svn: 69672	2009-04-21 08:22:09 +00:00
Dan Gohman	de72d5129b	Make X86's copyRegToReg able to handle copies to and from subclasses. This makes the extra copyRegToReg calls in ScheduleDAGSDNodesEmit.cpp unnecessary. Derived from a patch by Jakob Stoklund Olesen. llvm-svn: 69635	2009-04-20 22:54:34 +00:00
Bob Wilson	f7e9ff1d28	Move duplicated AddLiveIn function from X86 and ARM backends to be a method in the MachineFunction class, renaming it to addLiveIn for consistency with the same method in MachineBasicBlock. Thanks for Anton for suggesting this. llvm-svn: 69615	2009-04-20 18:36:57 +00:00
Mon P Wang	4db825e615	Fixed a few 64 bit cases in X86InstrInfo::commuteInstruction llvm-svn: 69417	2009-04-18 05:16:01 +00:00
Bill Wendling	0476a0acf3	Recommit r69335 and r69336. These were not causing problems. llvm-svn: 69394	2009-04-17 22:40:38 +00:00
Rafael Espindola	d74132e2c5	For general dynamic TLS access we must use leaq foo@TLSGD(%rip), %rdi as part of the instruction sequence. Using a register other than %rdi and then copying it to %rdi is not valid. llvm-svn: 69350	2009-04-17 14:35:58 +00:00
Bill Wendling	073e1c91dd	Revert r69335 and r69336. They were causing build failures. llvm-svn: 69347	2009-04-17 04:19:22 +00:00
Dan Gohman	d254f36e54	MOV8rr_NOREX is a "Move" instruction. This doesn't currently matter, because this instruction isn't generated until after things that care. llvm-svn: 69336	2009-04-17 00:45:17 +00:00
Dan Gohman	2349973ff3	Don't use MOV8rr_NOREX on x86-32. It doesn't actually hurt anything at present, but it's inconsistent. llvm-svn: 69335	2009-04-17 00:43:09 +00:00
Rafael Espindola	a07d1c3103	fix PR3995. A scale must be 1, 2, 4 or 8. llvm-svn: 69284	2009-04-16 12:34:53 +00:00
Dan Gohman	38bc0faa22	Fix 80-column violations. llvm-svn: 69204	2009-04-15 19:48:57 +00:00
Dan Gohman	a2ec3156eb	Add a folding table entry for MOV8rr_NOREX. llvm-svn: 69203	2009-04-15 19:48:28 +00:00
Dan Gohman	2b965abea4	Fix X86MachineFunctionInfo's doxygen comment. llvm-svn: 69127	2009-04-15 01:20:18 +00:00
Dan Gohman	56227ee26e	Do for GR16_NOREX what r69049 did for GR8_NOREX, to avoid trouble with the local register allocator. llvm-svn: 69115	2009-04-15 00:10:16 +00:00
Dan Gohman	a1fe2a3741	Add a new MOV8rr_NOREX, and make X86's copyRegToReg use it when either the source or destination is a physical h register. This fixes sqlite3 with the post-RA scheduler enabled. llvm-svn: 69111	2009-04-15 00:04:23 +00:00
Dan Gohman	1e76e65007	GR8_NOREX can contain the H registers, since they don't require REX prefixes. llvm-svn: 69108	2009-04-15 00:00:48 +00:00
Dan Gohman	365c457893	For the h-register addressing-mode trick, use the correct value for any non-address uses of the address value. This fixes 186.crafty. llvm-svn: 69094	2009-04-14 22:45:05 +00:00
Evan Cheng	b64f2c1b08	Some of GR8_NOREX registers are only available in 64-bit mode. llvm-svn: 69049	2009-04-14 16:57:43 +00:00
Dan Gohman	8393d29bc8	Rename COPY_TO_SUBCLASS to COPY_TO_REGCLASS, and generalize it accordingly. Thanks to Jakob Stoklund Olesen for pointing out how this might be useful. llvm-svn: 68986	2009-04-13 21:06:25 +00:00
Devang Patel	ad7f61c279	Reapply 68847. Now debug_inlined section is covered by TAI->doesDwarfUsesInlineInfoSection(), which is false by default. llvm-svn: 68964	2009-04-13 17:02:03 +00:00
Dan Gohman	be7227005f	Implement x86 h-register extract support. - Add patterns for h-register extract, which avoids a shift and mask, and in some cases a temporary register. - Add address-mode matching for turning (X>>(8-n))&(255<<n), where n is a valid address-mode scale value, into an h-register extract and a scaled-offset address. - Replace X86's MOV32to32_ and related instructions with the new target-independent COPY_TO_SUBREG instruction. On x86-64 there are complicated constraints on h registers, and CodeGen doesn't currently provide a high-level way to express all of them, so they are handled with a bunch of special code. This code currently only supports extracts where the result is used by a zero-extend or a store, though these are fairly common. These transformations are not always beneficial; since there are only 4 h registers, they sometimes require extra move instructions, and this sometimes increases register pressure because it can force out values that would otherwise be in one of those registers. However, this appears to be relatively uncommon. llvm-svn: 68962	2009-04-13 16:09:41 +00:00
Dan Gohman	f117bbdbcd	Remove x86's special-case handling for ISD::TRUNCATE and ISD::SIGN_EXTEND_INREG. Tablegen-generated code can handle these cases, and the scheduling issues observed earlier appear to be resolved now. llvm-svn: 68959	2009-04-13 15:29:31 +00:00
Dan Gohman	6e6f9e3a4f	Fix copy+pastos in comments. llvm-svn: 68958	2009-04-13 15:28:29 +00:00
Dan Gohman	ac6a439313	List the l registers before h registers, for consistency. llvm-svn: 68954	2009-04-13 15:18:42 +00:00
Dan Gohman	e1db797df3	Use X86::SUBREG_8BIT instead of hard-coding the equivalent constant. llvm-svn: 68951	2009-04-13 15:14:03 +00:00
Dan Gohman	a68d99c707	Add a comment about MOVSX64rr8. llvm-svn: 68950	2009-04-13 15:13:28 +00:00
Dan Gohman	65bafadd2b	Fix another hard-coded constant to use X86AddrNumOperands. This unbreaks the JIT on x86-64. llvm-svn: 68948	2009-04-13 15:04:25 +00:00
Rafael Espindola	72347bffce	X86-64 TLS support for local exec and initial exec. llvm-svn: 68947	2009-04-13 13:02:49 +00:00
Rafael Espindola	ad8137187c	In X86DAGToDAGISel::MatchWrapper, if base or index are set, avoid matching only if symbolic addresses are RIP relatives. llvm-svn: 68924	2009-04-12 23:00:38 +00:00
Rafael Espindola	2b0a01bda9	refactor some code into X86DAGToDAGISel::MatchWrapper llvm-svn: 68915	2009-04-12 21:55:03 +00:00
Chris Lattner	6d6cf3ff4a	fix a cross-block fastisel crash handling overflow intrinsics. See comment for details. This fixes rdar://6772169 llvm-svn: 68890	2009-04-12 07:51:14 +00:00
Chris Lattner	6a9e77c980	simplify code by using IntrinsicInst. llvm-svn: 68887	2009-04-12 07:36:01 +00:00
Chris Lattner	da05d37aa1	Add new TargetInstrDesc::hasImplicitUseOfPhysReg and hasImplicitDefOfPhysReg methods. Use them to remove a look in X86 fast isel. llvm-svn: 68886	2009-04-12 07:26:51 +00:00
Dan Gohman	ac11c8d30f	Revert r68847. It breaks the build on non-Darwin targets, with this message from the assembler: Error: unknown pseudo-op: `.debug_inlined' llvm-svn: 68863	2009-04-11 15:57:04 +00:00
Devang Patel	6f907173e0	Keep track of inlined functions and their locations. This information is collected when nested llvm.dbg.func.start intrinsics are seen. (Right now, inliner removes nested llvm.dbg.func.start intrinisics during inlining.) Create debug_inlined dwarf section using these information. This info is used by gdb, at least on Darwin, to enable better experience debugging inlined functions. See DwarfWriter.cpp for more information on structure of debug_inlined section. llvm-svn: 68847	2009-04-11 00:16:47 +00:00
Rafael Espindola	88986ef511	Don't fold a load if the other operand is a TLS address. With this we generate movl %gs:0, %eax leal i@NTPOFF(%eax), %eax instead of movl $i@NTPOFF, %eax addl %gs:0, %eax llvm-svn: 68778	2009-04-10 10:09:34 +00:00
Chris Lattner	26aee059ba	a few fixes to "addrspace(256) is reference offset of GS segment register". It turns out that there are still several problems with this, will file a bugzilla. llvm-svn: 68749	2009-04-10 00:16:23 +00:00
Dan Gohman	8121b3f88d	Remove the obsolete SelectionDAG::getNodeValueTypes and simplify code that uses it by using SelectionDAG::getVTList instead. llvm-svn: 68744	2009-04-09 23:54:40 +00:00
Chris Lattner	e0d0edaf3f	Fix code size computation on x86-64, patch by Zoltan Varga! llvm-svn: 68690	2009-04-09 06:10:51 +00:00
Dan Gohman	6cb1387261	Fix grammaros in comments. llvm-svn: 68666	2009-04-09 02:06:09 +00:00
Rafael Espindola	7eb72dc5f2	Re-apply 68552. Tested by bootstrapping llvm-gcc and using that to build llvm. llvm-svn: 68645	2009-04-08 21:14:34 +00:00
Rafael Espindola	d4563305fd	Avoid a hard coded constant. llvm-svn: 68603	2009-04-08 08:09:33 +00:00
Dan Gohman	c9ce27d6b7	Implement support for using modeling implicit-zero-extension on x86-64 with SUBREG_TO_REG, teach SimpleRegisterCoalescing to coalesce SUBREG_TO_REG instructions (which are similar to INSERT_SUBREG instructions), and teach the DAGCombiner to take advantage of this on targets which support it. This eliminates many redundant zero-extension operations on x86-64. This adds a new TargetLowering hook, isZExtFree. It's similar to isTruncateFree, except it only applies to actual definitions, and not no-op truncates which may not zero the high bits. Also, this adds a new optimization to SimplifyDemandedBits: transform operations like x+y into (zext (add (trunc x), (trunc y))) on targets where all the casts are no-ops. In contexts where the high part of the add is explicitly masked off, this allows the mask operation to be eliminated. Fix the DAGCombiner to avoid undoing these transformations to eliminate casts on targets where the casts are no-ops. Also, this adds a new two-address lowering heuristic. Since two-address lowering runs before coalescing, it helps to be able to look through copies when deciding whether commuting and/or three-address conversion are profitable. Also, fix a bug in LiveInterval::MergeInClobberRanges. It didn't handle the case that a clobber range extended both before and beyond an existing live range. In that case, multiple live ranges need to be added. This was exposed by the new subreg coalescing code. Remove 2008-05-06-SpillerBug.ll. It was bugpoint-reduced, and the spiller behavior it was looking for no longer occurrs with the new instruction selection. llvm-svn: 68576	2009-04-08 00:15:30 +00:00
Bill Wendling	6e702cf68c	Temporarily revert r68552. This was causing a failure in the self-hosting LLVM builds. --- Reverse-merging (from foreign repository) r68552 into '.': U test/CodeGen/X86/tls8.ll U test/CodeGen/X86/tls10.ll U test/CodeGen/X86/tls2.ll U test/CodeGen/X86/tls6.ll U lib/Target/X86/X86Instr64bit.td U lib/Target/X86/X86InstrSSE.td U lib/Target/X86/X86InstrInfo.td U lib/Target/X86/X86RegisterInfo.cpp U lib/Target/X86/X86ISelLowering.cpp U lib/Target/X86/X86CodeEmitter.cpp U lib/Target/X86/X86FastISel.cpp U lib/Target/X86/X86InstrInfo.h U lib/Target/X86/X86ISelDAGToDAG.cpp U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.cpp U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.h U lib/Target/X86/AsmPrinter/X86IntelAsmPrinter.h U lib/Target/X86/X86ISelLowering.h U lib/Target/X86/X86InstrInfo.cpp U lib/Target/X86/X86InstrBuilder.h U lib/Target/X86/X86RegisterInfo.td llvm-svn: 68560	2009-04-07 22:35:25 +00:00
Rafael Espindola	0324937229	Reduce code duplication on the TLS implementation. This introduces a small regression on the generated code quality in the case we are just computing addresses, not loading values. Will work on it and on X86-64 support. llvm-svn: 68552	2009-04-07 21:37:46 +00:00
Mon P Wang	f829fb5cab	Added a x86 dag combine to increase the chances to use a movq for v2i64 on x86-32. llvm-svn: 68368	2009-04-03 02:43:30 +00:00
Chris Lattner	f1719bf7b5	silence warning in release-asserts build. llvm-svn: 68253	2009-04-01 22:14:45 +00:00
Evan Cheng	44fdb5d570	i128 shift libcalls are not available on x86. llvm-svn: 68133	2009-03-31 19:38:51 +00:00
Dan Gohman	86e4d0130c	Reapply 68073, with fixes. EH Landing-pad basic blocks are not entered via fall-through. Don't miss fallthroughs from blocks terminated by conditional branches. Also, move isOnlyReachableByFallthrough out of line. llvm-svn: 68129	2009-03-31 18:39:13 +00:00
Rafael Espindola	3d866ac20c	remove unused arguments. llvm-svn: 68109	2009-03-31 16:16:57 +00:00
Bill Wendling	28fad6fcc1	Really temporarily revert r68073. llvm-svn: 68100	2009-03-31 08:42:40 +00:00
Bill Wendling	1c40c8c242	Oy! When reverting r68073, I added in experimental code. Sorry... llvm-svn: 68099	2009-03-31 08:41:31 +00:00
Bill Wendling	4706abded2	Revert r68073. It's causing a failure in the Apple-style builds. llvm-svn: 68092	2009-03-31 08:26:26 +00:00
Evan Cheng	5c02e62620	X86 address mode isel tweak. If the base of the address is also used by a CopyToReg (i.e. it's likely live-out), do not fold the sub-expressions into the addressing mode to avoid computing the address twice. The CopyToReg use will be isel'ed to a LEA, re-use it for address instead. This is not yet enabled. llvm-svn: 68082	2009-03-31 01:13:53 +00:00
Dan Gohman	29694088d3	Except in asm-verbose mode, avoid printing labels for blocks that are only reachable via fall-through edges. This dramatically reduces the number of labels printed, and thus also the number of labels the assembler must parse and remember. llvm-svn: 68073	2009-03-30 22:55:17 +00:00
Evan Cheng	3e30bcbd69	When optimzing a mul by immediate into two, the resulting mul's should get a x86 specific node to avoid dag combiner from hacking on them further. llvm-svn: 68066	2009-03-30 21:36:47 +00:00
Anton Korobeynikov	0404baca28	Do not propagate ELF-specific stuff (data.rel) into other targets. This simplifies code and also ensures correctness. llvm-svn: 68032	2009-03-30 15:27:43 +00:00
Anton Korobeynikov	2ea565a37b	Add data.rel stuff llvm-svn: 68031	2009-03-30 15:27:03 +00:00
Rafael Espindola	34f59009d1	Use array_lengthof llvm-svn: 67950	2009-03-28 19:02:18 +00:00
Rafael Espindola	37522e768a	Have only one definition of X86AddrNumOperands. llvm-svn: 67949	2009-03-28 18:55:31 +00:00
Rafael Espindola	884992a7e9	Make code a bit less brittle by no hardcoding the number of operands in an address in so many places. llvm-svn: 67945	2009-03-28 17:03:24 +00:00
Evan Cheng	a15fdaa292	Optimize some 64-bit multiplication by constants into two lea's or one lea + shl since imulq is slow (latency 5). e.g. x * 40 => shlq $3, %rdi leaq (%rdi,%rdi,4), %rax This has the added benefit of allowing more multiply to be folded into addressing mode. e.g. a * 24 + b => leaq (%rdi,%rdi,2), %rax leaq (%rsi,%rax,8), %rax llvm-svn: 67917	2009-03-28 05:57:29 +00:00
Rafael Espindola	83ee99d7b0	Avoid hardcoding that X86 addresses have 4 operands. llvm-svn: 67848	2009-03-27 15:57:50 +00:00
Rafael Espindola	7c113e5354	Use less hard coded constants to make the code less brittle. llvm-svn: 67846	2009-03-27 15:45:05 +00:00
Rafael Espindola	38604d9598	I am trying to add a segment to the X86 addresses matching to improve TLS support (see http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20090309/075220.html), but that code is VERY brittle. This patch just makes it a bit more resistant. llvm-svn: 67843	2009-03-27 15:26:30 +00:00
Evan Cheng	ab6e38c88d	-no-implicit-float means explicit fp operations are legal. llvm-svn: 67784	2009-03-26 23:06:32 +00:00
Bill Wendling	f4247ff478	Pull transform from target-dependent code into target-independent code. llvm-svn: 67742	2009-03-26 06:14:09 +00:00
Bill Wendling	f79eccc675	Match this pattern so that we can generate simpler code: %a = ... %b = and i32 %a, 2 %c = srl i32 %b, 1 %d = br i32 %c, into %a = ... %b = and %a, 2 %c = X86ISD::CMP %b, 0 %d = X86ISD::BRCOND %c ... This applies only when the AND constant value has one bit set and the SRL constant is equal to the log2 of the AND constant. The back-end is smart enough to convert the result into a TEST/JMP sequence. llvm-svn: 67728	2009-03-26 01:47:50 +00:00
Bill Wendling	40ac545f38	Doxygen-ify comments. llvm-svn: 67727	2009-03-26 01:46:56 +00:00
Evan Cheng	3a7489a4cc	CodeGen still defaults to non-verbose asm, but llc now overrides it and default to verbose. llvm-svn: 67668	2009-03-25 01:47:28 +00:00
Evan Cheng	6ff8cea903	Don't print global names twice with -asm-verbose. llvm-svn: 67667	2009-03-25 01:08:42 +00:00
Dan Gohman	7a9e8cbf79	I was convinced that it's ok to allow a second i8 return value to be returned in DL. LLVM's multiple-return-value support is not ABI-conforming; front-ends that wish to have code emitted that conforms to an ABI are currently expected to make arrangements for this on their own rather than assuming that multiple-return-values will automatically do the right thing. This commit doesn't fundamentally change this situation. llvm-svn: 67588	2009-03-24 01:04:34 +00:00
Evan Cheng	b3196f1298	Do not emit comments unless -asm-verbose. llvm-svn: 67580	2009-03-24 00:17:40 +00:00
Dan Gohman	e9cf3083d2	Correct some comments. Operand numbers start at 0. llvm-svn: 67518	2009-03-23 15:40:10 +00:00
Evan Cheng	2ec94dd447	Model inline asm constraint which ties an input to an output register as machine operand TIED_TO constraint. This eliminated the need to pre-allocate registers for these. This also allows register allocator can eliminate the unneeded copies. llvm-svn: 67512	2009-03-23 08:01:15 +00:00
Dan Gohman	745c0acc79	Fix a grammaro in a comment that Bill noticed. llvm-svn: 67507	2009-03-23 05:02:44 +00:00
Dan Gohman	16b4a33039	Add comments explaining why there's only one register for i8 return values. llvm-svn: 67502	2009-03-23 04:28:24 +00:00
Nick Lewycky	a0dcd7e173	Remove strange extra semicolons. llvm-svn: 67287	2009-03-19 05:51:39 +00:00
Chris Lattner	205380a4e4	Disable the "call to immediate" optimization on x86-64. It is not safe in general because the immediate could be an arbitrary value that does not fit in a 32-bit pcrel displacement. Conservatively fall back to loading the value into a register and calling through it. We still do the optzn on X86-32. llvm-svn: 67142	2009-03-18 00:43:52 +00:00
Dan Gohman	f6c57d0fe7	Recognize bswapl as bswap too. llvm-svn: 67072	2009-03-17 02:45:40 +00:00
Dan Gohman	4efda2b52b	Recognize "bswapq" as an alternate spelling for the bswap instruction. llvm-svn: 67071	2009-03-17 02:17:27 +00:00
Dan Gohman	fd6debff99	Use %rip-relative addressing on x86-64 whenever practical, as it has a smaller encoding than absolute addressing. llvm-svn: 67002	2009-03-14 02:33:41 +00:00
Dan Gohman	e7495ef7aa	Don't forego folding of loads into 64-bit adds when the other operand is a signed 32-bit immediate. Unlike with the 8-bit signed immediate case, it isn't actually smaller to fold a 32-bit signed immediate instead of a load. In fact, it's larger in the case of 32-bit unsigned immediates, because they can be materialized with movl instead of movq. llvm-svn: 67001	2009-03-14 02:07:16 +00:00
Dan Gohman	fa0a3504ba	Improve FastISel's handling of truncates to i1, and implement ptrtoint and inttoptr in X86FastISel. These casts aren't always handled in the generic FastISel code because X86 sometimes needs custom code to do truncation and zero-extension. llvm-svn: 66988	2009-03-13 23:53:06 +00:00
Dan Gohman	790659c0d6	Fix FastISel's assumption that i1 values are always zero-extended by inserting explicit zero extensions where necessary. Included is a testcase where SelectionDAG produces a virtual register holding an i1 value which FastISel previously mistakenly assumed to be zero-extended. llvm-svn: 66941	2009-03-13 20:42:20 +00:00
Rafael Espindola	aadb9af093	add 8 and 16 bit TLS moves. add a fixme note on how to remove code duplication. llvm-svn: 66932	2009-03-13 19:39:55 +00:00
Rafael Espindola	ff17d02271	Improve sext and zext of TLS variables. llvm-svn: 66922	2009-03-13 18:37:06 +00:00
Chris Lattner	63569fa327	generalize this code so that fast isel handles integer truncates to i1, which codegen to the same thing as integer truncates to i8 (the top bits are just undefined). This implements rdar://6667338 llvm-svn: 66902	2009-03-13 16:36:42 +00:00
Bill Wendling	2fe64f48aa	These instructions have special lowering that may lower them to SSE instructions. Prevent that if we don't want implicit uses of SSE. llvm-svn: 66877	2009-03-13 08:41:47 +00:00
Evan Cheng	f9951d1557	Fix some significant problems with constant pools that resulted in unnecessary paddings between constant pool entries, larger than necessary alignments (e.g. 8 byte alignment for .literal4 sections), and potentially other issues. 1. ConstantPoolSDNode alignment field is log2 value of the alignment requirement. This is not consistent with other SDNode variants. 2. MachineConstantPool alignment field is also a log2 value. 3. However, some places are creating ConstantPoolSDNode with alignment value rather than log2 values. This creates entries with artificially large alignments, e.g. 256 for SSE vector values. 4. Constant pool entry offsets are computed when they are created. However, asm printer group them by sections. That means the offsets are no longer valid. However, asm printer uses them to determine size of padding between entries. 5. Asm printer uses expensive data structure multimap to track constant pool entries by sections. 6. Asm printer iterate over SmallPtrSet when it's emitting constant pool entries. This is non-deterministic. Solutions: 1. ConstantPoolSDNode alignment field is changed to keep non-log2 value. 2. MachineConstantPool alignment field is also changed to keep non-log2 value. 3. Functions that create ConstantPool nodes are passing in non-log2 alignments. 4. MachineConstantPoolEntry no longer keeps an offset field. It's replaced with an alignment field. Offsets are not computed when constant pool entries are created. They are computed on the fly in asm printer and JIT. 5. Asm printer uses cheaper data structure to group constant pool entries. 6. Asm printer compute entry offsets after grouping is done. 7. Change JIT code to compute entry offsets on the fly. llvm-svn: 66875	2009-03-13 07:51:59 +00:00
Chris Lattner	cbbdd230dd	generalize the previous code to use the full generality of LEA for i32/i64 expressions (we could also do i16 on cpus where i16 lea is fast, but I didn't add this). On the example, we now generate: _test: movl 4(%esp), %eax cmpl $42, (%eax) setl %al movzbl %al, %eax leal 4(%eax,%eax,8), %eax ret instead of: _test: movl 4(%esp), %eax cmpl $41, (%eax) movl $4, %ecx movl $13, %eax cmovg %ecx, %eax ret llvm-svn: 66869	2009-03-13 05:53:31 +00:00
Chris Lattner	878d951f8f	optimize the case of cond ? 42 : 41 and friends. This compiles the example to: _test: movl 4(%esp), %eax cmpl $41, (%eax) setg %al movzbl %al, %eax orl $4294967294, %eax ret instead of: movl 4(%esp), %eax cmpl $41, (%eax) movl $4294967294, %ecx movl $4294967295, %eax cmova %ecx, %eax ret which is smaller in code size and faster. rdar://6668608 llvm-svn: 66868	2009-03-13 05:22:11 +00:00
Dan Gohman	37d843c129	Enhance address-mode folding of ISD::ADD to handle cases where the operands can't both be fully folded at the same time. For example, in the included testcase, a global variable is being added with an add of two values. The global variable wants RIP-relative addressing, so it can't share the address with another base register, but it's still possible to fold the initial add. llvm-svn: 66865	2009-03-13 02:25:09 +00:00
Evan Cheng	d112c41d95	Re-apply 66024 with fixes: 1. Fixed indirect call to immediate address assembly. 2. Fixed JIT encoding by making the address pc-relative. llvm-svn: 66803	2009-03-12 18:15:39 +00:00
Chris Lattner	26a971c4ec	Move 3 "(add (select cc, 0, c), x) -> (select cc, x, (add, x, c))" related transformations out of target-specific dag combine into the ARM backend. These were added by Evan in r37685 with no testcases and only seems to help ARM (e.g. test/CodeGen/ARM/select_xform.ll). Add some simple X86-specific (for now) DAG combines that turn things like cond ? 8 : 0 -> (zext(cond) << 3). This happens frequently with the recently added cp constant select optimization, but is a very general xform. For example, we now compile the second example in const-select.ll to: _test: movsd LCPI2_0, %xmm0 ucomisd 8(%esp), %xmm0 seta %al movzbl %al, %eax movl 4(%esp), %ecx movsbl (%ecx,%eax,4), %eax ret instead of: _test: movl 4(%esp), %eax leal 4(%eax), %ecx movsd LCPI2_0, %xmm0 ucomisd 8(%esp), %xmm0 cmovbe %eax, %ecx movsbl (%ecx), %eax ret This passes multisource and dejagnu. llvm-svn: 66779	2009-03-12 06:52:53 +00:00
Chris Lattner	b904fac1b8	improve comment. llvm-svn: 66778	2009-03-12 06:46:02 +00:00
Evan Cheng	46e903d2f6	On x86, if the only use of a i64 load is a i64 store, generate a pair of double load and store instead. llvm-svn: 66776	2009-03-12 05:59:15 +00:00
Dan Gohman	d30e108f0e	Revert r66024. The JIT encoding for CALLpcrel32 is wrong -- see PR3773, and the assembly text output uses an indirect call ("call *") instead of a direct call. llvm-svn: 66735	2009-03-11 23:01:47 +00:00
Rafael Espindola	a8fe373200	optimize i8 and i16 tls values. llvm-svn: 66725	2009-03-11 22:40:04 +00:00
Bill Wendling	fca05e3a5c	Add a -no-implicit-float flag. This acts like -soft-float, but may generate floating point instructions that are explicitly specified by the user. llvm-svn: 66719	2009-03-11 22:30:01 +00:00
Duncan Sands	b27c523449	It makes no sense to have a ODR version of common linkage, so remove it. llvm-svn: 66690	2009-03-11 20:14:15 +00:00
Mon P Wang	287e422039	For yonah, fix a vector shuffle case for v16i8 where we didn't properly clear some bits. llvm-svn: 66684	2009-03-11 18:47:57 +00:00
Mon P Wang	2867737ad2	Fixed a v8i16 shuffle case that should generate a pshufb instead of a pshuflw/hw. llvm-svn: 66645	2009-03-11 06:35:11 +00:00
Chris Lattner	eb9327f335	formatting change, reduce indentation. No functionality change. llvm-svn: 66642	2009-03-11 05:48:52 +00:00
Dan Gohman	e15d8f03c3	Add more information to the EFLAGS note. llvm-svn: 66515	2009-03-10 00:26:23 +00:00
Dan Gohman	995c3dd344	Add a note about EFLAGS optimization. llvm-svn: 66508	2009-03-09 23:47:02 +00:00
Chris Lattner	b89dbcd448	do not export all the X86FastISel symbols, ever. llvm-svn: 66382	2009-03-08 18:44:31 +00:00
Chris Lattner	3342ba06d4	add a note. llvm-svn: 66360	2009-03-08 03:04:26 +00:00
Chris Lattner	8ace06fdda	add a note. llvm-svn: 66359	2009-03-08 01:54:43 +00:00
Duncan Sands	5ab54d488f	Introduce new linkage types linkonce_odr, weak_odr, common_odr and extern_weak_odr. These are the same as the non-odr versions, except that they indicate that the global will only be overridden by an equivalent global. In C, a function with weak linkage can be overridden by a function which behaves completely differently. This means that IP passes have to skip weak functions, since any deductions made from the function definition might be wrong, since the definition could be replaced by something completely different at link time. This is not allowed in C++, thanks to the ODR (One-Definition-Rule): if a function is replaced by another at link-time, then the new function must be the same as the original function. If a language knows that a function or other global can only be overridden by an equivalent global, it can give it the weak_odr linkage type, and the optimizers will understand that it is alright to make deductions based on the function body. The code generators on the other hand map weak and weak_odr linkage to the same thing. llvm-svn: 66339	2009-03-07 15:45:40 +00:00
Dan Gohman	b9c32f1aca	Arithmetic instructions don't set EFLAGS bits OF and CF bits the same say the "test" instruction does in overflow cases, so eliminating the test is only safe when those bits aren't needed, as is the case for COND_E and COND_NE, or if it can be proven that no overflow will occur. For now, just restrict the optimization to COND_E and COND_NE and don't do any overflow analysis. llvm-svn: 66318	2009-03-07 01:58:32 +00:00
Dan Gohman	f9599e6c5f	Don't use plain INC32 and DEC32 on x86-64; it needs INC64_32r and INC64_16r, because these instructions are encoded differently on x86-64. This fixes JIT regressions on x86-64 in kimwitu++ and others. llvm-svn: 66207	2009-03-05 21:32:23 +00:00
Dan Gohman	1e9db7c1a1	When creating X86ISD::INC and X86ISD::DEC nodes, only add one operand. The extra operand didn't appear to cause any trouble, but it was erroneous regardless. llvm-svn: 66206	2009-03-05 21:29:28 +00:00
Dan Gohman	f6f684b206	Fix the "test" optimization to recognize "dec" as an add of negative one, as subtracts of immediates are canonicalized to adds. llvm-svn: 66180	2009-03-05 19:32:48 +00:00
Dan Gohman	31fb085c2e	Re-apply 66008, now that the unfoldMemoryOperand bug is fixed. llvm-svn: 66058	2009-03-04 19:44:21 +00:00
Dan Gohman	f41e54c5af	Correct this comment. llvm-svn: 66057	2009-03-04 19:24:25 +00:00
Dan Gohman	04453ca36c	When using MachineInstr operand indices on SDNodes, the number of MachineInstr def operands must be subtracted out. This bug was uncovered by the recent x86 EFLAGS optimization. Before that, the only instructions that ever needed unfolding were things like CMP32rm, where NumDefs is zero. llvm-svn: 66056	2009-03-04 19:23:38 +00:00
Evan Cheng	7d9019d0f3	Fix PR3666: isel calls to constant addresses. llvm-svn: 66024	2009-03-04 06:48:53 +00:00
Dan Gohman	6831e2c2a6	Revert r66004 for now; it's causing a variety of test failures. llvm-svn: 66008	2009-03-04 03:54:19 +00:00
Dan Gohman	c6c669cc1e	Teach the x86 backend to eliminate "test" instructions by using the EFLAGS result from add, sub, inc, and dec instructions in simple cases. llvm-svn: 66004	2009-03-04 02:33:24 +00:00
Evan Cheng	db402a7a49	Fix PR3701. 1. X86 target renamed eflags register to flags. This matches what llvm-gcc generates so codegen knows flags register is being clobbered by inline asm. 2. BURR scheduler should also check if inline asm nodes can clobber "live" physical registers. Previously it was only checking target nodes with implicit defs. llvm-svn: 65996	2009-03-04 01:41:49 +00:00
Dan Gohman	3c6c7754b2	Add '(implicit EFLAGS)' for AND, OR, XOR, NEG, INC, and DEC instructions. These aren't used yet. llvm-svn: 65965	2009-03-03 19:53:46 +00:00
Dan Gohman	51d4e8db6a	Fix a bunch of Doxygen syntax issues. Escape special characters, and put @file directives on their own comment line. llvm-svn: 65920	2009-03-03 02:55:14 +00:00
Mon P Wang	0258a27c5a	Added another darwin subtarget llvm-svn: 65662	2009-02-28 00:25:30 +00:00
Rafael Espindola	880e63bf01	Refactor TLS code and add some tests. The tests and expected results are: pic \| declaration \| linkage \| visibility \| !pic \| declaration \| external \| default \| tls1.ll tls2.ll \| local exec pic \| declaration \| external \| default \| tls1-pic.ll tls2-pic.ll \| general dynamic !pic \| !declaration \| external \| default \| tls3.ll tls4.ll \| initial exec pic \| !declaration \| external \| default \| tls3-pic.ll tls4-pic.ll \| general dynamic !pic \| declaration \| external \| hidden \| tls7.ll tls8.ll \| local exec pic \| declaration \| external \| hidden \| X \| local dynamic !pic \| !declaration \| external \| hidden \| tls9.ll tls10.ll \| local exec pic \| !declaration \| external \| hidden \| X \| local dynamic !pic \| declaration \| internal \| default \| tls5.ll tls6.ll \| local exec pic \| declaration \| internal \| default \| X \| local dynamic The ones marked with an X have not been implemented since local dynamic is not implemented. llvm-svn: 65632	2009-02-27 13:37:18 +00:00
Evan Cheng	4014a9a5b8	ADDS{D\|S}rr_Int and MULS{D\|S}rr_Int are not commutable. The users of these intrinsics expect the high bits will not be modified. llvm-svn: 65499	2009-02-26 03:12:02 +00:00
Evan Cheng	ec34226c2b	Revert BuildVectorSDNode related patches: 65426, 65427, and 65296. llvm-svn: 65482	2009-02-25 22:49:59 +00:00
Bill Wendling	9d4eb136da	Overhaul my earlier submission due to feedback. It's a large patch, but most of them are generic changes. - Use the "fast" flag that's already being passed into the asm printers instead of shoving it into the DwarfWriter. - Instead of calling "MI->getParent()->getParent()" for every MI, set the machine function when calling "runOnMachineFunction" in the asm printers. llvm-svn: 65379	2009-02-24 08:30:20 +00:00
Dan Gohman	3766eeea36	Fast-isel can't do TLS yet, so it should fall back to SDISel if it sees TLS addresses. llvm-svn: 65341	2009-02-23 22:03:08 +00:00
Evan Cheng	dd139e795c	Only v1i16 (i.e. _m64) is returned via RAX / RDX. llvm-svn: 65313	2009-02-23 09:03:22 +00:00
Nate Begeman	e0093d2501	Generate better code for v8i16 shuffles on SSE2 Generate better code for v16i8 shuffles on SSE2 (avoids stack) Generate pshufb for v8i16 and v16i8 shuffles on SSSE3 where it is fewer uops. Document the shuffle matching logic and add some FIXMEs for later further cleanups. New tests that test the above. Examples: New: _shuf2: pextrw $7, %xmm0, %eax punpcklqdq %xmm1, %xmm0 pshuflw $128, %xmm0, %xmm0 pinsrw $2, %eax, %xmm0 Old: _shuf2: pextrw $2, %xmm0, %eax pextrw $7, %xmm0, %ecx pinsrw $2, %ecx, %xmm0 pinsrw $3, %eax, %xmm0 movd %xmm1, %eax pinsrw $4, %eax, %xmm0 ret ========= New: _shuf4: punpcklqdq %xmm1, %xmm0 pshufb LCPI1_0, %xmm0 Old: _shuf4: pextrw $3, %xmm0, %eax movsd %xmm1, %xmm0 pextrw $3, %xmm1, %ecx pinsrw $4, %ecx, %xmm0 pinsrw $5, %eax, %xmm0 ======== New: _shuf1: pushl %ebx pushl %edi pushl %esi pextrw $1, %xmm0, %eax rolw $8, %ax movd %xmm0, %ecx rolw $8, %cx pextrw $5, %xmm0, %edx pextrw $4, %xmm0, %esi pextrw $3, %xmm0, %edi pextrw $2, %xmm0, %ebx movaps %xmm0, %xmm1 pinsrw $0, %ecx, %xmm1 pinsrw $1, %eax, %xmm1 rolw $8, %bx pinsrw $2, %ebx, %xmm1 rolw $8, %di pinsrw $3, %edi, %xmm1 rolw $8, %si pinsrw $4, %esi, %xmm1 rolw $8, %dx pinsrw $5, %edx, %xmm1 pextrw $7, %xmm0, %eax rolw $8, %ax movaps %xmm1, %xmm0 pinsrw $7, %eax, %xmm0 popl %esi popl %edi popl %ebx ret Old: _shuf1: subl $252, %esp movaps %xmm0, (%esp) movaps %xmm0, 16(%esp) movaps %xmm0, 32(%esp) movaps %xmm0, 48(%esp) movaps %xmm0, 64(%esp) movaps %xmm0, 80(%esp) movaps %xmm0, 96(%esp) movaps %xmm0, 224(%esp) movaps %xmm0, 208(%esp) movaps %xmm0, 192(%esp) movaps %xmm0, 176(%esp) movaps %xmm0, 160(%esp) movaps %xmm0, 144(%esp) movaps %xmm0, 128(%esp) movaps %xmm0, 112(%esp) movzbl 14(%esp), %eax movd %eax, %xmm1 movzbl 22(%esp), %eax movd %eax, %xmm2 punpcklbw %xmm1, %xmm2 movzbl 42(%esp), %eax movd %eax, %xmm1 movzbl 50(%esp), %eax movd %eax, %xmm3 punpcklbw %xmm1, %xmm3 punpcklbw %xmm2, %xmm3 movzbl 77(%esp), %eax movd %eax, %xmm1 movzbl 84(%esp), %eax movd %eax, %xmm2 punpcklbw %xmm1, %xmm2 movzbl 104(%esp), %eax movd %eax, %xmm1 punpcklbw %xmm1, %xmm0 punpcklbw %xmm2, %xmm0 movaps %xmm0, %xmm1 punpcklbw %xmm3, %xmm1 movzbl 127(%esp), %eax movd %eax, %xmm0 movzbl 135(%esp), %eax movd %eax, %xmm2 punpcklbw %xmm0, %xmm2 movzbl 155(%esp), %eax movd %eax, %xmm0 movzbl 163(%esp), %eax movd %eax, %xmm3 punpcklbw %xmm0, %xmm3 punpcklbw %xmm2, %xmm3 movzbl 188(%esp), %eax movd %eax, %xmm0 movzbl 197(%esp), %eax movd %eax, %xmm2 punpcklbw %xmm0, %xmm2 movzbl 217(%esp), %eax movd %eax, %xmm4 movzbl 225(%esp), %eax movd %eax, %xmm0 punpcklbw %xmm4, %xmm0 punpcklbw %xmm2, %xmm0 punpcklbw %xmm3, %xmm0 punpcklbw %xmm1, %xmm0 addl $252, %esp ret llvm-svn: 65311	2009-02-23 08:49:38 +00:00
Scott Michel	3f8637305f	Introduce the BuildVectorSDNode class that encapsulates the ISD::BUILD_VECTOR instruction. The class also consolidates the code for detecting constant splats that's shared across PowerPC and the CellSPU backends (and might be useful for other backends.) Also introduces SelectionDAG::getBUID_VECTOR() for generating new BUILD_VECTOR nodes. llvm-svn: 65296	2009-02-22 23:36:09 +00:00
Evan Cheng	3ea8bd42f3	Add a note. llvm-svn: 65275	2009-02-22 08:13:45 +00:00
Evan Cheng	4385f393f7	Be bug compatible with gcc by returning MMX values in RAX. llvm-svn: 65274	2009-02-22 08:05:12 +00:00
Evan Cheng	9d9688ec15	Do not consider MMX_MOVD64rr a move instructions. The source register is in GR32, the destination is VR64. They are not compatible. llvm-svn: 65273	2009-02-22 08:04:23 +00:00
Anton Korobeynikov	5df82e3e25	Drop bunch of half-working stuff in the ext_weak linkage support. Now we're using one gross, but quite robust hack :) (previous ones did not work, for example, when ext_weak symbol was used deep inside constant expression in the initializer). The proper fix of this problem will require some quite huge asmprinter changes and that's why was postponed. This fixes PR3629 by the way :) llvm-svn: 65230	2009-02-21 11:53:32 +00:00
Bill Wendling	bfc216c45a	Make sure this doesn't access .end() too. llvm-svn: 65213	2009-02-21 01:11:36 +00:00
Bill Wendling	96430050a5	Make sure we don't dereference the .end() of the container. llvm-svn: 65211	2009-02-21 01:07:26 +00:00
Bill Wendling	66c3ffa2de	Propagate more debug loc infos. This also includes some code cleaning. llvm-svn: 65207	2009-02-21 00:43:56 +00:00
Bill Wendling	09289bc433	We need to propagate the debug location information even when dealing with the prologue/epilogue. llvm-svn: 65206	2009-02-21 00:32:08 +00:00
Evan Cheng	c40c3e28f7	Support return of MMX values in 64-bit mode. llvm-svn: 65152	2009-02-20 20:43:02 +00:00
Bill Wendling	306b992133	Put code that generates debug labels into TableGen so that it can be used by everyone. llvm-svn: 64978	2009-02-18 23:12:06 +00:00
Nate Begeman	5e78e558ff	Add support to the JIT for true non-lazy operation. When a call to a function that has not been JIT'd yet, the callee is put on a list of pending functions to JIT. The call is directed through a stub, which is updated with the address of the function after it has been JIT'd. A new interface for allocating and updating empty stubs is provided. Add support for removing the ModuleProvider the JIT was created with, which would otherwise invalidate the JIT's PassManager, which is initialized with the ModuleProvider's Module. Add support under a new ExecutionEngine flag for emitting the infomration necessary to update Function and GlobalVariable stubs after JITing them, by recording the address of the stub and the name of the GlobalValue. This allows code to be copied from one address space to another, where libraries may live at different virtual addresses, and have the stubs updated with their new correct target addresses. llvm-svn: 64906	2009-02-18 08:31:02 +00:00
Dan Gohman	9c258bd2ec	Factor out the code to add a MachineOperand to a MachineInstrBuilder. llvm-svn: 64891	2009-02-18 05:45:50 +00:00
Evan Cheng	bd63a1f40d	GV with null value initializer shouldn't go to BSS if it's meant for a mergeable strings section. Currently it only checks for Darwin. Someone else please check if it should apply to other targets as well. llvm-svn: 64877	2009-02-18 02:19:52 +00:00
Scott Michel	4c5fa6c982	Remove trailing whitespace to reduce later commit patch noise. (Note: Eventually, commits like this will be handled via a pre-commit hook that does this automagically, as well as expand tabs to spaces and look for 80-col violations.) llvm-svn: 64827	2009-02-17 22:15:04 +00:00
Chris Lattner	3b67310686	add a horrible note llvm-svn: 64719	2009-02-17 01:16:14 +00:00
Bill Wendling	266c8bc98f	--- Merging (from foreign repository) r64714 into '.': U include/llvm/CodeGen/DebugLoc.h U lib/CodeGen/SelectionDAG/LegalizeDAG.cpp U lib/CodeGen/SelectionDAG/SelectionDAGBuild.cpp U lib/Target/X86/AsmPrinter/X86ATTAsmPrinter.cpp Enable debug location generation at -Os. This goes with the reapplication of the r63639 patch. llvm-svn: 64715	2009-02-17 01:04:54 +00:00
Dan Gohman	856736a187	MachineLICM now handles these cases. llvm-svn: 64620	2009-02-15 23:24:52 +00:00
Dan Gohman	684359ea96	The x86-64 red zone is now being used. llvm-svn: 64535	2009-02-14 03:30:05 +00:00
Evan Cheng	9041a71923	Teach x86 target -soft-float. llvm-svn: 64496	2009-02-13 22:36:38 +00:00
Dale Johannesen	560b03bbcd	Remove non-DebugLoc versions of BuildMI from X86. There were some that might even matter in X86FastISel. llvm-svn: 64437	2009-02-13 02:33:27 +00:00
Bill Wendling	40e4b271af	Revert this. It was breaking stuff. llvm-svn: 64428	2009-02-13 02:16:35 +00:00
Bill Wendling	83b6edd760	Turn off the old way of handling debug information in the code generator. Use the new way, where all of the information is passed on SDNodes and machine instructions. llvm-svn: 64427	2009-02-13 02:01:04 +00:00
Dale Johannesen	5a21722625	Eliminate a couple of non-DebugLoc BuildMI variants. Modify callers. llvm-svn: 64409	2009-02-12 23:08:38 +00:00
Dale Johannesen	47321cf01f	Arrange to print constants that match "n" and "i" constraints in inline asm as signed (what gcc does). Add partial support for x86-specific "e" and "Z" constraints, with appropriate signedness for printing. llvm-svn: 64400	2009-02-12 20:58:09 +00:00
Chris Lattner	1174b80823	fix the X86 backend to just drop llvm.declare nodes for VLAs instead of leaving them in the DAG and then getting selection errors. This is a fix for PR3538. llvm-svn: 64382	2009-02-12 17:33:11 +00:00
Bill Wendling	49baa5465c	Propagate DebugLoc info for spiller call-backs. llvm-svn: 64329	2009-02-11 21:51:19 +00:00
Dan Gohman	2decb4495d	Don't try to set an EFLAGS operand to dead if no instruction was created. This fixes a bug introduced by r61215. llvm-svn: 64316	2009-02-11 19:50:24 +00:00
Evan Cheng	cdb35e3f0f	Handle llvm.x86.sse2.maskmov.dqu in 64-bit. llvm-svn: 64240	2009-02-10 22:06:28 +00:00
Evan Cheng	a1b9cf3143	80 col violations. llvm-svn: 64237	2009-02-10 21:39:44 +00:00
Evan Cheng	3b84024598	Implement FpSET_ST1_*. llvm-svn: 64186	2009-02-09 23:32:07 +00:00
Dan Gohman	548d7c9145	Use doxygen comment syntax. llvm-svn: 64150	2009-02-09 18:12:09 +00:00
Evan Cheng	9dc1507838	Turns out AnalyzeBranch can modify the mbb being analyzed. This is a nasty suprise to some callers, e.g. register coalescer. For now, add an parameter that tells AnalyzeBranch whether it's safe to modify the mbb. A better solution is out there, but I don't have time to deal with it right now. llvm-svn: 64124	2009-02-09 07:14:22 +00:00
Chris Lattner	a50a554333	add a note. llvm-svn: 64093	2009-02-08 20:44:19 +00:00
Dale Johannesen	b22cb23f6f	Use getDebugLoc forwarder instead of getNode()->getDebugLoc. No functional change. llvm-svn: 64026	2009-02-07 19:59:05 +00:00
Dan Gohman	4105a38248	Constify TargetInstrInfo::EmitInstrWithCustomInserter, allowing ScheduleDAG's TLI member to use const. llvm-svn: 64018	2009-02-07 16:15:20 +00:00
Dale Johannesen	a259483aae	Get rid of the last non-DebugLoc versions of getNode! Many targets build placeholder nodes for special operands, e.g. GlobalBaseReg on X86 and PPC for the PIC base. There's no sensible way to associate debug info with these. I've left them built with getNode calls with explicit DebugLoc::getUnknownLoc operands. I'm not too happy about this but don't see a good improvement; I considered adding a getPseudoOperand or something, but it seems to me that'll just make it harder to read. llvm-svn: 63992	2009-02-07 00:55:49 +00:00
Dan Gohman	8437b9efa1	Refactor some repeated logic into a separate function. llvm-svn: 63989	2009-02-07 00:43:41 +00:00
Dan Gohman	8aeae15ebd	Make a comment a doxygen comment. llvm-svn: 63988	2009-02-07 00:42:54 +00:00
Dale Johannesen	1580ab6b7f	Remove more non-DebugLoc getNode variants. Use getCALLSEQ_{END,START} to permit passing no DebugLoc there. UNDEF doesn't logically have DebugLoc; add getUNDEF to encapsulate this. llvm-svn: 63978	2009-02-06 23:05:02 +00:00
Dale Johannesen	c405486235	Remove more non-DebugLoc versions of getNode. llvm-svn: 63969	2009-02-06 21:50:26 +00:00
Bill Wendling	2c7f47d1d4	Record debug location information in the Dwarf writer. A simple test program shows that debugging works. :-) llvm-svn: 63968	2009-02-06 21:45:08 +00:00
Dan Gohman	a2f300f26d	Use .size and .type on ELF systems; this helps tools that map addresses to symbols. llvm-svn: 63962	2009-02-06 21:15:52 +00:00
Evan Cheng	e00df1d39c	Move getPointerRegClass from TargetInstrInfo to TargetRegisterInfo. llvm-svn: 63938	2009-02-06 17:43:24 +00:00

... 3 4 5 6 7 ...

4470 Commits