llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 06:22:56 +02:00

Author	SHA1	Message	Date
Chad Rosier	5efd936f43	[ms-inline asm] Extend the MC AsmParser API to match MCInsts (but not emit). This new API will be used by clang to parse ms-style inline asms. One goal of this project is to use this style of inline asm for targets other then x86. Therefore, this API needs to be implemented for non-x86 targets at some point in the future. llvm-svn: 161624	2012-08-09 22:04:55 +00:00
Manman Ren	967804ad0a	X86: enable CSE between CMP and SUB We perform the following: 1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering. 2> Modify MachineCSE to correctly handle implicit defs. 3> Convert SUB back to CMP if possible at peephole. Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled by peephole now. rdar://11873276 llvm-svn: 161462	2012-08-08 00:51:41 +00:00
Jakob Stoklund Olesen	924ff06fdf	Don't scan physreg use-def chains looking for a PIC base. We can't rematerialize a PIC base after register allocation anyway, and scanning physreg use-def chains is very expensive in a function with many calls. <rdar://problem/12047515> llvm-svn: 161461	2012-08-08 00:40:47 +00:00
Evan Cheng	96c6741fad	X86 cmp lowering is looking past truncate on the condition node. It should only do so when the high bits are known zero. This caused a subtle miscompilation. rdar://12027825 llvm-svn: 161451	2012-08-07 22:21:00 +00:00
Andrew Trick	35b938c991	Allow x86 subtargets to use the GenericModel defined in X86Schedule.td. This allows codegen passes to query properties like InstrItins->SchedModel->IssueWidth. It also ensure's that computeOperandLatency returns the X86 defaults for loads and "high latency ops". This should have no significant impact on existing schedulers because X86 defaults happen to be the same as global defaults. llvm-svn: 161370	2012-08-07 00:25:30 +00:00
Eric Christopher	f5132794cd	Add support for the OpenBSD for Bitrig. Patch by David Hill. llvm-svn: 161344	2012-08-06 20:52:18 +00:00
Craig Topper	dc1b95e7de	Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305. llvm-svn: 161318	2012-08-06 06:22:36 +00:00
Craig Topper	e26b30c830	Remove custom inserter for MWAIT. It doesn't do anything that couldn't be represented in a pattern. llvm-svn: 161306	2012-08-05 00:36:57 +00:00
Craig Topper	c716a3f554	Use a COPY node instead of an explicit MOVA opcode in the custom insterter for pcmpestrm/pcmpistrm. Allows the register allocator to handle it better and prevent wasted identity moves. llvm-svn: 161305	2012-08-05 00:17:48 +00:00
Bob Wilson	d1eefbeac2	Fall back to selection DAG isel for calls to builtin functions. Fast isel doesn't currently have support for translating builtin function calls to target instructions. For embedded environments where the library functions are not available, this is a matter of correctness and not just optimization. Most of this patch is just arranging to make the TargetLibraryInfo available in fast isel. <rdar://problem/12008746> llvm-svn: 161232	2012-08-03 04:06:28 +00:00
Manman Ren	2c236a30f6	X86 Peephole: fold loads to the source register operand if possible. Add more comments and use early returns to reduce nesting in isLoadFoldable. Also disable folding for V_SET0 to avoid introducing a const pool entry and a const pool load. rdar://10554090 and rdar://11873276 llvm-svn: 161207	2012-08-02 19:37:32 +00:00
Manman Ren	78b8d454cc	X86 Peephole: fold loads to the source register operand if possible. Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. This patch is a rework of r160919 and was tested on clang self-host on my local machine. rdar://10554090 and rdar://11873276 llvm-svn: 161152	2012-08-02 00:56:42 +00:00
Manman Ren	1c73b43c93	X86: mark GATHER instructios as mayLoad llvm-svn: 161143	2012-08-01 23:28:59 +00:00
Chad Rosier	078db59861	Whitespace. llvm-svn: 161122	2012-08-01 18:39:17 +00:00
Elena Demikhovsky	0fec7026d9	Added FMA functionality to X86 target. llvm-svn: 161110	2012-08-01 12:06:00 +00:00
Craig Topper	3f200a7638	Add more indirection to the disassembler tables to reduce amount of space used to store the operand types and encodings. Store only the unique combinations in a separate table and store indices in the instruction table. Saves about 32K of static data. llvm-svn: 161101	2012-08-01 07:39:18 +00:00
Chad Rosier	9a4ff99710	[x86 frame lowering] In 32-bit mode, use ESI as the base pointer. Previously, we were using EBX, but PIC requires the GOT to be in EBX before function calls via PLT GOT pointer. llvm-svn: 161066	2012-07-31 18:29:21 +00:00
Craig Topper	4f038a7a70	Make INSTRUCTION_SPECIFIER_FIELDS match X86DisassemblerCommon.h. Also remove trailing whitespace. llvm-svn: 161029	2012-07-31 05:18:26 +00:00
Craig Topper	62ece8c8d5	Tidy up trailing whitespace llvm-svn: 161027	2012-07-31 04:58:05 +00:00
Craig Topper	676ae779ef	Tidy up trailing whitespace llvm-svn: 161026	2012-07-31 04:38:27 +00:00
Craig Topper	f374fd0e17	Mark MOVZX16/MOVSX16 as neverHasSideEffects/mayLoad llvm-svn: 160953	2012-07-30 07:14:07 +00:00
Craig Topper	19fc5055ea	Mark MOVZX32_NOREX as isCodeGenOnly and neverHasSideEffects. The isCodeGenOnly change allows special detection of _NOREX instructions to be removed from tablegen disassembler code. llvm-svn: 160951	2012-07-30 06:48:11 +00:00
Craig Topper	80fdfb7f56	Give VCVTTPD2DQ priority over CVTTPD2DQ. llvm-svn: 160942	2012-07-30 02:20:32 +00:00
Craig Topper	492a7af190	Fix patterns for CVTTPS2DQ to specify SSE2 instead of SSE1. llvm-svn: 160941	2012-07-30 02:14:02 +00:00
Craig Topper	9050c15c71	Fix up patterns for VCVTSS2SD. Specifically give it priority over SSE form. Add an OptForSpeed to explicitly pair up with an OptForSize that was already on another pattern. llvm-svn: 160939	2012-07-30 01:38:57 +00:00
Craig Topper	147248a6a0	Fix load types on intrinsic forms of SS2SD and SD2SS AVX/SSE convert instruction patterns. llvm-svn: 160938	2012-07-29 23:26:34 +00:00
Craig Topper	293e781ba6	Move more SSE/AVX convert instruction patterns into their definitions. llvm-svn: 160937	2012-07-29 22:30:06 +00:00
Manman Ren	ceef7c4d9b	Revert r160920 and r160919 due to dragonegg and clang selfhost failure llvm-svn: 160927	2012-07-29 02:44:09 +00:00
Craig Topper	e75418242a	Fold patterns for some of the SSE/AVX convert instructions into their instruction definitions. llvm-svn: 160922	2012-07-28 18:59:19 +00:00
Craig Topper	189349dab2	Mark some of the SSE/AVX convert instructions as mayLoad/neverHasSideEffects. llvm-svn: 160921	2012-07-28 18:36:39 +00:00
Manman Ren	ea77f9076b	X86 Peephole: fold loads to the source register operand if possible. Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. rdar://10554090 and rdar://11873276 llvm-svn: 160919	2012-07-28 16:48:01 +00:00
Craig Topper	3c15b4afd4	Make CVTSS2SI instruction definition consistent with CVTSD2SI. llvm-svn: 160914	2012-07-28 08:28:23 +00:00
Craig Topper	8121932592	Fix up memory load types for SSE scalar convert intrinsic patterns. llvm-svn: 160913	2012-07-28 07:59:59 +00:00
Manman Ren	fbc9fcdbf2	X86 Peephole: fix PR13475 in optimizeCompare. It is possible that an instruction can use and update EFLAGS. When checking the safety, we should check the usage of EFLAGS first before declaring it is safe to optimize due to the update. llvm-svn: 160912	2012-07-28 03:15:46 +00:00
Jakob Stoklund Olesen	4b1105c720	Remove the X86 sub_ss and sub_sd sub-register indexes completely. llvm-svn: 160833	2012-07-26 23:07:20 +00:00
Jakob Stoklund Olesen	008b36037d	Remove the last mentions of sub_ss and sub_sd from patterns. I'll remove these two sub-register indexes shortly. llvm-svn: 160831	2012-07-26 23:03:08 +00:00
Jakob Stoklund Olesen	c1dd7a213d	Eliminate sub_ss, sub_sd from broadcast patterns. The (COPY_TO_REGCLASS GR32:$src, VR128) pattern looks odd, but copyPhysReg does the right thing with it. (The old pattern would eventually produce the same cross-class copy). llvm-svn: 160830	2012-07-26 22:59:06 +00:00
Jakob Stoklund Olesen	419ccae442	Eliminate more sub_ss / sub_sd patterns. This gets rid of some more INSERT_SUBREG - IMPLICIT_DEF patterns, simplifying the emitted code a bit. llvm-svn: 160820	2012-07-26 22:30:18 +00:00
Jakob Stoklund Olesen	0a41a45c36	Eliminate some SUBREG_TO_REG patterns with sub_ss and sub_sd. The SUBREG_TO_REG instruction has magic semantics asserting that the source value was defined by an instruction that cleared the high half of the register. Those semantics are never actually exploited for xmm registers. llvm-svn: 160818	2012-07-26 22:03:21 +00:00
Jakob Stoklund Olesen	44868927b9	Eliminate a batch of uses of sub_ss and sub_sd in the X86 target. These idempotent sub-register indices don't do anything --- They simply map XMM registers to themselves. They no longer affect register classes either since the SubRegClasses field has been removed from Target.td. This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns with COPY_TO_REGCLASS patterns which simply become COPY instructions. The number of IMPLICIT_DEF instructions before register allocation is reduced, and that is the cause of the test case changes. llvm-svn: 160816	2012-07-26 21:40:42 +00:00
Craig Topper	9667d599eb	Make l/q suffixes on AVX forms of scalar convert instructions consistent with their non-AVX forms. llvm-svn: 160775	2012-07-26 07:48:28 +00:00
Rafael Espindola	062cf9ccf9	Fix typos. Thanks to Matt Beaumont-Gay for noticing it. llvm-svn: 160731	2012-07-25 15:42:45 +00:00
Rafael Espindola	8dfc815c3f	When a return struct pointer is passed in registers, the called has nothing to pop. llvm-svn: 160725	2012-07-25 13:41:10 +00:00
Rafael Espindola	e58779ca1b	Factor a long list of conditions into a predicate function. No functionality change. llvm-svn: 160724	2012-07-25 13:35:45 +00:00
Kevin Enderby	9a7bc24c01	Fix a bug in the x86 disassembler's symbolic disassembly support for Jcc-Jump if Condition Is Met instuctions that was not correctly determining the target instruction. So for a jne rel32 instruction: % cat x.s .byte 0x0f, 0x85, 0x09, 0x00, 0x00, 0x00 % as x.s it was incorrectly deterining the target: % otool -q -tv a.out a.out: (__TEXT,__text) section 0000000000000000 jne 0xd and with the fix it gets this correct as: % otool -q -tv a.out a.out: (__TEXT,__text) section 0000000000000000 jne 0xf rdar://11505997 llvm-svn: 160694	2012-07-24 21:40:01 +00:00
David Chisnall	c21e3b3ce1	ELF does not imply GNU/Linux. Do not assume GNU conventions just because we are targeting an ELF platform. Only fold gs-relative (and fs-relative) loads if it is actually sensible to do so for the target platform. This fixes PR13438. llvm-svn: 160687	2012-07-24 20:04:16 +00:00
Sylvestre Ledru	bf8acb65ac	Fix a typo (the the => the) llvm-svn: 160621	2012-07-23 08:51:15 +00:00
Craig Topper	015d52b2dc	Don't use implicit register operands to calculate L-bit for AVX instructions. Needed because super reg defs and kills are added as implicit operands on 128-bit instructions. Fixes PR13349. Patch by Jose Fonseca. llvm-svn: 160543	2012-07-20 07:03:46 +00:00
Preston Gurd	6d82adeada	Adds the family codes for the Midview Atom processors so that the Atom buildbot will auto-detect Atom. llvm-svn: 160521	2012-07-19 19:05:37 +00:00
Bill Wendling	0b007e009e	Remove tabs. llvm-svn: 160479	2012-07-19 00:15:11 +00:00
Bill Wendling	17b12b72bc	Remove tabs. llvm-svn: 160477	2012-07-19 00:11:40 +00:00
Manman Ren	dd8d9c10a3	X86: remove redundant cmp against zero. Updated OptimizeCompare in peephole to remove redundant cmp against zero. We only remove Compare if CF and OF are not used. rdar://11855129 llvm-svn: 160454	2012-07-18 21:40:01 +00:00
Preston Gurd	d2b344c685	This patch fixes 8 out of 20 unexpected failures in "make check" when run on an Intel Atom processor. The failures have arisen due to changes elsewhere in the trunk over the past 8 weeks or so. These failures were not detected by the Atom buildbot because the CPU on the Atom buildbot was not being detected as an Atom CPU. The fix for this problem is in Host.cpp and X86Subtarget.cpp, but shall remain commented out until the current set of Atom test failures are fixed. Patch by Andy Zhang and Tyler Nowicki! llvm-svn: 160451	2012-07-18 20:49:17 +00:00
Nadav Rotem	03d2729392	The vbroadcast family of instructions has 'fallback patterns' in case where the load source operand is used by multiple nodes. The v2i64 broadcast was emulated by shuffling the two lower i32 elements to the upper two. We had a bug in the immediate used for the broadcast. Replacing 0 to 0x44. 0x44 means [01\|00\|01\|00] which corresponds to the correct lane. Patch by Michael Kuperstein. llvm-svn: 160430	2012-07-18 08:14:48 +00:00
Craig Topper	6150f43b28	Remove tab characters. llvm-svn: 160425	2012-07-18 04:59:16 +00:00
Craig Topper	b086c8faf2	Fix typo in error message and remove some tab characters. llvm-svn: 160423	2012-07-18 04:36:35 +00:00
Craig Topper	b144f3b6db	Make x86 asm parser to check for xmm vs ymm for index register in gather instructions. Also fix Intel syntax for gather instructions to use 'DWORD PTR' or 'QWORD PTR' to match gas. llvm-svn: 160420	2012-07-18 04:11:12 +00:00
Evan Cheng	5e82ad04d5	Back out r160101 and instead implement a dag combine to recover from instcombine transformation. llvm-svn: 160387	2012-07-17 18:54:11 +00:00
Evan Cheng	f84dd0cf40	Implement r160312 as target indepedenet dag combine. llvm-svn: 160354	2012-07-17 08:31:11 +00:00
Evan Cheng	0b6bcb6e06	This is another case where instcombine demanded bits optimization created large immediates. Add dag combine logic to recover in case the large immediates doesn't fit in cmp immediate operand field. int foo(unsigned long l) { return (l>> 47) == 1; } we produce %shr.mask = and i64 %l, -140737488355328 %cmp = icmp eq i64 %shr.mask, 140737488355328 %conv = zext i1 %cmp to i32 ret i32 %conv which codegens to movq $0xffff800000000000,%rax andq %rdi,%rax movq $0x0000800000000000,%rcx cmpq %rcx,%rax sete %al movzbl %al,%eax ret TargetLowering::SimplifySetCC would transform (X & -256) == 256 -> (X >> 8) == 1 if the immediate fails the isLegalICmpImmediate() test. For x86, that's immediates which are not a signed 32-bit immediate. Based on a patch by Eli Friedman. PR10328 rdar://9758774 llvm-svn: 160346	2012-07-17 06:53:39 +00:00
Evan Cheng	b409a61574	For something like uint32_t hi(uint64_t res) { uint_32t hi = res >> 32; return !hi; } llvm IR looks like this: define i32 @hi(i64 %res) nounwind uwtable ssp { entry: %lnot = icmp ult i64 %res, 4294967296 %lnot.ext = zext i1 %lnot to i32 ret i32 %lnot.ext } The optimizer has optimize away the right shift and truncate but the resulting constant is too large to fit in the 32-bit immediate field. The resulting x86 code is worse as a result: movabsq $4294967296, %rax ## imm = 0x100000000 cmpq %rax, %rdi sbbl %eax, %eax andl $1, %eax This patch teaches the x86 lowering code to handle ult against a large immediate with trailing zeros. It will issue a right shift and a truncate followed by a comparison against a shifted immediate. shrq $32, %rdi testl %edi, %edi sete %al movzbl %al, %eax It also handles a ugt comparison against a large immediate with trailing bits set. i.e. X > 0x0ffffffff -> (X >> 32) >= 1 rdar://11866926 llvm-svn: 160312	2012-07-16 19:35:43 +00:00
Chad Rosier	16c9db9ad6	With r160248 in place this code is no longer needed. llvm-svn: 160293	2012-07-16 17:42:13 +00:00
Nadav Rotem	67ff66bd0c	Fix a bug in the 3-address conversion of LEA when one of the operands is an undef virtual register. The problem is that ProcessImplicitDefs removes the definition of the register and marks all uses as undef. If we lose the undef marker then we get a register which has no def, is not marked as undef. The live interval analysis does not collect information for these virtual registers and we crash in later passes. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160260	2012-07-16 10:52:25 +00:00
Alexey Samsonov	c68bb48704	This CL changes the function prologue and epilogue emitted on X86 when stack needs realignment. It is intended to fix PR11468. Old prologue and epilogue looked like this: push %rbp mov %rsp, %rbp and $alignment, %rsp push %r14 push %r15 ... pop %r15 pop %r14 mov %rbp, %rsp pop %rbp The problem was to reference the locations of callee-saved registers in exception handling: locations of callee-saved had to be re-calculated regarding the stack alignment operation. It would take some effort to implement this in LLVM, as currently MachineLocation can only have the form "Register + Offset". Funciton prologue and epilogue are now changed to: push %rbp mov %rsp, %rbp push %14 push %15 and $alignment, %rsp ... lea -$size_of_saved_registers(%rbp), %rsp pop %r15 pop %r14 pop %rbp Reviewed by Chad Rosier. llvm-svn: 160248	2012-07-16 06:54:09 +00:00
Nadav Rotem	a09775b875	Teach getTargetVShiftNode about TargetConstant nodes. llvm-svn: 160234	2012-07-15 20:27:43 +00:00
Nadav Rotem	0377e0d234	Rename VBROADCASTSDrm into VBROADCASTSDYrm to match the naming convention. Allow the folding of vbroadcastRR to vbroadcastRM, where the memory operand is a spill slot. PR12782. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160230	2012-07-15 12:26:30 +00:00
Nadav Rotem	c40d85dda5	AVX: Fix a bug in getTargetVShiftNode. The shift amount has to be a 128bit vector with the same element type as the input vector. This is needed because of the patterns we have for the VP[SLL/SRA/SRL][W/D/Q] instructions. llvm-svn: 160222	2012-07-14 22:26:05 +00:00
Benjamin Kramer	308eb1b4c0	Make helper functions static. llvm-svn: 160173	2012-07-13 13:25:15 +00:00
Craig Topper	a75da664ff	Mark VINSERTI128rm as MayLoad=1. Fixes PR13348. llvm-svn: 160162	2012-07-13 05:46:28 +00:00
Benjamin Kramer	558c56f216	Give the rdrand instructions a SideEffect flag and a chain so MachineCSE and MachineLICM don't touch it. I already had the necessary things in place for IR-level passes but missed the machine passes. llvm-svn: 160137	2012-07-12 18:14:57 +00:00
Benjamin Kramer	f8e67a04f4	Add intrinsics for Ivy Bridge's rdrand instruction. The rdrand/cmov sequence is the same that is emitted by both GCC and ICC. Fixes PR13284. llvm-svn: 160117	2012-07-12 09:31:43 +00:00
Craig Topper	6eef81b65b	Update GATHER instructions to support 2 read-write operands. Patch from myself and Manman Ren. llvm-svn: 160110	2012-07-12 06:52:41 +00:00
Chad Rosier	75817295b6	[x86 fast-isel] Per discussion with Eric, add all cases to switch with verbose comments. llvm-svn: 160069	2012-07-11 19:58:38 +00:00
Manman Ren	93ef864f3c	X86: Update to peephole optimization to move Movr0 before (Sub, Cmp) pair. When Movr0 is between sub and cmp, we move Movr0 before sub if it enables removal of Cmp. llvm-svn: 160066	2012-07-11 19:35:12 +00:00
Chad Rosier	aaccad80b4	[x86 fast-isel] Rather then call llvm_unreachable() have fast-isel fall back to Selection DAG isel. Patch by Andrew Kaylor <andrew.kaylor@intel.com>. llvm-svn: 160055	2012-07-11 17:23:17 +00:00
Nadav Rotem	22652c85bc	When ext-loading and trunc-storing vectors to memory, on x86 32bit systems, allow loads/stores of 64bit values from xmm registers. llvm-svn: 160044	2012-07-11 13:27:05 +00:00
Chad Rosier	3273667edf	Move [get\|set]BasePtrStackAdjustment() from MachineFrameInfo to X86MachineFunctionInfo as this is currently only used by X86. If this ever becomes an issue on another arch (e.g., ARM) then we can hoist it back out. llvm-svn: 160009	2012-07-10 18:27:15 +00:00
Chad Rosier	5395ec6ee4	Add support for dynamic stack realignment in the presence of dynamic allocas on X86. Basically, this is a reapplication of r158087 with a few fixes. Specifically, (1) the stack pointer is restored from the base pointer before popping callee-saved registers and (2) in obscure cases (see comments in patch) we must cache the value of the original stack adjustment in the prologue and apply it in the epilogue. rdar://11496434 llvm-svn: 160002	2012-07-10 17:45:53 +00:00
Nadav Rotem	5f6e9d5ffe	Improve the loading of load-anyext vectors by allowing the codegen to load multiple scalars and insert them into a vector. Next, we shuffle the elements into the correct places, as before. Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the migration of bitcasts happened too late in the SelectionDAG process. llvm-svn: 159991	2012-07-10 13:25:08 +00:00
Craig Topper	b346ce8240	Reverse assembler/disassembler operand order for gather instructions. llvm-svn: 159983	2012-07-10 06:38:33 +00:00
Manman Ren	dc41586be4	X86: implement functions to analyze & synthesize CMOV\|SET\|Jcc getCondFromSETOpc, getCondFromCMovOpc, getSETFromCond, getCMovFromCond No functional change intended. If we want to update the condition code of CMOV\|SET\|Jcc, we first analyze the opcode to get the condition code, then update the condition code, finally synthesize the new opcode form the new condition code. llvm-svn: 159955	2012-07-09 18:57:12 +00:00
Andrew Trick	b9c8074dcd	I'm introducing a new machine model to simultaneously allow simple subtarget CPU descriptions and support new features of MachineScheduler. MachineModel has three categories of data: 1) Basic properties for coarse grained instruction cost model. 2) Scheduler Read/Write resources for simple per-opcode and operand cost model (TBD). 3) Instruction itineraties for detailed per-cycle reservation tables. These will all live side-by-side. Any subtarget can use any combination of them. Instruction itineraries will not change in the near term. In the long run, I expect them to only be relevant for in-order VLIW machines that have complex contraints and require a precise scheduling/bundling model. Once itineraries are only actively used by VLIW-ish targets, they could be replaced by something more appropriate for those targets. This tablegen backend rewrite sets things up for introducing MachineModel type #2: per opcode/operand cost model. llvm-svn: 159891	2012-07-07 04:00:00 +00:00
Manman Ren	eca5886e50	X86: Fix optimizeCompare to correctly check safe condition. It is safe if EFLAGS is killed or re-defined. When we are done with the basic block, check whether EFLAGS is live-out. Do not optimize away cmp if EFLAGS is live-out. llvm-svn: 159888	2012-07-07 03:34:46 +00:00
Manman Ren	8cbff1360f	X86: peephole optimization to remove cmp instruction For each Cmp, we check whether there is an earlier Sub which make Cmp redundant. We handle the case where SUB operates on the same source operands as Cmp, including the case where the two source operands are swapped. llvm-svn: 159838	2012-07-06 17:36:20 +00:00
Jakob Stoklund Olesen	6edf66ffe8	Make X86 call and return instructions non-variadic. Function argument and return value registers aren't part of the encoding, so they should be implicit operands. llvm-svn: 159728	2012-07-04 23:53:27 +00:00
Jakob Stoklund Olesen	795083115c	Ensure CopyToReg nodes are always glued to the call instruction. The CopyToReg nodes that set up the argument registers before a call must be glued to the call instruction. Otherwise, the scheduler may emit the physreg copies long before the call, causing long live ranges for the fixed registers. Besides disabling good register allocation, that can also expose problems when EmitInstrWithCustomInserter() splits a basic block during the live range of a physreg. llvm-svn: 159721	2012-07-04 19:28:31 +00:00
Jakob Stoklund Olesen	79846e5c9b	Add early if-conversion support to X86. Implement the TII hooks needed by EarlyIfConversion to create cmov instructions and estimate their latency. Early if-conversion is still not enabled by default. llvm-svn: 159695	2012-07-04 00:09:58 +00:00
Craig Topper	4644d3577c	Remove extra space. llvm-svn: 159647	2012-07-03 06:48:58 +00:00
Craig Topper	60bbc2fde8	Change i128mem/i256mem to f128mem/f256mem on some floating point vector instructions. llvm-svn: 159646	2012-07-03 06:11:06 +00:00
Craig Topper	6fcb4454a0	Add aliases for pblendvb, blendvpd, and blendvps instructions with the implicit xmm0 operand specified. Fixes PR13252. llvm-svn: 159644	2012-07-03 05:49:45 +00:00
Bob Wilson	0a1ef38836	Add all codegen passes to the PassManager via TargetPassConfig. This is a preliminary step toward having TargetPassConfig be able to start and stop the compilation at specified passes for unit testing and debugging. No functionality change. llvm-svn: 159567	2012-07-02 19:48:31 +00:00
Elena Demikhovsky	0617b5a56c	Optimization of shuffle node that can fit to the register form of VBROADCAST instruction on AVX2. llvm-svn: 159504	2012-07-01 06:12:26 +00:00
Craig Topper	4fc5342fc7	Reduce code size by using a second switch statement to avoid extra calls to SelectAtomic64. Also catch cases where SelectAtomic64 fails. llvm-svn: 159503	2012-07-01 02:55:34 +00:00
Craig Topper	80279ea39f	Add a break to the end of case statement missed in r159501. llvm-svn: 159502	2012-07-01 02:18:18 +00:00
Craig Topper	8b795d08a5	Fix a crash on release builds if gather intrinsics are passed a non-constant value for the last argument. llvm-svn: 159501	2012-07-01 02:17:08 +00:00
Craig Topper	b2a94bd61c	Use a second switch statement to reduce number of calls to SelectGather in code. Reduces code size a bit. llvm-svn: 159500	2012-07-01 02:05:52 +00:00
Rafael Espindola	53e0eee9de	In the initial exec mode we always do a load to find the address of a variable. Before this patch in pic 32 bit code we would add the global base register and not load from that address. This is a really old bug, but before the introduction of the tls attributes we would never select initial exec for pic code. llvm-svn: 159409	2012-06-29 04:22:35 +00:00
Manman Ren	63bf58865a	X86: add more GATHER intrinsics in LLVM Corrected type for index of llvm.x86.avx2.gather.d.pd.256 from 256-bit to 128-bit. Corrected types for src\|dst\|mask of llvm.x86.avx2.gather.q.ps.256 from 256-bit to 128-bit. Support the following intrinsics: llvm.x86.avx2.gather.d.q, llvm.x86.avx2.gather.q.q llvm.x86.avx2.gather.d.q.256, llvm.x86.avx2.gather.q.q.256 llvm.x86.avx2.gather.d.d, llvm.x86.avx2.gather.q.d llvm.x86.avx2.gather.d.d.256, llvm.x86.avx2.gather.q.d.256 llvm-svn: 159402	2012-06-29 00:54:20 +00:00
Bill Wendling	e8949ecfa6	Move lib/Analysis/DebugInfo.cpp to lib/VMCore/DebugInfo.cpp and include/llvm/Analysis/DebugInfo.h to include/llvm/DebugInfo.h. The reasoning is because the DebugInfo module is simply an interface to the debug info MDNodes and has nothing to do with analysis. llvm-svn: 159312	2012-06-28 00:05:13 +00:00
Chad Rosier	32642a8292	Whitespace. llvm-svn: 159300	2012-06-27 22:34:28 +00:00

1 2 3 4 5 ...

8513 Commits