1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-27 14:02:50 +01:00
Commit Graph

1444 Commits

Author SHA1 Message Date
Bill Wendling
9bb7ac566f Add an MVT::x86mmx type. It will take the place of all current MMX vector types.
llvm-svn: 113261
2010-09-07 20:03:56 +00:00
Bruno Cardoso Lopes
21e1fc67c3 decouple MMX check from regular splat checks. Some refactoring is coming, and MMX should be left alone to be easily removed after moving to intrinsics
llvm-svn: 113247
2010-09-07 18:41:45 +00:00
Bruno Cardoso Lopes
dcc8690051 Remove now useless check, because the code can be matched below, no need to leave it for isel
llvm-svn: 113242
2010-09-07 18:29:03 +00:00
Bruno Cardoso Lopes
e6f7e4684d Minor change. Since the checks are equivalent, use isMMX
llvm-svn: 113239
2010-09-07 18:24:00 +00:00
Bruno Cardoso Lopes
8d76bbe31c Remove the last bit of isShuffleMaskLegal checks and improve the comment regarding mmx shuffles
llvm-svn: 113059
2010-09-04 02:58:56 +00:00
Bruno Cardoso Lopes
fccf00be8c make explicit that we not handle several mmx shuffles
llvm-svn: 113058
2010-09-04 02:50:13 +00:00
Bruno Cardoso Lopes
0562140ec3 Emit target specific nodes to handle palignr. Do not touch it for MMX versions yet.
llvm-svn: 113056
2010-09-04 02:36:07 +00:00
Bruno Cardoso Lopes
73651844b2 Emit target specific nodes to handle splats starting at zero indicies
llvm-svn: 113055
2010-09-04 02:02:14 +00:00
Bruno Cardoso Lopes
483bb7eed2 Emit target specific nodes for isPSHUFHWMask and isPSHUFLWMask
llvm-svn: 113050
2010-09-04 01:36:45 +00:00
Bruno Cardoso Lopes
742030b3db Emit target specific nodes for isSHUFPMask
llvm-svn: 113048
2010-09-04 01:22:57 +00:00
Bruno Cardoso Lopes
3e3169873e Previous isMOVLMask matching already emits targets nodes, remove check
llvm-svn: 113047
2010-09-04 00:50:08 +00:00
Bruno Cardoso Lopes
b867456bfc One more check from the original isShuffleMaskLegal goes away
llvm-svn: 113045
2010-09-04 00:46:16 +00:00
Bruno Cardoso Lopes
3081ae493b Remove a duplicated but useless check that i've inserted in the previous commit.
llvm-svn: 113044
2010-09-04 00:43:12 +00:00
Bruno Cardoso Lopes
22775e6e65 Refactor some code and remove the extra checks for unpckl_undef and unpckh_undef
llvm-svn: 113043
2010-09-04 00:39:43 +00:00
Bruno Cardoso Lopes
5d71537f4a Remove check for unpckh mask
llvm-svn: 113035
2010-09-03 23:32:47 +00:00
Bruno Cardoso Lopes
d9d2ed558e Remove check for unpckl mask
llvm-svn: 113034
2010-09-03 23:31:50 +00:00
Bruno Cardoso Lopes
ecfa52b251 Inline isShuffleMaskLegal into LowerVECTOR_SHUFFLE, so we can start
checking each standalone condition and decide whether emit target
specific nodes or remove the condition if it's already matched before.

llvm-svn: 113031
2010-09-03 23:24:06 +00:00
Bruno Cardoso Lopes
01dc6f1195 Reapply considered harmfull part of rr112934 and r112942.
"Use target specific nodes instead of relying in unpckl and
unpckh pattern fragments during isel time. Also place a
depth limit in getShuffleScalarElt.

llvm-svn: 113020
2010-09-03 22:09:41 +00:00
Bruno Cardoso Lopes
4753ce5e2c Reintroduce a simple function refactoring done in r112934, also without any functionality changes
llvm-svn: 113008
2010-09-03 20:20:02 +00:00
Bruno Cardoso Lopes
3c43bc3214 Reapply piecies of r112942 and r112934 which don't do
functional changes

llvm-svn: 113007
2010-09-03 20:10:35 +00:00
Bruno Cardoso Lopes
9635a81d34 Reapply Fix comment
llvm-svn: 113006
2010-09-03 19:55:05 +00:00
Daniel Dunbar
26e0e964ab Revert r112934, "- Use specific nodes to match unpckl masks.", which introduced
some infinite loop and select failures.
 - Apologies for eager reverting, but its branch day.

llvm-svn: 113000
2010-09-03 19:38:11 +00:00
Daniel Dunbar
c8af4f3a0a Revert r112938 "Fix comment", which depends on r112934, which introduced some
infinite loop and select failures.

llvm-svn: 112999
2010-09-03 19:38:08 +00:00
Daniel Dunbar
4ece67890b Revert r112942, "Use punpckh and unpckh family of nodes instead of using unpckh
mask pattern fragment", which depends on r112934, which introduced some infinite
loop and select failures.

llvm-svn: 112998
2010-09-03 19:38:05 +00:00
Bruno Cardoso Lopes
70f376e9da Use punpckh and unpckh family of nodes instead of using unpckh mask pattern fragment
llvm-svn: 112942
2010-09-03 01:39:08 +00:00
Bruno Cardoso Lopes
b107a092a5 Fix comment
llvm-svn: 112938
2010-09-03 01:28:51 +00:00
Bruno Cardoso Lopes
e1ad6555a8 - Use specific nodes to match unpckl masks.
- Teach getShuffleScalarElt how to handle more target
specific nodes, so the DAGCombine can make use of it.
- Add another hack to avoid the node update problem
during legalization. More description on the comments

llvm-svn: 112934
2010-09-03 01:24:00 +00:00
Anton Korobeynikov
a65910e5ca Revert win64 changes. They seem to be incomplete
llvm-svn: 112885
2010-09-02 22:31:32 +00:00
Anton Korobeynikov
339ab60a5b Properly allocate win64 shadow reg area.
Patch by Jan Sjodin!

llvm-svn: 112875
2010-09-02 22:16:28 +00:00
Bruno Cardoso Lopes
659f549638 Replace unpckl_undef and unpckh_undef matching with target specific opcodes
llvm-svn: 112806
2010-09-02 05:23:12 +00:00
Bruno Cardoso Lopes
9d4a11d4c6 Move condition out to prepare for more matching
llvm-svn: 112805
2010-09-02 04:20:26 +00:00
Bruno Cardoso Lopes
1b9095fff1 Remove checking for isUNPCKL_v_undef_Mask, the specific node is already emitted for it
llvm-svn: 112804
2010-09-02 03:57:58 +00:00
Bruno Cardoso Lopes
dcdab94661 become more strict about when it's safe to use X86ISD::MOVLPS
llvm-svn: 112799
2010-09-02 02:35:51 +00:00
Bruno Cardoso Lopes
b73f0cbc7a Revert r112689, avoid those kind of checks cause they mess up with mmx
llvm-svn: 112760
2010-09-01 22:59:03 +00:00
Bruno Cardoso Lopes
9375b2f67d Use movlps, movlpd, movss and movsd specific nodes instead of pattern matching with movlp pattern fragment
llvm-svn: 112694
2010-09-01 05:08:25 +00:00
Bruno Cardoso Lopes
b69568ab33 minor change, simplify some logic
llvm-svn: 112689
2010-09-01 00:57:08 +00:00
Bruno Cardoso Lopes
c31697f68c Move some functions around so they can be used for some other to come function
llvm-svn: 112687
2010-09-01 00:51:36 +00:00
Bruno Cardoso Lopes
80613a070e Use x86 specific MOVSLDUP node, add more patterns to match it and remove useless load nodes
llvm-svn: 112661
2010-08-31 22:35:05 +00:00
Bruno Cardoso Lopes
8fc83b1960 Use x86 specific MOVSHDUP node and add more patterns to match it
llvm-svn: 112657
2010-08-31 22:22:11 +00:00
Bruno Cardoso Lopes
dfa177cf81 Use MOVHLPS node instead of matching using movhlps and movhlps_undef pattern fragments
llvm-svn: 112644
2010-08-31 21:38:49 +00:00
Bruno Cardoso Lopes
6fbe7b9ddd Use MOVLHPS and MOVHLPS x86 nodes whenever possible. Also remove some useless nodes
llvm-svn: 112642
2010-08-31 21:15:21 +00:00
Bruno Cardoso Lopes
08d5d62dcb Use X86ISD::MOVSS and MOVSD to represent the movl mask pattern, also fix the handling of those nodes when seeking for scalars inside vector shuffles
llvm-svn: 112570
2010-08-31 02:26:40 +00:00
Chris Lattner
8cb4abbc0e fix the buildvector->insertp[sd] logic to not always create a redundant
insertp[sd] $0, which is a noop.  Before:

_f32:                                   ## @f32
	pshufd	$1, %xmm1, %xmm2
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm2, %xmm3
	addss	%xmm1, %xmm0
                                        ## kill: XMM0<def> XMM0<kill> XMM0<def>
	insertps	$0, %xmm0, %xmm0
	insertps	$16, %xmm3, %xmm0
	ret

after:

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm1, %xmm3
	movdqa	%xmm2, %xmm0
	insertps	$16, %xmm3, %xmm0
	ret

The extra movs are due to a random (poor) scheduling decision.

llvm-svn: 112379
2010-08-28 17:59:08 +00:00
Chris Lattner
c3b630d64b fix the BuildVector -> unpcklps logic to not do pointless shuffles
when the top elements of a vector are undefined.  This happens all
the time for X86-64 ABI stuff because only the low 2 elements of
a 4 element vector are defined.  For example, on:

_Complex float f32(_Complex float A, _Complex float B) {
  return A+B;
}

We used to produce (with SSE2, SSE4.1+ uses insertps):

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$16, %xmm2, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm0
	addss	%xmm1, %xmm0
	pshufd	$16, %xmm0, %xmm1
	movdqa	%xmm2, %xmm0
	unpcklps	%xmm1, %xmm0
	ret

We now produce:

_f32:                                   ## @f32
	movdqa	%xmm0, %xmm2
	addss	%xmm1, %xmm2
	pshufd	$1, %xmm1, %xmm1
	pshufd	$1, %xmm0, %xmm3
	addss	%xmm1, %xmm3
	movaps	%xmm2, %xmm0
	unpcklps	%xmm3, %xmm0
	ret

This implements rdar://8368414

llvm-svn: 112378
2010-08-28 17:28:30 +00:00
Chris Lattner
7fa5fa1207 improve comments in the unpcklps generating logic, introduce
a new EltStride variable instead of reusing NumElems variable
for a non-obvious purpose.  No functionality change.

llvm-svn: 112377
2010-08-28 17:15:43 +00:00
Bruno Cardoso Lopes
1052e6d5d9 Clean up the logic of vector shuffles -> vector shifts.
Also teach this logic how to handle target specific shuffles if
needed, this is necessary while searching recursively for zeroed
scalar elements in vector shuffle operands.

llvm-svn: 112348
2010-08-28 02:46:39 +00:00
Anton Korobeynikov
62a9879ef4 Properly handle passing of FP stuff to varargs function on Win64:
value should be copied to the corresponding shadow reg as well.
Patch by Cameron Esfahani!

llvm-svn: 112262
2010-08-27 14:43:06 +00:00
Bruno Cardoso Lopes
6150648a64 zap the now unused MVT::getIntVectorWithNumElements
llvm-svn: 112218
2010-08-26 20:53:12 +00:00
Chris Lattner
148485f707 implement SplitVecOp_CONCAT_VECTORS, fixing the included testcase with SSE1.
llvm-svn: 112171
2010-08-26 05:51:22 +00:00
Chris Lattner
5256226fc8 fix sse1 only codegen in x86-64 mode, which is something we
apparently try to support.

llvm-svn: 112168
2010-08-26 05:24:29 +00:00
Bruno Cardoso Lopes
28f3261dbd Revert this for now, PUNPCKLDQ dont operate on v4f32
llvm-svn: 112090
2010-08-25 21:26:37 +00:00
Anton Korobeynikov
1544f79e36 Fix nasty mingw32 bug, which e.g. prevented llvm-gcc bootstrap there.
Mark _alloca call as clobberring EFLAGS, otherwise some DCE might remove
other flags-clobberring stuff (e.g. cmp instructions) occuring after
_alloca call.

llvm-svn: 112034
2010-08-25 07:50:11 +00:00
Bruno Cardoso Lopes
af72dd7362 PUNPCKLDQ should also be used for v4f32
llvm-svn: 112020
2010-08-25 02:55:40 +00:00
Bruno Cardoso Lopes
33aa4f7d1c teach lowering to get target specific nodes for pshufd, emulating the same isel behavior for now, so we can pass all vector shuffle tests
llvm-svn: 112017
2010-08-25 02:35:37 +00:00
Dan Gohman
e400c660e4 Fix X86's isLegalAddressingMode to recognize that static addresses
need not be RIP-relative in small mode.

llvm-svn: 111917
2010-08-24 15:55:12 +00:00
Bruno Cardoso Lopes
7939025262 Use pshufhw and pshuflw in more cases and fix getTargetShuffleNode number of arguments
llvm-svn: 111890
2010-08-24 01:16:15 +00:00
Bruno Cardoso Lopes
ed9ff8d8d0 Start using target speficic nodes for shuffles: pshufhw and pshuflw
llvm-svn: 111837
2010-08-23 20:41:02 +00:00
Anton Korobeynikov
a68e2a53a1 Revert invalid r111792. Jump tables are not broken on x86-64 / coff,
it's COFF emitter which does not support differences of two symbols
(and needs to be fixed). GAS is pretty fine with code produced.

llvm-svn: 111801
2010-08-23 07:38:51 +00:00
Michael J. Spencer
c52ac23659 Workaround broken jump tables on x86-64 COFF.
llvm-svn: 111792
2010-08-23 04:45:37 +00:00
Bruno Cardoso Lopes
1998fbbf1a Prepare LowerVECTOR_SHUFFLEv8i16 to use x86 target specific nodes directly
llvm-svn: 111704
2010-08-21 01:32:18 +00:00
Bruno Cardoso Lopes
28d9071635 This is the first step towards refactoring the x86 vector shuffle code. The
general idea here is to have a group of x86 target specific nodes which are
going to be selected during lowering and then directly matched in isel.

The commit includes the addition of those specific nodes and a *bunch* of
patterns, and incrementally we're going to switch between them and what we
have right now. Both the patterns and target specific nodes can change as
we move forward with this work.

llvm-svn: 111691
2010-08-20 22:55:05 +00:00
Anton Korobeynikov
f0600e9e8a More fixes for win64:
- Do not clobber al during variadic calls, this is AMD64 ABI-only feature
  - Emit wincall64, where necessary
Patch by Cameron Esfahani!

llvm-svn: 111289
2010-08-17 21:06:07 +00:00
Eric Christopher
1470fe415c Rework how the non-sse2 memory barrier is lowered so that the
encoding is correct for the built-in assembler.

Based on a patch from Chris.

llvm-svn: 111083
2010-08-14 21:51:50 +00:00
Chris Lattner
8426971169 improve indentation
llvm-svn: 111073
2010-08-14 17:26:09 +00:00
Bruno Cardoso Lopes
8b07859f3a Fix comment to reflect code, and remove an unused argument
llvm-svn: 111022
2010-08-13 17:50:47 +00:00
Bruno Cardoso Lopes
bb491bd56c Begin to support some vector operations for AVX 256-bit intructions. The long
term goal here is to be able to match enough of vector_shuffle and build_vector
so all avx intrinsics which aren't mapped to their own built-ins but to
shufflevector calls can be codegen'd. This is the first (baby) step, support
building zeroed vectors.

llvm-svn: 110897
2010-08-12 02:06:36 +00:00
Dan Gohman
d91d51116b Use ISD::ADD instead of ISD::SUB with a negated constant. This
avoids trouble if the return type of TD->getPointerSize() is
changed to something which doesn't promote to a signed type,
and is simpler anyway.

Also, use getCopyFromReg instead of getRegister to read a
physical register's value.

llvm-svn: 110835
2010-08-11 18:14:00 +00:00
Bruno Cardoso Lopes
6eb24fd744 Add AVX matching patterns to Packed Bit Test intrinsics.
Apply the same approach of SSE4.1 ptest intrinsics but
create a new x86 node "testp" since AVX introduces
vtest{ps}{pd} instructions which set ZF and CF depending
on sign bit AND and ANDN of packed floating-point sources.

This is slightly different from what the "ptest" does.
Tests comming with the other 256 intrinsics tests.

llvm-svn: 110744
2010-08-10 23:25:42 +00:00
Bruno Cardoso Lopes
2a7ed4b5c9 Support AVX 256-bit load and store intrinsics
llvm-svn: 110645
2010-08-10 01:43:16 +00:00
Bruno Cardoso Lopes
a26a97510a Support very basic (doesn't include ABI support in the front-end, varags, ...) 256-bit argument passing and return for AVX
llvm-svn: 110394
2010-08-05 23:35:51 +00:00
Eric Christopher
0e09eb9f77 Make x86-64 membarriers work without sse and clean up some of the
uses.

llvm-svn: 110274
2010-08-04 23:03:04 +00:00
Bruno Cardoso Lopes
0c0dd2173c Support all 128-bit AVX vector intrinsics. Most part of them I already
declared during the addition of the assembler support, the additional
changes are:
- Add missing intrinsics
- Move all SSE conversion instructions in X86InstInfo64.td to the SSE.td file.
- Duplicate some patterns to AVX mode.
- Step into PCMPEST/PCMPIST custom inserter and add AVX versions.

llvm-svn: 109878
2010-07-30 19:54:33 +00:00
Jakob Stoklund Olesen
1dee3913d6 Revert r109652, and remove the offending assert in loadRegFromStackSlot instead.
We do sometimes load from a too small stack slot when dealing with x86 arguments
(varargs and smaller-than-32-bit args). It looks like we know what we are doing
in those cases, so I am going to remove the assert instead of artifically
enlarging stack slot sizes.

The assert in storeRegToStackSlot stays in. We don't want to write beyond the
bounds of a stack slot.

llvm-svn: 109764
2010-07-29 17:42:27 +00:00
Jakob Stoklund Olesen
d4c60eed5e Create a fixed stack object for varargs that is as large as any register.
The size of this object isn't used for anything - technically it is of variable
size.

This avoids a false positive from the assert in
X86InstrInfo::loadRegFromStackSlot, and fixes PR7735.

llvm-svn: 109652
2010-07-28 20:55:38 +00:00
Nate Begeman
133820e806 Implement a vectorized algorithm for <16 x i8> << <16 x i8>
This is about 4x faster and smaller than the existing scalarization.

llvm-svn: 109566
2010-07-28 00:21:48 +00:00
Nate Begeman
068e932975 ~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches.
For:

define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp {
entry:
  %shl = shl <4 x i32> %r, %a                     ; <<4 x i32>> [#uses=1]
  %tmp2 = bitcast <4 x i32> %shl to <2 x i64>     ; <<2 x i64>> [#uses=1]
  ret <2 x i64> %tmp2
}

We get:

_shl:                                   ## @shl
	pslld	$23, %xmm1
	paddd	LCPI0_0, %xmm1
	cvttps2dq	%xmm1, %xmm1
	pmulld	%xmm1, %xmm0
	ret

Instead of:

_shl:                                   ## @shl
	pshufd	$3, %xmm0, %xmm2
	movd	%xmm2, %eax
	pshufd	$3, %xmm1, %xmm2
	movd	%xmm2, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	pshufd	$1, %xmm0, %xmm3
	movd	%xmm3, %eax
	pshufd	$1, %xmm1, %xmm3
	movd	%xmm3, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm3
	punpckldq	%xmm2, %xmm3
	movd	%xmm0, %eax
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	movhlps	%xmm0, %xmm0
	movd	%xmm0, %eax
	movhlps	%xmm1, %xmm1
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm0
	punpckldq	%xmm0, %xmm2
	movdqa	%xmm2, %xmm0
	punpckldq	%xmm3, %xmm0
	ret

llvm-svn: 109549
2010-07-27 22:37:06 +00:00
Evan Cheng
4cee8136a7 On x86, f32 / f64 nodes share the same registers as 128-bit vector values.
llvm-svn: 109450
2010-07-26 21:50:05 +00:00
Evan Cheng
a0b74d8804 Add an ILP scheduler. This is a register pressure aware scheduler that's
appropriate for targets without detailed instruction iterineries.
The scheduler schedules for increased instruction level parallelism in
low register pressure situation; it schedules to reduce register pressure
when the register pressure becomes high.

On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2
by 16%.

llvm-svn: 109300
2010-07-24 00:39:05 +00:00
Dale Johannesen
1ee436f417 The only supported calling convention for X86-64 uses
SSE, so we can't return floating point values if this
is disabled.  Detect this error for clang.

With SSE1 only, f64 is a problem; it can be done, but
neither llvm-gcc nor clang has ever generated correct
code for it.  Since nobody noticed this I think it's
OK to treat it as an error for now.

This also handles SSE-sized vectors of floating point.
8207686, 8204109.

llvm-svn: 109201
2010-07-23 00:30:35 +00:00
Eric Christopher
4924d5fb93 Custom lower the memory barrier instructions and add support
for lowering without sse2.  Add a couple of new testcases.

Fixes a few libgomp tests and latent bugs.  Remove a few todos.

llvm-svn: 109078
2010-07-22 02:48:34 +00:00
Eric Christopher
5901214b6a 80-columns.
llvm-svn: 109070
2010-07-22 00:26:08 +00:00
Nate Begeman
e7ca21ab3b Fix a couple issues with Win64 ABI
1) all registers were spilled as xmm, regardless of actual size
2) win64 abi doesn't do the varargs-size-in-%al thing

Still to look into:

xmm6-15 are marked as clobbered by call instructions on win64 even though they aren't.

llvm-svn: 109035
2010-07-21 20:49:52 +00:00
Eric Christopher
959481ec87 Pulling out previous patch, must've run the tests in
the wrong directory.

llvm-svn: 109005
2010-07-21 09:23:56 +00:00
Eric Christopher
5c12ad2a4b Lower MEMBARRIER on x86 and support processors without SSE2.
Fixes a pile of libgomp failures in the llvm-gcc testsuite due
to the libcall not existing.

llvm-svn: 109004
2010-07-21 09:05:23 +00:00
Evan Cheng
ffbae6ad52 Split -enable-finite-only-fp-math to two options:
-enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN.

llvm-svn: 108465
2010-07-15 22:07:12 +00:00
Jakob Stoklund Olesen
e3aafe4988 Use TargetOpcode::COPY instead of X86-native register copy instructions when
lowering atomics. This will allow those copies to still be coalesced after
TII::isMoveInstr is removed.

llvm-svn: 108385
2010-07-14 23:50:27 +00:00
Evan Cheng
f6478f489d Fix for PR7193 was overly conservative. The only case where sibcall callee
address cannot be allocated a register is in 32-bit mode where the first
three arguments are marked inreg. In that case EAX, EDX, and ECX will be
used for argument passing.

This fixes PR7610.

llvm-svn: 108327
2010-07-14 06:44:01 +00:00
Dan Gohman
fef30fcd5e Reapply bottom-up fast-isel, with several fixes for x86-32:
- Check getBytesToPopOnReturn().
 - Eschew ST0 and ST1 for return values.
 - Fix the PIC base register initialization so that it doesn't ever
   fail to end up the top of the entry block.

llvm-svn: 108039
2010-07-10 09:00:22 +00:00
Jakob Stoklund Olesen
bf7dddd5b7 An x86 function returns a floating point value in st(0), and we must make sure
it is popped, even if it is ununsed. A CopyFromReg node is too weak to represent
the required sideeffect, so insert an FpGET_ST0 instruction directly instead.

This will matter when CopyFromReg gets lowered to a generic COPY instruction.

llvm-svn: 108037
2010-07-10 04:04:25 +00:00
Bob Wilson
9e8c9204ef --- Reverse-merging r107947 into '.':
U    utils/TableGen/FastISelEmitter.cpp
--- Reverse-merging r107943 into '.':
U    test/CodeGen/X86/fast-isel.ll
U    test/CodeGen/X86/fast-isel-loads.ll
U    include/llvm/Target/TargetLowering.h
U    include/llvm/Support/PassNameParser.h
U    include/llvm/CodeGen/FunctionLoweringInfo.h
U    include/llvm/CodeGen/CallingConvLower.h
U    include/llvm/CodeGen/FastISel.h
U    include/llvm/CodeGen/SelectionDAGISel.h
U    lib/CodeGen/LLVMTargetMachine.cpp
U    lib/CodeGen/CallingConvLower.cpp
U    lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
U    lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp
U    lib/CodeGen/SelectionDAG/FastISel.cpp
U    lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
U    lib/CodeGen/SelectionDAG/ScheduleDAGSDNodes.cpp
U    lib/CodeGen/SelectionDAG/InstrEmitter.cpp
U    lib/CodeGen/SelectionDAG/TargetLowering.cpp
U    lib/Target/XCore/XCoreISelLowering.cpp
U    lib/Target/XCore/XCoreISelLowering.h
U    lib/Target/X86/X86ISelLowering.cpp
U    lib/Target/X86/X86FastISel.cpp
U    lib/Target/X86/X86ISelLowering.h

llvm-svn: 107987
2010-07-09 16:37:18 +00:00
Dan Gohman
dad9d461c3 Fix the memoperand offsets in code generated for va_start.
llvm-svn: 107948
2010-07-09 01:06:48 +00:00
Dan Gohman
7e6e4dd058 Re-apply bottom-up fast-isel, with fixes. Be very careful to avoid emitting
a DBG_VALUE after a terminator, or emitting any instructions before an EH_LABEL.

llvm-svn: 107943
2010-07-09 00:39:23 +00:00
Chris Lattner
49ac65543c Change LEA to have 5 operands for its memory operand, just
like all other instructions, even though a segment is not
allowed.  This resolves a bunch of gross hacks in the 
encoder and makes LEA more consistent with the rest of the
instruction set.

No functionality change.

llvm-svn: 107934
2010-07-08 23:46:44 +00:00
Chris Lattner
18802e1a55 add some long-overdue enums to refer to the parts of the 5-operand
X86 memory operand.

llvm-svn: 107925
2010-07-08 22:41:28 +00:00
Dan Gohman
4dcc56a102 Revert 107840 107839 107813 107804 107800 107797 107791.
Debug info intrinsics win for now.

llvm-svn: 107850
2010-07-08 01:00:56 +00:00
Evan Cheng
22b3e8f3b1 Move getExtLoad() and (some) getLoad() DebugLoc argument after EVT argument for consistency sake.
llvm-svn: 107820
2010-07-07 22:15:37 +00:00
Dan Gohman
424cc6b616 Add X86FastISel support for return statements. This entails refactoring
a bunch of stuff, to allow the target-independent calling convention
logic to be employed.

llvm-svn: 107800
2010-07-07 18:32:53 +00:00
Dan Gohman
b87c534168 Simplify FastISel's constructor by giving it a FunctionLoweringInfo
instance, rather than pointers to all of FunctionLoweringInfo's
members.

This eliminates an NDEBUG ABI sensitivity.

llvm-svn: 107789
2010-07-07 16:29:44 +00:00
Dan Gohman
c768525273 Split the SDValue out of OutputArg so that SelectionDAG-independent
code can do calling-convention queries. This obviates OutputArgReg.

llvm-svn: 107786
2010-07-07 15:54:55 +00:00
Dale Johannesen
81ea05c193 Accept RIP-relative symbols with 'i' constraint, and
print the (%rip) only if the 'a' modifier is present.
PR 7528.

llvm-svn: 107727
2010-07-06 23:27:00 +00:00