1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00
Commit Graph

2331 Commits

Author SHA1 Message Date
Andrew Trick
aec414c298 Order CALLSEQ_START and CALLSEQ_END nodes.
Fixes PR16146: gdb.base__call-ar-st.exp fails after
pre-RA-sched=source fixes.

Patch by Xiaoyi Guo!

This also fixes an unsupported dbg.value test case. Codegen was
previously incorrect but the test was passing by luck.

llvm-svn: 182885
2013-05-29 22:03:55 +00:00
Andrew Trick
2790ee3a8e Track IR ordering of SelectionDAG nodes 2/4.
Change SelectionDAG::getXXXNode() interfaces as well as call sites of
these functions to pass in SDLoc instead of DebugLoc.

llvm-svn: 182703
2013-05-25 02:42:55 +00:00
Michael J. Spencer
c195b8a813 Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros.
llvm-svn: 182680
2013-05-24 22:23:49 +00:00
Nadav Rotem
bfc39207bf X86: Fix a bug in EltsFromConsecutiveLoads. We can't generate new loads without chains.
llvm-svn: 182507
2013-05-22 19:28:41 +00:00
Benjamin Kramer
dced131e7e X86: When expanding PCMPGTQ to PCMPGTD we always want to compare the lower halves as unsigned.
Take #2 on fixing PR15977.

llvm-svn: 182486
2013-05-22 17:01:12 +00:00
Benjamin Kramer
6b339bee4e X86: When emulating unsigned PCMPGTQ with PCMPGTD, fix the sign bit for the smaller type.
Otherwise we'll get a mix of signed and unsigned compares.
Fixes PR15977.

llvm-svn: 182364
2013-05-21 09:58:54 +00:00
Matt Arsenault
118196f0ca Add LLVMContext argument to getSetCCResultType
llvm-svn: 182180
2013-05-18 00:21:46 +00:00
Benjamin Kramer
e9efb3252f X86: Make shuffle -> shift conversion more aggressive about undefs.
Shuffles that only move an element into position 0 of the vector are common in
the output of the loop vectorizer and often generate suboptimal code when SSSE3
is not available. Lower them to vector shifts if possible.

We still prefer palignr over psrldq because it has higher throughput on
sandybridge.

llvm-svn: 182102
2013-05-17 14:48:34 +00:00
David Majnemer
d5ba4da281 Remove a recently redundant transform from X86ISelLowering.
X86ISelLowering has support to treat:
(icmp ne (and (xor %flags, -1), (shl 1, flag)), 0)

as if it were actually:
(icmp eq (and %flags, (shl 1, flag)), 0)

However, r179386 has code at the InstCombine level to handle this.

llvm-svn: 181145
2013-05-05 02:00:10 +00:00
Nadav Rotem
1330c4432b Fix an odd comment.
llvm-svn: 181136
2013-05-04 23:24:56 +00:00
Michael Liao
11eb7f45ef 80-col fixup.
llvm-svn: 180915
2013-05-02 09:22:04 +00:00
Michael Liao
7a427b3c8f Avoid duplicating logic on frame register selecting when lowering eh_return
No functionality change

llvm-svn: 180914
2013-05-02 09:18:38 +00:00
Michael Liao
c1eaac67f3 Avoid duplicating logic on frame register selecting when lowering frameaddr
No functionality change

llvm-svn: 180912
2013-05-02 08:21:56 +00:00
Tim Northover
6a53674c70 Remove unused ShouldFoldAtomicFences flag.
I think it's almost impossible to fold atomic fences profitably under
LLVM/C++11 semantics. As a result, this is now unused and just
cluttering up the target interface.

llvm-svn: 179940
2013-04-20 12:32:43 +00:00
Tim Northover
d12b2f24c8 Remove unused MEMBARRIER DAG node; it's been replaced by ATOMIC_FENCE.
llvm-svn: 179939
2013-04-20 12:32:17 +00:00
Michael Liao
3b258b6b24 ArrayRefize getMachineNode(). No functionality change.
llvm-svn: 179901
2013-04-19 22:22:57 +00:00
Michael Liao
10bba2164e Use 'array_lengthof' as possible to avoid magic numbers
llvm-svn: 179833
2013-04-19 04:03:37 +00:00
Benjamin Kramer
aeff9e581b X86: Add an SSE2 lowering for 64 bit compares when pcmpgtq (SSE4.2) isn't available.
This pattern started popping up in vectorized min/max reductions.

llvm-svn: 179797
2013-04-18 21:37:45 +00:00
Michael Liao
877d1576e6 Optimize vector select from all 0s or all 1s
As packed comparisons in AVX/SSE produce all 0s or all 1s in each SIMD lane,
vector select could be simplified to AND/OR or removed if one or both values
being selected is all 0s or all 1s.

llvm-svn: 179267
2013-04-11 05:15:54 +00:00
Michael Liao
87125582e9 Enhance bool simplifcation in X86 to handle more cases
This patch is revised based on patch from Victor Umansky
<victor.umansky@intel.com>. More cases are handled in X86's bool
simplification, i.e.
- SETCC_CARRY
- value is truncated to i1 with AND

As a by-product, PR5443 is also fixed.

llvm-svn: 179265
2013-04-11 04:43:09 +00:00
Evan Cheng
9f82233851 __sincosf_stret returns sinf / cosf in bits 0:31 and 32:63 of xmm0, not in
xmm0 / xmm1.

rdar://13599493

llvm-svn: 179141
2013-04-10 01:26:07 +00:00
Bill Wendling
f2bb7aa5f8 Use the target options specified on a function to reset the back-end.
During LTO, the target options on functions within the same Module may
change. This would necessitate resetting some of the back-end. Do this for X86,
because it's a Friday afternoon.

llvm-svn: 178917
2013-04-05 21:52:40 +00:00
Benjamin Kramer
790bd5fb50 X86: Promote sitofp <8 x i16> to <8 x i32> when AVX is available.
A vector sext + sitofp is a lot cheaper than 8 scalar conversions.

llvm-svn: 178448
2013-03-31 12:49:15 +00:00
Benjamin Kramer
279e5cfa9a Remove the old CodePlacementOpt pass.
It was superseded by MachineBlockPlacement and disabled by default since LLVM 3.1.

llvm-svn: 178349
2013-03-29 17:14:24 +00:00
Michael Liao
427149cbcf Add support of RDSEED defined in AVX2 extension
llvm-svn: 178314
2013-03-28 23:41:26 +00:00
Michael Liao
aec693ab31 Enhance boolean simplification to handle 16-/64-bit RDRAND
- RDRAND always clears the destination value when a random value is not
  available (i.e. CF == 0). This value is truncated or zero-extended as
  the false boolean value to be returned. Boolean simplification needs
  to skip this 'zext' or 'trunc' node.

llvm-svn: 178312
2013-03-28 23:38:52 +00:00
Michael Liao
d961d7a7b3 Skip moving call address loading into callseq when targets prefer register indirect call.
To enable a load of a call address to be folded with that call, this
load is moved from outside of callseq into callseq. Such a moving
adds a non-glued node (that load) into a glued sequence. This non-glue
load is only removed when DAG selection folds them into a memory form
call instruction. When such instruction selection is disabled, it breaks
DAG schedule.

To prevent that, such moving is disabled when target favors register
indirect call.

Previous workaround disabling CALL32m/CALL64m insn selection is removed.

llvm-svn: 178308
2013-03-28 23:13:21 +00:00
Timur Iskhodzhanov
d7de83d51c Make Win32 put the SRet address into EAX, fixes PR15556
llvm-svn: 178291
2013-03-28 21:30:04 +00:00
Preston Gurd
b6ed645cb6 For the current Atom processor, the fastest way to handle a call
indirect through a memory address is to load the memory address into
a register and then call indirect through the register.

This patch implements this improvement by modifying SelectionDAG to
force a function address which is a memory reference to be loaded
into a virtual register.

Patch by Sriram Murali.

llvm-svn: 178171
2013-03-27 19:14:02 +00:00
Hal Finkel
ba08d9519e Fix typo (common to both X86 and PPC)
Thanks to Bill Schmidt for pointing this out during code review!

llvm-svn: 178170
2013-03-27 19:10:42 +00:00
Michael Liao
bd3f6b0eea Add XTEST codegen support
llvm-svn: 178083
2013-03-26 22:47:01 +00:00
Michael Liao
a0a4d0c6f7 Revise alignment checking/calculation on 256-bit unaligned memory access
- It's still considered aligned when the specified alignment is larger
  than the natural alignment;
- The new alignment for the high 128-bit vector should be min(16,
  alignment) as the pointer is advanced by 16, a power-of-2 offset.

llvm-svn: 177947
2013-03-25 23:50:10 +00:00
Michael Liao
d0e167edfb Fix PR15296
- Move SRA/SRL/SHL lowering support from DAG combination to DAG lowering
  to support extended 256-bit integer in AVX but not AVX2.

llvm-svn: 177478
2013-03-20 02:33:21 +00:00
Michael Liao
8be4fbefe3 Mark all variable shifts needing customizing
- Prepare moving logic from DAG combining into DAG lowering. There's no
  functionality change.

llvm-svn: 177477
2013-03-20 02:28:20 +00:00
Michael Liao
3b72fc2823 Move scalar immediate shift lowering into a dedicated func
- no functionality change

llvm-svn: 177476
2013-03-20 02:20:36 +00:00
Nadav Rotem
317ff20b46 Optimize sext <4 x i8> and <4 x i16> to <4 x i64>.
Patch by Ahmad, Muhammad T <muhammad.t.ahmad@intel.com>

llvm-svn: 177421
2013-03-19 18:38:27 +00:00
Anton Korobeynikov
97423324fa TLS support for MinGW targets.
MinGW is almost completely compatible to MSVC, with the exception of the _tls_array global not being available.

Patch by David Nadlinger!

llvm-svn: 177257
2013-03-18 08:12:28 +00:00
Michael Liao
89d165e673 Fix PR15309
- Fix the typo on type checking

llvm-svn: 177010
2013-03-14 06:57:42 +00:00
Tom Stellard
fa72758e1d DAGCombiner: Use correct value type for checking legality of BR_CC v3
LegalizeDAG.cpp uses the value of the comparison operands when checking
the legality of BR_CC, so DAGCombiner should do the same.

v2:
  - Expand more BR_CC value types for NVPTX

v3:
  - Expand correct BR_CC value types for Hexagon, Mips, and XCore.

llvm-svn: 176694
2013-03-08 15:36:57 +00:00
Benjamin Kramer
d2f85ae895 X86: Fold EXTRACT_SUBVECTORs of a BUILD_VECTOR into a smaller BUILD_VECTOR.
That can usually be lowered efficiently and is common in sandybridge code.
It would be nice to do this in DAGCombiner but we can't insert arbitrary
BUILD_VECTORs this late.

Fixes PR15462.

llvm-svn: 176634
2013-03-07 18:48:40 +00:00
Michael Liao
32f3aca77c Fix two remaining issue after fixing PR15355 when CMOV is not available
- Phi nodes should be replaced/updated after lowering CMOV into branch
  because 'mainMBB' updating operand in Phi node is changed.
- Add EFLAGS in livein before lowering the 2nd CMOV. It's necessary as
  we will reuse the EFLAGS generated before the 1st lowered CMOV, which
  won't clobber EFLAGS. However, we need explicitly specify that.
- '-attr=-cmov' test case are added.

llvm-svn: 176598
2013-03-07 01:01:29 +00:00
Michael Liao
5859ab0234 Fix PR15355
- Clear 'mayStore' flag when loading from the atomic variable before the
  spin loop
- Clear kill flag from one use to multiple use in registers forming the
  address to that atomic variable
- don't use a physical register as live-in register in BB (neither entry
  nor landing pad.) by copying it into virtual register

(patch by Cameron Zwarich)

llvm-svn: 176538
2013-03-06 00:17:04 +00:00
Preston Gurd
66b9c4fcf9 Bypass Slow Divides
* Only apply divide bypass optimization when not optimizing for size. 
* Fixed bug caused by constant for 0 value of type Int32,
  used dividend type to generate the constant instead.
* For atom x86-64 apply the divide bypass to use 16-bit divides instead of
  64-bit divides when operand values are small enough.
* Added lit tests for 64-bit divide bypass.

Patch by Tyler Nowicki!

llvm-svn: 176442
2013-03-04 18:13:57 +00:00
Michael Liao
1e621fbd2f Fix PR10475
- ISD::SHL/SRL/SRA must have either both scalar or both vector operands
  but TLI.getShiftAmountTy() so far only return scalar type. As a
  result, backend logic assuming that breaks.
- Rename the original TLI.getShiftAmountTy() to
  TLI.getScalarShiftAmountTy() and re-define TLI.getShiftAmountTy() to
  return target-specificed scalar type or the same vector type as the
  1st operand.
- Fix most TICG logic assuming TLI.getShiftAmountTy() a simple scalar
  type.

llvm-svn: 176364
2013-03-01 18:40:30 +00:00
Michael Liao
ad0b9ecc47 Refine fix to PR10499, no functionality change
- Put expensive checking after simple one

llvm-svn: 176060
2013-02-25 23:16:36 +00:00
Michael Liao
ff7d7ec88b Fix PR10499
- Check whether SSE is available before lowering all 1s vector building with
  PCMPEQD, which is only available from SSE2

llvm-svn: 176058
2013-02-25 23:01:03 +00:00
Nadav Rotem
0740239f87 Revert r169638 because it broke Mesa llvmpipe tests.
Fix PR15239.

llvm-svn: 175985
2013-02-24 07:09:35 +00:00
Jim Grosbach
233487d8a2 Update TargetLowering ivars for name policy.
http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

ivars should be camel-case and start with an upper-case letter. A few in
TargetLowering were starting with a lower-case letter.

No functional change intended.

llvm-svn: 175667
2013-02-20 21:13:59 +00:00
Elena Demikhovsky
0886fb4d55 I optimized the following patterns:
sext <4 x i1> to <4 x i64>
 sext <4 x i8> to <4 x i64>
 sext <4 x i16> to <4 x i64>
 
I'm running Combine on SIGN_EXTEND_IN_REG and revert SEXT patterns:
 (sext_in_reg (v4i64 anyext (v4i32 x )), ExtraVT) -> (v4i64 sext (v4i32 sext_in_reg (v4i32 x , ExtraVT)))
 
 The sext_in_reg (v4i32 x) may be lowered to shl+sar operations.
 The "sar" does not exist on 64-bit operation, so lowering sext_in_reg (v4i64 x) has no vector solution.

I also added a cost of this operations to the AVX costs table.

llvm-svn: 175619
2013-02-20 12:42:54 +00:00
Craig Topper
6b083f50d1 Fix capitalization in comment to match function name.
llvm-svn: 175497
2013-02-19 07:43:59 +00:00