1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00
Commit Graph

497 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen
b15912aafd Don't fold indexed loads into TCRETURNmi64.
We don't have enough GR64_TC registers when calling a varargs function
with 6 arguments. Since %al holds the number of vector registers used,
only %r11 is available as a scratch register.

This means that addressing modes using both base and index registers
can't be folded into TCRETURNmi64.

<rdar://problem/12282281>

llvm-svn: 163761
2012-09-13 00:25:00 +00:00
Michael Liao
e600a8a616 Fix PR11985
- BlockAddress has no support of BA + offset form and there is no way to
  propagate that offset into machine operand;
- Add BA + offset support and a new interface 'getTargetBlockAddress' to
  simplify target block address forming;
- All targets are modified to use new interface and X86 backend is enhanced to
  support BA + offset addressing.

llvm-svn: 163743
2012-09-12 21:43:09 +00:00
Manman Ren
1a047422a0 Release build: guard dump functions with
"#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)"

No functional change. Update r163339.

llvm-svn: 163653
2012-09-11 22:23:19 +00:00
Manman Ren
b9d2a6fa2e Release build: guard dump functions with "ifndef NDEBUG"
No functional change.

llvm-svn: 163339
2012-09-06 19:06:06 +00:00
Richard Smith
865f47cbb6 Fix integer undefined behavior due to signed left shift overflow in LLVM.
Reviewed offline by chandlerc.

llvm-svn: 162623
2012-08-24 23:29:28 +00:00
Craig Topper
3929432178 Add a couple default: llvm_unreachable() to some switch statements. Fix a bad message in an existing llvm_unreachable.
llvm-svn: 161725
2012-08-11 17:44:14 +00:00
Manman Ren
967804ad0a X86: enable CSE between CMP and SUB
We perform the following:
1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering.
2> Modify MachineCSE to correctly handle implicit defs.
3> Convert SUB back to CMP if possible at peephole.

Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled
by peephole now.

rdar://11873276

llvm-svn: 161462
2012-08-08 00:51:41 +00:00
Craig Topper
dc1b95e7de Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305.
llvm-svn: 161318
2012-08-06 06:22:36 +00:00
Chad Rosier
078db59861 Whitespace.
llvm-svn: 161122
2012-08-01 18:39:17 +00:00
David Chisnall
c21e3b3ce1 ELF does not imply GNU/Linux. Do not assume GNU conventions just because we
are targeting an ELF platform.  Only fold gs-relative (and fs-relative) loads
if it is actually sensible to do so for the target platform.

This fixes PR13438.

llvm-svn: 160687
2012-07-24 20:04:16 +00:00
Craig Topper
6eef81b65b Update GATHER instructions to support 2 read-write operands. Patch from myself and Manman Ren.
llvm-svn: 160110
2012-07-12 06:52:41 +00:00
Craig Topper
4fc5342fc7 Reduce code size by using a second switch statement to avoid extra calls to SelectAtomic64. Also catch cases where SelectAtomic64 fails.
llvm-svn: 159503
2012-07-01 02:55:34 +00:00
Craig Topper
80279ea39f Add a break to the end of case statement missed in r159501.
llvm-svn: 159502
2012-07-01 02:18:18 +00:00
Craig Topper
8b795d08a5 Fix a crash on release builds if gather intrinsics are passed a non-constant value for the last argument.
llvm-svn: 159501
2012-07-01 02:17:08 +00:00
Craig Topper
b2a94bd61c Use a second switch statement to reduce number of calls to SelectGather in code. Reduces code size a bit.
llvm-svn: 159500
2012-07-01 02:05:52 +00:00
Manman Ren
63bf58865a X86: add more GATHER intrinsics in LLVM
Corrected type for index of llvm.x86.avx2.gather.d.pd.256
  from 256-bit to 128-bit.
Corrected types for src|dst|mask of llvm.x86.avx2.gather.q.ps.256
  from 256-bit to 128-bit.

Support the following intrinsics:
  llvm.x86.avx2.gather.d.q, llvm.x86.avx2.gather.q.q
  llvm.x86.avx2.gather.d.q.256, llvm.x86.avx2.gather.q.q.256
  llvm.x86.avx2.gather.d.d, llvm.x86.avx2.gather.q.d
  llvm.x86.avx2.gather.d.d.256, llvm.x86.avx2.gather.q.d.256

llvm-svn: 159402
2012-06-29 00:54:20 +00:00
Manman Ren
6be46b7b4c X86: add GATHER intrinsics (AVX2) in LLVM
Support the following intrinsics:
llvm.x86.avx2.gather.d.pd, llvm.x86.avx2.gather.q.pd
llvm.x86.avx2.gather.d.pd.256, llvm.x86.avx2.gather.q.pd.256
llvm.x86.avx2.gather.d.ps, llvm.x86.avx2.gather.q.ps
llvm.x86.avx2.gather.d.ps.256, llvm.x86.avx2.gather.q.ps.256

Modified Disassembler to handle VSIB addressing mode.

llvm-svn: 159221
2012-06-26 19:47:59 +00:00
Craig Topper
49c52dde2b Tidy up spacing.
llvm-svn: 157313
2012-05-23 05:44:51 +00:00
Evan Cheng
9a33fc17be Avoid creating a cycle when folding load / op with flag / store. PR11451474. rdar://11451474
llvm-svn: 156896
2012-05-16 01:54:27 +00:00
Evan Cheng
d9958dcd91 Generalize r153635 to deal with TokenFactor chains; also clean up the logic and fix the tests. rdar://11069732, rdar://11236106
llvm-svn: 154604
2012-04-12 19:14:21 +00:00
Chandler Carruth
bb1db0e66a Cleanup and relax a restriction on the matching of global offsets into
x86 addressing modes. This allows PIE-based TLS offsets to fit directly
into an addressing mode immediate offset, which is the last remaining
code quality issue from PR12380. With this patch, that PR is completely
fixed.

To understand why this patch is correct to match these offsets into
addressing mode immediates, break it down by cases:
1) 32-bit is trivially correct, and unmodified here.
2) 64-bit non-small mode is unchanged and never matches.
3) 64-bit small PIC code which is RIP-relative is handled specially in
   the match to try to fit RIP into the base register. If it fails, it
   now early exits. This behavior is unchanged by the patch.
4) 64-bit small non-PIC code which is not RIP-relative continues to work
   as it did before. The reason these immediates are safe is because the
   ABI ensures they fit in small mode. This behavior is unchanged.
5) 64-bit small PIC code which is *not* using RIP-relative addressing.
   This is the only case changed by the patch, and the primary place you
   see it is in TLS, either the win64 section offset TLS or Linux
   local-exec TLS model in a PIC compilation. Here the ABI again ensures
   that the immediates fit because we are in small mode, and any other
   operations required due to the PIC relocation model have been handled
   externally to the Wrapper node (extra loads etc are made around the
   wrapper node in ISelLowering).

I've tested this as much as I can comparing it with GCC's output, and
everything appears safe. I discussed this with Anton and it made sense
to him at least at face value. That said, if there are issues with PIC
code after this patch, yell and we can revert it.

llvm-svn: 154304
2012-04-09 02:13:06 +00:00
Rafael Espindola
88a1aeb123 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011
2012-04-04 12:51:34 +00:00
Benjamin Kramer
e3b0c81c27 Replace assert(0) with llvm_unreachable to avoid warnings about dropping off the end of a non-void function in Release builds.
llvm-svn: 153643
2012-03-29 12:37:26 +00:00
Joel Jones
486c38b0cf For X86, change load/dec-or-inc/store into dec-or-inc, respectively.
This is a code change to add support for changing instruction sequences of the form:

  load
  inc/dec of 8/16/32/64 bits
  store

into the appropriate X86 inc/dec through memory instruction:

  inc[qlwb] / dec[qlwb]

The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better
named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode.  The comments have also been expanded.

llvm-svn: 153635
2012-03-29 05:45:48 +00:00
Joel Jones
32f97db4b2 Reverted to revision 153616 to unblock build
llvm-svn: 153623
2012-03-29 01:20:56 +00:00
Joel Jones
b4477ee31f For X86, change load/dec-or-inc/store into dec-or-inc, respectively.
This is a code change to add support for changing instruction sequences of the form:

  load
  inc/dec of 8/16/32/64 bits
  store

into the appropriate X86 inc/dec through memory instruction:

  inc[qlwb] / dec[qlwb]

The checks that were in X86DAGToDAGISel::Select(SDNode *Node)>>ISD::STORE have been extracted to isLoadIncOrDecStore and reworked to use the better
named wrappers for getOperand(unsigned) (e.g. getOffset()) and replaced Chain.getNode() with LoadNode.  The comments have also been expanded.

llvm-svn: 153617
2012-03-29 00:37:47 +00:00
Craig Topper
bf6a47d0ec Prune some includes
llvm-svn: 153502
2012-03-27 07:54:11 +00:00
Craig Topper
6bb276ae72 Remove unnecessary llvm:: qualifications
llvm-svn: 153500
2012-03-27 07:21:54 +00:00
Craig Topper
b1f171a213 Reorder includes in Target backends to following coding standards. Remove some superfluous forward declarations.
llvm-svn: 152997
2012-03-17 18:46:09 +00:00
Craig Topper
3dee24b8c3 Use uint16_t to store opcodes in static tables in X86 backend.
llvm-svn: 152391
2012-03-09 07:45:21 +00:00
Craig Topper
d8faffd93b Declare register classes as const. Fix a couple pointers to register classes that weren't already const.
llvm-svn: 151138
2012-02-22 07:28:11 +00:00
Jakob Stoklund Olesen
b498ebe5b7 Use the same CALL instructions for Windows as for everything else.
The different calling conventions and call-preserved registers are
represented with regmask operands that are added dynamically.

llvm-svn: 150708
2012-02-16 17:56:02 +00:00
Pete Cooper
21409dd760 Stop custom lowering forr x86 DEC64m from happening if the load in the lowered sequence has more than 1 user
llvm-svn: 150537
2012-02-15 00:33:37 +00:00
Pete Cooper
b1229a8866 Fixed bug when custom lowering DEC64m on x86.
If the DEC node had more than one user, it was doing this lowering but
leaving the original DEC node around and so decrementing twice.

Fixes PR11964.

llvm-svn: 150356
2012-02-13 00:10:03 +00:00
David Blaikie
06ecc99a56 More dead code removal (using -Wunreachable-code)
llvm-svn: 148578
2012-01-20 21:51:11 +00:00
Chandler Carruth
751438a272 Switch all of the uses of my InsertDAGNode helper to follow the exact
same pattern. We already had this pattern is a few places, but others
tried to make a rough approximation of an actual DAG structure. As not
everywhere went to this trouble, nothing could rely on this being done.
In fact, I've checked all references to these node Ids, and the ones
that are using the topo-sort properties are actually satisfied with
a strict-weak-ordering. The requirement appears to be that Use >= Def.

I've added a big blurb of comments to this bit of the transform to
clarify why the order is so important for the next reader of the code.

I'm starting with this change as it is very small, and trivially
reverted if something breaks or the >= above really does need to be >.
If that proves the case, we can hide the problem by reverting this
patch, but the problem exists elsewhere as well, and so a more
comprehensive solution will be needed.

llvm-svn: 148001
2012-01-12 01:34:44 +00:00
Chandler Carruth
e3facd7860 Revert r147945 which disabled an addressing mode transformation. I had
hoped this would revive one of the llvm-gcc selfhost build bots, but it
didn't so it doesn't appear that my transform is the culprit.

If anyone else is seeing failures, please let me know!

llvm-svn: 147957
2012-01-11 18:36:12 +00:00
Chandler Carruth
e443e5e55d Disable the transformation I added in r147936 to see if it fixes some
strange build bot failures that look like a miscompile into an infloop.
I'll investigate this tomorrow, but I'd both like to know whether my
patch is the culprit, and get the bots back to green.

llvm-svn: 147945
2012-01-11 12:17:47 +00:00
Chandler Carruth
9729a20760 Hoist a really redundant code pattern into a helper function, and delete
lots of lines of code. No functionality changed.

llvm-svn: 147942
2012-01-11 11:04:36 +00:00
Chandler Carruth
f44e3c7745 Simplify the AND-rooted mask+shift checking code to match that of the
SRL-rooted code.

llvm-svn: 147941
2012-01-11 09:35:04 +00:00
Chandler Carruth
5a0ea8a5fd Unify the interface of the three mask+shift transform helpers, and
factor the differences that were hiding in one of them into its other
caller, the SRL handling code. No change in behavior.

llvm-svn: 147940
2012-01-11 09:35:02 +00:00
Chandler Carruth
7d8eac052d Clarify and make explicit some of the requirements for transforming
mask+shift pairs at the beginning of the ISD::AND case block, and then
hoist the final pattern into a helper function, simplifying and
reflowing it appropriately. This should have no observable behavior
change, but several simplifications fell out of this such as directly
computing the new mask constant, etc.

llvm-svn: 147939
2012-01-11 09:35:00 +00:00
Chandler Carruth
068b0ca15a Hoist the logic to transform shift+mask combinations into sub-register
extracts and scaled addressing modes into its own helper function. No
functionality changed here, just hoisting and layout fixes falling out
of that hoisting.

llvm-svn: 147937
2012-01-11 08:48:20 +00:00
Chandler Carruth
b3371fa250 Teach the X86 instruction selection to do some heroic transforms to
detect a pattern which can be implemented with a small 'shl' embedded in
the addressing mode scale. This happens in real code as follows:

  unsigned x = my_accelerator_table[input >> 11];

Here we have some lookup table that we look into using the high bits of
'input'. Each entity in the table is 4-bytes, which means this
implicitly gets turned into (once lowered out of a GEP):

  *(unsigned*)((char*)my_accelerator_table + ((input >> 11) << 2));

The shift right followed by a shift left is canonicalized to a smaller
shift right and masking off the low bits. That hides the shift right
which x86 has an addressing mode designed to support. We now detect
masks of this form, and produce the longer shift right followed by the
proper addressing mode. In addition to saving a (rather large)
instruction, this also reduces stalls in Intel chips on benchmarks I've
measured.

In order for all of this to work, one part of the DAG needs to be
canonicalized *still further* than it currently is. This involves
removing pointless 'trunc' nodes between a zextload and a zext. Without
that, we end up generating spurious masks and hiding the pattern.

llvm-svn: 147936
2012-01-11 08:41:08 +00:00
Chandler Carruth
f762ca322b Don't rely on the fact that shift values are never very large, and thus
this substraction will result in small negative numbers at worst which
become very large positive numbers on assignment and are thus caught by
the <=4 check on the next line. The >0 check clearly intended to catch
these as negative numbers.

Spotted by inspection, and impossible to trigger given the shift widths
that can be used.

llvm-svn: 147773
2012-01-09 09:47:25 +00:00
Pete Cooper
4f4a9794b2 Added missing comment about new custom lowering of DEC64
llvm-svn: 144811
2011-11-16 19:03:23 +00:00
Pete Cooper
8441c08e0b Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used
by later instructions.

Only done for DEC64m right now.

Fixes <rdar://problem/6172640>

llvm-svn: 144705
2011-11-15 21:57:53 +00:00
Dan Gohman
a5f382da8b Reapply r143206, with fixes. Disallow physical register lifetimes
across calls, and only check for nested dependences on the special
call-sequence-resource register.

llvm-svn: 143660
2011-11-03 21:49:52 +00:00
Dan Gohman
826cec9a4b Revert r143206, as there are still some failing tests.
llvm-svn: 143262
2011-10-29 00:41:52 +00:00
Dan Gohman
dedcc22bcd Reapply r143177 and r143179 (reverting r143188), with scheduler
fixes: Use a separate register, instead of SP, as the
calling-convention resource, to avoid spurious conflicts with
actual uses of SP. Also, fix unscheduling of calling sequences,
which can be triggered by pseudo-two-address dependencies.

llvm-svn: 143206
2011-10-28 17:55:38 +00:00