1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-01 00:12:50 +01:00
Commit Graph

3381 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen
914857b29a Remove the -live-regunits command line option.
Register allocators depend on it being permanently enabled now.

llvm-svn: 158873
2012-06-20 23:31:34 +00:00
Jakob Stoklund Olesen
6d2db5c3d9 Only update regunit live ranges that have been precomputed.
Regunit live ranges are computed on demand, so when mi-sched calls
handleMove, some regunits may not have live ranges yet.

That makes updating them easier: Just skip the non-existing ranges. They
will be computed correctly from the rescheduled machine code when they
are needed.

llvm-svn: 158831
2012-06-20 18:00:57 +00:00
Craig Topper
d63e429d68 Don't insert 128-bit UNDEF into 256-bit vectors. Just keep the 256-bit vector. Original patch by Elena Demikhovsky. Tweaked by me to allow possibility of covering more cases.
llvm-svn: 158792
2012-06-20 05:39:26 +00:00
Rafael Espindola
0f267bbe04 really add a triple :-(
llvm-svn: 158696
2012-06-19 02:17:35 +00:00
Rafael Espindola
1cc3be37a0 Add a triple to the test.
llvm-svn: 158695
2012-06-19 01:42:34 +00:00
Rafael Espindola
38c45a939d Move the support for using .init_array from ARM to the generic
TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM,
on X86 it is not easy to find out if .init_array should be used or not, so
the decision is made via TargetOptions and defaults to off.

Add a command line option to llc that enables it.

llvm-svn: 158692
2012-06-19 00:48:28 +00:00
Chandler Carruth
1c3df655ea Add a regression test for the bug exposed by r158087, which has been
temporarily reverted.

This test is annoyingly overspecified, but I don't know of another way
to thoroughly test the saving and restoring of the registers. While this
will have to be adjusted even with the issue fixed in order to re-apply
r158087, those adjustments should very clearly indicate that it is still
correct (%esp getting restored prior to pops), whereas without it, this
case can easily slip under the radar.

Still, any suggestions for improvements are very welcome.

All credit to Matt Beaumont-Gay for reducing this out of an insane
Address Sanitizer crash to a reasonably small seg-faulting C program
when built with -mstackrealign. I just reduced it to IR, which was much
simpler. =]

llvm-svn: 158656
2012-06-18 09:15:04 +00:00
Chandler Carruth
d2716ae111 Temporarily revert r158087.
This patch causes problems when both dynamic stack realignment and
dynamic allocas combine in the same function. With this patch, we no
longer build the epilog correctly, and silently restore registers from
the wrong position in the stack.

Thanks to Matt for tracking this down, and getting at least an initial
test case to Chad. I'm going to try to check a variation of that test
case in so we can easily track the fixes required.

llvm-svn: 158654
2012-06-18 07:03:12 +00:00
Craig Topper
b2299168d3 Fix intrinsics for XOP frczss/sd instructions. These instructions only take one source register and zero the upper bits of the destination rather than preserving them.
llvm-svn: 158396
2012-06-13 07:18:53 +00:00
Craig Topper
ad5e38e410 Replace XOP vpcom intrinsics with fewer intrinsics that take the immediate as an argument.
llvm-svn: 158278
2012-06-09 16:46:13 +00:00
Jakob Stoklund Olesen
ce0f9aef12 Don't run RAFast in the optimizing regalloc pipeline.
The fast register allocator is not supposed to work in the optimizing
pipeline. It doesn't make sense to compute live intervals, run full copy
coalescing, and then run RAFast.

Fast register allocation in the optimizing pipeline is better done by
RABasic.

llvm-svn: 158242
2012-06-08 23:15:12 +00:00
Manman Ren
40901656e6 Test case for r158160
llvm-svn: 158218
2012-06-08 18:42:37 +00:00
Manman Ren
f51a6d5fae X86: optimize generated code for integer ABS
This patch will generate the following for integer ABS:
      movl    %edi, %eax
      negl    %eax
      cmovll  %edi, %eax
INSTEAD OF
      movl    %edi, %ecx
      sarl    $31, %ecx
      leal    (%rdi,%rcx), %eax
      xorl    %ecx, %eax

There exists a target-independent DAG combine for integer ABS, which converts
integer ABS to sar+add+xor. For X86, we match this pattern back to neg+cmov. 
This is implemented in PerformXorCombine.

rdar://10695237

llvm-svn: 158175
2012-06-07 22:39:10 +00:00
Rafael Espindola
4c9d611360 Use a base register instead of an index register with the local dynamic model.
Fixes pr13048.

llvm-svn: 158158
2012-06-07 18:39:19 +00:00
Manman Ren
1d91fc3342 X86: replace SUB with CMP if possible
This patch will optimize the following
    movq    %rdi, %rax
    subq    %rsi, %rax
    cmovsq  %rsi, %rdi
    movq    %rdi, %rax
to
    cmpq    %rsi, %rdi
    cmovsq  %rsi, %rdi
    movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023
llvm-svn: 158126
2012-06-07 00:42:47 +00:00
Manman Ren
f591de61da Revert r157755.
The commit is intended to fix rdar://11540023.
It is implemented as part of peephole optimization. We can actually implement
this in the SelectionDAG lowering phase.

llvm-svn: 158122
2012-06-06 23:53:03 +00:00
Chad Rosier
5a354cd5e8 Add support for dynamic stack realignment in the presence of dynamic allocas on
X86.
rdar://11496434

llvm-svn: 158087
2012-06-06 17:37:40 +00:00
Nadav Rotem
6969a673e9 Remove the "-promote-elements" flag. This flag is now enabled by default.
llvm-svn: 157925
2012-06-04 11:27:21 +00:00
Craig Topper
5837bcfc02 Rename FMA3 feature flag to just FMA to match gcc so it can be added to clang.
llvm-svn: 157903
2012-06-03 18:58:46 +00:00
Craig Topper
8d3031fa46 Rename fma4 intrinsics to just fma since they are now used for both FMA4 and FMA3. Autoupgrade support coming in a separate commit.
llvm-svn: 157898
2012-06-03 07:26:46 +00:00
Manman Ren
c3a6de9953 Revert r157831
llvm-svn: 157896
2012-06-03 03:14:24 +00:00
Craig Topper
685b86b007 Use sse_load_f32/64 for scalar FMA3 intrinsic patterns instead of 128-bit loads to match instruction behavior.
llvm-svn: 157895
2012-06-03 01:40:43 +00:00
Manman Ren
74ccc117d5 X86: peephole optimization to remove cmp instruction
This patch will optimize the following:
  sub r1, r3
  cmp r3, r1 or cmp r1, r3
  bge L1
TO
  sub r1, r3
  bge L1 or ble L1

If the branch instruction can use flag from "sub", then we can eliminate
the "cmp" instruction.

llvm-svn: 157831
2012-06-01 19:49:33 +00:00
Chris Lattner
773aa116f2 testcase for PR13006, thanks to Duncan for filing it.
llvm-svn: 157824
2012-06-01 18:19:46 +00:00
Hans Wennborg
4344ad4a86 Implement the local-dynamic TLS model for x86 (PR3985)
This implements codegen support for accesses to thread-local variables
using the local-dynamic model, and adds a clean-up pass so that the base
address for the TLS block can be re-used between local-dynamic access on
an execution path.

llvm-svn: 157818
2012-06-01 16:27:21 +00:00
Craig Topper
02e5a00a70 Remove fadd(fmul) patterns for FMA3. This needs to be implemented by paying attention to FP_CONTRACT and matching @llvm.fma which is not available yet. This will allow us to enablle intrinsic use at least though.
llvm-svn: 157804
2012-06-01 06:07:48 +00:00
Chris Lattner
8b71fa09ec enhance the logic for looking through tailcalls to look through transparent casts
in multiple-return value scenarios, like what happens on X86-64 when returning
small structs.

llvm-svn: 157800
2012-06-01 05:29:15 +00:00
Chris Lattner
9f7a47d886 enhance getNoopInput to know about vector<->vector bitcasts of legal
types, as well as int<->ptr casts.  This allows us to tailcall functions
with some trivial casts between the call and return (i.e. because the
return types disagree).

llvm-svn: 157798
2012-06-01 05:16:33 +00:00
Chris Lattner
748ecee801 add some simple 64-bit tail call tests.
llvm-svn: 157797
2012-06-01 05:03:31 +00:00
Chris Lattner
f4a7be0c55 merge some tests.
llvm-svn: 157795
2012-06-01 05:00:54 +00:00
Chris Lattner
39d2bf08f8 rename test
llvm-svn: 157794
2012-06-01 04:58:50 +00:00
Manman Ren
82e2c9debf X86: replace SUB with CMP if possible
This patch will optimize the following
        movq    %rdi, %rax
        subq    %rsi, %rax
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax
to
        cmpq    %rsi, %rdi
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023
llvm-svn: 157755
2012-05-31 17:20:29 +00:00
Elena Demikhovsky
194da7364d Added FMA3 Intel instructions.
I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks.
I added tests for GodeGen and intrinsics.
I did not change llvm.fma.f32/64 - it may be done later.

llvm-svn: 157737
2012-05-31 09:20:20 +00:00
Craig Topper
c6bd90e646 Add intrinsic for pclmulqdq instruction.
llvm-svn: 157731
2012-05-31 04:37:40 +00:00
Jakob Stoklund Olesen
8a558631c0 Prioritize smaller register classes for urgent evictions.
It helps compile exotic inline asm. In the test case, normal GR32
virtual registers use up eax-edx so the final GR32_ABCD live range has
no registers left. Since all the live ranges were tiny, we had no way of
prioritizing the smaller register class.

This patch allows tiny unspillable live ranges to be evicted by tiny
unspillable live ranges from a smaller register class.

<rdar://problem/11542429>

llvm-svn: 157715
2012-05-30 21:46:58 +00:00
Chris Lattner
1d39403e7b it's pointed out that R11 can be used for magic things, and doing things just for 64-bit registers is silly. Just optimize 3 more.
llvm-svn: 157699
2012-05-30 18:08:02 +00:00
Chris Lattner
505bcf70a0 Extend the (abi-irrelevant) return convention to be able to return more than two values in
integer registers.  This is already supported by the fastcc convention, but it doesn't
hurt to support it in the standard conventions as well.

In cases where we can cheat at the calling convention, this allows us to avoid returning
things through memory in more cases.

llvm-svn: 157698
2012-05-30 17:50:14 +00:00
Benjamin Kramer
0c823ae0ed Add intrinsics, code gen, assembler and disassembler support for the SSE4a extrq and insertq instructions.
This required light surgery on the assembler and disassembler
because the instructions use an uncommon encoding. They are
the only two instructions in x86 that use register operands
and two immediates.

llvm-svn: 157634
2012-05-29 19:05:25 +00:00
Chris Lattner
3ae033bbe9 These tests used intrinsics with the wrong prototype. They weren't caught because
the old verifier just checked that something "was a pointer", but not that the pointee
was correct.

llvm-svn: 157544
2012-05-27 19:35:41 +00:00
Benjamin Kramer
07f18e78ba SelectionDAGBuilder: When emitting small compare chains for switches order them by using edge weights.
SimplifyCFG tends to form a lot of 2-3 case switches when merging branches. Move
the most likely condition to the front so it is checked first and the others can
be skipped. This is currently not as effective as it could be because SimplifyCFG
destroys profiling metadata when merging branches and switches. Merging branch
weight metadata is tricky though.

This code touches at most 3 cases so I didn't use a proper sorting algorithm.

llvm-svn: 157521
2012-05-26 20:01:32 +00:00
NAKAMURA Takumi
ad733c5014 test/CodeGen/X86/bigstructret.ll: Suppress one test. It is msvc-incompatible. (compatible to mingw32 and netbsd, though)
llvm-svn: 157474
2012-05-25 15:40:54 +00:00
NAKAMURA Takumi
93dbb2e681 test/CodeGen/X86/bigstructret.ll: Relax stack offsets for hosts of stack-align=8, eg. win32 and netbsd.
llvm-svn: 157471
2012-05-25 15:12:21 +00:00
Eli Friedman
d89582030a Simplify code for calling a function where CanLowerReturn fails, fixing a small bug in the process.
llvm-svn: 157446
2012-05-25 00:09:29 +00:00
David Blaikie
74e7769ec9 Fix for CHECK-NOT misspelling.
Patch by Nicklas Bo Jensen.

llvm-svn: 157421
2012-05-24 22:08:29 +00:00
Jakob Stoklund Olesen
ce44a7a9ae Correctly deal with identity copies in RegisterCoalescer.
Now that the coalescer keeps live intervals and machine code in sync at
all times, it needs to deal with identity copies differently.

When merging two virtual registers, all identity copies are removed
right away. This means that other identity copies must come from
somewhere else, and they are going to have a value number.

Deal with such copies by merging the value numbers before erasing the
copy instruction. Otherwise, we leave dangling value numbers in the live
interval.

This fixes PR12927.

llvm-svn: 157340
2012-05-23 20:21:06 +00:00
Nuno Lopes
944814b41a revert my previous patches that introduced an additional parameter to the objectsize intrinsic.
After a lot of discussion, we realized it's not the best option for run-time bounds checking

llvm-svn: 157255
2012-05-22 15:25:31 +00:00
Jakob Stoklund Olesen
151b044b75 Only erase virtregs with no uses left.
Also make sure registers aren't erased twice if the dead def mentions
the register twice.

This fixes PR12911.

llvm-svn: 157254
2012-05-22 14:52:12 +00:00
Craig Topper
57d54b1831 Allow 256-bit shuffles to still be split even if only half of the shuffle comes from two 128-bit pieces.
llvm-svn: 157175
2012-05-21 06:40:16 +00:00
Peter Collingbourne
c9e3c75c4d When legalising shifts, do not pre-build a list of operands which
may be RAUW'd by the recursive call to LegalizeOps; instead, retrieve
the other operands when calling UpdateNodeOperands.  Fixes PR12889.

llvm-svn: 157162
2012-05-20 18:36:15 +00:00
Jakob Stoklund Olesen
ed19f92618 Properly constrain register classes for sub-registers.
Not all GR64 registers have sub_8bit sub-registers.

llvm-svn: 157150
2012-05-20 06:38:37 +00:00