1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 06:22:56 +02:00
Commit Graph

8429 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen
4b1105c720 Remove the X86 sub_ss and sub_sd sub-register indexes completely.
llvm-svn: 160833
2012-07-26 23:07:20 +00:00
Jakob Stoklund Olesen
008b36037d Remove the last mentions of sub_ss and sub_sd from patterns.
I'll remove these two sub-register indexes shortly.

llvm-svn: 160831
2012-07-26 23:03:08 +00:00
Jakob Stoklund Olesen
c1dd7a213d Eliminate sub_ss, sub_sd from broadcast patterns.
The (COPY_TO_REGCLASS GR32:$src, VR128) pattern looks odd, but
copyPhysReg does the right thing with it. (The old pattern would
eventually produce the same cross-class copy).

llvm-svn: 160830
2012-07-26 22:59:06 +00:00
Jakob Stoklund Olesen
419ccae442 Eliminate more sub_ss / sub_sd patterns.
This gets rid of some more INSERT_SUBREG - IMPLICIT_DEF patterns,
simplifying the emitted code a bit.

llvm-svn: 160820
2012-07-26 22:30:18 +00:00
Jakob Stoklund Olesen
0a41a45c36 Eliminate some SUBREG_TO_REG patterns with sub_ss and sub_sd.
The SUBREG_TO_REG instruction has magic semantics asserting that the
source value was defined by an instruction that cleared the high half of
the register. Those semantics are never actually exploited for xmm
registers.

llvm-svn: 160818
2012-07-26 22:03:21 +00:00
Jakob Stoklund Olesen
44868927b9 Eliminate a batch of uses of sub_ss and sub_sd in the X86 target.
These idempotent sub-register indices don't do anything --- They simply
map XMM registers to themselves.  They no longer affect register classes
either since the SubRegClasses field has been removed from Target.td.

This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns
with COPY_TO_REGCLASS patterns which simply become COPY instructions.

The number of IMPLICIT_DEF instructions before register allocation is
reduced, and that is the cause of the test case changes.

llvm-svn: 160816
2012-07-26 21:40:42 +00:00
Craig Topper
9667d599eb Make l/q suffixes on AVX forms of scalar convert instructions consistent with their non-AVX forms.
llvm-svn: 160775
2012-07-26 07:48:28 +00:00
Rafael Espindola
062cf9ccf9 Fix typos. Thanks to Matt Beaumont-Gay for noticing it.
llvm-svn: 160731
2012-07-25 15:42:45 +00:00
Rafael Espindola
8dfc815c3f When a return struct pointer is passed in registers, the called has nothing
to pop.

llvm-svn: 160725
2012-07-25 13:41:10 +00:00
Rafael Espindola
e58779ca1b Factor a long list of conditions into a predicate function. No functionality
change.

llvm-svn: 160724
2012-07-25 13:35:45 +00:00
Kevin Enderby
9a7bc24c01 Fix a bug in the x86 disassembler's symbolic disassembly support for Jcc-Jump
if Condition Is Met instuctions that was not correctly determining the target
instruction.

So for a jne rel32 instruction:

% cat x.s
.byte 0x0f, 0x85, 0x09, 0x00, 0x00, 0x00
% as x.s

it was incorrectly deterining the target:

% otool -q -tv a.out 
a.out:
(__TEXT,__text) section
0000000000000000	jne	0xd

and with the fix it gets this correct as:

% otool -q -tv a.out
a.out:
(__TEXT,__text) section
0000000000000000	jne	0xf

rdar://11505997

llvm-svn: 160694
2012-07-24 21:40:01 +00:00
David Chisnall
c21e3b3ce1 ELF does not imply GNU/Linux. Do not assume GNU conventions just because we
are targeting an ELF platform.  Only fold gs-relative (and fs-relative) loads
if it is actually sensible to do so for the target platform.

This fixes PR13438.

llvm-svn: 160687
2012-07-24 20:04:16 +00:00
Sylvestre Ledru
bf8acb65ac Fix a typo (the the => the)
llvm-svn: 160621
2012-07-23 08:51:15 +00:00
Craig Topper
015d52b2dc Don't use implicit register operands to calculate L-bit for AVX instructions. Needed because super reg defs and kills are added as implicit operands on 128-bit instructions. Fixes PR13349. Patch by Jose Fonseca.
llvm-svn: 160543
2012-07-20 07:03:46 +00:00
Preston Gurd
6d82adeada Adds the family codes for the Midview Atom processors so that the
Atom buildbot will auto-detect Atom.

llvm-svn: 160521
2012-07-19 19:05:37 +00:00
Bill Wendling
0b007e009e Remove tabs.
llvm-svn: 160479
2012-07-19 00:15:11 +00:00
Bill Wendling
17b12b72bc Remove tabs.
llvm-svn: 160477
2012-07-19 00:11:40 +00:00
Manman Ren
dd8d9c10a3 X86: remove redundant cmp against zero.
Updated OptimizeCompare in peephole to remove redundant cmp against zero.
We only remove Compare if CF and OF are not used.

rdar://11855129

llvm-svn: 160454
2012-07-18 21:40:01 +00:00
Preston Gurd
d2b344c685 This patch fixes 8 out of 20 unexpected failures in "make check"
when run on an Intel Atom processor. The failures have arisen due
to changes elsewhere in the trunk over the past 8 weeks or so.

These failures were not detected by the Atom buildbot because the
CPU on the Atom buildbot was not being detected as an Atom CPU.
The fix for this problem is in Host.cpp and X86Subtarget.cpp, but
shall remain commented out until the current set of Atom test failures
are fixed.

Patch by Andy Zhang and Tyler Nowicki!

llvm-svn: 160451
2012-07-18 20:49:17 +00:00
Nadav Rotem
03d2729392 The vbroadcast family of instructions has 'fallback patterns' in case where the
load source operand is used by multiple nodes. The v2i64 broadcast was emulated
by shuffling the two lower i32 elements to the upper two.
We had a bug in the immediate used for the broadcast.
Replacing 0 to 0x44.
0x44 means [01|00|01|00] which corresponds to the correct lane.

Patch by Michael Kuperstein.

llvm-svn: 160430
2012-07-18 08:14:48 +00:00
Craig Topper
6150f43b28 Remove tab characters.
llvm-svn: 160425
2012-07-18 04:59:16 +00:00
Craig Topper
b086c8faf2 Fix typo in error message and remove some tab characters.
llvm-svn: 160423
2012-07-18 04:36:35 +00:00
Craig Topper
b144f3b6db Make x86 asm parser to check for xmm vs ymm for index register in gather instructions. Also fix Intel syntax for gather instructions to use 'DWORD PTR' or 'QWORD PTR' to match gas.
llvm-svn: 160420
2012-07-18 04:11:12 +00:00
Evan Cheng
5e82ad04d5 Back out r160101 and instead implement a dag combine to recover from instcombine transformation.
llvm-svn: 160387
2012-07-17 18:54:11 +00:00
Evan Cheng
f84dd0cf40 Implement r160312 as target indepedenet dag combine.
llvm-svn: 160354
2012-07-17 08:31:11 +00:00
Evan Cheng
0b6bcb6e06 This is another case where instcombine demanded bits optimization created
large immediates. Add dag combine logic to recover in case the large
immediates doesn't fit in cmp immediate operand field.

int foo(unsigned long l) {
  return (l>> 47) == 1;
}

we produce

  %shr.mask = and i64 %l, -140737488355328
  %cmp = icmp eq i64 %shr.mask, 140737488355328
  %conv = zext i1 %cmp to i32
  ret i32 %conv

which codegens to

movq    $0xffff800000000000,%rax
andq    %rdi,%rax
movq    $0x0000800000000000,%rcx
cmpq    %rcx,%rax
sete    %al
movzbl    %al,%eax
ret

TargetLowering::SimplifySetCC would transform
(X & -256) == 256 -> (X >> 8) == 1
if the immediate fails the isLegalICmpImmediate() test. For x86,
that's immediates which are not a signed 32-bit immediate.

Based on a patch by Eli Friedman.

PR10328
rdar://9758774

llvm-svn: 160346
2012-07-17 06:53:39 +00:00
Evan Cheng
b409a61574 For something like
uint32_t hi(uint64_t res)
{
        uint_32t hi = res >> 32;
        return !hi;
}

llvm IR looks like this:
define i32 @hi(i64 %res) nounwind uwtable ssp {
entry:
  %lnot = icmp ult i64 %res, 4294967296
  %lnot.ext = zext i1 %lnot to i32
  ret i32 %lnot.ext
}

The optimizer has optimize away the right shift and truncate but the resulting
constant is too large to fit in the 32-bit immediate field. The resulting x86
code is worse as a result:
        movabsq $4294967296, %rax       ## imm = 0x100000000
        cmpq    %rax, %rdi
        sbbl    %eax, %eax
        andl    $1, %eax

This patch teaches the x86 lowering code to handle ult against a large immediate
with trailing zeros. It will issue a right shift and a truncate followed by
a comparison against a shifted immediate.
        shrq    $32, %rdi
        testl   %edi, %edi
        sete    %al
        movzbl  %al, %eax

It also handles a ugt comparison against a large immediate with trailing bits
set. i.e. X >  0x0ffffffff -> (X >> 32) >= 1

rdar://11866926

llvm-svn: 160312
2012-07-16 19:35:43 +00:00
Chad Rosier
16c9db9ad6 With r160248 in place this code is no longer needed.
llvm-svn: 160293
2012-07-16 17:42:13 +00:00
Nadav Rotem
67ff66bd0c Fix a bug in the 3-address conversion of LEA when one of the operands is an
undef virtual register. The problem is that ProcessImplicitDefs removes the
definition of the register and marks all uses as undef. If we lose the undef
marker then we get a register which has no def, is not marked as undef. The
live interval analysis does not collect information for these virtual
registers and we crash in later passes.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160260
2012-07-16 10:52:25 +00:00
Alexey Samsonov
c68bb48704 This CL changes the function prologue and epilogue emitted on X86 when stack needs realignment.
It is intended to fix PR11468.

Old prologue and epilogue looked like this:
push %rbp
mov %rsp, %rbp
and $alignment, %rsp
push %r14
push %r15
...
pop %r15
pop %r14
mov %rbp, %rsp
pop %rbp

The problem was to reference the locations of callee-saved registers in exception handling:
locations of callee-saved had to be re-calculated regarding the stack alignment operation. It would
take some effort to implement this in LLVM, as currently MachineLocation can only have the form
"Register + Offset". Funciton prologue and epilogue are now changed to:

push %rbp
mov %rsp, %rbp
push %14
push %15
and $alignment, %rsp
...
lea -$size_of_saved_registers(%rbp), %rsp
pop %r15
pop %r14
pop %rbp

Reviewed by Chad Rosier.

llvm-svn: 160248
2012-07-16 06:54:09 +00:00
Nadav Rotem
a09775b875 Teach getTargetVShiftNode about TargetConstant nodes.
llvm-svn: 160234
2012-07-15 20:27:43 +00:00
Nadav Rotem
0377e0d234 Rename VBROADCASTSDrm into VBROADCASTSDYrm to match the naming convention.
Allow the folding of vbroadcastRR to vbroadcastRM, where the memory operand is a spill slot.

PR12782.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160230
2012-07-15 12:26:30 +00:00
Nadav Rotem
c40d85dda5 AVX: Fix a bug in getTargetVShiftNode. The shift amount has to be a 128bit vector with the same element type as the input vector.
This is needed because of the patterns we have for the VP[SLL/SRA/SRL][W/D/Q] instructions.

llvm-svn: 160222
2012-07-14 22:26:05 +00:00
Benjamin Kramer
308eb1b4c0 Make helper functions static.
llvm-svn: 160173
2012-07-13 13:25:15 +00:00
Craig Topper
a75da664ff Mark VINSERTI128rm as MayLoad=1. Fixes PR13348.
llvm-svn: 160162
2012-07-13 05:46:28 +00:00
Benjamin Kramer
558c56f216 Give the rdrand instructions a SideEffect flag and a chain so MachineCSE and MachineLICM don't touch it.
I already had the necessary things in place for IR-level passes but missed the machine passes.

llvm-svn: 160137
2012-07-12 18:14:57 +00:00
Benjamin Kramer
f8e67a04f4 Add intrinsics for Ivy Bridge's rdrand instruction.
The rdrand/cmov sequence is the same that is emitted by both
GCC and ICC.

Fixes PR13284.

llvm-svn: 160117
2012-07-12 09:31:43 +00:00
Craig Topper
6eef81b65b Update GATHER instructions to support 2 read-write operands. Patch from myself and Manman Ren.
llvm-svn: 160110
2012-07-12 06:52:41 +00:00
Chad Rosier
75817295b6 [x86 fast-isel] Per discussion with Eric, add all cases to switch with verbose
comments.

llvm-svn: 160069
2012-07-11 19:58:38 +00:00
Manman Ren
93ef864f3c X86: Update to peephole optimization to move Movr0 before (Sub, Cmp) pair.
When Movr0 is between sub and cmp, we move Movr0 before sub if it enables
removal of Cmp.

llvm-svn: 160066
2012-07-11 19:35:12 +00:00
Chad Rosier
aaccad80b4 [x86 fast-isel] Rather then call llvm_unreachable() have fast-isel fall back
to Selection DAG isel.  Patch by Andrew Kaylor <andrew.kaylor@intel.com>.

llvm-svn: 160055
2012-07-11 17:23:17 +00:00
Nadav Rotem
22652c85bc When ext-loading and trunc-storing vectors to memory, on x86 32bit systems, allow loads/stores of 64bit values from xmm registers.
llvm-svn: 160044
2012-07-11 13:27:05 +00:00
Chad Rosier
3273667edf Move [get|set]BasePtrStackAdjustment() from MachineFrameInfo to
X86MachineFunctionInfo as this is currently only used by X86. If this ever
becomes an issue on another arch (e.g., ARM) then we can hoist it back out.

llvm-svn: 160009
2012-07-10 18:27:15 +00:00
Chad Rosier
5395ec6ee4 Add support for dynamic stack realignment in the presence of dynamic allocas on
X86.  Basically, this is a reapplication of r158087 with a few fixes.

Specifically, (1) the stack pointer is restored from the base pointer before
popping callee-saved registers and (2) in obscure cases (see comments in patch)
we must cache the value of the original stack adjustment in the prologue and
apply it in the epilogue.

rdar://11496434

llvm-svn: 160002
2012-07-10 17:45:53 +00:00
Nadav Rotem
5f6e9d5ffe Improve the loading of load-anyext vectors by allowing the codegen to load
multiple scalars and insert them into a vector. Next, we shuffle the elements
into the correct places, as before.
Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the
migration of bitcasts happened too late in the SelectionDAG process.

llvm-svn: 159991
2012-07-10 13:25:08 +00:00
Craig Topper
b346ce8240 Reverse assembler/disassembler operand order for gather instructions.
llvm-svn: 159983
2012-07-10 06:38:33 +00:00
Manman Ren
dc41586be4 X86: implement functions to analyze & synthesize CMOV|SET|Jcc
getCondFromSETOpc, getCondFromCMovOpc, getSETFromCond, getCMovFromCond

No functional change intended.
If we want to update the condition code of CMOV|SET|Jcc, we first analyze the
opcode to get the condition code, then update the condition code, finally
synthesize the new opcode form the new condition code.

llvm-svn: 159955
2012-07-09 18:57:12 +00:00
Andrew Trick
b9c8074dcd I'm introducing a new machine model to simultaneously allow simple
subtarget CPU descriptions and support new features of
MachineScheduler.

MachineModel has three categories of data:
1) Basic properties for coarse grained instruction cost model.
2) Scheduler Read/Write resources for simple per-opcode and operand cost model (TBD).
3) Instruction itineraties for detailed per-cycle reservation tables.

These will all live side-by-side. Any subtarget can use any
combination of them. Instruction itineraries will not change in the
near term. In the long run, I expect them to only be relevant for
in-order VLIW machines that have complex contraints and require a
precise scheduling/bundling model. Once itineraries are only actively
used by VLIW-ish targets, they could be replaced by something more
appropriate for those targets.

This tablegen backend rewrite sets things up for introducing
MachineModel type #2: per opcode/operand cost model.

llvm-svn: 159891
2012-07-07 04:00:00 +00:00
Manman Ren
eca5886e50 X86: Fix optimizeCompare to correctly check safe condition.
It is safe if EFLAGS is killed or re-defined.
When we are done with the basic block, check whether EFLAGS is live-out.
Do not optimize away cmp if EFLAGS is live-out.

llvm-svn: 159888
2012-07-07 03:34:46 +00:00
Manman Ren
8cbff1360f X86: peephole optimization to remove cmp instruction
For each Cmp, we check whether there is an earlier Sub which make Cmp
redundant. We handle the case where SUB operates on the same source operands as
Cmp, including the case where the two source operands are swapped.

llvm-svn: 159838
2012-07-06 17:36:20 +00:00