1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 22:12:57 +02:00
Commit Graph

12874 Commits

Author SHA1 Message Date
David Majnemer
572acaa24c [X86] Permit reading of the FLAGS register without it being previously defined
We modeled the RDFLAGS{32,64} operations as "using" {E,R}FLAGS.
While technically correct, this is not be desirable for folks who want
to examine aspects of the FLAGS register which are not related to
computation like whether or not CPUID is a valid instruction.

Differential Revision: http://reviews.llvm.org/D17782

llvm-svn: 262465
2016-03-02 06:46:52 +00:00
Craig Topper
a32ceccef8 [X86] Remove assertion I accidentally left in.
llvm-svn: 262464
2016-03-02 06:35:22 +00:00
Craig Topper
05f010fdc4 [X86] Be more structured about how we capture the register number when it is encoded in bits 7:4 of the immediate.
For some instructions the register is not the last operand and the immediate handling had to detect this and hardcode the index to find it. It also required CurOp to be pointing at the last operand handled in the Form switch whereas for any instruction it would be pointing at the next operand.

Now we just capture the value in the Form switch when we know exactly where it is and the CurOp pointer can behave normally.

llvm-svn: 262462
2016-03-02 06:06:18 +00:00
Craig Topper
f43237f0ae [X86] Use MCPhysReg and uint16_t for static arrays of registers and opcodes respectively should reduce size tiny bit. NFC
llvm-svn: 262458
2016-03-02 04:42:31 +00:00
Sanjay Patel
ed5fa039a5 [x86] use getBitcast()
This isn't quite NFC because some of the SDLocs may change which could
cause scheduling differences. But no regression tests are affected and
there is no functional change intended.

llvm-svn: 262391
2016-03-01 20:47:02 +00:00
Matthias Braun
6958800987 TableGen: Check scheduling models for completeness
TableGen checks at compiletime that for scheduling models with
"CompleteModel = 1" one of the following holds:

- Is marked with the hasNoSchedulingInfo flag
- The instruction is a subclass of Sched
- There are InstRW definitions in the scheduling model

Typical steps necessary to complete a model:

- Ensure all pseudo instructions that are expanded before machine
  scheduling (usually everything handled with EmitYYY() functions in
  XXXTargetLowering).
- If a CPU does not support some instructions mark the corresponding
  resource unsupported: "WriteRes<WriteXXX, []> { let Unsupported = 1; }".
- Add missing scheduling information.

Differential Revision: http://reviews.llvm.org/D17747

llvm-svn: 262384
2016-03-01 20:03:21 +00:00
David Majnemer
3003d6c7c0 [X86] Elide references to _chkstk for dynamic allocas
The _chkstk function is called by the compiler to probe the stack in an
order consistent with Windows' expectations.  However, it is possible to
elide the call to _chkstk and manually adjust the stack pointer if we
can prove that the allocation is fixed size and smaller than the probe
size.

This shrinks chrome.dll, chrome_child.dll and chrome.exe by a
cummulative ~133 KB.

Differential Revision: http://reviews.llvm.org/D17679

llvm-svn: 262370
2016-03-01 19:20:23 +00:00
Sanjay Patel
97d933d122 fix function names; NFC
llvm-svn: 262367
2016-03-01 19:14:09 +00:00
Matt Arsenault
807567a0a9 DAGCombiner: Turn extract of bitcasted integer into truncate
This reduces the number of bitcast nodes and generally cleans up the
DAG when bitcasting between integers and vectors everywhere.

llvm-svn: 262358
2016-03-01 18:01:37 +00:00
Michael Zuckerman
71f617e26e [LLVM][AVX512] PSRL{DI|QI} Change imm8 to int
Differential Revision: http://reviews.llvm.org/D17713

llvm-svn: 262353
2016-03-01 17:46:32 +00:00
Hans Wennborg
f725e062bd [X86] Check that attribute parameters match for tail calls (PR26590)
In the code below on 32-bit targets, x would previously get forwarded to g()
without sign-extension to 32 bits as required by the parameter attribute.

  void g(signed short);
  void f(unsigned short x) {
    g(x);
  }

llvm-svn: 262352
2016-03-01 17:45:23 +00:00
Sanjay Patel
d47daeaf1d fix documentation comments; NFC
llvm-svn: 262351
2016-03-01 17:25:35 +00:00
Sanjay Patel
74bcf66c0f function names start with a lowercase letter; NFC
llvm-svn: 262347
2016-03-01 16:17:48 +00:00
Michael Zuckerman
c05422513f [AVX512][PSRAQ][PSRAD] Change imm8 to int.
Differential Revision: http://reviews.llvm.org/D17692

llvm-svn: 262320
2016-03-01 11:36:23 +00:00
Amjad Aboud
557cf6fe56 Disallow generating vzeroupper before return instruction (iret) in interrupt handler function.
This resolves https://llvm.org/bugs/show_bug.cgi?id=26412

Differential Revision: http://reviews.llvm.org/D17542

llvm-svn: 262319
2016-03-01 11:32:03 +00:00
Craig Topper
ec7d7b8389 [X86] Centralize the masking of TSFlags with FormMask into a variable earlier so we can stop masking in multiple places. NFC
llvm-svn: 262312
2016-03-01 07:15:59 +00:00
Craig Topper
a90bbbbd14 [X86] Localize a temporary variable into the cases its need in. NFC
llvm-svn: 262310
2016-03-01 06:42:48 +00:00
Craig Topper
a60b62369e [X86] Be consistent about using pre/post increment/decrement in nearby code. NFC
llvm-svn: 262309
2016-03-01 06:42:46 +00:00
Craig Topper
f449a9bf2a [X86] Combine some initialization code with variable declaration and comments. NFC
llvm-svn: 262301
2016-03-01 05:42:16 +00:00
Ahmed Bougacha
8edc363aa7 [X86] Move the ATOMIC_LOAD_OP ISel from DAGToDAG to ISelLowering. NFCI.
This is long-standing dirtiness, as acknowledged by r77582:

    The current trick is to select it into a merge_values with
    the first definition being an implicit_def. The proper solution is
    to add new ISD opcodes for the no-output variant.

Doing this before selection will let us combine away some constructs.

Differential Revision: http://reviews.llvm.org/D17659

llvm-svn: 262244
2016-02-29 19:28:07 +00:00
David Majnemer
544b600088 [WinEH] Make setjmp work correctly with EH
32-bit X86 EH on Windows utilizes a stack of registration nodes
allocated and deallocated on entry/exit.  A registration node contains a
bunch of EH personality specific information like which try-state we are
currently in.

Because a setjmp target allows control flow from arbitrary program
points, there is no way to ensure that the try-state we are in is
correctly updated once we transfer control.

MSVC compatible compilers, like MSVC and ICC, utilize runtime helpers to
reinitialize the try-state when a longjmp occurs.  This is implemented
by adding additional arguments to _setjmp3: the desired try-state and
a helper routine to update the try-state.

Differential Revision: http://reviews.llvm.org/D17721

llvm-svn: 262241
2016-02-29 19:16:03 +00:00
Michael Zuckerman
c4dc2f4ba2 [AVX512][PSLLW ][PSLLV] Change imm8 to int
Differential Revision: http://reviews.llvm.org/D17684

llvm-svn: 262176
2016-02-28 07:32:10 +00:00
Duncan P. N. Exon Smith
779625a4a6 CodeGen: Change MachineInstr to use MachineInstr&, NFC
Change MachineInstr API to prefer MachineInstr& over MachineInstr*
whenever the parameter is expected to be non-null.  Slowly inching
toward being able to fix PR26753.

llvm-svn: 262149
2016-02-27 20:01:33 +00:00
Simon Pilgrim
1ad94fd8a6 Tidyup for loops - don't repeat upper limit evaluation if you don't have to. NFCI.
llvm-svn: 262137
2016-02-27 13:26:58 +00:00
Simon Pilgrim
5490389df9 Strip trailing whitespace. NFCI.
llvm-svn: 262131
2016-02-27 11:49:16 +00:00
Ahmed Bougacha
f14633274b [X86] Fix a stale comment. NFC.
llvm-svn: 262087
2016-02-26 22:59:57 +00:00
Ahmed Bougacha
05e77f2852 [X86] Remove the unused SDTX86atomicBinary. NFC.
llvm-svn: 262086
2016-02-26 22:59:41 +00:00
Simon Pilgrim
13e404ae07 Strip trailing whitespace. NFCI.
llvm-svn: 262083
2016-02-26 22:28:50 +00:00
Simon Pilgrim
b240484d51 Fix spelling. NFCI.
llvm-svn: 262078
2016-02-26 21:56:27 +00:00
Sanjay Patel
7e9bef055d [x86] refactor to eliminate duplicated code; NFCI
llvm-svn: 262062
2016-02-26 20:59:05 +00:00
Sanjay Patel
9bf70679ba [x86, AVX] fold 'isPositive' 256-bit vector integer operations (PR26701)
This extends the fold introduced with:
http://reviews.llvm.org/rL262036

llvm-svn: 262047
2016-02-26 18:42:50 +00:00
Sanjay Patel
705cd39feb [x86, SSE] fold 'isPositive' vector integer operations (PR26701)
This is one of the cases shown in:
https://llvm.org/bugs/show_bug.cgi?id=26701

Shift and negate is what InstCombine appears to prefer, so I've started with that pattern. 
Note that the 'pcmpeq' instructions are always generating the negative one for the actual
'pcmpgt' comparison in each case (side note: why isn't there an alias mnemonic for that?).

Differential Revision: http://reviews.llvm.org/D17630

llvm-svn: 262036
2016-02-26 16:56:03 +00:00
Craig Topper
fa194ccc73 [X86] Null out some redundant patterns for masked vector register to register moves. These can be accomplished with both aligned and unaligned opcodes.
Currently aligned is what is being used so remove the redundant patterns for the unaligned versions. But don't do this for the byte and word vector types since they don't have aligned versions.

llvm-svn: 261985
2016-02-26 06:50:29 +00:00
Craig Topper
3a6e292b6d [X86] Add test cases for r261977 and fix a grammatical error.
llvm-svn: 261983
2016-02-26 06:50:24 +00:00
Craig Topper
29e5870acb [X86] Remove a couple returns after llvm_unreachables. NFC
llvm-svn: 261979
2016-02-26 05:29:39 +00:00
Craig Topper
f1c180e526 [X86] Use inclusive ranges for XMM/YMM/ZMM registers in is32Extended and isX86_64ExtendedReg. NFC
llvm-svn: 261978
2016-02-26 05:29:35 +00:00
Craig Topper
e52def5593 [X86] Explicitly diagnose use of %xmm16-%xmm31, %ymm16-%ymm31 and %zmm16-%zmm31 when AVX512 is not enabled in the asm parser.
llvm-svn: 261977
2016-02-26 05:29:32 +00:00
David L Kreitzer
9eac7219b8 Reformatted a comment to fit the 80 column limit. NFC.
llvm-svn: 261916
2016-02-25 18:50:45 +00:00
Igor Breger
c2763588ac AVX512F: Add GATHER/SCATTER assembler Intel syntax tests for knl/skx/avx . Change memory operand parser handling.
Differential Revision: http://reviews.llvm.org/D17564

llvm-svn: 261862
2016-02-25 13:30:17 +00:00
Simon Pilgrim
67f60b197d [X86][SSE3] Added combine support for MOVDDUP/MOVSHDUP/MOVSLDUP target shuffles
Now that PerformShuffleCombine can handle unary shuffles.

llvm-svn: 261843
2016-02-25 09:12:12 +00:00
Elena Demikhovsky
b3f8b2cb2e Optimized loading (zextload) of i1 value from memory.
This patch is a partial revert of https://llvm.org/svn/llvm-project/llvm/trunk@237793.
Extra "and" causes performance degradation.

We assume that i1 is stored in zero-extended form. And store operation is responsible for zeroing upper bits.

Differential Revision: http://reviews.llvm.org/D17541

llvm-svn: 261828
2016-02-25 07:05:12 +00:00
Simon Pilgrim
9dfc61c3a6 [X86][SSE41] Combine vector blends with zero
Part 2 of 2
This patch add support for combining target shuffles into blends-with-zero.

Differential Revision: http://reviews.llvm.org/D17483

llvm-svn: 261745
2016-02-24 15:14:21 +00:00
Simon Pilgrim
21285b3ce4 [X86][SSE41] Combine insertion of zero scalars into vector blends with zero
Part 1 of 2
This patch attempts to replace the insertion of zero scalars with a vector blend with zero, avoiding the use of the integer insertion instructions (which are particularly slow on many targets).
(Part 2 will add support for combining multiple blends-with-zero).

Differential Revision: http://reviews.llvm.org/D17483

llvm-svn: 261743
2016-02-24 14:53:27 +00:00
David Majnemer
83f0a59a00 [CodeView] Describe variables live in x87 registers
We didn't have a mapping from LLVM's x87 floating point registers to
CodeView's encoding.

llvm-svn: 261730
2016-02-24 10:01:24 +00:00
Simon Pilgrim
962ee76b8b [X86][SSE] Don't get target shuffle operands prematurely.
PerformShuffleCombine should be usable by unary and binary target shuffles, but was attempting to get the first two operands whatever the instruction type. Since these are only used for VECTOR_SHUFFLE instructions for one particular combine I've moved them inside the relevant if statement.

llvm-svn: 261727
2016-02-24 09:07:47 +00:00
Michael Zuckerman
d1c409a5af [LLVM][AVX512][PSHUFHW ][PSHUFLW ] Change imm8 to int
Differential Revision: http://reviews.llvm.org/D17538

llvm-svn: 261725
2016-02-24 08:39:05 +00:00
Igor Breger
8b9daa338d AVX512: Add vpmovzxbw/d/q ,vpmovzxw/d/q ,vpmovzxbdq lowering patterns that support 256bit inputs like AVX patterns ( that are disable in case HasVLX , see SS41I_pmovx_avx2_patterns).
Differential Revision: http://reviews.llvm.org/D17504

llvm-svn: 261724
2016-02-24 08:15:20 +00:00
Justin Bogner
c26c003b44 X86: Wrap a helper for an assert in #ifndef NDEBUG
This function is used in exactly one place, and only in asserts
builds. Move it a few lines up before the use and only define it when
asserts are enabled. Fixes the release build under -Werror.

Also remove the forward declaration and commentary that was basically
identical to the code itself.

llvm-svn: 261722
2016-02-24 07:58:02 +00:00
Davide Italiano
715c1d0a06 [X86ISelLowering] Stop typing the same return over and over and over.
llvm-svn: 261666
2016-02-23 18:39:38 +00:00
Igor Breger
e60efa9c40 AVX512: Fix predicate of AVX pcmpeqw/b , pcmpgtb/w/d instructions . AVX512 version of this instructions return result in kmask register, so AVX patterns should not be disabled.
Differential Revision: http://reviews.llvm.org/D17517

llvm-svn: 261619
2016-02-23 08:55:33 +00:00
Duncan P. N. Exon Smith
53cb4596f6 CodeGen: TII: Take MachineInstr& in predicate API, NFC
Change TargetInstrInfo API to take `MachineInstr&` instead of
`MachineInstr*` in the functions related to predicated instructions
(I'll try to come back later and get some of the rest).  All of these
functions require non-null parameters already, so references are more
clear.  As a bonus, this happens to factor away a host of implicit
iterator => pointer conversions.

No functionality change intended.

llvm-svn: 261605
2016-02-23 02:46:52 +00:00
David Majnemer
f566971993 [X86] Create mergeable constant pool entries for AVX
We supported creating mergeable constant pool entries for smaller
constants but not for 32-byte AVX constants.

llvm-svn: 261584
2016-02-22 22:23:11 +00:00
Davide Italiano
f2c3cb292f [X86ISelLowering] Consolidate duplicated code in a single place.
llvm-svn: 261573
2016-02-22 21:06:46 +00:00
Duncan P. N. Exon Smith
0fa6439bcd Revert "CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC"
This reverts commit r261504, since it's not obvious the new name is
better:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160222/334298.html

I'll recommit if we get consensus that it's the right direction.

llvm-svn: 261567
2016-02-22 20:49:58 +00:00
Igor Breger
0f4267c518 AVX512F: Add assembler Intel syntax tests for knl, fix minor bugs.
Differential Revision: http://reviews.llvm.org/D17498

llvm-svn: 261521
2016-02-22 12:37:41 +00:00
Igor Breger
2d437b4341 AVX512: Fix scalar mem operands.
Differential Revision: http://reviews.llvm.org/D17500

llvm-svn: 261520
2016-02-22 11:48:27 +00:00
Craig Topper
37f137f856 [X86] Minor formatting fix. NFC
llvm-svn: 261515
2016-02-22 08:00:04 +00:00
Duncan P. N. Exon Smith
052f6c7bc1 Document assumption in X86FrameLowering::inlineStackProbe()
Resolve FIXME from r261504.  Apparently bundled instructions are illegal
here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160215/334146.html

llvm-svn: 261507
2016-02-22 02:32:35 +00:00
Duncan P. N. Exon Smith
b2dab65ba3 CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC
Delete MachineInstr::getIterator(), since the term "iterator" is
overloaded when talking about MachineInstr.

- Downcast to ilist_node in iplist::getNextNode() and getPrevNode() so
  that ilist_node::getIterator() is still available.
- Add it back as MachineInstr::getInstrIterator().  This matches the
  naming in MachineBasicBlock.
- Add MachineInstr::getBundleIterator().  This is explicitly called
  "bundle" (not matching MachineBasicBlock) to disintinguish it clearly
  from ilist_node::getIterator().
- Update all calls.  Some of these I switched to `auto` to remove
  boiler-plate, since the new name is clear about the type.

There was one call I updated that looked fishy, but it wasn't clear what
the right answer was.  This was in X86FrameLowering::inlineStackProbe(),
added in r252578 in lib/Target/X86/X86FrameLowering.cpp.  I opted to
leave the behaviour unchanged, but I'll reply to the original commit on
the list in a moment.

llvm-svn: 261504
2016-02-21 22:58:35 +00:00
Duncan P. N. Exon Smith
d5e432aea7 ADT: Remove == and != comparisons between ilist iterators and pointers
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.

Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators.  (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)

There should be NFC here.

llvm-svn: 261498
2016-02-21 20:39:50 +00:00
Craig Topper
f1ad8f775d [X86] Remove unused encoding types from disassembler. NFC
llvm-svn: 261494
2016-02-21 19:49:16 +00:00
Simon Pilgrim
ec0f8ea81f [X86][AVX] Add shuffle masking support for EltsFromConsecutiveLoads
Add support for the case where we have a consecutive load (which must include the first + last elements) with a mixture of undef/zero elements. We load the vector and then apply a shuffle to clear the zero'd elements.

Differential Revision: http://reviews.llvm.org/D17297

llvm-svn: 261490
2016-02-21 19:15:48 +00:00
Sanjoy Das
7fe777fc31 Fix LLVM's handling and detection of skylake and cannonlake CPUs
Summary:
 - Rename `"skylake"` == SkylakeServerProc to `"skylake-avx512"`
 - Change `"skylake"` to denote SkylakeClientProc
 - Fix the detection of cpu family 6 and model 94 to be
   SkylakeClientProc instead of SkylakeServerProc
 - Remove the `"cnl"` for CannonLake

Reviewers: craig.topper, delena

Subscribers: zansari, echristo, qcolombet, RKSimon, spatel, DavidKreitzer, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17090

llvm-svn: 261482
2016-02-21 17:12:03 +00:00
David Majnemer
174ea5bfc4 [X86] Use the correct alignment for COMDAT constant pool entries
COFF doesn't have sections with mergeable contents.  Instead, each
constant pool entry ends up in a COMDAT section.  The linker, when
choosing between COMDAT sections, doesn't choose the max alignment of
the two sections.  You just get whatever alignment was on the section.

If one constant needed a higher alignment in one object file from
another one, then we will get into trouble if the linker chooses the
lower alignment one.

Instead, lets promote the alignment of the constant pool entry to make
sure we don't use an under aligned constant with an instruction which
assumed otherwise.

This fixes PR26680.

llvm-svn: 261462
2016-02-21 01:30:30 +00:00
Simon Pilgrim
f7fbbebbc2 [X86][SSE] Fixed issue with commutation of 'faux unary' target shuffles (PR26667)
Fixed a bug introduced by D16683 when a binary shuffle is simplified to a unary shuffle (with undef/zero sentinel mask indices) - if this resulted in only the second input being used combineX86ShuffleChain failed to take this into account and still referenced the first input.

llvm-svn: 261434
2016-02-20 14:39:45 +00:00
Simon Pilgrim
3ddb55acee [X86][SSE] Move all undef/zero cases before target shuffle combining.
First small step towards fixing PR26667 - we need to ensure that combineX86ShuffleChain only gets called with a valid shuffle input node (a similar issue was found in D17041).

llvm-svn: 261433
2016-02-20 12:57:32 +00:00
Andrey Turetskiy
0027eb9e6b [X86] Enable the LEA optimization pass by default.
Differential Revision: http://reviews.llvm.org/D16877

llvm-svn: 261429
2016-02-20 11:11:55 +00:00
Andrey Turetskiy
9b5b22afce [X86] PR26575: Fix LEA optimization pass (Part 2).
Handle address displacement operands of a type other than Immediate or Global in LEAs and load/stores.

Ref: https://llvm.org/bugs/show_bug.cgi?id=26575

Differential Revision: http://reviews.llvm.org/D17374

llvm-svn: 261428
2016-02-20 10:58:28 +00:00
David Majnemer
56dae9ed9d Move some code from doInitialization to runOnFunction
This has no observable behavior change, it just makes the state
insertion pass look a little more like normal passes.

llvm-svn: 261420
2016-02-20 07:34:21 +00:00
Craig Topper
f8c17bbd8f [X86] Add some missing reversed forms of XOP instructions.
llvm-svn: 261417
2016-02-20 06:20:17 +00:00
Davide Italiano
93a025fa59 [X86ISelLowering] Fix TLSADDR lowering when shrink-wrapping is enabled.
TLSADDR nodes are lowered into actuall calls inside MC. In order to prevent
shrink-wrapping from pushing prologue/epilogue past them (which result
in TLS variables being accessed before the stack frame is set up), we 
put markers, so that the stack gets adjusted properly.
Thanks to Quentin Colombet for guidance/help on how to fix this problem!

llvm-svn: 261387
2016-02-20 00:44:47 +00:00
Davide Italiano
f288ed2d05 [X86ISelLowering] Provide a more informative assert message.
I stumbled upon this while debugging a lowering bug.

llvm-svn: 261371
2016-02-19 22:18:49 +00:00
Davide Italiano
2cdf3c2ffb [X86ISelLowering] Merge two conditions inside a single if.
llvm-svn: 261370
2016-02-19 22:01:07 +00:00
Hans Wennborg
a329081c88 Revert r253557 "Alternative to long nops for X86 CPUs, by Andrey Turetsky"
Turns out the new nop sequences aren't actually nops on x86_64 (PR26554).

llvm-svn: 261365
2016-02-19 21:26:31 +00:00
Dimitry Andric
be984b406a Fix incorrect selection of AVX512 sqrt when OptForSize is on
Summary:
When optimizing for size, sqrt calls can be incorrectly selected as
AVX512 VSQRT instructions.  This is because X86InstrAVX512.td has a
`Requires<[OptForSize]>` in its `avx512_sqrt_scalar` multiclass
definition.  Even if the target does not support AVX512, the class can
apparently still be chosen, leading to an incorrect selection of
`vsqrtss`.

In PR26625, this lead to an assertion: Reg >= X86::FP0 && Reg <=
X86::FP6 && "Expected FP register!", because the `vsqrtss` instruction
requires an XMM register, which is not available on i686 CPUs.

Reviewers: grosbach, resistor, joker.eph

Subscribers: spatel, emaste, llvm-commits

Differential Revision: http://reviews.llvm.org/D17414

llvm-svn: 261360
2016-02-19 20:14:11 +00:00
Craig Topper
fea5656c80 [X86] Remove unused entries from the disassembler type enum.
llvm-svn: 261311
2016-02-19 06:57:40 +00:00
Sanjay Patel
3941775761 [x86] fix initialization of PredictableSelectIsExpensive
This is effectively NFC because Atom is the only in-order x86 subtarget currently,
but the predicate would have become wrong if any other in-order CPU came along.

See related discussion in:
http://reviews.llvm.org/D16836

llvm-svn: 261275
2016-02-18 23:08:48 +00:00
Richard Trieu
5a759985de Remove uses of builtin comma operator.
Cleanup for upcoming Clang warning -Wcomma.  No functionality change intended.

llvm-svn: 261270
2016-02-18 22:09:30 +00:00
David Majnemer
40015c4774 [WinEH] Hoist state stores from successors
If we know that all of our successors want to be in the exact same
state, it makes sense to hoist the state transition into their common
predecessor.

Differential Revision: http://reviews.llvm.org/D17391

llvm-svn: 261262
2016-02-18 21:13:35 +00:00
Davide Italiano
74aa817e29 [X86ISelLowering] Use isPowerof2 instead of rewriting it. NFC.
llvm-svn: 261255
2016-02-18 20:43:15 +00:00
Hans Wennborg
0667f54980 Revert to extend i8/i16 return values on Darwin (PR26665)
In r260133, LLVM was changed to no longer extend i8/i16 return values,
as it's not required by the ABI. However, code was found in the wild
that relies on the old behaviour on Darwin, so this commit reverts
back to that old behaviour for Darwin.

On other platforms, it's less likely that code would be depending on
the old behaviour, as GCC and MSVC haven't been extending such return
values.

llvm-svn: 261235
2016-02-18 18:17:05 +00:00
Simon Pilgrim
82dcce5934 [X86][SSE] Improve PSHUFB shuffle mask decoding.
In cases where the PSHUFB shuffle mask is shared it might not be bitcasted to a vXi8 byte vector. This patch adds support for decoding these wider shuffle masks from the ConstantPool.

The test case in question makes use of this to recognise the shuffle mask is an unary UNPCKL pattern and simplifies accordingly.

llvm-svn: 261201
2016-02-18 10:17:40 +00:00
Michael Zuckerman
285264f877 [AVX512][PRORQ][PRORD] Change imm8 to int
Differential Revision: http://reviews.llvm.org/D17024

llvm-svn: 261198
2016-02-18 09:52:12 +00:00
Nico Weber
abdea66c1a Remove superfluous semicolon.
llvm-svn: 261128
2016-02-17 18:48:08 +00:00
David Majnemer
81eb2d6bae [WinEH] Optimize WinEH state stores
32-bit x86 Windows targets use a linked-list of nodes allocated on the
stack, referenced to via thread-local storage.  The personality routine
interprets one of the fields in the node as a 'state number' which
indicates where the personality routine should transfer control.

State transitions are possible only before call-sites which may throw
exceptions.  Our previous scheme had us update the state number before
all call-sites which may throw.

Instead, we can try to minimize the number of times we need to store by
reasoning about the nearest store which dominates the current call-site.
If the last store agrees with the current call-site, then we know that
the state-update is redundant and can be elided.

This is largely straightforward: an RPO walk of the blocks allows us to
correctly forward propagate the information when the function is a DAG.
Currently, loops are not handled optimally and may trigger superfluous
state stores.

Differential Revision: http://reviews.llvm.org/D16763

llvm-svn: 261122
2016-02-17 18:37:11 +00:00
Igor Breger
0cae06de47 AVX512: Fix LowerMSCATTER() return value.
Bug description:
  The bug was discovered when test was compiled with -O0.
  In case scatter result is DAG root , VectorLegalizer failed (assert) due to LowerMSCATTER() return kmask as result.
Change LowerMSCATTER() to return chain as original node do.

Differential Revision: http://reviews.llvm.org/D17331

llvm-svn: 261090
2016-02-17 14:04:33 +00:00
Simon Pilgrim
54bcb546f1 [X86][AVX] Support bit-blend integer shuffles for 256-bit integer vectors
AVX1 doesn't support the shuffling of 256-bit integer vectors. For 32/64-bit elements we get around this by shuffling as float/double but for 8/16-bit elements (assuming they can't widen) we currently just split, shuffle as 128-bit vectors and concatenate the results back.

This patch adds the ability to lower using the bit-blend patterns before defaulting to the splitting behaviour.

Part 2 of 2

Differential Revision: http://reviews.llvm.org/D17292

llvm-svn: 261082
2016-02-17 10:50:06 +00:00
Simon Pilgrim
b13c68898e [X86][AVX] Support bit-mask integer shuffles for 256-bit integer vectors
AVX1 doesn't support the shuffling of 256-bit integer vectors. For 32/64-bit elements we get around this by shuffling as float/double but for 8/16-bit elements (assuming they can't widen) we currently just split, shuffle as 128-bit vectors and concatenate the results back.

This patch adds the ability to lower using the bit-mask patterns before defaulting to the splitting behaviour. In some cases this ends up matching what AVX2 would do anyhow or what AVX1 does on the split vectors.

Part 1 of 2

Differential Revision: http://reviews.llvm.org/D17292

llvm-svn: 261081
2016-02-17 10:37:49 +00:00
Simon Pilgrim
d35c533371 [X86][SSE] Tidyup BUILD_VECTOR operand collection. NFCI.
Avoid reuse of operand variables, keep them local to a particular lowering - the operand collection is unique to each case anyhow.

Renamed from V to Ops to more closely match their purpose.

llvm-svn: 261078
2016-02-17 10:12:30 +00:00
Hans Wennborg
8484662941 Revert r260979 "[X86] Enable the LEA optimization pass by default."
Asserts are still firing in Chromium builds. PR26575.

llvm-svn: 261058
2016-02-17 02:49:59 +00:00
Reid Kleckner
30231d4f95 [X86] Fix a shrink-wrapping miscompile around __chkstk
__chkstk clobbers EAX. If EAX is live across the prologue, then we have
to take extra steps to save it. We already had code to do this if EAX
was a register parameter. This change adapts it to work when shrink
wrapping is used.

llvm-svn: 261039
2016-02-17 00:17:33 +00:00
Ahmed Bougacha
c1f422afe6 [X86] Remove the now-unused X86ISD::PSIGN. NFC.
llvm-svn: 261025
2016-02-16 22:14:12 +00:00
Ahmed Bougacha
0b74af0c16 [X86] Generalize logic blend of (x, -x) combine to match (-x, x).
I suspect this is what let PR26110 lie dormant for so long.

llvm-svn: 261024
2016-02-16 22:14:07 +00:00
Ahmed Bougacha
c6b1c28e14 [X86] Don't turn (c?-v:v) into (c?-v:0) by blindly using PSIGN.
Currently, we sometimes miscompile this vector pattern:
    (c ? -v : v)
We lower it to (because "c" is <4 x i1>, lowered as a vector mask):
    (~c & v) | (c & -v)

When we have SSSE3, we incorrectly lower that to PSIGN, which does:
    (c < 0 ? -v : c > 0 ? v : 0)
in other words, when c is either all-ones or all-zero:
    (c ? -v : 0)
While this is an old bug, it rarely triggers because the PSIGN combine
is too sensitive to operand order. This will be improved separately.

Note that the PSIGN tests are also incorrect. Consider:
    %b.lobit = ashr <4 x i32> %b, <i32 31, i32 31, i32 31, i32 31>
    %sub = sub nsw <4 x i32> zeroinitializer, %a
    %0 = xor <4 x i32> %b.lobit, <i32 -1, i32 -1, i32 -1, i32 -1>
    %1 = and <4 x i32> %a, %0
    %2 = and <4 x i32> %b.lobit, %sub
    %cond = or <4 x i32> %1, %2
    ret <4 x i32> %cond
if %b is zero:
    %b.lobit = <4 x i32> zeroinitializer
    %sub = sub nsw <4 x i32> zeroinitializer, %a
    %0 = <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
    %1 = <4 x i32> %a
    %2 = <4 x i32> zeroinitializer
    %cond = or <4 x i32> %a, zeroinitializer
    ret <4 x i32> %a
whereas we currently generate:
    psignd %xmm1, %xmm0
    retq
which returns 0, as %xmm1 is 0.

Instead, use a pure logic sequence, as described in:
https://graphics.stanford.edu/~seander/bithacks.html#ConditionalNegate

Fixes PR26110.

Differential Revision: http://reviews.llvm.org/D17181

llvm-svn: 261023
2016-02-16 22:14:03 +00:00
Ahmed Bougacha
7fc85e4b5a [X86] Extract PSIGN/BLENDVP combine. NFC.
llvm-svn: 261021
2016-02-16 22:13:55 +00:00
Ahmed Bougacha
8f9b9ee793 [X86] Extract ANDNP combine. NFC.
This makes it IMO more readable and reduces indentation.

llvm-svn: 261020
2016-02-16 22:13:49 +00:00
Andrey Turetskiy
951c4b5ff8 [X86] Enable the LEA optimization pass by default.
Differential Revision: http://reviews.llvm.org/D16877

llvm-svn: 260979
2016-02-16 16:41:38 +00:00
Andrey Turetskiy
4106876da9 [X86] PR26575: Fix LEA optimization pass.
Add a missing check for a type of address displacement operand of the load/store instruction being a candidate for LEA substitution.

Ref: https://llvm.org/bugs/show_bug.cgi?id=26575

Differential Revision: http://reviews.llvm.org/D17261

llvm-svn: 260959
2016-02-16 12:47:45 +00:00
Craig Topper
1c3ea77731 [X86] Fix typos. NFC
llvm-svn: 260943
2016-02-16 07:45:07 +00:00
Craig Topper
724defa666 [X86] Use range-based for loop. NFC
llvm-svn: 260942
2016-02-16 07:45:04 +00:00