1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 14:33:02 +02:00
Commit Graph

1149 Commits

Author SHA1 Message Date
Peter Collingbourne
53c709eaaf Re-apply r286384, "X86: Introduce the "relocImm" ComplexPattern, which represents a relocatable immediate.", with a fix for 32-bit x86.
Teach X86InstrInfo::analyzeCompare() not to crash on CMP and SUB instructions
that take a global address operand.

llvm-svn: 286420
2016-11-09 23:53:43 +00:00
Peter Collingbourne
5b818e4321 Revert r286384, "X86: Introduce the "relocImm" ComplexPattern, which represents a relocatable immediate."
Suspected to be the cause of a sanitizer-windows bot failure:
Assertion failed: isImm() && "Wrong MachineOperand accessor", file C:\b\slave\sanitizer-windows\llvm\include\llvm/CodeGen/MachineOperand.h, line 420

llvm-svn: 286385
2016-11-09 18:17:50 +00:00
Peter Collingbourne
b305b8de78 X86: Introduce the "relocImm" ComplexPattern, which represents a relocatable immediate.
A relocatable immediate is either an immediate operand or an operand that
can be relocated by the linker to an immediate, such as a regular symbol
in non-PIC code.

Start using relocImm for 32-bit and 64-bit MOV instructions, and for operands
of type "imm32_su". Remove a number of now-redundant patterns.

Differential Revision: https://reviews.llvm.org/D25812

llvm-svn: 286384
2016-11-09 17:51:58 +00:00
Pierre Gousseau
1c56ea6dd5 [X86] Take advantage of the lzcnt instruction on btver2 architectures when ORing comparisons to zero.
This change adds transformations such as:
  zext(or(setcc(eq, (cmp x, 0)), setcc(eq, (cmp y, 0))))
  To:
  srl(or(ctlz(x), ctlz(y)), log2(bitsize(x))
This optimisation is beneficial on Jaguar architecture only, where lzcnt has a good reciprocal throughput.
Other architectures such as Intel's Haswell/Broadwell or AMD's Bulldozer/PileDriver do not benefit from it.
For this reason the change also adds a "HasFastLZCNT" feature which gets enabled for Jaguar.

Differential Revision: https://reviews.llvm.org/D23446

llvm-svn: 284248
2016-10-14 16:41:38 +00:00
Marina Yatsina
269811ad82 [x86] Accept 'retn' as an alias to 'ret[lqw]'\'ret' (At&t\Intel)
Implement 'retn' simply by aliasing it to the relevant 'ret' instruction

Commit on behalf of coby

Differential Revision: https://reviews.llvm.org/D24346

llvm-svn: 282601
2016-09-28 15:52:56 +00:00
Craig Topper
6651d4f318 [AVX-512] Use 512-bit vcvtps2ph/vcvtph2ps to implement fp_to_f16/f16_to_fp when F16C and VLX are not supported.
Fixes PR23941.

llvm-svn: 281958
2016-09-20 05:44:47 +00:00
Craig Topper
21e185006b [X86] Create a new instruction format to handle 4VOp3 encoding. This saves one bit in TSFlags and simplifies MRMSrcMem/MRMSrcReg format handling.
llvm-svn: 279424
2016-08-22 07:38:50 +00:00
Sanjay Patel
36b572f1e0 [x86] Allow merging multiple instances of an immediate within a basic block for code size savings, for 64-bit constants.
This patch handles 64-bit constants which can be encoded as 32-bit immediates.

It extends the functionality added by https://reviews.llvm.org/D11363 for 32-bit constants to 64-bit constants.

Patch by Sunita Marathe!

Differential Revision: https://reviews.llvm.org/D23391

llvm-svn: 278857
2016-08-16 21:35:16 +00:00
Ahmed Bougacha
2e4800fe66 [X86] Don't mark addressing mode operands as "outs". NFC-ish.
Nothing in-tree can tell the difference, but it's incorrect: the
addressing mode registers aren't what's defined.

llvm-svn: 275426
2016-07-14 14:53:17 +00:00
Michael Zuckerman
702609b9a6 [LLVM][INTRINSICS] adding intrinsics of CLFLUSHOPT
Differential Revision: http://reviews.llvm.org/D21789

llvm-svn: 274553
2016-07-05 14:42:12 +00:00
Rafael Espindola
6d57a71621 Convert a few more comparisons to isPositionIndependent(). NFC.
llvm-svn: 273945
2016-06-27 21:33:08 +00:00
Rafael Espindola
8feebdfbda Delete the IsStatic predicate.
In all its uses it was equivalent to IsNotPIC.

llvm-svn: 273943
2016-06-27 21:09:14 +00:00
Ahmed Bougacha
458b6b7f6e [X86] Remove dead ISD opcodes. NFC.
llvm-svn: 273716
2016-06-24 20:37:55 +00:00
Ahmed Bougacha
efa3d8d88c [X86] Define segment MI operands as regs instead of i8imm.
We've been pretending that segments are i8imm since the initial
support (r68645), predating the addition of the SEGMENT_REG class
(r81895).  That happens to works, but is wrong, and inconsistent
with how we print (e.g., X86ATTInstPrinter::printMemReference)
and parse them (e.g., X86Operand::addMemOperands).

This change shouldn't affect any tool users, but is visible to
library users or out-of-tree tablegen backends: this causes
MCOperandInfo for the segment op to have an RC instead of "unknown",
and TII::getRegClass to actually return something.  As the registers
are reserved and no vregs of the class ever created, that shouldn't
change anything.

No test change; no suspicious getRegClass() in X86 and CodeGen.

llvm-svn: 271559
2016-06-02 18:29:15 +00:00
Saleem Abdulrasool
f85a029a33 X86: permit using SjLj EH on x86 targets as an option
This adds support to the backed to actually support SjLj EH as an exception
model.  This is *NOT* the default model, and requires explicitly opting into it
from the frontend.  GCC supports this model and for MinGW can still be enabled
via the `--using-sjlj-exceptions` options.

Addresses PR27749!

llvm-svn: 271244
2016-05-31 01:48:07 +00:00
Rafael Espindola
4ed3f6e3d4 Remember the relocation model. NFC.
This avoids passing a TargetMachine in a few places.

llvm-svn: 270095
2016-05-19 18:49:29 +00:00
Rafael Espindola
084f91d955 Style fixes. NFC.
llvm-svn: 270093
2016-05-19 18:34:20 +00:00
Hans Wennborg
5b89989aa5 Re-commit r269828 "X86: Avoid using _chkstk when lowering WIN_ALLOCA instructions"
with an additional fix to make RegAllocFast ignore undef physreg uses. It would
previously get confused about the "push %eax" instruction's use of eax. That
method for adjusting the stack pointer is used in X86FrameLowering::emitSPUpdate
as well, but since that runs after register-allocation, we didn't run into the
RegAllocFast issue before.

llvm-svn: 269949
2016-05-18 16:10:17 +00:00
Ashutosh Nema
0cfbe42fbc Add new flag and intrinsic support for MWAITX and MONITORX instructions
Summary:

MONITORX/MWAITX instructions provide similar capability to the MONITOR/MWAIT
pair while adding a timer function, such that another termination of the MWAITX
instruction occurs when the timer expires. The presence of the MONITORX and
MWAITX instructions is indicated by CPUID 8000_0001, ECX, bit 29.

The MONITORX and MWAITX instructions are intercepted by the same bits that
intercept MONITOR and MWAIT. MONITORX instruction establishes a range to be
monitored. MWAITX instruction causes the processor to stop instruction execution
and enter an implementation-dependent optimized state until occurrence of a
class of events.

Opcode of MONITORX instruction is "0F 01 FA". Opcode of MWAITX instruction is
"0F 01 FB". These opcode information is used in adding tests for the
disassembler.

These instructions are enabled for AMD's bdver4 architecture.

Patch by Ganesh Gopalasubramanian!

Reviewers: echristo, craig.topper, RKSimon
Subscribers: RKSimon, joker.eph, llvm-commits
Differential Revision: http://reviews.llvm.org/D19795

llvm-svn: 269911
2016-05-18 11:59:12 +00:00
Hans Wennborg
90018c04c2 Revert r269828 "X86: Avoid using _chkstk when lowering WIN_ALLOCA instructions"
Seems to have broken the Windows ASan bot. Reverting while investigating.

llvm-svn: 269833
2016-05-17 20:38:56 +00:00
Hans Wennborg
09bf3bedad X86: Avoid using _chkstk when lowering WIN_ALLOCA instructions
This patch moves the expansion of WIN_ALLOCA pseudo-instructions
into a separate pass that walks the CFG and lowers the instructions
based on a conservative estimate of the offset between the stack
pointer and the lowest accessed stack address.

The goal is to reduce binary size and run-time costs by removing
calls to _chkstk. While it doesn't fix all the code quality problems
with inalloca calls, it's an incremental improvement for PR27076.

Differential Revision: http://reviews.llvm.org/D20263

llvm-svn: 269828
2016-05-17 20:13:29 +00:00
Craig Topper
7b7547a4dc [X86] Fix InstAliases to not allow FARCALL32i/FARCALL16i/FARJMP32i/FARJMP16i in 64-bit mode.
llvm-svn: 268863
2016-05-07 19:25:56 +00:00
Craig Topper
7cda2fdad5 [X86] Remove isel patterns for selecting tzcnt/lzcnt from cmove/ne+cttz/ctlz. These are folded by DAG combine now.
llvm-svn: 267326
2016-04-24 04:38:34 +00:00
Craig Topper
917b9d7a47 [X86] Fix patterns that turn cmove/cmovne+ctlz/cttz into lzcnt/tzcnt instructions. Only one of the conditions should be valid for each pattern, not both. Update tests accordingly.
llvm-svn: 267311
2016-04-24 02:01:22 +00:00
Hans Wennborg
9fe6bf47fd X86: Use push-pop for materializing 8-bit immediates for minsize (take 2)
This is the same as r255936, with added logic for avoiding clobbering of the
red zone (PR26023).

Differential Revision: http://reviews.llvm.org/D18246

llvm-svn: 264375
2016-03-25 01:10:56 +00:00
Craig Topper
f34a1d74e9 [X86] Remove many operands that represent memory stores from outs to ins. These operands are the registers and immediates that specify the memory address not the memory itself thus they are inputs.
llvm-svn: 263354
2016-03-13 02:56:31 +00:00
Quentin Colombet
aaf2db6c80 [X86] Make sure we do not clobber RBX with cmpxchg when used as a base pointer.
cmpxchg[8|16]b uses RBX as one of its argument.
In other words, using this instruction clobbers RBX as it is defined to hold one
the input. When the backend uses dynamically allocated stack, RBX is used as a
reserved register for the base pointer. 

Reserved registers have special semantic that only the target understands and
enforces, because of that, the register allocator don’t use them, but also,
don’t try to make sure they are used properly (remember it does not know how
they are supposed to be used).

Therefore, when RBX is used as a reserved register but defined by something that
is not compatible with that use, the register allocator will not fix the
surrounding code to make sure it gets saved and restored properly around the
broken code. This is the responsibility of the target to do the right thing with
its reserved register.

To fix that, when the base pointer needs to be preserved, we use a different
pseudo instruction for cmpxchg that save rbx.
That pseudo takes two more arguments than the regular instruction:
- One is the value to be copied into RBX to set the proper value for the
  comparison.
- The other is the virtual register holding the save of the value of RBX as the
  base pointer. This saving is done as part of isel (i.e., we emit a copy from
  rbx).

cmpxchg_save_rbx <regular cmpxchg args>, input_for_rbx_reg, save_of_rbx_as_bp

This gets expanded into:
rbx = copy input_for_rbx_reg
cmpxchg <regular cmpxchg args>
rbx = save_of_rbx_as_bp

Note: The actual modeling of the pseudo is a bit more complicated to make sure
the interferes that appears after the pseudo gets expanded are properly modeled
before that expansion.

This fixes PR26883.

llvm-svn: 263325
2016-03-12 02:25:27 +00:00
David Majnemer
0db3c7acce [X86] Support cleaning more than 2**16 bytes of stack
The x86 ret instruction has a 16 bit immediate indicating how many bytes
to pop off of the stack beyond the return address.

There is a problem when extremely large structs are passed by value: we
might not be able to fit the number of bytes to pop into the return
instruction.

To fix this, expand RET_FLAG a little later and use a special sequence
to clean the stack:

pop  %ecx     ; return address is now in %ecx
add  $n, %esp ; clean the stack
push %ecx     ; bring the return address back on the stack
ret           ; pop the return address and jmp to it's value

llvm-svn: 262755
2016-03-04 22:56:17 +00:00
David Majnemer
572acaa24c [X86] Permit reading of the FLAGS register without it being previously defined
We modeled the RDFLAGS{32,64} operations as "using" {E,R}FLAGS.
While technically correct, this is not be desirable for folks who want
to examine aspects of the FLAGS register which are not related to
computation like whether or not CPUID is a valid instruction.

Differential Revision: http://reviews.llvm.org/D17782

llvm-svn: 262465
2016-03-02 06:46:52 +00:00
Ahmed Bougacha
8edc363aa7 [X86] Move the ATOMIC_LOAD_OP ISel from DAGToDAG to ISelLowering. NFCI.
This is long-standing dirtiness, as acknowledged by r77582:

    The current trick is to select it into a merge_values with
    the first definition being an implicit_def. The proper solution is
    to add new ISD opcodes for the no-output variant.

Doing this before selection will let us combine away some constructs.

Differential Revision: http://reviews.llvm.org/D17659

llvm-svn: 262244
2016-02-29 19:28:07 +00:00
Ahmed Bougacha
05e77f2852 [X86] Remove the unused SDTX86atomicBinary. NFC.
llvm-svn: 262086
2016-02-26 22:59:41 +00:00
Igor Breger
c2763588ac AVX512F: Add GATHER/SCATTER assembler Intel syntax tests for knl/skx/avx . Change memory operand parser handling.
Differential Revision: http://reviews.llvm.org/D17564

llvm-svn: 261862
2016-02-25 13:30:17 +00:00
Igor Breger
e60efa9c40 AVX512: Fix predicate of AVX pcmpeqw/b , pcmpgtb/w/d instructions . AVX512 version of this instructions return result in kmask register, so AVX patterns should not be disabled.
Differential Revision: http://reviews.llvm.org/D17517

llvm-svn: 261619
2016-02-23 08:55:33 +00:00
Sanjay Patel
d5eac4bb77 [x86-64] allow mfence even with -mno-sse (PR23203)
As shown in:
https://llvm.org/bugs/show_bug.cgi?id=23203
...we currently die because lowering believes that mfence is allowed without SSE2 on x86-64,
but the instruction def doesn't know that.

I don't know if allowing mfence without SSE is right, but if not, at least now it's consistently wrong. :)

Differential Revision: http://reviews.llvm.org/D17219

llvm-svn: 260828
2016-02-13 17:26:29 +00:00
Oliver Stannard
4ff6a9f924 [TableGen] Fix sort order of asm operand classes
This is a fix for https://llvm.org/bugs/show_bug.cgi?id=22796.

The previous implementation of ClassInfo::operator< allowed cycles of classes
such that x < y < z < x, meaning that a list of them cannot be correctly
sorted, and the sort order could differ with different standard libraries.

The original implementation sorted classes by ValueName if they were otherwise
equal. This isn't strictly necessary, but some backends seem to accidentally
rely on it. If I reverse this comparison I get 8 test failures spread across
the AArch64, Mips and X86 backends, so I have left it in until those backends
can be fixed.

There was one case in the X86 backend where the observable behaviour of the
assembler is changed by this patch. This was because some of the memory asm
operands were not marked as children of X86MemAsmOperand.

Differential Revision: http://reviews.llvm.org/D16141

llvm-svn: 258677
2016-01-25 10:20:19 +00:00
Elena Demikhovsky
832e2d5858 Added Skylake client to X86 targets and features
Changes in X86.td:

I set features of Intel processors in incremental form: IVB = SNB + X HSW = IVB + X ..
I added Skylake client processor and defined it's features
FeatureADX was missing on KNL
Added some new features to appropriate processors SMAP, IFMA, PREFETCHWT1, VMFUNC and others

Differential Revision: http://reviews.llvm.org/D16357

llvm-svn: 258659
2016-01-24 10:41:28 +00:00
Michael Zuckerman
71a84dc5a5 [AVX512] Adding VPERMB instruction
Differential Revision: http://reviews.llvm.org/D16294

llvm-svn: 258144
2016-01-19 17:07:43 +00:00
Marina Yatsina
d7dac8fde4 [X86] Adding support for missing variations of X86 string related instructions
The following are legal according to X86 spec:
ins mem, DX
outs DX, mem
lods mem
stos mem
scas mem
cmps mem, mem
movs mem, mem

Differential Revision: http://reviews.llvm.org/D14827

llvm-svn: 258132
2016-01-19 15:37:56 +00:00
Michael Zuckerman
65f549e895 [AVX512] adding AVXVBMI feature flag
The feature flag is for VPERMB,VPERMI2B,VPERMT2B and VPMULTISHIFTQB instructions. 
More about the instruction can be found in:
hattps://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf

Differential Revision: http://reviews.llvm.org/D16190

llvm-svn: 258012
2016-01-17 13:42:12 +00:00
Craig Topper
e2f6754f73 [X86] Use \t instead of space after mnemonics in a bunch InstAliases for consistency.
llvm-svn: 257148
2016-01-08 06:09:13 +00:00
Craig Topper
1e8c72699f [X86] Add hasSideEffects=0 and mayLoad=1 to MOVZX64* instructions. While there remove a superfluous _Q from the instruction names.
llvm-svn: 257032
2016-01-07 05:57:39 +00:00
Craig Topper
aa4d1466b3 [X86] STOSQ without a rep prefix doesn't read or write RCX.
llvm-svn: 257030
2016-01-07 05:18:49 +00:00
Craig Topper
f0f0ca85b5 [X86] Use PS instead of TB for instructions that have PD/XS/XD variations. Use OpSize32 on an instruction that has an OpSize16 variant.
llvm-svn: 256918
2016-01-06 06:18:41 +00:00
David Majnemer
c60cb66b34 Revert "[X86] Use push-pop for materializing small constants under 'minsize'"
The red zone consists of 128 bytes beyond the stack pointer so that the
allocation of objects in leaf functions doesn't require decrementing
rsp.  In r255656, we introduced an optimization that would cheaply
materialize certain constants via push/pop.  Push decrements the stack
pointer and stores it's result at what is now the top of the stack.
However, this means that using push/pop would encroach on the red zone.
PR26023 gives an example where this corrupts an object in the red zone.

llvm-svn: 256808
2016-01-05 02:32:06 +00:00
David Majnemer
93803262f4 [X86] Add intrinsics for reading and writing to the flags register
LLVM's targets need to know if stack pointer adjustments occur after the
prologue.  This is needed to correctly determine if the red-zone is
appropriate to use or if a frame pointer is required.

Normally, LLVM can figure this out very precisely by reasoning about the
contents of the MachineFunction.  There is an interesting corner case:
inline assembly.

The vast majority of inline assembly which will perform a push or pop is
done so to pair up with pushf or popf as appropriate.  Unfortunately,
this inline assembly doesn't mark the stack pointer as clobbered
because, well, it isn't.  The stack pointer is decremented and then
immediately incremented.  Because of this, LLVM was changed in r256456
to conservatively assume that inline assembly contain a sequence of
stack operations.  This is unfortunate because the vast majority of
inline assembly will not end up manipulating the stack pointer in any
way at all.

Instead, let's provide a more principled solution: an intrinsic.
FWIW, other compilers (MSVC and GCC among them) also provide this
functionality as an intrinsic.

llvm-svn: 256685
2016-01-01 06:50:01 +00:00
Craig Topper
bfdc4a3764 [X86] Add proper Uses/Defs/mayLoad flags for AAA/AAD/AAM/AAS/DAA/DAS/XLAT instructions.
llvm-svn: 256481
2015-12-28 06:11:37 +00:00
Amjad Aboud
8197f10787 Implemented Support of IA interrupt and exception handlers:
http://lists.llvm.org/pipermail/cfe-dev/2015-September/045171.html

Differential Revision: http://reviews.llvm.org/D15567

llvm-svn: 256155
2015-12-21 14:07:14 +00:00
Hans Wennborg
6b696434e4 [X86] Use push-pop for materializing small constants under 'minsize'
Use the 3-byte (4 with REX prefix) push-pop sequence for materializing
small constants. This is smaller than using a mov (5, 6 or 7 bytes
depending on size and REX prefix), but it's likely to be slower, so
only used for 'minsize'.

This is a follow-up to r255656.

Differential Revision: http://reviews.llvm.org/D15549

llvm-svn: 255936
2015-12-17 23:18:39 +00:00
Asaf Badouh
2b8c4d5c98 [x86] adding PKU feature flag
the feature flag is essential for RDPKRU and WRPKRU instruction 
more about the instruction can be found in the SDM rev 56, vol 2 from http://www.intel.com/sdm

Differential Revision: http://reviews.llvm.org/D15491

llvm-svn: 255644
2015-12-15 13:35:29 +00:00
Chih-Hung Hsieh
bd51d0c9b1 [X86] Part 2 to fix x86-64 fp128 calling convention.
Part 1 was submitted in http://reviews.llvm.org/D15134.
Changes in this part:
* X86RegisterInfo.td, X86RecognizableInstr.cpp: Add FR128 register class.
* X86CallingConv.td: Pass f128 values in XMM registers or on stack.
* X86InstrCompiler.td, X86InstrInfo.td, X86InstrSSE.td:
  Add instruction selection patterns for f128.
* X86ISelLowering.cpp:
  When target has MMX registers, configure MVT::f128 in FR128RegClass,
  with TypeSoftenFloat action, and custom actions for some opcodes.
  Add missed cases of MVT::f128 in places that handle f32, f64, or vector types.
  Add TODO comment to support f128 type in inline assembly code.
* SelectionDAGBuilder.cpp:
  Fix infinite loop when f128 type can have
  VT == TLI.getTypeToTransformTo(Ctx, VT).
* Add unit tests for x86-64 fp128 type.

Differential Revision: http://reviews.llvm.org/D11438

llvm-svn: 255558
2015-12-14 22:08:36 +00:00