1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

127 Commits

Author SHA1 Message Date
Pete Cooper
01b2132972 Fix a stackmap bug introduced in r220710.
For a call to not return in to the stackmap shadow, the shadow must end with the call.

To do this, we must insert any required nops *before* the call, and not after it.

llvm-svn: 220728
2014-10-27 22:38:45 +00:00
Pete Cooper
87efb91a50 Stackmap shadows should consider call returns a branch target.
To avoid emitting too many nops, a stackmap shadow can include emitted instructions in the shadow, but these must not include branch targets.

A return from a call should count as a branch target as patching over the instructions after the call would lead to incorrect behaviour for threads currently making that call, when they return.

llvm-svn: 220710
2014-10-27 19:40:35 +00:00
Rafael Espindola
6ffbd5bf5d Fix a bit of confusion about .set and produce more readable assembly.
Every target we support has support for assembly that looks like

a = b - c
.long a

What is special about MachO is that the above combination suppresses the
production of a relocation.

With this change we avoid producing the intermediary labels when they don't
add any value.

llvm-svn: 220256
2014-10-21 01:17:30 +00:00
Chandler Carruth
98785bb00d [x86] Implement v16i16 support with AVX2 in the new vector shuffle
lowering.

This also implements the fancy blend lowering for v16i16 using AVX2 and
teaches the X86 backend to print shuffle masks for 256-bit PSHUFB
and PBLENDW instructions. It also makes the mask decoding correct for
PBLENDW instructions. The yaks, they are legion.

Tests are updated accordingly. There are some missing tests for the
VBLENDVB lowering, but I'll add those in a follow-up as this commit has
accumulated enough cruft already.

llvm-svn: 218430
2014-09-25 00:24:19 +00:00
Chandler Carruth
5fdf21544a [x86] Teach the instruction lowering to add comments describing constant
pool data being loaded into a vector register.

The comments take the form of:

  # ymm0 = [a,b,c,d,...]
  # xmm1 = <x,y,z...>

The []s are used for generic sequential data and the <>s are used for
specifically ConstantVector loads. Undef elements are printed as the
letter 'u', integers in decimal, and floating point values as floating
point values. Suggestions on improving the formatting or other aspects
of the display are very welcome.

My primary use case for this is to be able to FileCheck test masks
passed to vector shuffle instructions in-register. It isn't fantastic
for that (no decoding special zeroing semantics or other tricks), but it
at least puts the mask onto an instruction line that could reasonably be
checked. I've updated many of the new vector shuffle lowering tests to
leverage this in their test cases so that we're actually checking the
shuffle masks remain as expected.

Before implementing this, I tried a *bunch* of different approaches.
I looked into teaching the MCInstLower code to scan up the basic block
and find a definition of a register used in a shuffle instruction and
then decode that, but this seems incredibly brittle and complex.
I talked to Hal a lot about the "right" way to do this: attach the raw
shuffle mask to the instruction itself in some form of unencoded
operands, and then use that to emit the comments. I still think that's
the optimal solution here, but it proved to be beyond what I'm up for
here. In particular, it seems likely best done by completing the
plumbing of metadata through these layers and attaching the shuffle mask
in metadata which could have fully automatic dropping when encoding an
actual instruction.

llvm-svn: 218377
2014-09-24 09:39:41 +00:00
Chandler Carruth
b025884a94 [x86] More refactoring of the shuffle comment emission. The previous
attempt didn't work out so well. It looks like it will be much better
for introducing extra logic to find a shuffle mask if the finding logic
is totally separate. This also makes it easy to sink the opcode logic
completely out of the routine so we don't re-dispatch across it.

Still no functionality changed.

llvm-svn: 218363
2014-09-24 03:06:37 +00:00
Chandler Carruth
aa73f0333d [x86] Bypass the shuffle mask comment generation when not using verbose
asm. This can be somewhat expensive and there is no reason to do it
outside of tests or debugging sessions. I'm also likely to make it
significantly more expensive to support more styles of shuffles.

llvm-svn: 218362
2014-09-24 03:06:34 +00:00
Chandler Carruth
4d88297f25 [x86] Hoist the logic for extracting the relevant bits of information
from the MachineInstr into the caller which is already doing a switch
over the instruction.

This will make it more clear how to compute different operands to feed
the comment selection for example.

Also, in a drive-by-fix, don't append an empty comment string (which is
a no-op ultimately).

No functionality changed.

llvm-svn: 218361
2014-09-24 02:24:41 +00:00
Chandler Carruth
4f26b542c2 [x86] Start refactoring the comment printing logic in the MC lowering of
vector shuffles.

This is just the beginning by hoisting it into its own function and
making use of early exit to dramatically simplify the flow of the
function. I'm going to be incrementally refactoring this until it is
a bit less magical how this applies to other instructions, and I can
teach it how to dig a shuffle mask out of a register. Then I plan to
hook it up to VPERMD so we get our mask comments for it.

No functionality changed yet.

llvm-svn: 218357
2014-09-24 02:16:12 +00:00
Chandler Carruth
d2677fff24 [x86] Teach the vector comment parsing and printing to correctly handle
undef in the shuffle mask. This shows up when we're printing comments
during lowering and we still have an IR-level constant hanging around
that models undef.

A nice consequence of this is *much* prettier test cases where the undef
lanes actually show up as undef rather than as a particular set of
values. This also allows us to print shuffle comments in cases that use
undef such as the recently added variable VPERMILPS lowering. Now those
test cases have nice shuffle comments attached with their details.

The shuffle lowering for PSHUFB has been augmented to use undef, and the
shuffle combining has been augmented to comprehend it.

llvm-svn: 218301
2014-09-23 11:15:19 +00:00
Chandler Carruth
257fbb56ae [x86] Teach the AVX1 path of the new vector shuffle lowering one more
trick that I missed.

VPERMILPS has a non-immediate memory operand mode that allows it to do
asymetric shuffles in the two 128-bit lanes. Use this rather than two
shuffles and a blend.

However, it turns out the variable shuffle path to VPERMILPS (and
VPERMILPD, although that one offers no functional differenc from the
immediate operand other than variability) wasn't even plumbed through
codegen. Do such plumbing so that we can reasonably emit
a variable-masked VPERMILP instruction. Also plumb basic comment parsing
and printing through so that the tests are reasonable.

There are still a few tests which don't show the shuffle pattern. These
are tests with undef lanes. I'll teach the shuffle decoding and printing
to handle undef mask entries in a follow-up. I've looked at the masks
and they seem reasonable.

llvm-svn: 218300
2014-09-23 10:08:29 +00:00
Robin Morisset
7114ac8258 [X86] Allow atomic operations using immediates to avoid using a register
The only valid lowering of atomic stores in the X86 backend was mov from
register to memory. As a result, storing an immediate required a useless copy
of the immediate in a register. Now these can be compiled as a simple mov.

Similarily, adding/and-ing/or-ing/xor-ing an
immediate to an atomic location (but through an atomic_store/atomic_load,
not a fetch_whatever intrinsic) can now make use of an 'add $imm, x(%rip)'
instead of using a register. And the same applies to inc/dec.

This second point matches the first issue identified in
  http://llvm.org/bugs/show_bug.cgi?id=17281

llvm-svn: 216980
2014-09-02 22:16:29 +00:00
Sanjay Patel
8cd2aae34c fix typo
llvm-svn: 214995
2014-08-06 21:08:38 +00:00
Eric Christopher
99307e99a2 Remove the TargetMachine forwards for TargetSubtargetInfo based
information and update all callers. No functional change.

llvm-svn: 214781
2014-08-04 21:25:23 +00:00
Reid Kleckner
07d84ca71d Fix failure to invoke exception handler on Win64
When the last instruction prior to a function epilogue is a call, we
need to emit a nop so that the return address is not in the epilogue IP
range.  This is consistent with MSVC's behavior, and may be a workaround
for a bug in the Win64 unwinder.

Differential Revision: http://reviews.llvm.org/D4751

Patch by Vadim Chugunov!

llvm-svn: 214775
2014-08-04 21:05:27 +00:00
Chandler Carruth
a780975d83 [x86] Switch to using the variable we extracted this operand into.
Spotted this missed refactoring by inspection when reading code, and it
doesn't changethe functionality at all.

llvm-svn: 214627
2014-08-02 10:29:36 +00:00
Chandler Carruth
746864d542 [x86] Teach my pshufb comment printer to handle VPSHUFB forms as well as
PSHUFB forms. This will be important to update some AVX tests when I add
PSHUFB combining.

llvm-svn: 214624
2014-08-02 10:08:17 +00:00
Chandler Carruth
4cc339923e [x86] Fix unused variable warning in no-asserts build.
llvm-svn: 213989
2014-07-26 00:04:41 +00:00
Chandler Carruth
27abbcdaaa [x86] Teach the X86 backend to print shuffle comments for PSHUFB
instructions which happen to have a constant mask.

Currently, this only handles a very narrow set of cases, but those
happen to be the cases that I care about for testing shuffles sanely.
This is a bit trickier than other shuffle instructions because we're
decoding constants out of the constant pool. The current MC layer makes
it completely impossible to inspect a constant pool entry, so we have to
do it at the MI level and attach the comment to the streamer on its way
out. So no joy for disassembling, but it does make test cases and asm
dumps *much* nicer.

Sorry for no test cases, but it didn't really seem that valuable to go
trolling through existing old test cases and updating them. I'll have
lots of testing of this in the upcoming patch for SSSE3 emission in the
new vector shuffle lowering code paths.

llvm-svn: 213986
2014-07-25 23:47:11 +00:00
Lang Hames
aebc33c2ec [X86] Clarify some stackmap shadow optimization code as based on review
feedback from Eric Christopher.

No functional change.

llvm-svn: 213917
2014-07-25 02:29:19 +00:00
Lang Hames
6af0c82d1f [X86] Optimize stackmap shadows on X86.
This patch minimizes the number of nops that must be emitted on X86 to satisfy
stackmap shadow constraints.

To minimize the number of nops inserted, the X86AsmPrinter now records the
size of the most recent stackmap's shadow in the StackMapShadowTracker class,
and tracks the number of instruction bytes emitted since the that stackmap
instruction was encountered. Padding is emitted (if it is required at all)
immediately before the next stackmap/patchpoint instruction, or at the end of
the basic block.

This optimization should reduce code-size and improve performance for people
using the llvm stackmap intrinsic on X86.

<rdar://problem/14959522>

llvm-svn: 213892
2014-07-24 20:40:55 +00:00
Saleem Abdulrasool
d45105fbb4 MC: rename EmitWin64EH routines
Rename the routines to reflect the reality that they are more related to call
frame information than to Win64 EH. Although EH is implemented in an intertwined
manner by augmenting with an exception handler and an associated parameter, the
majority of these routines emit information required to unwind the frames. This
also helps identify that these routines are generic for most windows platforms
(they apply equally to nearly all architectures except x86) although the
encoding of the information is architecture dependent.

Unwinding data is emitted via EmitWinCFI* and exception handling information via
EmitWinEH*.

llvm-svn: 211994
2014-06-29 01:52:01 +00:00
NAKAMURA Takumi
c5a2c81f7e Re-apply r211399, "Generate native unwind info on Win64" with a fix to ignore SEH pseudo ops in X86 JIT emitter.
--
This patch enables LLVM to emit Win64-native unwind info rather than
DWARF CFI.  It handles all corner cases (I hope), including stack
realignment.

Because the unwind info is not flexible enough to describe stack frames
with a gap of unknown size in the middle, such as the one caused by
stack realignment, I modified register spilling code to place all spills
into the fixed frame slots, so that they can be accessed relative to the
frame pointer.

Patch by Vadim Chugunov!

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D4081

llvm-svn: 211691
2014-06-25 12:41:52 +00:00
NAKAMURA Takumi
eca89a0522 Revert r211399, "Generate native unwind info on Win64"
It broke Legacy JIT Tests on x86_64-{mingw32|msvc}, aka Windows x64.

llvm-svn: 211480
2014-06-22 22:00:56 +00:00
Reid Kleckner
40d9c6f936 Generate native unwind info on Win64
This patch enables LLVM to emit Win64-native unwind info rather than
DWARF CFI.  It handles all corner cases (I hope), including stack
realignment.

Because the unwind info is not flexible enough to describe stack frames
with a gap of unknown size in the middle, such as the one caused by
stack realignment, I modified register spilling code to place all spills
into the fixed frame slots, so that they can be accessed relative to the
frame pointer.

Patch by Vadim Chugunov!

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D4081

llvm-svn: 211399
2014-06-20 20:35:47 +00:00
Craig Topper
6d411cb95a [C++] Use 'nullptr'. Target edition.
llvm-svn: 207197
2014-04-25 05:30:21 +00:00
Craig Topper
fb6649907e Prune includes in X86 target.
llvm-svn: 204216
2014-03-19 06:53:25 +00:00
Manuel Jacob
b550acfd2d X86: Use enums for memory operand decoding instead of integer literals.
Summary:
X86BaseInfo.h defines an enum for the offset of each operand in a memory operand
sequence.  Some code uses it and some does not.  This patch replaces (hopefully)
all remaining locations where an integer literal was used instead of this enum.
No functionality change intended.

Reviewers: nadav

CC: llvm-commits, t.p.northover

Differential Revision: http://llvm-reviews.chandlerc.com/D3108

llvm-svn: 204158
2014-03-18 16:14:11 +00:00
Rafael Espindola
aea6192f20 Add back r201608, r201622, r201624 and r201625
r201608 made llvm corretly handle private globals with MachO. r201622 fixed
a bug in it and r201624 and r201625 were changes for using private linkage,
assuming that llvm would do the right thing.

They all got reverted because r201608 introduced a crash in LTO. This patch
includes a fix for that. The issue was that TargetLoweringObjectFile now has
to be initialized before we can mangle names of private globals. This is
trivially true during the normal codegen pipeline (the asm printer does it),
but LTO has to do it manually.

llvm-svn: 201700
2014-02-19 17:23:20 +00:00
Daniel Jasper
bf4e7d8ac3 Revert r201622 and r201608.
This causes the LLVMgold plugin to segfault. More information on the
replies to r201608.

llvm-svn: 201669
2014-02-19 12:26:01 +00:00
Rafael Espindola
d39a573c72 Fix PR18743.
The IR
@foo = private constant i32 42

is valid, but before this patch we would produce an invalid MachO from it. It
was invalid because it would use an L label in a section where the liker needs
the labels in order to atomize it.

One way of fixing it would be to just reject this IR in the backend, but that
would not be very front end friendly.

What this patch does is use an 'l' prefix in sections that we know the linker
requires symbols for atomizing them. This allows frontends to just use
private and not worry about which sections they go to or how the linker handles
them.

One small issue with this strategy is that now a symbol name depends on the
section, which is not available before codegen. This is not a problem in
practice. The reason is that it only happens with private linkage, which will
be ignored by the non codegen users (llvm-nm and llvm-ar).

llvm-svn: 201608
2014-02-18 22:24:57 +00:00
David Woodhouse
5d0b529d58 Change MCStreamer EmitInstruction interface to take subtarget info
llvm-svn: 200345
2014-01-28 23:12:42 +00:00
Rafael Espindola
ecb9c01bc7 Add an emitRawComment function and use it to simplify some uses of EmitRawText.
llvm-svn: 199397
2014-01-16 16:28:37 +00:00
Craig Topper
7930939a81 Copy segment register when optimizing to MOV8ao8/MOV16ao16/MOV32ao32.
llvm-svn: 199365
2014-01-16 07:57:45 +00:00
David Woodhouse
e757b998ec [x86] Disambiguate RET[QL] and fix aliases for 16-bit mode
I couldn't see how to do this sanely without splitting RETQ from RETL.

Eric says: "sad about the inability to roundtrip them now, but...".
I have no idea what that means, but perhaps it wants preserving in the
commit comment.

llvm-svn: 198756
2014-01-08 12:58:07 +00:00
Rafael Espindola
4dc5af8bc2 Move the llvm mangler to lib/IR.
This makes it available to tools that don't link with target (like llvm-ar).

llvm-svn: 198708
2014-01-07 21:19:40 +00:00
Rafael Espindola
eae6386a1e Make the llvm mangler depend only on DataLayout.
Before this patch any program that wanted to know the final symbol name of a
GlobalValue had to link with Target.

This patch implements a compromise solution where the mangler uses DataLayout.
This way, any tool that already links with Target (llc, clang) gets the exact
behavior as before and new IR files can be mangled without linking with Target.

With this patch the mangler is constructed with just a DataLayout and DataLayout
is extended to include the information the Mangler needs.

llvm-svn: 198438
2014-01-03 19:21:54 +00:00
Craig Topper
ed98df1d3a Handle MOV32r0 in expandPostRAPseudo instead of MCInst lowering. No functional change intended.
llvm-svn: 198254
2013-12-31 03:05:38 +00:00
Rafael Espindola
2b4db8d379 Remove the isImplicitlyPrivate argument of getNameWithPrefix.
getSymbolWithGlobalValueBase use is to create a name of a new symbol based
on the name of an existing GV. Assert that and then remove the last call
to pass true to isImplicitlyPrivate.

This gives the mangler API a 1:1 mapping from GV to names, which is what we
need to drop the mangler dependency on the target (and use an extended
datalayout instead).

llvm-svn: 196472
2013-12-05 05:53:12 +00:00
Rafael Espindola
b4226966a9 Hide the stub created for MO_ExternalSymbol too.
given

declare void @llvm.memset.p0i8.i32(i8* nocapture, i8, i32, i32, i1)
declare void @foo()
define void @bar() {
  call void @foo()
  call void @llvm.memset.p0i8.i32(i8* null, i8 0, i32 188, i32 1, i1 false)
  ret void
}

We used to produce

L_foo$stub:
        .indirect_symbol        _foo
        .ascii  "\364\364\364\364\364"

_memset$stub:
        .indirect_symbol        _memset
        .ascii  "\364\364\364\364\364"

We not produce a private stub for memset too.

Stubs are not needed with recent linkers, but we still produce them for darwin8.

Thanks to David Fang for confirming that gcc used to do this too.

llvm-svn: 196468
2013-12-05 05:19:12 +00:00
Juergen Ributzka
f7f5626671 [Stackmap] Emit multi-byte nops for X86.
llvm-svn: 196334
2013-12-04 00:39:08 +00:00
Lang Hames
067c025250 Refactor a lot of patchpoint/stackmap related code to simplify and make it
target independent.

Most of the x86 specific stackmap/patchpoint handling was necessitated by the
use of the native address-mode format for frame index operands. PEI has now
been modified to treat stackmap/patchpoint similarly to DEBUG_INFO, allowing
us to use a simple, platform independent register/offset pair for frame
indexes on stackmap/patchpoints.

Notes:
  - Folding is now platform independent and automatically supported.
  - Emiting patchpoints with direct memory references now just involves calling
    the TargetLoweringBase::emitPatchPoint utility method from the target's
    XXXTargetLowering::EmitInstrWithCustomInserter method. (See
    X86TargetLowering for an example).
  - No more ugly platform-specific operand parsers.

This patch shouldn't change the generated output for X86. 

llvm-svn: 195944
2013-11-29 03:07:54 +00:00
Rafael Espindola
3f4a857bd5 Refactor to remove a bit of duplication. No functionality change.
llvm-svn: 195933
2013-11-28 20:12:44 +00:00
Rafael Espindola
05d05c4e8a Use the mangler consistently instead of using getGlobalPrefix directly.
llvm-svn: 195911
2013-11-28 08:59:52 +00:00
Andrew Trick
15aac659a7 Add an abstraction to handle patchpoint operands.
Hard-coded operand indices were scattered throughout lowering stages
and layers. It was super bug prone.

llvm-svn: 195093
2013-11-19 03:29:56 +00:00
Andrew Trick
bd486c29f4 Added a size field to the stack map record to handle subregister spills.
Implementing this on bigendian platforms could get strange. I added a
target hook, getStackSlotRange, per Jakob's recommendation to make
this as explicit as possible.

llvm-svn: 194942
2013-11-17 01:36:23 +00:00
Lang Hames
37fe732f75 Remove unused arguments.
llvm-svn: 194882
2013-11-15 23:19:01 +00:00
Andrew Trick
1eb87f0d42 Minor extension to llvm.experimental.patchpoint: don't require a call.
If a null call target is provided, don't emit a dummy call. This
allows the runtime to reserve as little nop space as it needs without
the requirement of emitting a call.

llvm-svn: 194676
2013-11-14 06:54:10 +00:00
Lang Hames
dcd012fa30 Lower X86::MORESTACK_RET and X86::MORESTACK_RET_RESTORE_R10 in
X86AsmPrinter::EmitInstruction, rather than X86MCInstLower::Lower.

The aim is to improve the reusability of the X86MCInstLower class by making it
more function-like. The X86::MORESTACK_RET_RESTORE_R10 pseudo broke the
function model by emitting an extra instruction to the MCStreamer attached to
the AsmPrinter.

The patch should have no impact on generated code. 
 

llvm-svn: 194431
2013-11-11 23:00:41 +00:00
Juergen Ributzka
a748d55906 [Stackmap] Materialize the jump address within the patchpoint noop slide.
This patch moves the jump address materialization inside the noop slide. This
enables patching of the materialization itself or its complete removal. This
patch also adds the ability to define scratch registers that can be used safely
by the code called from the patchpoint intrinsic. At least one scratch register
is required, because that one is used for the materialization of the jump
address. This patch depends on D2009.

Differential Revision: http://llvm-reviews.chandlerc.com/D2074

Reviewed by Andy

llvm-svn: 194306
2013-11-09 01:51:33 +00:00