1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00
Commit Graph

165130 Commits

Author SHA1 Message Date
Craig Topper
e5c44e87ab [X86] Remove masking from the 512-bit masked floating point add/sub/mul/div intrinsics. Use a select in IR instead.
llvm-svn: 334358
2018-06-10 06:01:36 +00:00
Fangrui Song
7601d5a8b3 Cleanup. NFC
llvm-svn: 334357
2018-06-10 04:53:14 +00:00
Zachary Turner
0a3823ab82 Revert "Resubmit "[Support] Expose flattenWindowsCommandLine.""
This reverts commit 65243b6d19143cb7a03f68df0169dcb63e8b4632.

Seems like it's not a flake.  It might have something to do with
the '*' character being in a command line.

llvm-svn: 334356
2018-06-10 03:16:25 +00:00
Zachary Turner
f2f74ba36a Resubmit "[Support] Expose flattenWindowsCommandLine."
There were a few linux compilation failures, but other than that
I think this was just a flake that caused the tests to fail.  I'm
going to resubmit and see if the failures go away, if not I'll
revert again.

llvm-svn: 334355
2018-06-10 02:46:11 +00:00
Zachary Turner
77c48da34f Revert "[Support] Expose flattenWindowsCommandLine."
This reverts commit 10d2e88e87150a35dc367ba30716189d2af26774.

This is causing some test failures for some reason, reverting
while I investigate.

llvm-svn: 334354
2018-06-09 23:07:39 +00:00
Zachary Turner
b81cc17dd4 [Support] Expose flattenWindowsCommandLine.
This function was internal to Program.inc, but I've needed this
on several occasions when I've had to use CreateProcess without
llvm's sys::Execute functions.  In doing so, I noticed that the
function was written using unsafe C-string access and was pretty
hard to understand / make sense of, so I've also re-written the
functions to use more modern LLVM constructs.

llvm-svn: 334353
2018-06-09 22:44:44 +00:00
Simon Pilgrim
d4b723f66d [CostModel][X86] Add 'select' style shuffle costs tests (PR33744)
llvm-svn: 334351
2018-06-09 16:08:25 +00:00
Roman Lebedev
38a002d88b [NFC][InstCombine] More tests for (x >> y) << y -> x & (-1 << y) fold.
Followup for rL334347.
The fold is also valid for ashr.
https://rise4fun.com/Alive/0HF

https://bugs.llvm.org/show_bug.cgi?id=37603
https://reviews.llvm.org/D46760#1123713
https://rise4fun.com/Alive/cplX

llvm-svn: 334349
2018-06-09 14:01:46 +00:00
Roman Lebedev
ec49918862 [NFC][InstCombine] Tests for (x >> y) << y -> x & (-1 << y) fold.
We already do it for splat constants, but not just values.
Also, undef cases are mostly non-functional.

https://bugs.llvm.org/show_bug.cgi?id=37603
https://reviews.llvm.org/D46760#1123713
https://rise4fun.com/Alive/cplX

llvm-svn: 334347
2018-06-09 09:27:43 +00:00
Roman Lebedev
46a4ee3296 [NFC][InstCombine] Tests for (x << y) >> y -> x & (-1 >> y) fold.
We already do it for splat constants, but not just values.
Also, undef cases are mostly non-functional.

https://bugs.llvm.org/show_bug.cgi?id=37603
https://rise4fun.com/Alive/cplX

llvm-svn: 334346
2018-06-09 09:27:39 +00:00
Gabor Buella
3ba1b81d69 [X86] NFC Use member initialization in X86Subtarget
The separate initializeEnvironment function was sort of
useless since r217071.
ARM did this move already with r273556.

llvm-svn: 334345
2018-06-09 09:19:40 +00:00
Serge Pavlov
7f357d43f7 Use uniform mechanism for OOM errors handling
This is a recommit of r333506, which was reverted in r333518.
The original commit message is below.

In r325551 many calls of malloc/calloc/realloc were replaces with calls of
their safe counterparts defined in the namespace llvm. There functions
generate crash if memory cannot be allocated, such behavior facilitates
handling of out of memory errors on Windows.

If the result of *alloc function were checked for success, the function was
not replaced with the safe variant. In these cases the calling function made
the error handling, like:

    T *NewElts = static_cast<T*>(malloc(NewCapacity*sizeof(T)));
    if (NewElts == nullptr)
      report_bad_alloc_error("Allocation of SmallVector element failed.");

Actually knowledge about the function where OOM occurred is useless. Moreover
having a single entry point for OOM handling is convenient for investigation
of memory problems. This change removes custom OOM errors handling and
replaces them with calls to functions `llvm::safe_*alloc`.

Declarations of `safe_*alloc` are moved to a separate include file, to avoid
cyclic dependency in SmallVector.h

Differential Revision: https://reviews.llvm.org/D47440

llvm-svn: 334344
2018-06-09 05:19:45 +00:00
Craig Topper
3e90558514 Use SmallPtrSet instead of SmallSet in places where we iterate over the set.
SmallSet forwards to SmallPtrSet for pointer types. SmallPtrSet supports iteration, but a normal SmallSet doesn't. So if it wasn't for the forwarding, this wouldn't work.

These places were found by hiding the begin/end methods in the SmallSet forwarding

llvm-svn: 334343
2018-06-09 05:04:20 +00:00
Daniel Sanders
1799831af2 [tablegen] Improve performance on *GenRegisterInfo.inc by replacing SparseVector with BitVector. NFC
Summary: Generating X86GenRegisterInfo.inc and AArch64GenRegisterInfo.inc is 8-9% faster on my build.

Reviewers: bogner, javed.absar

Reviewed By: bogner

Subscribers: llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D47907

llvm-svn: 334337
2018-06-08 23:12:29 +00:00
Craig Topper
7917d76653 [X86] Remove GCCBuiltin from some intrinsics so we can do custom IR generation from clang.
llvm-svn: 334328
2018-06-08 21:49:09 +00:00
Eli Friedman
a762ef5b3b [LangRef] fptosi and fptoui return poison on overflow.
I think we assume poison, not undef, for certain transforms we
currently do. In any case, we should clarify the language here.

(This sort of conversion is undefined behavior according to the C
and C++ standards. And in practice, hardware implementations handle
overflow inconsistently, so it would be difficult to define the
result here.)

Differential Revision: https://reviews.llvm.org/D47851

llvm-svn: 334326
2018-06-08 21:33:33 +00:00
Eli Friedman
7fb2376b24 [LangRef] insertelement/extractelement return poison for out of range.
We need to clarify the language here. I think poison makes more sense
than undef, since it's an undefined operation rather than uninitialized
memory. I don't think anything depends on the difference at the moment,
though.

Differential Revision: https://reviews.llvm.org/D47859

llvm-svn: 334325
2018-06-08 21:23:09 +00:00
Ryan Prichard
0d7be22265 Test commit: remove a blank line
Test commit access

llvm-svn: 334324
2018-06-08 21:21:55 +00:00
Eli Friedman
204b6b7989 [ARM] Allow CMPZ transforms even if the input has multiple uses.
It looks like this got left in by accident in r289794; I can't think of
any reason this check would be necessary.  (Maybe it was meant to be a
check that the AND has one use? But we check that a few lines earlier.)

Differential Revision: https://reviews.llvm.org/D47921

llvm-svn: 334322
2018-06-08 21:16:56 +00:00
Florian Hahn
ae14785007 [SmallSet] Add some simple unit tests.
Reviewers: craig.topper, dblaikie

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D47940

llvm-svn: 334321
2018-06-08 21:14:49 +00:00
Krzysztof Parzyszek
016e021c0d [SCEV] Look through zero-extends in howFarToZero
An expression like
  (zext i2 {(trunc i32 (1 + %B) to i2),+,1}<%while.body> to i32)
will become zero exactly when the nested value becomes zero in its type.
Strip injective operations from the input value in howFarToZero to make
the value simpler.

Differential Revision: https://reviews.llvm.org/D47951

llvm-svn: 334318
2018-06-08 20:43:07 +00:00
Davide Italiano
0743cdd30b [InstCombine] Skip dbg.value(s) when looking at stack{save,restore}.
Fixes PR37713.

llvm-svn: 334317
2018-06-08 20:42:36 +00:00
Sanjay Patel
e3045d78cb [InstCombine] add llvm.assume + debuginfo test (PR37726); NFC
llvm-svn: 334314
2018-06-08 18:47:33 +00:00
Reid Kleckner
5e8302f896 [asan] Instrument comdat globals on COFF targets
Summary:
If we can use comdats, then we can make it so that the global metadata
is thrown away if the prevailing definition of the global was
uninstrumented. I have only tested this on COFF targets, but in theory,
there is no reason that we cannot also do this for ELF.

This will allow us to re-enable string merging with ASan on Windows,
reducing the binary size cost of ASan on Windows.

Reviewers: eugenis, vitalybuka

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D47841

llvm-svn: 334313
2018-06-08 18:33:16 +00:00
Sanjay Patel
89cfe4b747 [DAGCombiner] clean up comments; NFC
llvm-svn: 334312
2018-06-08 18:00:46 +00:00
Simon Pilgrim
15afaca086 [X86][SSE] Support v8i16/v16i16 rotations
Extension to D46954 (PR37426), this patch adds support for v8i16/v16i16 rotations in a similar manner - the conversion of the shift/rotate amount to a multiplication factor and the use of PMULLW to shift left and PMULHUW (ISD::MULHU) to shift the wrapped bits back around to be ORd together.

Differential Revision: https://reviews.llvm.org/D47822

llvm-svn: 334309
2018-06-08 17:58:42 +00:00
Sanjay Patel
c3730d582a [x86] add tests for node-level FMF; NFC
These cases should be optimized using the change from D47911.

llvm-svn: 334308
2018-06-08 17:54:28 +00:00
Sanjay Patel
44872f2e42 [x86] regenerate test checks; NFC
llvm-svn: 334307
2018-06-08 17:42:35 +00:00
Michael Berg
e3719e608c Utilize new SDNode flag functionality to expand current support for fsub
Summary: This patch originated from D46562 and is a proper subset, with some issues addressed for fsub.

Reviewers: spatel, hfinkel, wristow, arsenm

Reviewed By: spatel

Subscribers: wdng

Differential Revision: https://reviews.llvm.org/D47910

llvm-svn: 334306
2018-06-08 17:39:50 +00:00
Florian Hahn
acb34e914c [VPlan] Move recipe construction to VPRecipeBuilder.
This patch moves the recipe-creation functions out of
LoopVectorizationPlanner, which should do the high-level
orchestration of the transformations.

Reviewers: dcaballe, rengolin, hsaito, Ayal

Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D47595

llvm-svn: 334305
2018-06-08 17:30:45 +00:00
Simon Pilgrim
c07dc477f2 [X86][BtVer2] Add support for all SUB/XOR 32/64 scalar instructions that should match the dependency-breaking 'zero-idiom'
As detailed on Agner's Microarchitecture doc (21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions), these instructions are dependency breaking and fast-path zero the destination register (and appropriate EFLAGS bits).

llvm-svn: 334303
2018-06-08 17:00:45 +00:00
Simon Pilgrim
3a15a3a392 [X86] Fix schedule-x86_64.s tests to use different registers in reg-reg cases
Same fix as rL334110: I noticed while working on zero-idiom + dependency-breaking support (PR36671) that most of our binary instruction schedule tests were reusing the same src registers, which would cause the tests to fail once we enable scalar zero-idiom support on btver2.

llvm-svn: 334302
2018-06-08 16:40:15 +00:00
Daniil Fukalov
1d375d9d5b [AMDGPU] Inline asm - added i16, half and i128 types support
AMDGPU inline assembler support i16, half and i128 typed variables in constraints, but they were reported as error.
Needed to fix https://github.com/RadeonOpenCompute/ROCm/issues/341,
e.g. to be able to load with global_load_dwordx4 to a 128bit integer variable

Differential Revision: https://reviews.llvm.org/D44920

llvm-svn: 334301
2018-06-08 16:29:04 +00:00
Daniil Fukalov
59789f7759 reapply r334209 with fixes for harfbuzz in Chromium
r334209 description:
[LSR] Check yet more intrinsic pointer operands

the patch fixes another assertion in isLegalUse()

Differential Revision: https://reviews.llvm.org/D47794

llvm-svn: 334300
2018-06-08 16:22:52 +00:00
Roman Lebedev
82acc95a19 [NFC][InstSimplify] SimplifyAddInst(): coding style: variable names.
llvm-svn: 334299
2018-06-08 15:44:53 +00:00
Roman Lebedev
84dcefc37c [InstSimplify] add nuw %x, -1 -> -1 fold.
Summary:
`%ret = add nuw i8 %x, C`
From [[ https://llvm.org/docs/LangRef.html#add-instruction | langref ]]:
    nuw and nsw stand for “No Unsigned Wrap” and “No Signed Wrap”,
    respectively. If the nuw and/or nsw keywords are present,
    the result value of the add is a poison value if unsigned
    and/or signed overflow, respectively, occurs.

So if `C` is `-1`, `%x` can only be `0`, and the result is always `-1`.

I'm not sure we want to use `KnownBits`/`LVI` here, because there is
exactly one possible value (all bits set, `-1`), so some other pass
should take care of replacing the known-all-ones with constant `-1`.

The `test/Transforms/InstCombine/set-lowbits-mask-canonicalize.ll` change *is* confusing.
What happening is, before this: (omitting `nuw` for simplicity)
1. First, InstCombine D47428/rL334127 folds `shl i32 1, %NBits`) to `shl nuw i32 -1, %NBits`
2. Then, InstSimplify D47883/rL334222 folds `shl nuw i32 -1, %NBits` to `-1`,
3. `-1` is inverted to `0`.
But now:
1. *This* InstSimplify fold `%ret = add nuw i32 %setbit, -1` -> `-1` happens first,
   before InstCombine D47428/rL334127 fold could happen.
Thus we now end up with the opposite constant,
and it is all good: https://rise4fun.com/Alive/OA9

https://rise4fun.com/Alive/sldC
Was mentioned in D47428 review.
Follow-up for D47883.

Reviewers: spatel, craig.topper

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47908

llvm-svn: 334298
2018-06-08 15:44:47 +00:00
Simon Pilgrim
0c5a72d3c4 [X86][BtVer2] Remove SBB tests that were accidentally added in rL334296
These aren't true zero-idiom instructions (just dependency breaking).

llvm-svn: 334297
2018-06-08 15:43:00 +00:00
Simon Pilgrim
52eedf5092 [X86][BtVer2] Add tests for scalar SUB/XOR instructions that should match the dependency-breaking 'zero-idiom'
As detailed on Agner's Microarchitecture doc (21.8 AMD Bobcat and Jaguar pipeline - Dependency-breaking instructions).

llvm-svn: 334296
2018-06-08 15:28:43 +00:00
Alexander Kornienko
73a92659e5 commandLineFitsWithinSystemLimits Overestimates System Limits
Summary:
The function `llvm::sys::commandLineFitsWithinSystemLimits` appears to be overestimating the system limits. This issue was discovered while attempting to enable response files in the Swift compiler. When the compiler submits its frontend jobs, those jobs are subjected to the system limits on command line length. `commandLineFitsWithinSystemLimits` is used to determine if the job's arguments need to be wrapped in a response file. There are some cases where the argument size for the job passes `commandLineFitsWithinSystemLimits`, but actually exceeds the real system limit, and the job fails.

`clang` also uses this function to decide whether or not to wrap it's job arguments in response files. See: https://github.com/llvm-mirror/clang/blob/master/lib/Driver/Driver.cpp#L1341. Clang will also fail for response files who's size falls within a certain range. I wrote a script that should find a failure point for `clang++`. All that is needed to run it is Python 2.7, and a simple "hello world" program for `test.cc`. It should run on Linux and on macOS. The script is available here: https://gist.github.com/dabelknap/71bd083cd06b91c5b3cef6a7f4d3d427. When it hits a failure point, you should see a `clang: error: unable to execute command: posix_spawn failed: Argument list too long`.

The proposed solution is to mirror the behavior of `xargs` in `commandLinefitsWithinSystemLimits`. `xargs` defaults to 128k for the command line length size (See: https://fossies.org/dox/findutils-4.6.0/buildcmd_8c_source.html#l00551). It adjusts this depending on the value of `ARG_MAX`.

Reviewers: alexfh

Reviewed By: alexfh

Subscribers: llvm-commits

Tags: #clang

Patch by Austin Belknap!

Differential Revision: https://reviews.llvm.org/D47795

llvm-svn: 334295
2018-06-08 15:19:16 +00:00
Zachary Turner
5f66a3a103 Clean up some code in Program.
NFC here, this just raises some platform specific ifdef hackery
out of a class and creates proper platform-independent typedefs
for the relevant things.  This allows these typedefs to be
reused in other places without having to reinvent this preprocessor
logic.

llvm-svn: 334294
2018-06-08 15:16:25 +00:00
Zachary Turner
58a53155d0 Add a file open flag that disables O_CLOEXEC.
O_CLOEXEC is the right default, but occasionally you don't
want this.  This is especially true for tools like debuggers
where you might need to spawn the child process with specific
files already open, but it's occasionally useful in other
scenarios as well, like when you want to do some IPC between
parent and child.

llvm-svn: 334293
2018-06-08 15:15:56 +00:00
Simon Pilgrim
16ac6d814f [X86][BtVer2] Limit zero idiom tests to a single iteration.
Reduces output size and we're only wanting to check that the instructions are fast-path'd (just Dispatch+Retire) anyhow

llvm-svn: 334292
2018-06-08 15:01:40 +00:00
Simon Pilgrim
99cd1d17f5 Fix Wdocumentation warning for unknown param. NFCI.
llvm-svn: 334291
2018-06-08 14:53:52 +00:00
Simon Pilgrim
df38bb1554 [X86][SSE] Add SSE2/AVX2 vector rotate tests
Now that we're custom lowering vector rotates for SSE in general we should be testing the combines with them as well.

llvm-svn: 334290
2018-06-08 14:07:21 +00:00
Simon Pilgrim
5590ac9777 [X86][SSE] Simplify combineVectorTruncationWithPACKUS to reduce code duplication
Simplify combineVectorTruncationWithPACKUS to mask the upper bits followed by calling truncateVectorWithPACK instead of duplicating with similar code.

This results in the codegen using (V)PACKUSDW on SSE41+ targets for vXi64/vXi32 inputs where before it always used PACKUSWB (along with a lot more bitcasting).

I've raised PR37749 as until we avoid unnecessary concats back to 256-bit for bitwise ops, we can't avoid splitting the input value into 128-bit subvectors for masking.

llvm-svn: 334289
2018-06-08 13:59:11 +00:00
Sanjay Patel
7df0e638b6 [x86] restore test comment; NFC
The description got deleted along with the FIXME note in
rL334268.

llvm-svn: 334288
2018-06-08 13:53:13 +00:00
Artur Pilipenko
fecfcf704c [BPI] Apply invoke heuristic before loop branch heuristic
Currently the loop branch heuristic is applied before the invoke heuristic which makes us overestimate the probability of the unwind destination of invokes inside loops. This in turn makes us grossly underestimate the frequencies of loops with invokes.

Reviewed By: skatkov, vsk

Differential Revision: https://reviews.llvm.org/D47371

llvm-svn: 334285
2018-06-08 13:03:21 +00:00
Florian Hahn
148a435017 [VPlan] Move recipe based VPlan generation to separate function.
This first step separates VPInstruction-based and VPRecipe-based
VPlan creation, which should make it easier to migrate to VPInstruction
based code-gen step by step.

Reviewers: Ayal, rengolin, dcaballe, hsaito, mkuper, mzolotukhin

Reviewed By: dcaballe

Subscribers: bollu, tschuett, rkruppe, llvm-commits

Differential Revision: https://reviews.llvm.org/D47477

llvm-svn: 334284
2018-06-08 12:53:51 +00:00
Henry Wong
bfa46b6381 [ADT] Add StringRef::rsplit(StringRef Separator).
Summary: Add `StringRef::rsplit(StringRef Separator)` to achieve the function of getting the tail substring according to the separator. A typical usage is to get `data` in `std::basic_string::data`.

Reviewers: mehdi_amini, zturner, beanz, xbolva00, vsk

Reviewed By: zturner, xbolva00, vsk

Subscribers: vsk, xbolva00, llvm-commits, MTC

Differential Revision: https://reviews.llvm.org/D47406

llvm-svn: 334283
2018-06-08 12:42:12 +00:00
Simon Dardis
577d09f55f [mips] Correct the predicates for a number of codegen only instructions
Reviewers: smaksimovic, atanasyan, abeserminji

Differential Revision: https://reviews.llvm.org/D47638

llvm-svn: 334280
2018-06-08 10:55:34 +00:00