1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00
Commit Graph

145509 Commits

Author SHA1 Message Date
Simon Pilgrim
70004e62ed [X86] Regenerate CSE test with codegen instead of just the instruction count
llvm-svn: 295819
2017-02-22 10:12:46 +00:00
Roger Ferrer Ibanez
3111a83266 [ARM] Fix constant islands pass.
The pass tries to fix a spill of LR that turns out to be unnecessary.
So it removes the tPOP but forgets to remove tPUSH.
This causes the stack be misaligned upon returning the function.

Thus, remove the tPUSH as well in this case.

Differential Revision: https://reviews.llvm.org/D30207

llvm-svn: 295816
2017-02-22 09:06:21 +00:00
Benjamin Kramer
5f05e35412 Write to a temporary file in test instead of random file in the test directory.
llvm-svn: 295815
2017-02-22 09:02:27 +00:00
Ayman Musa
2059a627c4 [X86] Fix memory operands definition for some instructions.
Change integer memory operands to FP memory operands to some FP instructions.

Differential Revision: https://reviews.llvm.org/D30201

llvm-svn: 295813
2017-02-22 08:06:29 +00:00
Justin Bogner
dbcb2141ed OptDiag: Add const to some interfaces that don't modify anything. NFC
This needed a const_cast for the dominator tree recalculation in
OptimizationRemarkEmitter, but we do that all over the place already
and it's safe.

llvm-svn: 295812
2017-02-22 07:38:17 +00:00
Javed Absar
a1d1885fcc [ARM] Classification Improvements to ARM Sched-Models. NFCI.
This patch adds missing sched classes for Thumb2 instructions.
This has been missing so far, and as a consequence, machine
scheduler models for individual sub-targets have tended to
be larger than they needed to be. These patches should help
write schedulers better and faster in the future
for ARM sub-targets.

Reviewer: Diana Picus
Differential Revision: https://reviews.llvm.org/D29953

llvm-svn: 295811
2017-02-22 07:22:57 +00:00
Craig Topper
b50ca6d3e4 [AVX-512] Allow legacy scalar min/max intrinsics to select EVEX instructions when available
This patch introduces new X86ISD::FMAXS and X86ISD::FMINS opcodes. The legacy intrinsics now lower to this node. As do the AVX-512 masked intrinsics when the rounding mode is CUR_DIRECTION.

I've merged a copy of the tablegen multiclass avx512_fp_scalar into avx512_fp_scalar_sae. avx512_fp_scalar still needs to support CUR_DIRECTION appearing as a rounding mode for X86ISD::FADD_ROUND and others.

Differential revision: https://reviews.llvm.org/D30186

llvm-svn: 295810
2017-02-22 06:54:18 +00:00
Sanjoy Das
20031f752d [ValueTracking] Make poison propagation more aggressive
Summary:
Motivation: fix PR31181 without regression (the actual fix is still in
progress).  However, the actual content of PR31181 is not relevant
here.

This change makes poison propagation more aggressive in the following
cases:

 1. poision * Val == poison, for any Val.  In particular, this changes
    existing intentional and documented behavior in these two cases:
     a. Val is 0
     b. Val is 2^k * N
 2. poison << Val == poison, for any Val
 3. getelementptr is poison if any input is poison

I think all of these are justified (and are axiomatically true in the
new poison / undef model):

1a: we need poison * 0 to be poison to allow transforms like these:

  A * (B + C) ==> A * B + A * C

If poison * 0 were 0 then the above transform could not be allowed
since e.g. we could have A = poison, B = 1, C = -1, making the LHS

  poison * (1 + -1) = poison * 0 = 0

and the RHS

  poison * 1 + poison * -1 = poison + poison = poison

1b: we need e.g. poison * 4 to be poison since we want to allow

  A * 4 ==> A + A + A + A

If poison * 4 were a value with all of their bits poison except the
last four; then we'd not be able to do this transform since then if A
were poison the LHS would only be "partially" poison while the RHS
would be "full" poison.

2: Same reasoning as (1b), we'd like have the following kinds
transforms be legal:

  A << 1 ==> A + A

Reviewers: majnemer, efriedma

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D30185

llvm-svn: 295809
2017-02-22 06:52:32 +00:00
Sean Silva
c35c905230 Use const-ref in range-loop for to avoid copying pairs of std::string
No reason to create temporaries.

Differential Revision: https://reviews.llvm.org/D29871

Patch by sergio.martins!

llvm-svn: 295807
2017-02-22 06:34:04 +00:00
Dan Gohman
da00194d3c [WebAssembly] Add skeleton MC support for the Wasm container format
This just adds the basic skeleton for supporting a new object file format.
All of the actual encoding will be implemented in followup patches.

Differential Revision: https://reviews.llvm.org/D26722

llvm-svn: 295803
2017-02-22 01:23:18 +00:00
Rui Ueyama
f5c0e59e24 Fix -Wcovered-switch-default.
llvm-svn: 295799
2017-02-22 01:01:45 +00:00
Matt Arsenault
d2e2dba6a0 AMDGPU: Add cvt.pkrtz intrinsic
Convert llvm.SI.packf16 test uses

llvm-svn: 295797
2017-02-22 00:27:34 +00:00
Michael Kuperstein
bd4a813606 [LoopUnroll] Enable PGO-based loop peeling by default.
This enables peeling of loops with low dynamic iteration count by default,
when profile information is available.

Differential Revision: https://reviews.llvm.org/D27734

llvm-svn: 295796
2017-02-22 00:27:34 +00:00
Matt Arsenault
3320e649a3 AMDGPU: Remove some uses of llvm.SI.export in tests
Merge some of the old, smaller tests into more complete versions.

llvm-svn: 295792
2017-02-22 00:02:21 +00:00
Matt Arsenault
65d8dccee7 AMDGPU: Remove llvm.AMDGPU.clamp intrinsic
llvm-svn: 295789
2017-02-21 23:46:04 +00:00
Matt Arsenault
73b8eb1cc6 AMDGPU: Redefine clamp node as clamp 0.0-1.0
Change implementation to use max instead of add.
min/max/med3 do not flush denormals regardless of the mode,
so it is OK to use it whether or not they are enabled.

Also allow using clamp with f16, and use knowledge
of dx10_clamp.

llvm-svn: 295788
2017-02-21 23:35:48 +00:00
Artem Belevich
e90ce769e9 [NVPTX] Unify vectorization of load/stores of aggregate arguments and return values.
Original code only used vector loads/stores for explicit vector arguments.
It could also do more loads/stores than necessary (e.g v5f32 would
touch 8 f32 values). Aggregate types were loaded one element at a time,
even the vectors contained within.

This change attempts to generalize (and simplify) parameter space
loads/stores so that vector loads/stores can be used more broadly.
Functionality of the patch has been verified by compiling thrust
test suite and manually checking the differences between PTX
generated by llvm with and without the patch.

General algorithm:
* ComputePTXValueVTs() flattens input/output argument into a flat list
  of scalars to load/store and returns their types and offsets.
* VectorizePTXValueVTs() uses that data to create vectorization plan
  which returns an array of flags marking boundaries of vectorized
  load/stores. Scalars are represented as 1-element vectors.
* Code that generates loads/stores implements a simple state machine
  that constructs a vector according to the plan.

Differential Revision: https://reviews.llvm.org/D30011

llvm-svn: 295784
2017-02-21 22:56:05 +00:00
Matt Arsenault
180c1a1fd3 AMDGPU: Formatting fixes
llvm-svn: 295783
2017-02-21 22:50:41 +00:00
Matt Arsenault
c0dfae3f67 DAG: Check if extract_vector_elt is legal or custom
Avoids test regressions in future AMDGPU commits when
more vector types are custom lowered.

llvm-svn: 295782
2017-02-21 22:47:27 +00:00
Evandro Menezes
4fd4f6ce11 [AArch64, X86] Add statistics for the MacroFusion pass
llvm-svn: 295777
2017-02-21 22:16:13 +00:00
Evandro Menezes
409d9fa95f [AArch64, X86] Guard against both instrs being wild cards
If both instrs are wild cards, the result can be a crash.

llvm-svn: 295776
2017-02-21 22:16:11 +00:00
Evandro Menezes
2f2683b095 [AArch64] Add test case for fusion of literal generation
Add test case from https://reviews.llvm.org/D28698 that was somehow lost in
transit.

llvm-svn: 295775
2017-02-21 22:16:09 +00:00
Evandro Menezes
5f61a0e363 [AArch64] Add test case for fusion of AES crypto operations
Add test case from https://reviews.llvm.org/D28491 that was somehow lost in
transit.

llvm-svn: 295774
2017-02-21 22:16:06 +00:00
Eugene Zelenko
6fa4aaca51 [CodeGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 295773
2017-02-21 22:07:52 +00:00
Zachary Turner
be93149cb9 Try to fix the buildbot on OSX.
Since I'm only seeing failures on OSX, and it's saying
permission denied, I'm suspecting this is due to the addition
of the MAP_RESILIENT_CODESIGN and/or MAP_RESILIENT_MEDIA flags.
Speculatively trying to remove those to get the bots working.

llvm-svn: 295770
2017-02-21 21:31:28 +00:00
Zachary Turner
524e1e382b Try to fix Android build.
llvm-svn: 295769
2017-02-21 21:13:10 +00:00
Zachary Turner
1d2f86a79f [Support] Add a function to check if a file resides locally.
Differential Revision: https://reviews.llvm.org/D30010

llvm-svn: 295768
2017-02-21 20:55:47 +00:00
Xin Tong
85ef4fb106 Make default value for disable-licm-promotion in licm explicit.
llvm-svn: 295767
2017-02-21 20:53:48 +00:00
Rafael Espindola
98752fcfe6 Don't modify archive members unless really needed.
For whatever reason ld64 requires that member headers (not the member
themselves) should be aligned. The only way to do that is to edit the
previous member so that it ends at an aligned boundary.

Since modifying data put in an archive is an undesirable property,
llvm-ar should only do it when it is absolutely necessary.

llvm-svn: 295765
2017-02-21 20:40:54 +00:00
Evgeniy Stepanov
c010343bc2 Fix PR31896.
Address of an alias of a global with offset is incorrectly lowered as an address of the global (i.e. ignoring offset).

llvm-svn: 295762
2017-02-21 20:17:34 +00:00
Zachary Turner
3231ab7daa Try to fix line endings.
llvm-svn: 295759
2017-02-21 19:52:57 +00:00
Sanjay Patel
b8b0c8bff5 [InstCombine] canonicalize non-obivous forms of integer min/max
This is part of trying to clean up our handling of min/max patterns in IR.
By converting these to canonical form, we're more likely to recognize them
because there are various places in InstCombine that don't use 
matchSelectPattern or m_SMax and friends.

The backend fixups referenced in the now deleted TODO comment were added with:
https://reviews.llvm.org/rL291392
https://reviews.llvm.org/rL289738

If there's any codegen fallout from this change, we should be able to address
it in DAGCombiner or target-specific lowering. 

llvm-svn: 295758
2017-02-21 19:33:53 +00:00
Matt Arsenault
b94f4fd9d0 AMDGPU: Remove dead declarations in tests
llvm-svn: 295757
2017-02-21 19:31:33 +00:00
Zachary Turner
aabc236581 Remove svn:eol-style property from 2 files.
There are still over 3400 files remaining with this property set, but there are tens of thousands more with the property not set.  Until we decide what to do on a global scale, this at least unblocks me temporarily.

llvm-svn: 295756
2017-02-21 19:29:56 +00:00
Matt Arsenault
7a680c9a28 AMDGPU: Remove dead declarations from MIR tests
llvm-svn: 295755
2017-02-21 19:27:36 +00:00
Matt Arsenault
0f8d55acef AMDGPU: Remove llvm.AMDGPU.flbit intrinsic
llvm-svn: 295754
2017-02-21 19:27:33 +00:00
Matt Arsenault
85a1bec778 AMDGPU: Don't use stack space for SGPR->VGPR spills
Before frame offsets are calculated, try to eliminate the
frame indexes used by SGPR spills. Then we can delete them
after.

I think for now we can be sure that no other instruction
will be re-using the same frame indexes. It should be easy
to notice if this assumption ever breaks since everything
asserts if it tries to use a dead frame index later.

The unused emergency stack slot seems to still be left behind,
so an additional 4 bytes is still wasted.

llvm-svn: 295753
2017-02-21 19:12:08 +00:00
Xin Tong
93a7fb51c4 [LoopSimplify] Simplify how we compute UniqueExit
Summary: Simplify how we compute UniqueExit. Reuse ExitBlockSet.

Reviewers: sanjoy, efriedma, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D30182

llvm-svn: 295751
2017-02-21 19:10:58 +00:00
Xin Tong
4a781593be More comments for getUniqueExitBlocks. NFCI
llvm-svn: 295750
2017-02-21 19:08:03 +00:00
Adrian Prantl
458ca95045 Teach the IR verifier to reject conflicting debug info for function arguments.
Conflicting debug info for function arguments causes hard-to-debug
assertions in the DWARF backend, so the Verifier should reject it.
For performance reasons this only checks function arguments from
non-inlined debug intrinsics for now.

rdar://problem/30520286

llvm-svn: 295749
2017-02-21 19:03:15 +00:00
Geoff Berry
0599d58aba [CodeGenPrepare] Sink and duplicate more 'and' instructions.
Summary:
Rework the code that was sinking/duplicating (icmp and, 0) sequences
into blocks where they were being used by conditional branches to form
more tbz instructions on AArch64.  The new code is more general in that
it just looks for 'and's that have all icmp 0's as users, with a target
hook used to select which subset of 'and' instructions to consider.
This change also enables 'and' sinking for X86, where it is more widely
beneficial than on AArch64.

The 'and' sinking/duplicating code is moved into the optimizeInst phase
of CodeGenPrepare, where it can take advantage of the fact the
OptimizeCmpExpression has already sunk/duplicated any icmps into the
blocks where they are used.  One minor complication from this change is
that optimizeLoadExt needed to be updated to always mark 'and's it has
determined should be in the same block as their feeding load in the
InsertedInsts set to avoid an infinite loop of hoisting and sinking the
same 'and'.

This change fixes a regression on X86 in the tsan runtime caused by
moving GVNHoist to a later place in the optimization pipeline (see
PR31382).

Reviewers: t.p.northover, qcolombet, MatzeB

Subscribers: aemerson, mcrosier, sebpop, llvm-commits

Differential Revision: https://reviews.llvm.org/D28813

llvm-svn: 295746
2017-02-21 18:53:14 +00:00
Wei Ding
98a8a85308 AMDGPU : AMDGPU : Update AMDGPU Trap Handler ABI.
Differential Revision: http://reviews.llvm.org/D29913

llvm-svn: 295745
2017-02-21 18:48:01 +00:00
Dmitry Preobrazhensky
3856c94e30 Test commit
llvm-svn: 295740
2017-02-21 18:07:07 +00:00
Simon Pilgrim
e24a14b735 [X86] EltsFromConsecutiveLoads SDLoc argument should be const&.
There appears never to have been a time that the reference was updated.

llvm-svn: 295739
2017-02-21 17:42:28 +00:00
Vassil Vassilev
b693393360 Do not leak OpenedHandles.
Reviewed by Vedant Kumar (D30178)

llvm-svn: 295737
2017-02-21 17:30:43 +00:00
Simon Pilgrim
33db2a0876 [X86][AVX512] Update VPBROADCASTQ test to combine from VPERMQ instead of VPERMI2Q.
VPERMI2Q doesn't have shuffle decoding from re-materializable constants.

llvm-svn: 295736
2017-02-21 17:04:11 +00:00
Simon Pilgrim
fbe76de494 [X86][AVX] Rename shuffle combine tests to show combined shuffle type. NFCI.
llvm-svn: 295735
2017-02-21 16:45:31 +00:00
John Brawn
e4c382b509 [ARM] Correct SP/PC handling in t2MOVr
Add a missing test that I forgot to svn add in my previous commit

llvm-svn: 295734
2017-02-21 16:45:04 +00:00
Simon Pilgrim
b3b922b351 [X86][AVX2] Fix VPBROADCASTQ folding on 32-bit targets.
As i64 isn't a value type on 32-bit targets, we need to fold the VZEXT_LOAD into VPBROADCASTQ.

llvm-svn: 295733
2017-02-21 16:41:44 +00:00
John Brawn
c37c2d0c1e [ARM] Correct SP/PC handling in t2MOVr
PC isn't allowed in the source operand of t2MOVr, so change the register class
to one without PC. SP handling is slightly trickier and changes depending on if
we're in ARMv8, so do that in checkTargetMatchPredicate.

Differential Revision: https://reviews.llvm.org/D30199

llvm-svn: 295732
2017-02-21 16:41:29 +00:00