1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

94339 Commits

Author SHA1 Message Date
Sanjay Patel
d4c3ab24ab [InstCombine] use m_APInt to allow icmp (and X, Y), C folds for splat constant vectors
llvm-svn: 279937
2016-08-28 18:18:00 +00:00
Simon Pilgrim
913f5c9084 [X86][AVX512] Only combine EVEX targets shuffles to shuffles of the same number of vector elements
Over eager combing prevents the correct folding of writemasks.

At the moment this occurs for ALL EVEX shuffles, in the future we need to check that the user of the root shuffle is a VSELECT that can fold to a writemask.

llvm-svn: 279934
2016-08-28 17:27:14 +00:00
Hal Finkel
9e501da293 [PowerPC] Implement lowering for atomicrmw min/max/umin/umax
Implement lowering for atomicrmw min/max/umin/umax. Fixes PR28818.

llvm-svn: 279933
2016-08-28 16:17:58 +00:00
Elena Demikhovsky
eed0bb3d71 [Loop Vectorizer] Fixed memory confilict checks.
Fixed a bug in run-time checks for possible memory conflicts inside loop.
The bug is in Low <-> High boundaries calculation. The High boundary should be calculated as "last memory access pointer + element size".

Differential revision: https://reviews.llvm.org/D23176

llvm-svn: 279930
2016-08-28 08:53:53 +00:00
Craig Topper
0f68f0b2ab [AVX-512] Promote AND/OR/XOR to v2i64/v4i64/v8i64 even when we have AVX512F/AVX512VL.
Previously we weren't creating masked logical operations if bitcasts appeared between the logic operation and the select. The IR optimizers can move bitcasts across logic operations and create these cases. To minimize the number of cases we need to handle, this change promotes all logic ops to an i64 vector type just like when only SSE or AVX is available.

Unfortunately, this also has the consequence of making it difficult to select unmasked VPANDD/VPORD/VPXORD in all the cases it was previously used. This is the cause of most of the test change. This shouldn't result in any functional change though.

llvm-svn: 279929
2016-08-28 06:06:28 +00:00
Craig Topper
9a9a09e60f [X86] Rename PABSB/D/W instructions to be consistent with SSE/AVX instructions instead of ending 128/256. NFC
llvm-svn: 279927
2016-08-28 06:06:21 +00:00
Jan Vesely
0d9671ace0 AMDGPU/R600: Enable Load combine
Fix and improve tests

Differential Revision: https://reviews.llvm.org/D23899

llvm-svn: 279925
2016-08-27 19:09:43 +00:00
Craig Topper
d6a4523f86 [X86] Rename predicate function that detects if requires one of the REX.B, REX.X or REX.R bits. It's old name conflicted with a function in X8II namespace that doesnt' quite do the same thing. NFC
llvm-svn: 279924
2016-08-27 17:13:43 +00:00
Craig Topper
ce1e9291d5 [X86] Keep looping over operands looking for byte registers even if we already found a register that requires a REX prefix. Otherwise we don't error if a high byte register is used after SPL/BPL/DIL/SIL.
llvm-svn: 279923
2016-08-27 17:13:41 +00:00
Craig Topper
929e8489be [X86] Include XMM/YMM/ZMM16-23 in X86II::isX86_64ExtendedReg. This feels more consistent with its name and simplifies assembler code.
llvm-svn: 279922
2016-08-27 17:13:37 +00:00
Craig Topper
c65627ae77 [X86] Don't allow DR8-DR15 to be assembled in 32-bit mode. Add missing test for CR8-CR15.
llvm-svn: 279921
2016-08-27 17:13:34 +00:00
Craig Topper
c727b74959 [X86] Remove stale comment about FixupBWInsts pass being off by default. NFC
llvm-svn: 279915
2016-08-27 05:26:54 +00:00
Craig Topper
7c9fd2b2be [AVX-512] Allow EVEX encoding unordered/ordered/equal/notequal VCMPPS/PD/SS/SD to be commuted just like the SSE and AVX counterparts.
llvm-svn: 279914
2016-08-27 05:22:15 +00:00
Craig Topper
dc033bf380 [X86] Enable FR32/FR64 cmpeq/cmpne/cmpunord/cmpord to be commuted.
llvm-svn: 279913
2016-08-27 05:22:12 +00:00
Craig Topper
1964e92665 [AVX-512] Add load folding for EVEX vcmpps/pd/ss/sd.
llvm-svn: 279912
2016-08-27 05:22:08 +00:00
Teresa Johnson
864ef41e7f [LTO] Don't create a new common unless merged has different size
Summary:
This addresses a regression in common handling from the new LTO
API in r278338. Only create a new common if the size is different.
The type comparison against an array type fails when the size is
different but not an array. GlobalMerge does not handle the
array types as well and we lose some global merging opportunities.

Reviewers: mehdi_amini

Subscribers: junbuml, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23955

llvm-svn: 279911
2016-08-27 04:41:22 +00:00
Matt Arsenault
7d35696bfc AMDGPU: Mark sched model complete
Fixes bug 26800

llvm-svn: 279910
2016-08-27 03:39:27 +00:00
Matt Arsenault
5c1096e49d AMDGPU: Remove unneeded implicit exec uses/defs
SI_BREAK, SI_IF_BREAK, and SI_ELSE_BREAK do not def exec.
SI_IF_BREAK and SI_ELSE_BREAK do not read it either.

llvm-svn: 279909
2016-08-27 03:00:51 +00:00
Sebastian Pop
c6175f06bd GVN-hoist: invalidate MD cache (PR29144)
Without invalidating the entries in the MD cache we would try to access instructions
that were removed in previous iterations of hoisting.

Differential Revision: https://reviews.llvm.org/D23927

llvm-svn: 279907
2016-08-27 02:48:41 +00:00
Quentin Colombet
a56859d74d [RegBankSelect] Do not abort when the target wants to fall back.
llvm-svn: 279906
2016-08-27 02:38:27 +00:00
Quentin Colombet
cd84f181b1 [InstructionSelect] Do not abort when the target wants to fall back.
llvm-svn: 279905
2016-08-27 02:38:24 +00:00
Quentin Colombet
ff81dc2c09 [MachineLegalize] Do not abort when the target wants to fall back.
llvm-svn: 279904
2016-08-27 02:38:21 +00:00
Matt Arsenault
d3892e17d5 AMDGPU: Select mulhi 24-bit instructions
llvm-svn: 279902
2016-08-27 01:32:27 +00:00
Matt Arsenault
af7cdf6a46 AMDGPU: Move cndmask pseudo to be isel pseudo
There's only one use of this for the convenience
of a pattern. I think v_mov_b64_pseudo should also be
moved, but SIFoldOperands does currently make use of it.

llvm-svn: 279901
2016-08-27 01:00:37 +00:00
Matt Arsenault
9b698596fe AMDGPU: Fix sched type for branches
llvm-svn: 279900
2016-08-27 00:51:02 +00:00
Matt Arsenault
09098bc2b3 AMDGPU: Remove register operand from si_mask_branch
It isn't used for anything, and is also misleading since
it could be spilled at the end of the block, so it can't be relied
on. There ends up being a verifier error about using an undefined
register since the spill kills the register.

llvm-svn: 279899
2016-08-27 00:42:21 +00:00
Matt Arsenault
2922d104bf AMDGPU: Improve error reporting for maximum branch distance
Unfortunately this seems to only help the assembler diagnostic.

llvm-svn: 279895
2016-08-27 00:21:22 +00:00
Quentin Colombet
1636e7c63b [GlobalISel] Add a fallback path to SDISel.
When global-isel fails on a MachineFunction MF, MF will be cleaned up
and given to SDISel.
Thanks to this fallback, we can already perform correctness test even if
we support only a small portion of the functions in a test.

llvm-svn: 279891
2016-08-27 00:18:31 +00:00
Quentin Colombet
7a3f77bdff [AArch64][CallLowering] Do not assert for not implemented part.
When doing the ABI lowering, report a failure to the caller instead of
asserting. This gives a chance for the caller to recover.

llvm-svn: 279890
2016-08-27 00:18:28 +00:00
Quentin Colombet
c8ffecf434 [GlobalISel] Teach the core pipeline not to run if ISel failed.
llvm-svn: 279889
2016-08-27 00:18:24 +00:00
Quentin Colombet
a46582d729 [IRTranslator] Do not abort when the target wants to fall back.
Every pass in the GlobalISel pipeline will need to do something similar.

llvm-svn: 279886
2016-08-26 23:49:05 +00:00
Quentin Colombet
5f3ddf5325 [MFProperties] Introduce a FailedISel property.
This is used to communicate that the instruction selection pipeline
failed at some point.
Another way to achieve that would be to have some kind of conditional
scheduling in the PassManager, such that we only schedule a pass based
on the success/failure of another one. The property approach has the
advantage of being lightweight and solve the problem at stake.

llvm-svn: 279885
2016-08-26 23:49:01 +00:00
Teresa Johnson
048eeff265 [ThinLTO] Move loading of cache entry to client
Summary:
Have the cache pass back the path to the cache entry when it
is ready to be loaded, instead of a buffer.

For gold-plugin we can simply pass this file back to gold directly,
which avoids expensive writing of a separate tmp file. Ensure
the cache entry is not deleted on cleanup by adjusting the setting
of the IsTemporary flags.

Moved the loading of the buffer into llvm-lto2 to maintain current
behavior.

Reviewers: mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D23946

llvm-svn: 279883
2016-08-26 23:29:14 +00:00
Quentin Colombet
9f9ccd2d5a [TargetPassConfig] Add a target hook to know what GlobalISel should do on error.
By default, this hook tells GlobalISel to abort (report a fatal error)
when it encounters an error. The alternative will be to fall back on
SDISel.
This fall back will be removed when the bring-up of GlobalISel is over.

llvm-svn: 279879
2016-08-26 22:32:59 +00:00
Quentin Colombet
0b16bca0b7 [IRTranslator][NFC] Use DEBUG_TYPE instead of repeating the name.
llvm-svn: 279878
2016-08-26 22:32:57 +00:00
Quentin Colombet
2cd9ce7522 [SelectionDAG] Do not run the ISel process on already selected code.
Right now, this cannot happen, but with the fall back path of GlobalISel
it will show up eventually.

llvm-svn: 279877
2016-08-26 22:32:55 +00:00
Quentin Colombet
947cfac2b5 [MachineFunction] Introduce a reset method.
This method allows to reset the state of a MachineFunction as if it was
just created. This will be used during the bring-up of GlobalISel to
provide a way to fallback on SelectionDAG. That way, we can start doing
correctness testing even if we are not able to select all functions via
the global instruction selector.

llvm-svn: 279876
2016-08-26 22:32:53 +00:00
Quentin Colombet
fcb250bf0c [MFProperties] Introduce a reset method with no argument.
This method allows to reset all the properties in one go.

llvm-svn: 279874
2016-08-26 22:09:11 +00:00
Quentin Colombet
e0a55b84fa [MFProperties][NFC] Rename clear into reset to match BitVector naming.
The name clear is used to reset all the bit in bitvectors and using it
to reset just properties was confusing.

llvm-svn: 279873
2016-08-26 22:09:08 +00:00
Tom Stellard
2dbd047f3a AMDGPU/SI: Canonicalize offset order for merged DS instructions
Summary:
If the scheduler clusters the loads, then the offsets will be sorted,
but it is possible for the scheduler to scheduler loads together
without out explicitly clustering them, which would give us non-sorted
offsets.

Also, we will want to do this if we move the load/store optimizer before
the scheduler.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D23776

llvm-svn: 279870
2016-08-26 21:36:47 +00:00
Tom Stellard
8b436348bd XXX
llvm-svn: 279868
2016-08-26 21:16:40 +00:00
Tom Stellard
13f830c61a AMDGPU/SI: Use a better method for determining the largest pressure sets
Summary:
There are a few different sgpr pressure sets, but we only care about
the one which covers all of the sgprs.  We were using hard-coded
register pressure set names to determine the reg set id for the
biggest sgpr set.  However, we were using the wrong name, and this
method is pretty fragile, since the reg pressure set names may
change.

The new method just looks for the pressure set that contains the most
reg units and sets that set as our SGPR pressure set.  We've also
adopted the same technique for determining our VGPR pressure set.

Reviewers: arsenm

Subscribers: MatzeB, arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D23687

llvm-svn: 279867
2016-08-26 21:16:37 +00:00
Adam Nemet
68515406cf [Inliner] Report when inlining fails because callee's def is unavailable
Summary:
This is obviously an interesting case because it may motivate code
restructuring or LTO.

Reporting this requires instantiation of ORE in the loop where the call
sites are first gathered.  I've checked compile-time
overhead *with* -Rpass-with-hotness and the worst slow-down was 6% in
mcf and quickly tailing off.  As before without -Rpass-with-hotness
there is no overhead.

Because this could be a pretty noisy diagnostics, it is currently
qualified as 'verbose'.  As of this patch, 'verbose' diagnostics are
only emitted with -Rpass-with-hotness, i.e. when the output is expected
to be filtered.

Reviewers: eraman, chandlerc, davidxl, hfinkel

Subscribers: tejohnson, Prazek, davide, llvm-commits

Differential Revision: https://reviews.llvm.org/D23415

llvm-svn: 279860
2016-08-26 20:21:05 +00:00
Rafael Espindola
26c1b18156 Make writeToResolutionFile a static helper.
llvm-svn: 279859
2016-08-26 20:19:35 +00:00
Kyle Butt
160e4d9563 TailDuplication: Record blocks that received the duplicated block. NFC.
This will allow tail duplication during layout to handle the cfg changes more
cleanly.

llvm-svn: 279858
2016-08-26 20:12:40 +00:00
Kevin Enderby
9991a7696a Next set of additional error checks for invalid Mach-O files for bad LC_SYMTAB’s.
This contains the missing checks for LC_SYMTAB load command fields.

llvm-svn: 279854
2016-08-26 19:34:07 +00:00
Manman Ren
23c5a589f3 Swift Calling Convetion: add support for AArch64.
It will just be the same as the regular calling convention.

rdar://28029509

llvm-svn: 279853
2016-08-26 19:28:17 +00:00
Tim Northover
0f9e84c16f AArch64: avoid assertion on illegal types in performFDivCombine.
In the code to detect fixed-point conversions and make use of AArch64's special
instructions, we weren't prepared for weird types. The fptosi direction got
fixed recently, but not the similar sitofp code.

llvm-svn: 279852
2016-08-26 18:52:31 +00:00
Sanjay Patel
66b7ce5846 [InstCombine] add helper function for icmp (and (sh X, Y), C2), C1 ; NFC
Like other recent changes near here, the goal is to allow vector types for
all of these folds. Splitting things up makes it easier to incrementally 
enhance the code and easier to read.

llvm-svn: 279851
2016-08-26 18:28:46 +00:00
Chad Rosier
ab1bc0d119 [AArch64] Avoid materializing constant values when generating csel instructions.
Differential Revision: https://reviews.llvm.org/D23677

llvm-svn: 279849
2016-08-26 18:05:50 +00:00