1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00
Commit Graph

127486 Commits

Author SHA1 Message Date
Duncan P. N. Exon Smith
ea645eec42 Support: Fix incremental build when re-configuring targets
r180893 added an indirect include of llvm/Config/Targets.def to
llvm/Support/CodeGen.h, which in turn is included by things like
llvm/IR/Module.h.  After a full build of LLVM and Clang, ninja had to
rebuild 1274 files after reconfiguring.

This commit strips CodeGen.h back down to just a pile of enums and moves
the expensive includes over to CodeGenCWrappers.h (which is only
included in two places).  This gets ninja down to 88 files if you
reconfigure with, e.g., -DLLVM_TARGETS_TO_BUILD=X86.

llvm-svn: 260835
2016-02-13 22:58:43 +00:00
Simon Pilgrim
2acf1a4365 [X86][AVX] Lower shuffles as repeated lane shuffles then lane-crossing shuffles
This patch attempts to represent a shuffle as a repeating shuffle (recognisable by is128BitLaneRepeatedShuffleMask) with the source input(s) in their original lanes, followed by a single permutation of the 128-bit lanes to their final destinations.

On AVX2 we can additionally attempt to match using 64-bit sub-lane permutation. AVX2 can also now match a similar 'broadcasted' repeating shuffle.

This patch has several benefits:

 * Avoids prematurely matching with lowerVectorShuffleByMerging128BitLanes which can require both inputs to have their input lanes permuted before shuffling.
 * Can replace PERMPS/PERMD instructions - although these are useful for cross-lane unary shuffling, they require their shuffle mask to be pre-loaded (and increase register pressure).
 * Matching the repeating shuffle makes use of a lot of existing shuffle lowering.

There is an outstanding minor AVX1 regression (combine_unneeded_subvector1 in vector-shuffle-combining.ll) of a previously 128-bit shuffle + subvector splat being converted to a subvector splat + (2 instruction) 256-bit shuffle, I intend to fix this in a followup patch for review.

Differential Revision: http://reviews.llvm.org/D16537

llvm-svn: 260834
2016-02-13 21:54:04 +00:00
Craig Topper
7cf1a380ee Remove Proc feature flags for X86 processors that are used to inherit features from one processor to another. This exposed extra features to the -mattr command line that we shouldn't. Replace with just inherited listconcats.
llvm-svn: 260832
2016-02-13 21:35:37 +00:00
Craig Topper
9430c4414c [TableGen] Fix comment about 64-bit type I missed when I removed the underlying type in r260808.
llvm-svn: 260830
2016-02-13 17:58:14 +00:00
Kostya Serebryany
abc380db58 [libFuzzer] remove std::vector operations from hot paths, NFC
llvm-svn: 260829
2016-02-13 17:56:51 +00:00
Sanjay Patel
d5eac4bb77 [x86-64] allow mfence even with -mno-sse (PR23203)
As shown in:
https://llvm.org/bugs/show_bug.cgi?id=23203
...we currently die because lowering believes that mfence is allowed without SSE2 on x86-64,
but the instruction def doesn't know that.

I don't know if allowing mfence without SSE is right, but if not, at least now it's consistently wrong. :)

Differential Revision: http://reviews.llvm.org/D17219

llvm-svn: 260828
2016-02-13 17:26:29 +00:00
Benjamin Kramer
2316eec7c0 [APInt] No need for a copy when taking min/max of an APInt.
llvm-svn: 260827
2016-02-13 17:23:27 +00:00
Benjamin Kramer
2bfcfaa743 [ConstantFolding] Reduce APInt and APFloat copying.
llvm-svn: 260826
2016-02-13 16:54:14 +00:00
Benjamin Kramer
0225b4aec7 [AggressiveAntiDepBreaker] Skip some unnecessary BitVector copies.
llvm-svn: 260825
2016-02-13 16:39:39 +00:00
Benjamin Kramer
dceab201ec Use ArrayRef to hide SmallVector details, kill a useless vector copy along the way.
llvm-svn: 260824
2016-02-13 16:01:12 +00:00
Krzysztof Parzyszek
252d6c1ac6 [Hexagon] Replace use of "std::map::emplace" with "insert"
Gcc 4.7.2-4 does not seem to have "emplace" in its implementation of map.
This should fix the build failure on polly-amd64-linux.

llvm-svn: 260816
2016-02-13 14:06:01 +00:00
Chandler Carruth
710380f52f [attrs] Move the norecurse deduction to operate on the node set rather
than the SCC object, and have it scan the instruction stream directly
rather than relying on call records.

This makes the behavior of this routine consistent between libc routines
and LLVM intrinsics for libc routines. We can go and start teaching it
about those being norecurse, but we should behave the same for the
intrinsic and the libc routine rather than differently. I chatted with
James Molloy and the inconsistency doesn't seem intentional and likely
is due to intrinsic calls not being modelled in the call graph analyses.

This also fixes a bug where we would deduce norecurse on optnone
functions, when generally we try to handle optnone functions as-if they
were replaceable and thus unanalyzable.

llvm-svn: 260813
2016-02-13 08:47:51 +00:00
NAKAMURA Takumi
41afed0a52 HexagonFrameLowering.cpp: Appease msc18 to give an explicit constructor SlotInfo() instead of member initializers.
llvm-svn: 260812
2016-02-13 07:29:49 +00:00
Kostya Serebryany
bf966c5f23 [libFuzzer] don't require seed in fuzzer::Mutate, instead use the global Fuzzer object for fuzzer::Mutate. This makes custom mutators fast
llvm-svn: 260810
2016-02-13 06:24:18 +00:00
Craig Topper
b9483bdc4b [TableGen] Use range-based for loops. NFC
llvm-svn: 260809
2016-02-13 06:03:32 +00:00
Craig Topper
149f7e2025 No need to make the subtarget feature bit enum a uint64_t. This was a leftover from when the feature bit enum contained masks instead of bit indices.
llvm-svn: 260808
2016-02-13 06:03:29 +00:00
Matthias Braun
17fddb8c58 LiveIntervalAnalysis: Remove LiveVariables requirement
This requirement was a huge hack to keep LiveVariables alive because it
was optionally used by TwoAddressInstructionPass and PHIElimination.
However we have AnalysisUsage::addUsedIfAvailable() which we can use in
those passes.

llvm-svn: 260806
2016-02-13 04:35:31 +00:00
Matt Arsenault
d964164e4a AMDGPU: Prepare for reducing private element size.
Tests for the new scalarize all private access options will be
included with a future commit.

The only functional change is to make the split/scalarize behavior
for private access of > 4 element vectors to be consistent
with the flat/global handling. This makes the spilling worse
in the two changed tests.

llvm-svn: 260804
2016-02-13 04:18:53 +00:00
Kostya Serebryany
ef83d4c558 [libFuzzer] remove the C++-ish variant of FuzzerDriver from the interface
llvm-svn: 260801
2016-02-13 03:59:26 +00:00
Kostya Serebryany
abf7df0972 [libFuzzer] simplify CTOR of MutationDispatcher
llvm-svn: 260800
2016-02-13 03:46:26 +00:00
Kostya Serebryany
cca951bf4c [libFuzzer] get rid of MutationDispatcher::Impl (simplify the code; NFC)
llvm-svn: 260799
2016-02-13 03:37:24 +00:00
Kostya Serebryany
b9687a1cc3 [libFuzzer] get rid of UserSuppliedFuzzer; NFC
llvm-svn: 260798
2016-02-13 03:25:16 +00:00
Kostya Serebryany
9bf814b9ec [libFuzzer] simplify the code around Random. NFC
llvm-svn: 260797
2016-02-13 03:00:53 +00:00
Kostya Serebryany
1bb500faf8 [libFuzzer] remove UserSuppliedFuzzer from the interface (it was a bad idea).
llvm-svn: 260796
2016-02-13 02:39:30 +00:00
Kostya Serebryany
b2451a8b09 [libFuzzer] provide a plain C interface for custom mutators (experimental)
llvm-svn: 260794
2016-02-13 02:29:38 +00:00
Tom Stellard
04e6c06525 AMDGPU/SI: Add llvm.amdgcn.mov.dpp intrinsic
This intrinsic will be used to expose dpp functionality to higher-level
languages. It will map to the dpp version of v_mov_b32.

llvm-svn: 260792
2016-02-13 02:09:49 +00:00
Keno Fischer
2e799f5e5e [Cloning] Clone every Function's Debug Info
Summary:
Export the CloneDebugInfoMetadata utility, which clones all debug info
associated with a function into the first module. Also use this function
in CloneModule on each function we clone (the CloneFunction entrypoint
already does this).

Without this, cloning a module will lead to DI quality regressions,
especially since r252219 reversed the Function <-> DISubprogram edge
(before we could get lucky and have this edge preserved if the
DISubprogram itself was, e.g. due to location metadata).

This was verified to fix missing debug information in julia and
a unittest to verify the new behavior is included.

Patch by Yichao Yu! Thanks!

Reviewers: loladiro, pcc
Differential Revision: http://reviews.llvm.org/D17165

llvm-svn: 260791
2016-02-13 02:04:29 +00:00
Matt Arsenault
b576160845 Add AMDGPU related triple vendors/OSes
As support expands to more runtimes, we'll need to
distinguish between more than just HSA and unknown.
This also lets us stop using unknown everywhere.

llvm-svn: 260790
2016-02-13 01:56:21 +00:00
Davide Italiano
d5038b8299 [llvm-size] Remove variable used only once.
The use of auto and the name were very weird anyway.

llvm-svn: 260789
2016-02-13 01:52:47 +00:00
Davide Italiano
e227c6b3fe [llvm-size] Make error handling uniform.
llvm-svn: 260786
2016-02-13 01:38:16 +00:00
Matt Arsenault
8dbe43e457 AMDGPU: Cleanup includes and random macros
llvm-svn: 260784
2016-02-13 01:24:08 +00:00
Matt Arsenault
c77e92f437 AMDGPU: Add intrinsics for sin/cos
These provide direct access to the hardware instruction without
the unit version required like llvm.sin/llvm.cos lowering requires.

llvm-svn: 260782
2016-02-13 01:19:56 +00:00
Matt Arsenault
4ff4c396c1 AMDGPU: Rename intrinsic to better match instruction name
Also fixes missing f32 test.

llvm-svn: 260780
2016-02-13 01:03:00 +00:00
Tom Stellard
a308dba9ed AMDGPU/SI: Add instruction defs for VOP1 DPP instructions
Reviewers: nhaustov, cfang, arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17159

llvm-svn: 260774
2016-02-13 00:51:31 +00:00
Matt Arsenault
4cdd9956f3 AMDGPU: Fix broken condition causing warning
llvm-svn: 260773
2016-02-13 00:36:10 +00:00
Tom Stellard
6bf7a73a66 AMDGPU/SI: Organize intrinsics by subtarget
Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17210

llvm-svn: 260771
2016-02-13 00:29:57 +00:00
Pirama Arumuga Nainar
d09ef3dd88 Don't combine fp_round (fp_round x) if f80 to f16 is generated
Summary:
This patch skips DAG combine of fp_round (fp_round x) if it results in
an fp_round from f80 to f16.

fp_round from f80 to f16 always generates an expensive (and as yet,
unimplemented) libcall to __truncxfhf2.  This prevents selection of
native f16 conversion instructions from f32 or f64.  Moreover, the first
(value-preserving) fp_round from f80 to either f32 or f64 may become a
NOP in platforms like x86.

Reviewers: ab

Subscribers: srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D17221

llvm-svn: 260769
2016-02-13 00:08:05 +00:00
Alexey Samsonov
cd76db6136 Fix Windows buildbot breakage.
llvm-svn: 260766
2016-02-12 23:51:06 +00:00
Tom Stellard
9943755afb AMDGPU/SI: Detect uniform branches and emit s_cbranch instructions
Reviewers: arsenm

Subscribers: mareko, MatzeB, qcolombet, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D16603

llvm-svn: 260765
2016-02-12 23:45:29 +00:00
Yunzhong Gao
9e56bc9706 Disable the vzeroupper insertion pass on PS4.
Differential Revision: http://reviews.llvm.org/D16837

llvm-svn: 260764
2016-02-12 23:37:57 +00:00
Justin Bogner
dec93a4fca cmake: Simplify the iOS.cmake toolchain
- Remove a comment that was clearly copy pasted from Android.cmake and
  isn't relevant.
- Remove the toolchain's sensitivity to the environment. It's less
  error prone to just allow users to set CMAKE_OSX_SYSROOT if they
  want to use a custom SDK.
- Stop explicitly setting -mios-version-min to the default value. It
  just adds needless complexity.

This makes building the native tablegen work for me even when SDKROOT
is set in the environment (or passed in as -DCMAKE_OSX_SYSROOT).

llvm-svn: 260763
2016-02-12 23:36:05 +00:00
Derek Schuff
eed189f4ef [WebAssembly] Report more meaningful error messages for some unsupported
ops.

Computed gotos and RETURNADDR may never be supported; we can do
FRAMEADDR in the future.

llvm-svn: 260759
2016-02-12 22:56:03 +00:00
Krzysztof Parzyszek
702277f07f [Hexagon] Optimize stack slot spills
Replace spills to memory with spills to registers, if possible. This
applies mostly to predicate registers (both scalar and vector), since
they are very limited in number. A spill of a predicate register may
happen even if there is a general-purpose register available. In cases
like this the stack spill/reload may be eliminated completely.

This optimization will consider all stack objects, regardless of where
they came from and try to match the live range of the stack slot with
a dead range of a register from an appropriate register class.

llvm-svn: 260758
2016-02-12 22:53:35 +00:00
David Majnemer
85466f02fc [llvm-pdbdump] Start to decode some streams
We can decode a little bit of the first stream now.

llvm-svn: 260754
2016-02-12 22:27:44 +00:00
Krzysztof Parzyszek
882483351a [Hexagon] Mark HVX registers as volatile
llvm-svn: 260753
2016-02-12 22:26:44 +00:00
Sanjay Patel
9507ea18eb fix test to use FileCheck
llvm-svn: 260751
2016-02-12 22:07:54 +00:00
Derek Schuff
6f279569a2 [WebAssembly] Update test expectations after r260737
llvm-svn: 260750
2016-02-12 22:05:08 +00:00
Krzysztof Parzyszek
4004463702 [Hexagon] Recognize more cases in copyPhysReg and stack slot load/store
llvm-svn: 260748
2016-02-12 21:56:41 +00:00
Reid Kleckner
7c03262156 [codeview] Describe local variables in registers
llvm-svn: 260746
2016-02-12 21:48:30 +00:00
Rong Xu
669262d490 [PGO] Add another interface for annotateValueSite
Add another interface to function annotateValueSite() which directly uses the
VauleData array.

Differential Revision: http://reviews.llvm.org/D17108

llvm-svn: 260741
2016-02-12 21:36:17 +00:00