1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 03:33:20 +01:00
Commit Graph

208037 Commits

Author SHA1 Message Date
David Green
67f2592469 [ARM][RegAlloc] Add t2LoopEndDec
We currently have problems with the way that low overhead loops are
specified, with LR being spilled between the t2LoopDec and the t2LoopEnd
forcing the entire loop to be reverted late in the backend. As they will
eventually become a single instruction, this patch introduces a
t2LoopEndDec which is the combination of the two, combined before
registry allocation to make sure this does not fail.

Unfortunately this instruction is a terminator that produces a value
(and also branches - it only produces the value around the branching
edge). So this needs some adjustment to phi elimination and the register
allocator to make sure that we do not spill this LR def around the loop
(needing to put a spill after the terminator). We treat the loop very
carefully, making sure that there is nothing else like calls that would
break it's ability to use LR. For that, this adds a
isUnspillableTerminator to opt in the new behaviour.

There is a chance that this could cause problems, and so I have added an
escape option incase. But I have not seen any problems in the testing
that I've tried, and not reverting Low overhead loops is important for
our performance. If this does work then we can hopefully do the same for
t2WhileLoopStart and t2DoLoopStart instructions.

This patch also contains the code needed to convert or revert the
t2LoopEndDec in the backend (which just needs a subs; bne) and the code
pre-ra to create them.

Differential Revision: https://reviews.llvm.org/D91358
2020-12-10 12:14:23 +00:00
Martin Storsjö
102c71a022 [llvm-rc] Handle driveless absolute windows paths when loading external files
When llvm-rc loads an external file, it looks for it relative to
a number of include directories and the current working directory.
If the path is considered absolute, llvm-rc tries to open the
filename as such, and doesn't try to open it relative to other
paths.

On Windows, a path name like "\dir\file" isn't considered absolute
as it lacks the drive name, but by appending it on top of the search
dirs, it's not found.

LLVM's sys::path::append just appends such a path (same with a properly
absolute posix path) after the paths it's supposed to be relative to.

This fix doesn't handle the case if the resource script and the
external file are on a different drive than the current working
directory; to fix that, we'd have to make LLVM's sys::path::append
handle appending fully absolute and partially absolute paths (ones
lacking a drive prefix but containing a root directory), or switch
to C++17's std::filesystem.

Differential Revision: https://reviews.llvm.org/D92558
2020-12-10 14:11:06 +02:00
Alexey Lapshin
8c9a1f9e87 [dsymutil][DWARFLinker][NFC] Make interface of AddressMap more general.
Current interface of AddressMap assumes that relocations exist.
That is correct for not-linked object file but is not correct
for linked executable. This patch changes interface in such way
that AddressMap could be used not only with not-linked object files:

hasValidRelocationAt()

replaced with:

hasLiveMemoryLocation()
hasLiveAddressRange()

Differential Revision: https://reviews.llvm.org/D87723
2020-12-10 14:57:08 +03:00
Mirko Brkusanin
e195dd75ce [AMDGPU] Resolve issues when picking between ds_read/write and ds_read2/write2
Both ds_read_b128 and ds_read2_b64 are valid for 128bit 16-byte aligned
loads but the one that will be selected is determined either by the order in
tablegen or by the AddedComplexity attribute. Currently ds_read_b128 has
priority.

While ds_read2_b64 has lower alignment requirements, we cannot always
restrict ds_read_b128 to 16-byte alignment because of unaligned-access-mode
option. This was causing ds_read_b128 to be selected for 8-byte aligned
loads regardles of chosen access mode.

To resolve this we use two patterns for selecting ds_read_b128. One
requires alignment of 16-byte and the other requires
unaligned-access-mode option.

Same goes for ds_write2_b64 and ds_write_b128.

Differential Revision: https://reviews.llvm.org/D92767
2020-12-10 12:40:49 +01:00
David Green
b4f282c77f [ARM] Additional test for Min loop. NFC 2020-12-10 10:49:00 +00:00
David Green
04038723cc [ARM] Remove copies from low overhead phi inductions.
The phi created in a low overhead loop gets created with a default
register class it seems. There are then copied inserted between the low
overhead loop pseudo instructions (which produce/consume GPRlr
instructions) and the phi holding the induction. This patch removes
those as a step towards attempting to make t2LoopDec and t2LoopEnd a
single instruction, and appears useful in it's own right as shown in the
tests.

Differential Revision: https://reviews.llvm.org/D91267
2020-12-10 10:30:31 +00:00
Jun Ma
0d59bcd1c0 [TruncInstCombine] Remove scalable vector restriction
Differential Revision: https://reviews.llvm.org/D92819
2020-12-10 18:00:19 +08:00
Benjamin Kramer
5449a6bbe1 Remove Shapet assignment operator that's identical to the default. NFC. 2020-12-10 10:58:41 +01:00
Benjamin Kramer
94cf8f1d41 [Hexagon] Fold single-use variables into assert. NFCI.
Silences unused variable warnings in Release builds.
2020-12-10 10:53:56 +01:00
David Green
683e29b9a4 [ARM] MVE vcreate tests, for dual lane moves. NFC 2020-12-10 09:17:34 +00:00
LLVM GN Syncbot
dce1606292 [gn build] Port f80b29878b0 2020-12-10 09:13:09 +00:00
Luo, Yuanke
4a2765406d [X86] AMX programming model.
This patch implements amx programming model that discussed in llvm-dev
 (http://lists.llvm.org/pipermail/llvm-dev/2020-August/144302.html).
 Thank Hal for the good suggestion in the RA. The fast RA is not in the patch yet.
 This patch implemeted 7 components.

1. The c interface to end user.
2. The AMX intrinsics in LLVM IR.
3. Transform load/store <256 x i32> to AMX intrinsics or split the
   type into two <128 x i32>.
4. The Lowering from AMX intrinsics to AMX pseudo instruction.
5. Insert psuedo ldtilecfg and build the def-use between ldtilecfg to amx
   intruction.
6. The register allocation for tile register.
7. Morph AMX pseudo instruction to AMX real instruction.

Change-Id: I935e1080916ffcb72af54c2c83faa8b2e97d5cb0

Differential Revision: https://reviews.llvm.org/D87981
2020-12-10 17:01:54 +08:00
Lang Hames
bc304940bf [JITLink][ELF] Reformat/add debug logging in ELF_x86_64.cpp.
Moves symbol name to the end of the output and makes other columns fixed width
so that they line up.
2020-12-10 18:46:44 +11:00
Kazu Hirata
9793163151 [Tablegen] Use llvm::is_contained (NFC) 2020-12-09 23:34:07 -08:00
Sergey Dmitriev
9302af918f [llvm-link][NFC] Minor cleanup
llvm::Linker::linkModules() is a static member, so there is no need
to pass reference to llvm::Linker instance to loadArFile() function.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D92918
2020-12-09 23:16:13 -08:00
Kazushi (Jam) Marukawa
4bf07b5b90 [VE][NFC] Disable VP tests
VP tests recently added don't work on Release mode.  They work on
Debug mode, so I disable them on Release mode to make tests work.
2020-12-10 15:13:05 +09:00
Arthur Eubanks
07235ffbaf [test] Fix coro-retcon.ll under NPM
The full aa-pipeline is required to remove the extra store.
2020-12-09 22:04:59 -08:00
Alina Sbirlea
1574bc6938 [MemorySSA/docs] Extend MemorySSA documentation. 2020-12-09 18:00:16 -08:00
Arthur Eubanks
3c001d0408 [LTO][NPM] Default to using NPM under ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER
This affects users of LTO that don't explicitly set UseNewPM.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D92894
2020-12-09 17:48:47 -08:00
Fangrui Song
ead0e3209d Rename -plugin-opt=no-new-pass-manager to -plugin-opt=legacy-pass-manager 2020-12-09 16:43:30 -08:00
Stanislav Mekhanoshin
7b785e39c8 [AMDGPU] Fix expansion of 192 bit spills in PEI
Differential Revision: https://reviews.llvm.org/D92979
2020-12-09 16:36:29 -08:00
Krzysztof Parzyszek
071e5ba62e [Hexagon] Silence warnings about unused objects 2020-12-09 17:54:10 -06:00
Krzysztof Parzyszek
1a325ba866 [Hexagon] Fix build: move template specialization into namespace scope 2020-12-09 17:40:15 -06:00
Scott Linder
6218f09b03 [MC] Fix ICE with non-newline terminated input
There is an explicit option for the lexer to support this, but we crash
when `-preserve-comments` is enabled because it checks for
`getTok().getString().empty()` to detect the case. This doesn't
work currently because the lexer reports this case as a string of length
1, containing a null byte.

Change the lexer to instead report this case via an empty string, as the
null terminator isn't logically a part of the textual input, and the
check for `.empty()` seems natural and obvious in the calling code.

Reviewed By: niravd

Differential Revision: https://reviews.llvm.org/D92681
2020-12-09 23:39:32 +00:00
LLVM GN Syncbot
89ea615411 [gn build] Port f5d07a05bbd 2020-12-09 23:12:27 +00:00
Krzysztof Parzyszek
deb082d99d [Hexagon] Realign HVX vectors wherever possible
Introduce HexagonVectorCombine as a helper class for vector-related
optimizations.
2020-12-09 17:11:25 -06:00
Saleem Abdulrasool
4ae2c1c200 X86: use a data driven configuration of Windows x86 libcalls (NFC)
Rather than creating a series of associated calls and ensuring that
everything is lined up, use a table driven approach that ensures that
they two always stay in sync.
2020-12-09 22:49:11 +00:00
Scott Linder
19b5d1fffc [MC][AMDGPU] Consume EndOfStatement in asm parser
Avoids spurious newlines showing up in the output when emitting assembly
via MC.

Reviewed By: MaskRay, arsenm

Differential Revision: https://reviews.llvm.org/D92690
2020-12-09 21:45:55 +00:00
Craig Topper
067e0b2781 [X86] Use APInt::isSignedIntN instead of isIntN for 64-bit ANDs in X86DAGToDAGISel::IsProfitableToFold
Pretty sure we meant to be checking signed 32 immediates here
rather than unsigned 32 bit. I suspect I messed this up because
in MathExtras.h we have isIntN and isUIntN so isIntN differs in
signedness depending on whether you're using APInt or plain integers.

This fixes a case where we didn't fold a constant created
by shrinkAndImmediate. Since shrinkAndImmediate doesn't topologically
sort constants it creates, we can fail to convert the Constant
to a TargetConstant. This leads to very strange behavior later.

Fixes PR48458.
2020-12-09 13:39:07 -08:00
Fangrui Song
7f2a5362d1 [LLD][gold] Add -plugin-opt=no-new-pass-manager
-DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=on configured LLD and LLVMgold.so
will use the new pass manager by default. Add an option to
use the legacy pass manager. This will also be used by the Clang driver
when -fno-new-pass-manager (D92915) / -fno-experimental-new-pass-manager is set.

Reviewed By: aeubanks, tejohnson

Differential Revision: https://reviews.llvm.org/D92916
2020-12-09 13:31:03 -08:00
Yuanfang Chen
bbd8d2f9e4 [NFCI] Add missing triple to several LTO tests
Also remove the module triple of clang/test/CodeGenObjC/arc.ll, the
commandline tripe is all it needs.
2020-12-09 13:13:58 -08:00
Scott Linder
ad95bab280 [AMDGPU][MC] Restore old error position for "too few operands"
Revert part of https://reviews.llvm.org/D92084 to make it simpler to
start consuming the EndOfStatement token within AMDGPU's
ParseInstruction in a future patch. This also brings us back to what
every other target currently does.

A future change to move the position back to the end of the statement
would likely need to audit all of the AMDGPUOperand SMLoc ranges, and
determine the SMLoc for the last character of the last operand.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D92960
2020-12-09 21:09:47 +00:00
Sam Clegg
cee0963ebc [WebAssembly] Add support for named data sections in wasm binaries
Followup to https://reviews.llvm.org/D91769 which added support
for names globals.

Differential Revision: https://reviews.llvm.org/D92909
2020-12-09 12:57:07 -08:00
Mircea Trofin
99ada595aa [NFC] Removed unused prefixes in llvm/test/CodeGen/AArch64
Differential Revision: https://reviews.llvm.org/D92943
2020-12-09 12:47:51 -08:00
Florian Hahn
2e708d6d56 [AArch64] Add aarch64_neon_vcmla{_rot{90,180,270}} intrinsics.
Add builtins required to implement vcmla and rotated variants from
the ACLE

Reviewed By: t.p.northover

Differential Revision: https://reviews.llvm.org/D92929
2020-12-09 19:46:49 +00:00
Michael Munday
0e3bafc4e2 [RISCV][NFC] Regenerate RISCV CodeGen tests
Regenerated using:

./llvm/utils/update_llc_test_checks.py -u llvm/test/CodeGen/RISCV/*.ll

This has added comments to spill-related instructions and added @plt to
some symbols.

Differential Revision: https://reviews.llvm.org/D92841
2020-12-09 19:42:49 +00:00
Jianzhou Zhao
38add4d4b7 [dfsan] Track field/index-level shadow values in variables
*************
* The problem
*************
See motivation examples in compiler-rt/test/dfsan/pair.cpp. The current
DFSan always uses a 16bit shadow value for a variable with any type by
combining all shadow values of all bytes of the variable. So it cannot
distinguish two fields of a struct: each field's shadow value equals the
combined shadow value of all fields. This introduces an overtaint issue.

Consider a parsing function

   std::pair<char*, int> get_token(char* p);

where p points to a buffer to parse, the returned pair includes the next
token and the pointer to the position in the buffer after the token.

If the token is tainted, then both the returned pointer and int ar
tainted. If the parser keeps on using get_token for the rest parsing,
all the following outputs are tainted because of the tainted pointer.

The CL is the first change to address the issue.

**************************
* The proposed improvement
**************************
Eventually all fields and indices have their own shadow values in
variables and memory.

For example, variables with type {i1, i3}, [2 x i1], {[2 x i4], i8},
[2 x {i1, i1}] have shadow values with type {i16, i16}, [2 x i16],
{[2 x i16], i16}, [2 x {i16, i16}] correspondingly; variables with
primary type still have shadow values i16.

***************************
* An potential implementation plan
***************************

The idea is to adopt the change incrementially.

1) This CL
Support field-level accuracy at variables/args/ret in TLS mode,
load/store/alloca still use combined shadow values.

After the alloca promotion and SSA construction phases (>=-O1), we
assume alloca and memory operations are reduced. So if struct
variables do not relate to memory, their tracking is accurate at
field level.

2) Support field-level accuracy at alloca
3) Support field-level accuracy at load/store

These two should make O0 and real memory access work.

4) Support vector if necessary.
5) Support Args mode if necessary.
6) Support passing more accurate shadow values via custom functions if
necessary.

***************
* About this CL.
***************
The CL did the following

1) extended TLS arg/ret to work with aggregate types. This is similar
to what MSan does.

2) implemented how to map between an original type/value/zero-const to
its shadow type/value/zero-const.

3) extended (insert|extract)value to use field/index-level progagation.

4) for other instructions, propagation rules are combining inputs by or.
The CL converts between aggragate and primary shadow values at the
cases.

5) Custom function interfaces also need such a conversion because
all existing custom functions use i16. It is unclear whether custome
functions need more accurate shadow propagation yet.

6) Added test cases for aggregate type related cases.

Reviewed-by: morehouse

Differential Revision: https://reviews.llvm.org/D92261
2020-12-09 19:38:35 +00:00
Justin Bogner
833d76977e Limit the recursion depth of SelectionDAG::isSplatValue()
This method previously always recursively checked both the left-hand
side and right-hand side of binary operations for splatted (broadcast)
vector values to determine if the parent DAG node is a splat.

Like several other SelectionDAG methods, limit the recursion depth to
MaxRecursionDepth (6). This prevents stack overflow.
See also https://issuetracker.google.com/173785481

Patch by Nicolas Capens. Thanks!

Differential Revision: https://reviews.llvm.org/D92421
2020-12-09 10:35:07 -08:00
Alexey Bader
792476a7c3 [MCJIT] Add cmake variables to customize ittapi git location and revision.
To support llorg builds this patch provides the following changes:

1)  Added cmake variable ITTAPI_GIT_REPOSITORY to control the location of ITTAPI repository.
     Default value of ITTAPI_GIT_REPOSITORY is github location: https://github.com/intel/ittapi.git
     Also, the separate cmake variable ITTAPI_GIT_TAG was added for repo tag.
2)  Added cmake variable ITTAPI_SOURCE_DIR to control the place where the repo will be cloned.
     Default value of ITTAPI_SOURCE_DIR is build area: PROJECT_BINARY_DIR

Reviewed By: etyurin, bader

Patch by ekovanov.

Differential Revision: https://reviews.llvm.org/D91935
2020-12-09 21:04:24 +03:00
Arthur Eubanks
286730195f Reland Pin -loop-reduce to legacy PM
This was accidentally reverted by a later change.

LSR currently only runs in the codegen pass manager.
There are a couple issues with LSR and the NPM.
1) Lots of tests assume that LCSSA isn't run before LSR. This breaks a
bunch of tests' expected output. This is fixable with some time put in.
2) LSR doesn't preserve LCSSA. See
llvm/test/Analysis/MemorySSA/update-remove-deadblocks.ll. LSR's use of
SCEVExpander is the only use of SCEVExpander where the PreserveLCSSA option is
off. Turning it on causes some code sinking out of loops to fail due to
SCEVExpander's inability to handle the newly created trivial PHI nodes in the
broken critical edge (I was looking at
llvm/test/Transforms/LoopStrengthReduce/X86/2011-11-29-postincphi.ll).
I also tried simply just calling formLCSSA() at the end of LSR, but the extra
PHI nodes cause regressions in codegen tests.

We'll delay figuring these issues out until later.

This causes the number of check-llvm failures with -enable-new-pm true
by default to go from 60 to 29.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D92796
2020-12-09 09:57:57 -08:00
Fangrui Song
7056fcfd30 [CMake] Add llvm-profgen to LLVM_TEST_DEPENDS
Otherwise `check-llvm-*` may not rebuild llvm-profgen, causing llvm-profgen tests
to fail if llvm-profgen happens to be stale.
2020-12-09 09:34:51 -08:00
Mircea Trofin
5dea8b8d27 [FileCheck] Enforce --allow-unused-prefixes=false for llvm/test/Transforms
Explicitly opt-out llvm/test/Transforms/Attributor.

Verified by flipping the default value of allow-unused-prefixes and
observing that none of the failures were under llvm/test/Transforms.

Differential Revision: https://reviews.llvm.org/D92404
2020-12-09 08:51:38 -08:00
LLVM GN Syncbot
5fe8872286 [gn build] Port b804eef0905 2020-12-09 16:19:07 +00:00
LLVM GN Syncbot
f4bc2a0129 [gn build] Port ac7864ec019 2020-12-09 16:19:07 +00:00
LLVM GN Syncbot
d32f21ec22 [gn build] Port 5934a79196b 2020-12-09 16:19:06 +00:00
Kazushi (Jam) Marukawa
12d012b50c [VE] Add vsum and vfsum intrinsic instructions
Add vsum and vfsum intrinsic instructions and regression tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92938
2020-12-10 01:11:53 +09:00
Paul C. Anagnostopoulos
02220af447 [TableGen] Cache the vectors of records returned by getAllDerivedDefinitions().
Differential Revision: https://reviews.llvm.org/D92674
2020-12-09 10:54:04 -05:00
Sanjay Patel
d6e6ff92ec [VectorCombine] allow peeking through an extractelt when creating a vector load
This is an enhancement to load vectorization that is motivated by
a pattern in https://llvm.org/PR16739.
Unfortunately, it's still not enough to make a difference there.
We will have to handle multi-use cases in some better way to avoid
creating multiple overlapping loads.

Differential Revision: https://reviews.llvm.org/D92858
2020-12-09 10:36:14 -05:00
Roman Lebedev
ae9bbdd2bb [InstCombine] canonicalizeSaturatedAdd(): last fold is only valid for strict comparison (PR48390)
We could create uadd.sat under incorrect circumstances
if a select with -1 as the false value was canonicalized
by swapping the T/F values. Unlike the other transforms
in the same function, it is not invariant to equality.

Some alive proofs: https://alive2.llvm.org/ce/z/emmKKL

Based on original patch by David Green!

Fixes https://bugs.llvm.org/show_bug.cgi?id=48390

Differential Revision: https://reviews.llvm.org/D92717
2020-12-09 18:19:09 +03:00
Roman Lebedev
be06a88cc2 [NFC][InstCombine] Add test coverage for @llvm.uadd.sat canonicalization
The non-strict variants are already handled because they are canonicalized
to strict variants by swapping hands in both the select and icmp,
and the fold simply considers that strictness is irrelevant here.

But that isn't actually true for the last pattern, as PR48390 reports.
2020-12-09 18:19:08 +03:00