1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

140890 Commits

Author SHA1 Message Date
Christudasan Devadasan
f694eea8d3 [AMDGPU] Some refactoring after D90404. NFC. 2020-11-01 13:18:53 +05:30
Christudasan Devadasan
45dd9c1c1d [AMDGPU] Add alignment check for v3 to v4 load type promotion
It should be enabled only when the load alignment is at least 8-byte.

Fixes: SWDEV-256824

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D90404
2020-11-01 12:05:34 +05:30
Ayke van Laethem
a7eefcbc05 [AVR] Improve inline rotate/shift expansions
These expansions were rather inefficient and were done with more code
than necessary. This change optimizes them to use expansions more
similar to GCC. The code size is the same (when optimizing for code
size) but somehow LLVM reorders blocks in a non-optimal way. Still, this
should be an improvement with a reduction in code size of around 0.12%
(when building compiler-rt).

Differential Revision: https://reviews.llvm.org/D86418
2020-10-31 23:15:49 +01:00
Florian Hahn
dc7616bd85 [DSE] Use same logic as legacy impl to check if free kills a location.
This patch updates DSE + MemorySSA to use the same check as the legacy
implementation to determine if a location is killed by a free call.

This changes the existing behavior so that a free does not kill
locations before the start of the freed pointer.

This should fix PR48036.
2020-10-31 20:09:25 +00:00
Florian Hahn
94fd7ed787 Reland "[SLP] Consider alternatives for cost of select instructions."
This reverts the revert commit a1b53db32418cb6ed6f5b2054d15a22b5aa3aeb9.

This patch includes a fix for a reported issue, caused by
matchSelectPattern returning UMIN for selects of pointers in
some cases by looking to some connected casts.

For now, ensure integer instrinsics are only returned for selects of
ints or int vectors.
2020-10-31 16:52:36 +00:00
Paul C. Anagnostopoulos
2d0f8816ca [TableGen] Eliminate uses of true and false in .td files.
They occurred in one NVPTX file and some test files.

Differential Revision: https://reviews.llvm.org/D90513
2020-10-31 10:54:33 -04:00
David Green
e722c3abc4 [ARM] Fix crash for gather of pointer costs.
If the elt size is unknown due to it being a pointer, a comparison
against 0 will cause an assert. Make sure the elt size is large enough
before comparing and for the moment just return the scalar cost.
2020-10-31 13:10:14 +00:00
Simon Pilgrim
70069c9ff3 [InstCombine] foldSelectRotate - generalize to foldSelectFunnelShift
This is the last of the rotate->funnel shift InstCombine generalizations for PR46896

We still have foldGuardedRotateToFunnelShift to deal with in AggressiveInstCombine

Differential Revision: https://reviews.llvm.org/D90382
2020-10-31 12:32:34 +00:00
Simon Pilgrim
c63dceb3c6 [X86] Make some basic VarArgsLoweringHelper helper methods const. NFCI.
Fixes a number of cppcheck remarks.
2020-10-31 12:16:49 +00:00
Simon Pilgrim
6ecc9d5cd6 [X86] Make the X86FrameSortingComparator operator const. NFCI.
Fixes a cppcheck remark.
2020-10-31 12:16:49 +00:00
Simon Pilgrim
8ba90ee18c [CSE] Make some basic EarlyCSE::StackNode helper methods const. NFCI.
Fixes a number of cppcheck remarks.
2020-10-31 12:16:48 +00:00
Simon Pilgrim
cd24afa0df [Bitcode] Make some basic PlaceholderQueue/MetadataLoaderImpl helper methods const. NFCI.
Fixes a number of cppcheck remarks.
2020-10-31 12:16:48 +00:00
Andrea Di Biagio
7a72a7505a [MCA][LSUnit] Correctly update the internal group flags on store barrier execution. Fixes PR48024.
This is likely to be a regressigion introduced by my last refactoring of the
LSUnit (commit 5578ec32f9c4f). Before this patch, the
"CurrentStoreBarrierGroupID" index was not correctly reset on store barrier
executions.  This was leading to unexpected crashes like the one reported as
PR48024.
2020-10-31 11:57:27 +00:00
Simon Pilgrim
3efd9f3456 [X86] X86MCTargetDesc - ensure the declaration/definition variable names match. NFCI.
Silences cppcheck mismatch warnings.
2020-10-31 11:50:00 +00:00
Simon Pilgrim
a2ea3f5fb1 [X86] Reduce scope of DestReg and use specific Register type not unsigned. NFCI. 2020-10-31 11:46:07 +00:00
Simon Pilgrim
bbc158d7d8 [X86] printAsmMRegister - make the X86AsmPrinter arg a const reference. NFC.
Fixes cppcheck warning.
2020-10-31 11:41:14 +00:00
Simon Pilgrim
d0fa3ff1b3 [X86] assignValueToReg - fix Wshadow warning. NFCI.
X86OutgoingValueHandler already has a MIB member
2020-10-31 11:39:26 +00:00
Simon Pilgrim
15af1d25b2 [X86] printAsmVRegister - remove unused argument. NFC. 2020-10-31 11:34:28 +00:00
Simon Pilgrim
b884db032a [X86] X86AsmPrinter - ensure the declaration/definition variable names match. NFCI.
Silences cppcheck mismatch warnings.
2020-10-31 11:31:46 +00:00
Simon Pilgrim
ce989b4ba4 [X86] No need to determine pointer when the type is already a MachineInstr*. NFCI.
Caught by cppcheck - appears to be a copy+paste typo as the other var is an iterator that does need the &* pointer operation.
2020-10-31 11:26:25 +00:00
Nikita Popov
3b8d8fb2a4 [Inliner] Consistently apply callsite noalias metadata
Previously, !noalias and !alias.scope metadata on the call site was
applied as part of CloneAliasScopeMetadata(), which short-circuits
if the callee does not use any noalias metadata itself. However,
these two things have no relation to each other.

Consistently apply !noalias and !alias.scope metadata by integrating
this into an existing function that handled !llvm.access.group and
!llvm.mem.parallel_loop_access metadata. The handling for all of
these metadata kinds essentially the same.
2020-10-31 10:54:45 +01:00
Arthur Eubanks
bb84082e59 Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit 10f2a0d662d8d72eaac48d3e9b31ca8dc90df5a4.

More uint64_t overflows.
2020-10-31 00:25:32 -07:00
Liu, Chen3
0f29f1e458 [X86] Support Intel avxvnni
This patch mainly made the following changes:

1. Support AVX-VNNI instructions;
2. Introduce ExplicitVEXPrefix flag so that vpdpbusd/vpdpbusds/vpdpbusds/vpdpbusds instructions only use vex-encoding when user explicity add {vex} prefix.

Differential Revision: https://reviews.llvm.org/D89105
2020-10-31 12:39:51 +08:00
Thomas Lively
c38355d3a0 [WebAssembly] Prototype i64x2.bitmask
As proposed in https://github.com/WebAssembly/simd/pull/368.

Differential Revision: https://reviews.llvm.org/D90514
2020-10-30 17:23:30 -07:00
Wouter van Oortmerssen
9bad51347e [WebAssembly] Fixed DWARF DW_AT_low_pc encoded as 64-bit in wasm64
Also added general wasm64 DWARF test
Also added asserts for unsupported reloc combinations that triggered this bug.

Differential Revision: https://reviews.llvm.org/D90503
2020-10-30 16:42:48 -07:00
Thomas Lively
dcc2517656 [WebAssembly] Prototype i64x2.eq
As proposed in https://github.com/WebAssembly/simd/pull/381. Since it is still
in the prototyping phase, it is only accessible via a target builtin function
and a target intrinsic.

Depends on D90504.

Differential Revision: https://reviews.llvm.org/D90508
2020-10-30 16:38:15 -07:00
Thomas Lively
5b1fe216f7 [WebAssembly] Prototype i64x2.widen_{low,high}_i32x4_{s,u}
As proposed in https://github.com/WebAssembly/simd/pull/290. As usual, these
instructions are available only via builtin functions and intrinsics while they
are in the prototyping stage.

Differential Revision: https://reviews.llvm.org/D90504
2020-10-30 15:44:04 -07:00
Florian Hahn
7964568963 Revert "[SLP] Consider alternatives for cost of select instructions."
This reverts commit 19225704890632cd2552f41ada41600a20db1371.

This appears to cause a crash in the following example

 a, b, c;
 l() {
   int e = a, f = l, g, h, i, j;
   float *d = c, *k = b;
   for (;;)
     for (; g < f; g++) {
       k[h] = d[i];
       k[h - 1] = d[j];
       h += e << 1;
       i += e;
     }
 }

 clang -cc1 -triple i386-unknown-linux-gnu -emit-obj -target-cpu pentium-m -O1 -vectorize-loops -vectorize-slp reduced.c

 llvm::Type *llvm::Type::getWithNewBitWidth(unsigned int) const: Assertion `isIntOrIntVectorTy() && "Original type expected to be a vector of integers or a scalar integer."' failed.
2020-10-30 21:26:14 +00:00
Florian Hahn
f44af1a603 Revert "[TTI] Add VecPred argument to getCmpSelInstrCost."
This reverts commit 73f01e3df58dca9d1596440b866b52929e3878de.

This appears to break
http://lab.llvm.org:8011/#/builders/85/builds/383.
2020-10-30 21:26:14 +00:00
Peter Collingbourne
51d3ffbbc5 hwasan: Support for outlined checks in the Linux kernel.
Add support for match-all tags and GOT-free runtime calls, which
are both required for the kernel to be able to support outlined
checks. This requires extending the access info to let the backend
know when to enable these features. To make the code easier to maintain
introduce an enum with the bit field positions for the access info.

Allow outlined checks to be enabled with -mllvm
-hwasan-inline-all-checks=0. Kernels that contain runtime support for
outlined checks may pass this flag. Kernels lacking runtime support
will continue to link because they do not pass the flag. Old versions
of LLVM will ignore the flag and continue to use inline checks.

With a separate kernel patch [1] I measured the code size of defconfig
+ tag-based KASAN, as well as boot time (i.e. time to init launch)
on a DragonBoard 845c with an Android arm64 GKI kernel. The results
are below:

         code size    boot time
before    92824064      6.18s
after     38822400      6.65s

[1] https://linux-review.googlesource.com/id/I1a30036c70ab3c3ee78d75ed9b87ef7cdc3fdb76

Depends on D90425

Differential Revision: https://reviews.llvm.org/D90426
2020-10-30 14:25:40 -07:00
Cameron McInally
7b7e236aab [Legalize] Add legalizations for VECREDUCE_SEQ_FADD
Add Legalization support for VECREDUCE_SEQ_FADD, so that we don't need to depend on ExpandReductionsPass.

Differential Revision: https://reviews.llvm.org/D90247
2020-10-30 16:02:55 -05:00
Peter Collingbourne
1e61e7c7e0 hwasan: Move fixed shadow behind opaque no-op cast as well.
This is a workaround for poor heuristics in the backend where we can
end up materializing the constant multiple times. This is particularly
bad when using outlined checks because we materialize it for every call
(because the backend considers it trivial to materialize).

As a result the field containing the shadow base value will always
be set so simplify the code taking that into account.

Differential Revision: https://reviews.llvm.org/D90425
2020-10-30 13:23:52 -07:00
Peter Collingbourne
6c8896ea88 AArch64: Use SBFX instead of UBFX to extract address granule in outlined HWASan checks.
In a kernel (or in general in environments where bit 55 of the address
is set) the shadow base needs to point to the end of the shadow region,
not the beginning. Bit 55 needs to be sign extended into bits 52-63
of the shadow base offset, otherwise we end up loading from an invalid
address. We can do this by using SBFX instead of UBFX.

Using SBFX should have no effect in the userspace case where bit 55
of the address is clear so we do so unconditionally. I don't think
we need a ABI version bump for this (but one will come anyway when
we switch to x20 for the shadow base register).

Differential Revision: https://reviews.llvm.org/D90424
2020-10-30 12:53:15 -07:00
Peter Collingbourne
1263637c86 AArch64: Switch to x20 as the shadow base register for outlined HWASan checks.
From a code size perspective it turns out to be better to use a
callee-saved register to pass the shadow base. For non-leaf functions
it avoids the need to reload the shadow base into x9 after each
function call, at the cost of an additional stack slot to save the
caller's x20. But with x9 there is also a stack size cost, either
as a result of copying x9 to a callee-saved register across calls or
by spilling it to stack, so for the non-leaf functions the change to
stack usage is largely neutral.

It is also code size (and stack size) neutral for many leaf functions.
Although they now need to save/restore x20 this can typically be
combined via LDP/STP into the x30 save/restore. In the case where
the function needs callee-saved registers or stack spills we end up
needing, on average, 8 more bytes of stack and 1 more instruction
but given the improvements to other functions this seems like the
right tradeoff.

Unfortunately we cannot change the register for the v1 (non short
granules) check because the runtime assumes that the shadow base
register is stored in x9, so the v1 check still uses x9.

Aside from that there is no change to the ABI because the choice
of shadow base register is a contract between the caller and the
outlined check function, both of which are compiler generated. We do
need to rename the v2 check functions though because the functions
are deduplicated based on their names, not on their contents, and we
need to make sure that when object files from old and new compilers
are linked together we don't end up with a function that uses x9
calling an outlined check that uses x20 or vice versa.

With this change code size of /system/lib64/*.so in an Android build
with HWASan goes from 200066976 bytes to 194085912 bytes, or a 3%
decrease.

Differential Revision: https://reviews.llvm.org/D90422
2020-10-30 12:51:30 -07:00
Mircea Trofin
3afc00f390 [FileCheck] Report missing prefixes when more than one is provided.
If more than a prefix is provided - e.g. --check-prefixes=CHECK,FOO - we
don't report if (say) FOO is never used. This may lead to a gap in our
test coverage.

This patch introduces a new option, --allow-unused-prefixes. It
currently is set to true, keeping today's behavior. After we explicitly
set it in tests where this behavior was actually intentional, we will
switch it to false by default.

Differential Revision: https://reviews.llvm.org/D90281
2020-10-30 12:39:29 -07:00
Anna Thomas
4583a75ff3 [CFG] Replace hardcoded max BBs explored as CL option. NFC.
This option was hardcoded to 32. Changing this as a CL option since we
have seen some cases downstream where increasing this limit allows us to
disprove reachability.

Reviewed-By: jdoerfert
Differential Revision: https://reviews.llvm.org/D90487
2020-10-30 15:11:48 -04:00
Craig Topper
1ca50ab309 [RISCV] Don't use DCI.CombineTo to replace a single result. NFCI
Just return the new node, which is the standard practice.

I also noticed what appeared to be an unnecessary attempt at
creating an ANY_EXTEND where the type should already be correct.
I replace with an assert to verify the type.

Differential Revision: https://reviews.llvm.org/D90444
2020-10-30 10:46:32 -07:00
Ronald Wampler
3d6203fc7f [Support] PR42623: Avoid setting the delete-on-close bit if a TempFile doesn't reside on a local drive
On Windows, after commit 881ba104656c40098d4bc90c52613c08136f0fe1, tools
using TempFile would error with "bad file descriptor" when writing the
file on a network drive. It appears that setting the delete-on-close bit via
SetFileInformationByHandle/FileDispositionInfo prevented it from
accessing the file on network drives, and although using
FILE_DISPOSITION_INFO seems to work, it causes other troubles.

Differential Revision: https://reviews.llvm.org/D81803
2020-10-30 13:37:40 -04:00
Arthur Eubanks
3102160c9b [NFC] Clean up PassBuilder
Make DebugLogging a member variable so that users of PassBuilder don't
need to pass it around so much.

Move call to TargetMachine::registerPassBuilderCallbacks() within
PassBuilder so users don't need to remember to call it.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90437
2020-10-30 10:03:59 -07:00
Arthur Eubanks
f52f1e83f5 Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D88609
2020-10-30 10:03:46 -07:00
Pedro Tammela
e61104b08c [NFC][Reg2Mem] modernize loops iterators
This patch updates the Reg2Mem loops to use more modern iterators.

Differential Revision: https://reviews.llvm.org/D90122
2020-10-30 16:50:07 +00:00
Pedro Tammela
5cfbd72aaf [NFC][LoopSimplify] modernize for loops over LoopInfo
This patch modifies two for loops to use the range based syntax.
Since they are equivalent, this patch is tagged NFC.

Differential Revision: https://reviews.llvm.org/D90069
2020-10-30 16:50:07 +00:00
Sanjay Patel
a82a184b76 [x86] add cost overrides for mul with overflow
I'm assuming the standard size integer instructions for this end up as something like:
mulq %rsi
seto %al

And the 'mul' generally has reciprocal throughput of 1 on typical implementations
(higher latency, but that's not handled here).
The default costs may end up much higher than that, and that's what we see in the test diffs.

Vector types are left as a 'TODO'.

Differential Revision: https://reviews.llvm.org/D90431
2020-10-30 12:38:16 -04:00
Amy Huang
a59c18de66 [CodeView] Encode signed int values correctly when emitting S_CONSTANTs
Differential Revision: https://reviews.llvm.org/D90199
2020-10-30 09:28:41 -07:00
Michael Liao
15bc14c9fa [gvn] PRE needs to skip convergent intrinsics/calls.
- As convergent intrinsics/calls could only be moved to
  control-equivalent blocks, or more precisely the same divergent
  branch, PRE needs to skip them.

Differential Revision: https://reviews.llvm.org/D90391
2020-10-30 11:24:40 -04:00
Evgeniy Brevnov
d278d84c98 [DSE] Improve partial overlap detection
Currently isOverwrite returns OW_MaybePartial even for accesss known not to overlap. This is not a big problem for legacy implementation (since isPartialOverwrite follows isOverwrite and clarifies the result). Contrary SSA based version does a lot of work to later find out that accesses don't overlap. Besides negative impact on compile time we quickly reach MemorySSAPartialStoreLimit and miss optimization opportunities.

Note: In fact, I think it would be cleaner implementation if isOverwrite returned fully clarified result in the first place whithout need to call isPartialOverwrite. This can be done as a follow up. What do you think?

Reviewed By: fhahn, asbirlea

Differential Revision: https://reviews.llvm.org/D90371
2020-10-30 22:23:20 +07:00
Simon Pilgrim
d3042bf5bc Use cast<> instead of dyn_cast<> as we dereference the pointers immediately. NFCI.
Fix clang static analyzer warnings - we're better off relying on cast<> asserting on failure rather than a null dereference crash.
2020-10-30 15:20:40 +00:00
Simon Moll
4fac409eb4 [VE][NFC] Split up lowering init
Split up the monolithic VETargetLowering ctor into three initialization phases:
1. initRegisterClasses()
2. initSPUActions()
3. // TODO initVPUActions()

Reviewed By: kaz7

Differential Revision: https://reviews.llvm.org/D90463
2020-10-30 16:18:27 +01:00
Matt Arsenault
f52c63425c AMDGPU: Fix missing writelane cases to skip with exec=0 2020-10-30 11:15:11 -04:00
Florian Hahn
f1c34f2040 [VPlan] Use isa<> instead getVPRecipeID in getFirstNonPhi (NFC).
As per the comment in VPRecipeBase, clients should not rely on
getVPRecipeID, as it may change in the future. It should only be used in
classof implementations. Use isa instead in getFirstNonPhi.
2020-10-30 14:56:06 +00:00