1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00
Commit Graph

214241 Commits

Author SHA1 Message Date
Sanjay Patel
8bb5944166 [ValueTracking] don't recursively compute known bits using multiple llvm.assumes
This is an alternative to D99759 to avoid the compile-time explosion seen in:
https://llvm.org/PR49785

Another potential solution would make the exclusion logic stronger to avoid
blowing up, but note that we reduced the complexity of the exclusion mechanism
in D16204 because it was too costly.

So I'm questioning the need for recursion/exclusion entirely - what is the
optimization value vs. cost of recursively computing known bits based on
assumptions?
This was built into the implementation from the start with 60db058,
and we have kept adding code/cost to deal with that capability.

By clearing the query's AssumptionCache inside computeKnownBitsFromAssume(),
this patch retains all existing assume functionality except refining known
bits based on even more assumptions.

We have 1 regression test that shows a difference in optimization power.

Differential Revision: https://reviews.llvm.org/D100573
2021-04-16 08:43:35 -04:00
Roman Lebedev
6ab1a4858d [X86][CostModel] Fix cost model for non-power-of-two vector load/stores
Sometimes LV has to produce really wide vectors,
and sometimes they end up being not powers of two.
As it can be seen from the diff, the cost computation
is currently completely non-sensical in those cases.

Instead of just scalarizing everything, split/factorize the wide vector
into a number of subvectors, each one having a power-of-two elements,
recurse to get the cost of op on this subvector. Also, check how we'd
legalize this subvector, and if the legalized type is scalar,
also account for the scalarization cost.

Note that for sub-vector loads, we might be able to do better,
when the vectors are properly aligned.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D100099
2021-04-16 15:30:57 +03:00
Abhina Sreeskantharajan
eda191ef72 [SystemZ][z/OS][Windows] Add new functions that set Text/Binary mode for Stdin and Stdout based on OpenFlags
On Windows, we want to open a file in Binary mode if OF_CRLF bit is not set. On z/OS, we want to open a file in Binary mode if the OF_Text bit is not set.

This patch creates two new functions called ChangeStdinMode and ChangeStdoutMode which will take OpenFlags as an arg to determine which mode to set stdin and stdout to. This will enable patches like https://reviews.llvm.org/D100056 to not affect Windows when setting the OF_Text flag for raw_fd_streams.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D100130
2021-04-16 08:09:19 -04:00
Nigel Perks
8886d90737 Restore lit feature object-emission. Omit DebugInfo/Generic on XCore.
D73568 removed the lit feature object-emission, because it was introduced for a
target which did not support the integrated assembler, and that target no longer
required the feature. XCore still does not support the integrated assembler,
so a build with XCore as the default target fails tests requiring
object-emission. This issue was not publicly visible because there was not a
buildbot for XCore as the default target. We fixed the failures downstream. We
now have builder clang-xcore-ubuntu-20-x64 on the staging buildmaster, which
shows the failures. We would like to make upstream build green.

Omit DebugInfo/Generic on XCore to avoid annotating 70 separate files.

Differential Revision: https://reviews.llvm.org/D98508
2021-04-16 13:02:14 +01:00
Nico Weber
4baa05b316 [llvm-objcopy] clang-format a line 2021-04-16 07:24:43 -04:00
Florian Hahn
8fb6dc84e6 [SimplifyCFG] Regenerate CHECK lines and add test for PR49982. 2021-04-16 12:05:54 +01:00
Caroline Concatto
b2eb6c4c05 [NFC][AArch64][SVE] Move select-sve.ll tests to sve-select.ll
This patch merges the two select tests: select-sve.ll and sve-select.ll into
sve-select.ll as they are both testing SELECT instruction
2021-04-16 11:59:53 +01:00
David Green
fbac53e1e4 [ARM] Combine sub 0, csinc X, Y, CC -> csinv -X, Y, CC
Combine sub 0, csinc X, Y, CC to csinv -X, Y, CC providing that the
negation of X is cheap, currently just handling constants. This comes up
during the splat of an i1 to a predicate, where we now generate csetm,
as opposed to cset; rsb.

Differential Revision: https://reviews.llvm.org/D99940
2021-04-16 11:52:31 +01:00
Fraser Cormack
10d6aa129c [RISCV] Rerun stack test through update_llc_test_checks.py
Adjusts formatting of comments only. Just to reduce diffs in future
patches.
2021-04-16 11:08:58 +01:00
Simon Pilgrim
3ee3327a8b [CostModel][X86] Add fully aligned load/store tests
As noted on D100099, if these illegal vector types are suitably aligned they should be much cheaper to load (but probably not store).
2021-04-16 10:35:40 +01:00
Simon Moll
d32b153555 [docs] Add vector predication call
Add the syncup call to the table

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D100474
2021-04-16 10:49:34 +02:00
Nick Desaulniers
71ba3c97c5 [Aarch64] handle "o" inline asm memory constraints
This Linux kernel is making use of this inline asm constraint which is
causing an ICE.

PR49956

Link: https://github.com/ClangBuiltLinux/linux/issues/1348

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D100412
2021-04-15 23:36:21 -07:00
Jim Lin
04ff5f4d15 [RISCV] Don't emit save-restore call if function is a interrupt handler
It has to save all caller-saved registers before a call in the handler.
So don't emit a call that save/restore registers.

Reviewed By: simoncook, luismarques, asb

Differential Revision: https://reviews.llvm.org/D100532
2021-04-16 12:54:47 +08:00
hsmahesha
ae11cac4bb [AMDGPU] Refactor ds_read/ds_write related select code for better readability.
Part of the code related to ds_read/ds_write ISel is refactored, and the
corresponding comment is re-written for better readability, which would help
while implementing any future ds_read/ds_write ISel related modifications.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D100300
2021-04-16 08:24:00 +05:30
Mircea Trofin
ed178714f6 [MLGO] Fix use of AM.invalidate post D100519
The ML inline advisors more aggressively invalidate certain analyses
after each call site inlining, to more accurately capture the problem
state.
2021-04-15 18:45:39 -07:00
Marcythm
3e47f7f7ca [LICM][NFC] Fix typo
fixed some typos which may lead to misunderstandings in LICM.cpp

Reviewed By: nikic, asbirlea
Differential Revision: https://reviews.llvm.org/D100470
2021-04-16 09:42:00 +08:00
Juneyoung Lee
bc62ce57fa [LangRef] formatting 2021-04-16 10:41:30 +09:00
LLVM GN Syncbot
6139db4174 [gn build] Port 3bc88eb3924f 2021-04-16 01:16:51 +00:00
Juneyoung Lee
2f6006376d [LangRef] fix unexepcted unindent errror 2021-04-16 09:58:55 +09:00
Juneyoung Lee
4fba55a0b5 [LangRef] clarify the semantics of nocapture
This patch clarifies the semantics of nocapture attribute.

A 'Pointer Capture' subsection is added to describe the semantics of pointer capture first.

For the nocapture example with two same pointer arguments, it is consistent with the semantics that Alive2 used to run lit tests.

Reviewed By: nlopes

Differential Revision: https://reviews.llvm.org/D97924
2021-04-16 09:48:42 +09:00
Arthur Eubanks
ccbb33afde [NFC][NewPM] Remove some AnalysisManager invalidate methods
These were misleading, they're more of a "clear" than an "invalidate".

We shouldn't be individually clearing analysis results. Either we clear
all analyses when some IR becomes invalid, or we properly go through
invalidation.

There was only one use of this, which can be simulated with
AM.invalidate(F, PA).

Reviewed By: mtrofin

Differential Revision: https://reviews.llvm.org/D100519
2021-04-15 16:51:26 -07:00
River Riddle
13a7aadf73 [mlir] Add support for walking locations similarly to Operations
This allows for walking all nested locations of a given location, and is generally useful when processing locations.

Differential Revision: https://reviews.llvm.org/D100437
2021-04-15 16:09:34 -07:00
Jon Roelofs
87c0b7c256 s/setGenerator/addGenerator/ in the JIT docs. NFC 2021-04-15 15:54:28 -07:00
Florian Hahn
91b8e2e75a [InferAttrs] Do not mark first argument of str(n)cat as writeonly.
str(n)cat appends a copy of the second argument to the end of the first
argument. To find the end of the first argument, str(n)cat has to read
from it until it finds the terminating 0. So it should not be marked as
writeonly. I think this means the argument should not be marked as
writeonly.

(This is causing a mis-compile with legacy DSE, before it got removed)

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D100601
2021-04-15 23:00:21 +01:00
Momchil Velikov
d98e321d12 [clang][AArch64] Correctly align HFA arguments when passed on the stack
When we pass a AArch64 Homogeneous Floating-Point
Aggregate (HFA) argument with increased alignment
requirements, for example

    struct S {
      __attribute__ ((__aligned__(16))) double v[4];
    };

Clang uses `[4 x double]` for the parameter, which is passed
on the stack at alignment 8, whereas it should be at
alignment 16, following Rule C.4 in
AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules)

Currently we don't have a way to express in LLVM IR the
alignment requirements of the function arguments. The align
attribute is applicable to pointers only, and only for some
special ways of passing arguments (e..g byval). When
implementing AAPCS32/AAPCS64, clang resorts to dubious hacks
of coercing to types, which naturally have the needed
alignment. We don't have enough types to cover all the
cases, though.

This patch introduces a new use of the stackalign attribute
to control stack slot alignment, when and if an argument is
passed in memory.

The attribute align is left as an optimizer hint - it still
applies to pointer types only and pertains to the content of
the pointer, whereas the alignment of the pointer itself is
determined by the stackalign attribute.

For byval arguments, the stackalign attribute assumes the
role, previously perfomed by align, falling back to align if
stackalign` is absent.

On the clang side, when passing arguments using the "direct"
style (cf. `ABIArgInfo::Kind`), now we can optionally
specify an alignment, which is emitted as the new
`stackalign` attribute.

Patch by Momchil Velikov and Lucas Prates.

Differential Revision: https://reviews.llvm.org/D98794
2021-04-15 22:58:14 +01:00
Craig Topper
6b675e5ab5 [TableGen] Reduce the number of map lookups in TypeSetByHwMode::getOrCreate. NFCI
hasMode was looking up the map once. Then we'd either call get which
would look up again, or we'd insert into the map which requires
walking the map to find the insertion point.

I believe the hasMode was needed because get has a special case
to look for DefaultMode if the mode being asked for doesn't exist.
We don't want that here so we were using hasMode to make sure we
wouldn't hit that case.

Simplify to a regular operator[] access which will default
construct a SetType if the lookup fails.
2021-04-15 12:32:21 -07:00
Stanislav Mekhanoshin
e718ff5dc3 [AMDGPU] Factor out predicate FmaakFmamkF32Insts
Differential Revision: https://reviews.llvm.org/D100409
2021-04-15 12:29:16 -07:00
Florian Hahn
1003f18483 [VPlan] Replace a few unnecessary includes with forward decls. 2021-04-15 20:08:31 +01:00
Stanislav Mekhanoshin
5b15ced47a [AMDGPU] Add new EmitDstSel field to VOPPofile. NFC.
Differential Revision: https://reviews.llvm.org/D100589
2021-04-15 12:07:08 -07:00
LLVM GN Syncbot
3e03953b19 [gn build] Port 82787eb2285d 2021-04-15 18:54:08 +00:00
hsmahesha
eb7757a102 [AMDGPU] Move LDS lowering related utility functions to a separate utils file.
Move some utility functions which are used within LDS lowering pass to a separate utils
file so that other LDS related passes can make use of them when required.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D100526
2021-04-16 00:15:48 +05:30
Krzysztof Parzyszek
685c4cfa64 [Hexagon] Avoid infinite loops in type legalization when lowering SETCC
Only widen SETCC if the operands can be widened. Not checking that caused
infinite widen-split loops in legalization.
2021-04-15 13:34:37 -05:00
Craig Topper
3583af245c [RISCV] Share RVInstIShift and RVInstIShiftW instruction format classes with the B extension.
This generalizes RVInstIShift/RVInstIShiftW to take the upper
5 or 7 bits of the immediate as an input instead of only bit 30. Then
we can share them.

For RVInstIShift I left a hardcoded 0 at bit 26 where RV128 gets
a 7th bit for the shift amount.

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D100424
2021-04-15 11:08:28 -07:00
cchen
7304924c0f [OpenMP] Added codegen for masked directive
Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D100514
2021-04-15 12:55:07 -05:00
Danilo C. Grael
779b4e72af [LoopUnrollAndJam] Avoid repeated instructions for UAJ analysis
Avoid visiting repeated instructions for processHeaderPhiOperands as it can cause a scenario of endless loop. Test case is attached and can be ran with `opt -basic-aa -tbaa -loop-unroll-and-jam  -allow-unroll-and-jam -unroll-and-jam-count=4`.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D97407
2021-04-15 12:59:42 -04:00
Arthur Eubanks
2a2896669d [NewPM] Cleanup IR printing instrumentation
Being lazy with printing the banner seems hard to reason with, we should print it
unconditionally first (it could also lead to duplicate banners if we
have multiple functions in -filter-print-funcs).

The printIR() functions were doing too many things. I separated out the
call from PrintPassInstrumentation since we were essentially doing two
completely separate things in printIR() from different callers.

There were multiple ways to generate the name of some IR. That's all
been moved to getIRName(). The printing of the IR name was also
inconsistent, now it's always "IR Dump on $foo" where "$foo" is the
name. For a function, it's the function name. For a loop, it's what's
printed by Loop::print(), which is more detailed. For an SCC, it's the
list of functions in parentheses. For a module it's "[module]", to
differentiate between a possible SCC with a function called "module".

To preserve D74814, we have to check if we're going to print anything at
all first. This is unfortunate, but I would consider this a special
case that shouldn't be handled in the core logic.

Reviewed By: jamieschmeiser

Differential Revision: https://reviews.llvm.org/D100231
2021-04-15 09:50:55 -07:00
Mark Johnston
f964e041e4 [asan] Add an offset for the kernel address sanitizer on FreeBSD
This is based on a port of the sanitizer runtime to the FreeBSD kernel
that has been commited as https://cgit.freebsd.org/src/commit/?id=38da497a4dfcf1979c8c2b0e9f3fa0564035c147
and the following commits.

Reviewed By: emaste, dim
Differential Revision: https://reviews.llvm.org/D98285
2021-04-15 17:49:00 +01:00
Stefan Pintilie
3c5cd9faed [PowerPC] Add ROP Protection Instructions for PowerPC
There are four new PowerPC instructions that are introduced in
Power 10. They are hashst, hashchk, hashstp, hashchkp.

These instructions will be used for ROP Protection.
This patch adds the four instructions.

Reviewed By: nemanjai, amyk, #powerpc

Differential Revision: https://reviews.llvm.org/D99375
2021-04-15 11:38:38 -05:00
Stelios Ioannou
4da00eddbe [LSR] Fix for pre-indexed generated constant offset
This patch changed the isLegalUse check to ensure that
LSRInstance::GenerateConstantOffsetsImpl generates an
offset that results in a legal addressing mode and
formula. The check is changed to look similar to the
assert check used for illegal formulas.

Differential Revision: https://reviews.llvm.org/D100383

Change-Id: Iffb9e32d59df96b8f072c00f6c339108159a009a
2021-04-15 16:44:42 +01:00
OCHyams
9fd912e6a2 Revert "[DebugInfo] Replace debug uses in replaceUsesOutsideBlock"
This reverts commit 96a1e6b7cf72d9bd625903ea4b441404200383cf.

Failing build bots e.g. https://lab.llvm.org/buildbot/#/builders/161/builds/163
2021-04-15 16:35:45 +01:00
OCHyams
94edb782c9 [DebugInfo] Replace debug uses in replaceUsesOutsideBlock
Value::replaceUsesOutsideBlock doesn't replace debug uses which leads to an
unnecessary reduction in variable location coverage. Fix this, add a unittest for
it, and add a regression test demonstrating the change through instcombine's
replacedSelectWithOperand.

Reviewed By: djtodoro

Differential Revision: https://reviews.llvm.org/D99169
2021-04-15 16:19:36 +01:00
Sanjay Patel
94869b0900 [InstCombine] update RUN lines in assume test; NFC
This was in a draft of from D82703, but it got left out
of the committed version, so we were not actually testing
the new code.
2021-04-15 10:48:00 -04:00
Kerry McLaughlin
7624945380 [NFC] Remove the -instcombine flag from strict-fadd.ll
This also fixes a CHECK line in @fadd_strict_unroll which ensures the
changes made to fixReduction() to support in-order reductions with
unrolling are being tested correctly.
2021-04-15 15:10:48 +01:00
LemonBoy
0ebe53ad79 [yaml2obj/obj2yaml/llvm-readobj] Support printing and parsing AVR-specific e_flags
The `e_flags` contains a mixture of bitfields and regular ones, ensure all of them can be serialized and deserialized.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D100250
2021-04-15 15:54:28 +02:00
Paul C. Anagnostopoulos
b1e2c33e83 [TableGen] [docs] Correct a reference in the TableGen Overview document
Differential Revision: https://reviews.llvm.org/D100382
2021-04-15 09:25:09 -04:00
Sebastian Neubauer
033c97f321 [AMDGPU] Fix large return values with amdgpu_gfx
Returning in memory is not supported, so fall back to sret.
Also, extend i1 and i16 to i32. Otherwise, they would be passed through
memory.

Differential Revision: https://reviews.llvm.org/D100543
2021-04-15 14:57:56 +02:00
Simon Pilgrim
fd3b975a9f [X86] combineCMP - fold cmpEQ/NE(TRUNC(X),0) -> cmpEQ/NE(X,0)
If we are truncating from a i32 source before comparing the result against zero, then see if we can directly compare the source value against zero.

If the upper (truncated) bits are known to be zero then we can compare against that, hopefully increasing the chances of us folding the compare into a EFLAG result of the source's operation.

Fixes PR49028.

Differential Revision: https://reviews.llvm.org/D100491
2021-04-15 13:55:51 +01:00
Bradley Smith
843db3db21 [AArch64][NEON] Match (or (and -a b) (and (a+1) b)) => bit select
With this patch vbslq_f32(vnegq_s32(a), b, c) lowers to a BIT instruction.

Co-authored-by: Paul Walker <paul.walker@arm.com>

Differential Revision: https://reviews.llvm.org/D100304
2021-04-15 13:52:47 +01:00
Alex Orlov
e7b532348e Fix bug in .eh_frame/.debug_frame PC offset calculation for DW_EH_PE_pcrel
This fixes the following bugs:
https://bugs.llvm.org/show_bug.cgi?id=27249
https://bugs.llvm.org/show_bug.cgi?id=46414

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D100328
2021-04-15 15:06:20 +04:00
Florian Hahn
09bcc572cb [VPlan] Add VPRecipeBase::mayHaveSideEffects.
Add an initial version of a helper to determine whether a recipe may
have side-effects.

Reviewed By: a.elovikov

Differential Revision: https://reviews.llvm.org/D100259
2021-04-15 11:49:40 +01:00