1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00
Commit Graph

42145 Commits

Author SHA1 Message Date
Roman Lebedev
5d89fa42b3 [NFC][STLExtras] Add make_first_range(), similar to existing make_second_range()
Having just one of the two seens weird.
I wanted to use it a few times, but it wasn't there.
2020-08-29 09:58:07 +03:00
Xing GUO
1d4cd95b57 [DWARFYAML] Make the debug_abbrev_offset field optional.
This patch helps make the debug_abbrev_offset field optional. We don't
need to calculate the value of this field in the future.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D86614
2020-08-29 14:54:52 +08:00
Fangrui Song
311c781d4a [gcov] Increment counters with atomicrmw if -fsanitize=thread
Without this patch, `clang --coverage -fsanitize=thread` may fail spuriously
because non-atomic counter increments can be detected as data races.
2020-08-28 16:32:35 -07:00
Matt Arsenault
9e320d7adb GlobalISel: Combine out redundant sext_inreg
The scalar tests don't work yet, since computeNumSignBits apparently
doesn't handle sextload yet, and sext folds into the load first.
2020-08-28 17:57:31 -04:00
Craig Topper
09050e4cf7 [Attributes] Add a method to check if an Attribute has AttrKind None. Use instead of hasAttribute(Attribute::None)
There's a special case in hasAttribute for None when pImpl is null. If pImpl is not null we dispatch to pImpl->hasAttribute which will always return false for Attribute::None.

So if we just want to check for None its sufficient to just check that pImpl is null. Which can even be done inline.

This patch adds a helper for that case which I hope will speed up our getSubtargetImpl implementations.

Differential Revision: https://reviews.llvm.org/D86744
2020-08-28 13:23:45 -07:00
Arthur Eubanks
82da713351 [ObjCARCOpt] Port objc-arc to NPM
Since doInitialization() in the legacy pass modifies the module, the NPM
pass is a Module pass.

Reviewed By: ahatanak, ychen

Differential Revision: https://reviews.llvm.org/D86178
2020-08-28 12:59:33 -07:00
Tyker
70c01602af [SROA] Improve handleling of assumes bundles by SROA
This patch fixes this crash https://gcc.godbolt.org/z/Ps8d1e
And gives SROA the ability to remove assumes if it allows promoting an alloca to register
Without removing assumes when it can't promote to register.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86570
2020-08-28 21:55:45 +02:00
Snehasish Kumar
69b963173e [llvm][CodeGen] Machine Function Splitter
We introduce a codegen optimization pass which splits functions into hot and cold
parts. This pass leverages the basic block sections feature recently
introduced in LLVM from the Propeller project. The pass targets
functions with profile coverage, identifies cold blocks and moves them
to a separate section. The linker groups all cold blocks across
functions together, decreasing fragmentation and improving icache and
itlb utilization.

We evaluated the Machine Function Splitter pass on clang bootstrap and
SPECInt 2017.

For clang bootstrap we observe a mean 2.33% runtime improvement with a
~32% reduction in itlb and stlb misses. Additionally, L1 icache misses
reduced by 9.5% while L2 instruction misses reduced by 20%.

For SPECInt we report the change in IntRate the C/C++
benchmarks. All benchmarks apart from mcf and x264 improve, on average
by 0.6% with the max for deepsjeng at 1.6%.

Benchmark		% Change
500.perlbench_r		 0.78
502.gcc_r		 0.82
505.mcf_r		-0.30
520.omnetpp_r		 0.18
523.xalancbmk_r		 0.37
525.x264_r		-0.46
531.deepsjeng_r		 1.61
541.leela_r		 0.83
557.xz_r		 0.15

Differential Revision: https://reviews.llvm.org/D85368
2020-08-28 11:10:14 -07:00
David Sherwood
56b8c35591 [SVE] Make ElementCount members private
This patch changes ElementCount so that the Min and Scalable
members are now private and can only be accessed via the get
functions getKnownMinValue() and isScalable(). In addition I've
added some other member functions for more commonly used operations.
Hopefully this makes the class more useful and will reduce the
need for calling getKnownMinValue().

Differential Revision: https://reviews.llvm.org/D86065
2020-08-28 14:43:53 +01:00
Sam Parker
710437b36d [ARM][LowOverheadLoops] Liveouts and reductions
Remove the code that tried to look for reduction patterns, since the
vectorizer and isel can now produce predicated arithmetic instructios
within the loop body. This has required some reorganisation and fixes
around live-out and predication checks, as well as looking for cases
where an input/output is initialised to zero.

Differential Revision: https://reviews.llvm.org/D86613
2020-08-28 13:56:16 +01:00
Martin Storsjö
831184291b [MC] [Win64EH] Avoid producing malformed xdata records
If there's no unwinding opcodes, omit writing the xdata/pdata records.

Previously, this generated truncated xdata records, and llvm-readobj
would error out when trying to print them.

If writing of an xdata record is forced via the .seh_handlerdata
directive, skip it if there's no info to make a sensible unwind
info structure out of, and clearly error out if such info appeared
later in the process.

Differential Revision: https://reviews.llvm.org/D86527
2020-08-28 09:05:36 +03:00
serge-sans-paille
a12b4db565 (Expensive) Check for Loop, SCC and Region pass return status
This generalizes the logic introduced in https://reviews.llvm.org/D80916 to
other passes.

It's needed by https://reviews.llvm.org/D86442 to assert passes correctly report
their status.

Differential Revision: https://reviews.llvm.org/D86589
2020-08-28 07:56:35 +02:00
Valentin Clement
ba3e93ca35 [flang][openacc] Add check for tile clause restriction
The tile clause in OpenACC 3.0 imposes some restriction. Element in the tile size list are either * or a
constant positive integer expression. If there are n tile sizes in the list, the loop construct must be immediately
followed by n tightly-nested loops.
This patch implement these restrictions and add some tests.

Reviewed By: klausler

Differential Revision: https://reviews.llvm.org/D86655
2020-08-27 22:13:46 -04:00
Harmen Stoppels
e6870b67d6 Revert "Use find_library for ncurses"
The introduction of find_library for ncurses caused more issues than it solved problems. The current open issue is it makes the static build of LLVM fail. It is better to revert for now, and get back to it later.

Revert "[CMake] Fix an issue where get_system_libname creates an empty regex capture on windows"
This reverts commit 1ed1e16ab83f55d85c90ae43a05cbe08a00c20e0.

Revert "Fix msan build"
This reverts commit 34fe9613dda3c7d8665b609136a8c12deb122382.

Revert "[CMake] Always mark terminfo as unavailable on Windows"
This reverts commit 76bf26236f6fd453343666c3cd91de8f74ffd89d.

Revert "[CMake] Fix OCaml build failure because of absolute path in system libs"
This reverts commit 8e4acb82f71ad4effec8895b8fc957189ce95933.

Revert "[CMake] Don't look for terminfo libs when LLVM_ENABLE_TERMINFO=OFF"
This reverts commit 495f91fd33d492941c39424a32cf24bcfe192f35.

Revert "Use find_library for ncurses"
This reverts commit a52173a3e56553d7b795bcf3cdadcf6433117107.

Differential revision: https://reviews.llvm.org/D86521
2020-08-27 17:57:26 -07:00
Matt Arsenault
2cd41c208e GlobalISel: Implement known bits for min/max 2020-08-27 16:56:17 -04:00
Matt Arsenault
fcf8b40603 MIR: Infer not-SSA for subregister defs
It's possible to have a single virtual register def with a subreg
index that would pass the previous check, but it's not possible to
have a subregister def in SSA.

This is in preparation for adding stricter checks for SSA MIR.
2020-08-27 16:56:16 -04:00
Vitaly Buka
4f88f9a474 [NFC][ValueTracking] Add OffsetZero into findAllocaForValue
For StackLifetime after finding alloca we need to check that
values ponting to the begining of alloca.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D86692
2020-08-27 13:46:22 -07:00
Matt Arsenault
cac4e51351 GlobalISel: Add and_trivial_mask to all_combines
Also make up a new category of combines.
2020-08-27 16:42:09 -04:00
Eli Friedman
4c050f47e9 [RegisterScavenging] Delete dead function unprocess(). 2020-08-27 13:19:32 -07:00
Shinji Okumura
912a13d81d [Attributor] Do not add AA to dependency graph after the update stage
If an AA is registered to the dependency graph in the manifest stage, Attributor aborts in `::manifestAttributes()`.
This patch prevents such termination.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86734
2020-08-28 05:16:18 +09:00
Shinji Okumura
6ff8397df1 [Attributor] Guarantee getAAFor not to update AA in the manifestation stage
If we query an AA with `Attributor::getAAFor` in `AbstractAttribute::manifest`, the AA may be updated.
This patch makes use of the phase flag in Attributor, and handle `getAAFor` behavior according to the flag.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86635
2020-08-28 04:07:42 +09:00
Christopher Tetreault
c849a98c1b [SVE] Remove calls to VectorType::getNumElements from IR
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D81500
2020-08-27 11:16:10 -07:00
Matt Arsenault
b3488037c4 GlobalISel: Implement known bits for G_MERGE_VALUES 2020-08-27 14:07:18 -04:00
Mikhail Maltsev
f7e914e2c5 [ARM][BFloat16] Change types of some Arm and AArch64 bf16 intrinsics
This patch adjusts the following ARM/AArch64 LLVM IR intrinsics:
- neon_bfmmla
- neon_bfmlalb
- neon_bfmlalt
so that they take and return bf16 and float types. Previously these
intrinsics used <8 x i8> and <4 x i8> vectors (a rudiment from
implementation lacking bf16 IR type).

The neon_vbfdot[q] intrinsics are adjusted similarly. This change
required some additional selection patterns for vbfdot itself and
also for vector shuffles (in a previous patch) because of SelectionDAG
transformations kicking in and mangling the original code.

This patch makes the generated IR cleaner (less useless bitcasts are
produced), but it does not affect the final assembly.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D86146
2020-08-27 18:43:16 +01:00
Aditya Nandakumar
90acb0696f [GISel] Add new GISel combiners for G_SELECT
https://reviews.llvm.org/D83833

Patch adds two new GICombinerRules for G_SELECT. The rules include:
combining selects with undef comparisons into their first selectee value,
and to combine away selects with constant comparisons. Patch additionally
adds a new combiner test for the AArch64 target to test these new G_SELECT
combiner rules and the existing select_same_val combiner rule.

Patch by  mkitzan
2020-08-27 09:40:15 -07:00
Shinji Okumura
1054ad3409 [Attributor] Add a phase flag to Attributor
Add a new flag that indicates which stage in the process we are in.
This flag is introduced for handling behavior of `getAAFor` according to the stage. (discussed in D86635)

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86678
2020-08-28 01:16:38 +09:00
Aditya Nandakumar
46055bcec5 [GISel]: Fix one more CSE Non determinism
https://reviews.llvm.org/D86676

Sometimes we can have the following code

 x:gpr(s32) = G_OP

Say we build G_OP2 to the same x and then delete the previous instruction. Using something like

 Register X = ...;
 auto NewMIB = CSEBuilder.buildOp2(X, ... args);

Currently there's a mismatch in how NewMIB is profiled and inserted into the CSEMap (ie it doesn't consider register bank/register class along with type).Unify the profiling by refactoring and calling the common method.

This was found by turning on the CSEInfo::verify in at the end of each of our GISel passes which turns inconsistent state/non determinism in CSEing into crashes which likely usually indicates missing calls to Observer on mutations (the most common case). Here non determinism usually means not cseing sometimes, but almost never about producing incorrect code.
Also this patch adds this verification at the end of the combiners as well.
2020-08-27 09:06:21 -07:00
Teresa Johnson
bce40acc54 [HeapProf] Clang and LLVM support for heap profiling instrumentation
See RFC for background:
http://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html

Note that the runtime changes will be sent separately (hopefully this
week, need to add some tests).

This patch includes the LLVM pass to instrument memory accesses with
either inline sequences to increment the access count in the shadow
location, or alternatively to call into the runtime. It also changes
calls to memset/memcpy/memmove to the equivalent runtime version.
The pass is modeled on the address sanitizer pass.

The clang changes add the driver option to invoke the new pass, and to
link with the upcoming heap profiling runtime libraries.

Currently there is no attempt to optimize the instrumentation, e.g. to
aggregate updates to the same memory allocation. That will be
implemented as follow on work.

Differential Revision: https://reviews.llvm.org/D85948
2020-08-27 08:50:35 -07:00
diggerlin
5d038d06f5 Revert "[AIX][XCOFF] emit symbol visibility for xcoff object file."
This reverts commit a0818689213234d5a078641432d10eccccf61a13.

Based on the Hubert Tong'comment  https://reviews.llvm.org/D84265#inline-799085
2020-08-27 11:07:58 -04:00
OCHyams
463930d1cb [DwarfDebug] Improve single location detection in validThroughout (2/4)
With this patch we're now accounting for two more cases which should be
considered 'valid throughout': First, where RangeEnd is ScopeEnd. Second, where
RangeEnd comes before ScopeEnd when including meta instructions, but are both
preceded by the same non-meta instruction.

CTMark shows a geomean binary size reduction of 1.5% for RelWithDebInfo builds.
`llvm-locstats` (using D85636) shows a very small variable location coverage
change in 2 of 10 binaries, but it is in the order of 10s of bytes which lines
up with my expectations.

I've added a test which checks both of these new cases. The first check in the
test isn't strictly necessary for this patch. But I'm not sure that it is
explicitly tested anywhere else, and is useful for the final patch in the
series.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D86151
2020-08-27 11:52:29 +01:00
Shinji Okumura
cc1db81009 [Attributor] Add flag for undef value to the state of AAPotentialValues
Currently, an undef value is reduced to 0 when it is added to a set of potential values.
This patch introduces a flag for under values. By this, for example, we can merge two states `{undef}`, `{1}` to `{1}` (because we can reduce the undef to 1).

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D85592
2020-08-27 16:30:29 +09:00
Jianzhou Zhao
0ded22b2ea Fix an overflow issue at BackpatchWord
This happens when generating a huge file by LTO, for example, with -gmlt.
When BitNo is > 2^35, ByteNo is overflowed, and an incorrect output offset is overwritten.
This generates ill-formed bitcodes.

Reviewed-by: tejohnson, vitalybuka

Differential Revision: https://reviews.llvm.org/D86645
2020-08-27 04:46:19 +00:00
Amy Kwan
daa2ee7b26 [PowerPC] Implement Vector Multiply High/Divide Extended Builtins in LLVM/Clang
This patch implements the function prototypes vec_mulh and vec_dive in order to
utilize the vector multiply high (vmulh[s|u][w|d]) and vector divide extended
(vdive[s|u][w|d]) instructions introduced in Power10.

Differential Revision: https://reviews.llvm.org/D82609
2020-08-26 23:14:34 -05:00
Matt Arsenault
a03ce058a1 GlobalISel: Add generic instructions for memory intrinsics
AArch64, X86 and Mips currently directly consumes these and custom
lowering to produce a libcall, but really these should follow the
normal legalization process through the libcall/lower action.
2020-08-26 20:08:45 -04:00
Lang Hames
02bae0a8ff [ORC][JITLink] Switch to unique ownership for EHFrameRegistrars.
This will make stateful registrars (e.g. a future TargetProcessControl based
registrar) easier to deal with.
2020-08-26 16:59:45 -07:00
Arthur Eubanks
d74ec65308 [ConstProp] Remove ConstantPropagation
As discussed in
http://lists.llvm.org/pipermail/llvm-dev/2020-July/143801.html.

Currently no users outside of unit tests.

Replace all instances in tests of -constprop with -instsimplify.
Notable changes in tests:
* vscale.ll - @llvm.sadd.sat.nxv16i8 is evaluated by instsimplify, use a fake intrinsic instead
* InsertElement.ll - insertelement undef is removed by instsimplify in @insertelement_undef
llvm/test/Transforms/ConstProp moved to llvm/test/Transforms/InstSimplify/ConstProp

Reviewed By: lattner, nikic

Differential Revision: https://reviews.llvm.org/D85159
2020-08-26 15:51:30 -07:00
Juneyoung Lee
ecc3e51e55 [IR] Remove noundef from masked store/load/gather/scatter's pointer operands
As discussed in D86576, noundef attribute is removed from masked store/load/gather/scatter's
pointer operands.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D86656
2020-08-27 06:41:43 +09:00
Wei Mi
d9e172d0fb [SampleFDO] Enhance profile remapping support for searching inline instance
and indirect call promotion candidate.

Profile remapping is a feature to match a function in the module with its
profile in sample profile if the function name and the name in profile look
different but are equivalent using given remapping rules. This is a useful
feature to keep the performance stable by specifying some remapping rules
when sampleFDO targets are going through some large scale function signature
change.

However, currently profile remapping support is only valid for outline
function profile in SampleFDO. It cannot match a callee with an inline
instance profile if they have different but equivalent names. We found
that without the support for inline instance profile, remapping is less
effective for some large scale change.

To add that support, before any remapping lookup happens, all the names
in the profile will be inserted into remapper and the Key to the name
mapping will be recorded in a map called NameMap in the remapper. During
name lookup, a Key will be returned for the given name and it will be used
to extract an equivalent name in the profile from NameMap. So with the help
of the NameMap, we can translate any given name to an equivalent name in
the profile if it exists. Whenever we try to match a name in the module to
a name in the profile, we will try the match with the original name first,
and if it doesn't match, we will use the equivalent name got from remapper
to try the match for another time. In this way, the patch can enhance the
profile remapping support for searching inline instance and searching
indirect call promotion candidate.

In a planned large scale change of int64 type (long long) to int64_t (long),
we found the performance of a google internal benchmark degraded by 2% if
nothing was done. If existing profile remapping was enabled, the performance
degradation dropped to 1.2%. If the profile remapping with the current patch
was enabled, the performance degradation further dropped to 0.14% (Note the
experiment was done before searching indirect call promotion candidate was
added. We hope with the remapping support of searching indirect call promotion
candidate, the degradation can drop to 0% in the end. It will be evaluated
post commit).

Differential Revision: https://reviews.llvm.org/D86332
2020-08-26 11:07:35 -07:00
Juneyoung Lee
5d4d13a2ae [IR] Add NoUndef attribute to Intrinsics.td
This patch adds NoUndef to Intrinsics.td.
The attribute is attached to llvm.assume's operand, because llvm.assume(undef)
is UB.
It is attached to pointer operands of several memory accessing intrinsics
as well.

This change makes ValueTracking::getGuaranteedNonPoisonOps' intrinsic check
unnecessary, so it is removed.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86576
2020-08-27 02:54:48 +09:00
Roman Lebedev
b6a0e78067 [Value][InstCombine] Fix one-use checks in PHI-of-op -> Op-of-PHI[s] transforms to be one-user checks
As FIXME said, they really should be checking for a single user,
not use, so let's do that. It is not *that* unusual to have
the same value as incoming value in a PHI node, not unlike
how a PHI may have the same incoming basic block more than once.

There isn't a nice way to do that, Value::users() isn't uniqified,
and Value only tracks it's uses, not Users, so the check is
potentially costly since it does indeed potentially involes
traversing the entire use list of a value.
2020-08-26 20:20:41 +03:00
Roman Lebedev
19662cd96c [NFC][Value] Fixup comments, "N users" is NOT the same as "N uses".
In those cases, it really means "N uses".
2020-08-26 20:20:41 +03:00
Kai Nacke
2ad3f5c93b [SystemZ/ZOS] Add header file to encapsulate use of <sysexits.h>
The non-standard header file `<sysexits.h>` provides some return values.
`EX_IOERR` is used to as a special value to signal a broken pipe to the clang driver.
On z/OS Unix System Services, this header file does not exists. This patch

- adds a check for `<sysexits.h>`, removing the dependency on `LLVM_ON_UNIX`
- adds a new header file `llvm/Support/ExitCodes`, which either includes
  `<sysexits.h>` or defines `EX_IOERR`
- updates the users of `EX_IOERR` to include the new header file

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D83472
2020-08-26 12:44:30 -04:00
Sjoerd Meijer
20938e0e91 [LV] Fallback strategies if tail-folding fails
This implements 2 different vectorisation fallback strategies if tail-folding
fails: 1) don't vectorise at all, or 2) vectorise using a scalar epilogue. This
can be controlled with option -prefer-predicate-over-epilogue, that has been
changed to take a numeric value corresponding to the tail-folding preference
and preferred fallback.

Patch by: Pierre van Houtryve, Sjoerd Meijer.

Differential Revision: https://reviews.llvm.org/D79783
2020-08-26 16:55:25 +01:00
Dibya Ranjan Mishra
ef1ac54803 [Support] Allow printing the stack trace only for a given depth
Differential Revision: https://reviews.llvm.org/D85458
2020-08-26 09:27:42 -04:00
Matt Arsenault
0ce967f763 GlobalISel: Combine G_ADD of G_PTRTOINT to G_PTR_ADD
This produces less work for addressing mode matching. I think this is
safe since I don't think machine IR is supposed to give the same
aliasing properties as getelementptr in the IR.
2020-08-26 08:57:15 -04:00
Xing GUO
d25b99b1d8 [DWARFYAML] Make the unit_length and header_length fields optional.
This patch makes the unit_length and header_length fields of line tables
optional. yaml2obj is able to infer them for us.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D86590
2020-08-26 20:35:10 +08:00
QingShan Zhang
48664955df [Scheduling] Implement a new way to cluster loads/stores
Before calling target hook to determine if two loads/stores are clusterable,
we put them into different groups to avoid fake cluster due to dependency.
For now, we are putting the loads/stores into the same group if they have
the same predecessor. We assume that, if two loads/stores have the same
predecessor, it is likely that, they didn't have dependency for each other.

However, one SUnit might have several predecessors and for now, we just
pick up the first predecessor that has non-data/non-artificial dependency,
which is too arbitrary. And we are struggling to fix it.

So, I am proposing some better implementation.
1. Collect all the loads/stores that has memory info first to reduce the complexity.
2. Sort these loads/stores so that we can stop the seeking as early as possible.
3. For each load/store, seeking for the first non-dependency instruction with the
   sorted order, and check if they can cluster or not.

Reviewed By: Jay Foad

Differential Revision: https://reviews.llvm.org/D85517
2020-08-26 12:33:59 +00:00
Georgii Rymar
70f1e167ac [llvm/Object] - Make dyn_cast<XCOFFObjectFile> work as it should.
Currently, `dyn_cast<XCOFFObjectFile>` always does cast and returns a pointer,
even when we pass `ELF`/`Wasm`/`Mach-O` or `COFF` instead of `XCOFF`.

It happens because `XCOFFObjectFile` class does not implement `classof`.
I've fixed it and added a unit test.

Differential revision: https://reviews.llvm.org/D86542
2020-08-26 15:09:55 +03:00
Gabriel Hjort Åkerlund
6b25df96c2 [GlobalISel] Fix and tidy up documentation for ValueMapping class (NFC)
The documentation was missing a '*/' in '/*<2x32-bit> vadd {0, 64, VPR}',
and the example code are now aligned to improve readability.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D86201
2020-08-26 12:09:01 +02:00
sstefan1
d40ac7afef Reland [IR] Intrinsics default attributes and opt-out flag
Intrinsic properties can now be set to default and applied to all
intrinsics. If the attributes are not needed, the user can opt-out by
setting the DisableDefaultAttributes flag to true.

Differential Revision: https://reviews.llvm.org/D70365
2020-08-26 11:37:59 +02:00