We were previously doing a post-order traversal and operating on the
list in reverse, however this would occasionaly cause backedges for
loops to be visited before some of the other blocks in the loop.
We know use a reverse post-order traversal, which avoids this issue.
The reverse post-order traversal is not completely ideal, so we need
to manually fixup the list to ensure that inner loop backedges are
visited before outer loop backedges.
llvm-svn: 228186
Track unresolved nodes under distinct `MDNode`s during `MapMetadata()`,
and resolve them at the end. Previously, these cycles wouldn't get
resolved.
llvm-svn: 228180
In case CSE reuses a previoulsy unused register the dead-def flag has to
be cleared on the def operand, as exposed by the arm64-cse.ll test.
This fixes PR22439 and the corresponding rdar://19694987
Differential Revision: http://reviews.llvm.org/D7395
llvm-svn: 228178
Summary:
This change allows users to create SpecialCaseList objects from
multiple local files. This is needed to implement a proper support
for -fsanitize-blacklist flag (allow users to specify multiple blacklists,
in addition to default blacklist, see PR22431).
DFSan can also benefit from this change, as DFSan instrumentation pass now
accepts ABI-lists both from -fsanitize-blacklist= and -mllvm -dfsan-abilist flags.
Go bindings are fixed accordingly.
Test Plan: regression test suite
Reviewers: pcc
Subscribers: llvm-commits, axw, kcc
Differential Revision: http://reviews.llvm.org/D7367
llvm-svn: 228155
This is a bug that was caused due to storing the feature bitset in a 32-bit
variable when it is a 64-bit mask, discarding the top half of the feature set.
llvm-svn: 228151
Currently, Cortex-A72 is modelled as an Cortex-A57 except the fp
load balancing pass isn't enabled for Cortex-A72 as it's not
profitable to have it enabled for this core.
Patch by Ranjeet Singh.
llvm-svn: 228140
This associates movss and movsd with the packed single and packed double
execution domains (resp.). While this is largely cosmetic, as we now
don't have weird ping-pong-ing between single and double precision, it
is also useful because it avoids the domain fixing algorithm from seeing
domain breaks that don't actually exist. It will also be much more
important if we have an execution domain default other than packed
single, as that would cause us to mix movss and movsd with integer
vector code on a regular basis, a very bad mixture.
llvm-svn: 228135
version of the script.
Changes include:
- Using the VEX prefix
- Skipping more detail when we have useful shuffle comments to match
- Matching more shuffle comments that have been added to the printer
(yay!)
- Matching the destination registers of some AVX instructions
- Stripping trailing whitespace that crept in
- Fixing indentation issues
Nothing interesting going on here. I'm just trying really hard to ensure
these changes don't show up in the diffs with actual changes to the
backend.
llvm-svn: 228132
This is done in a bit of a strange way to use a multiline RE instead of
looping over the lines. Suggestions welcome here for a more pythonic way
of doing this as long as its reasonably fast.
llvm-svn: 228131
This reverts patches 223862, 224198, 224203, and 224754, which were all
related to the vector load/store combining and were reverted/reaplied
a few times due to the same alignment problems we're seeing now.
Further tests, mainly self-hosting Clang, will be needed to reapply this
patch in the future.
llvm-svn: 228129
zero for v8i16 as well.
These exhibit the same domain badness, but also exhibit other weaknesses
in our blend lowering. More fixes to come.
llvm-svn: 228126
This is the simplest form of bit-math based blending which only fires
when we are blending with zero and is relatively profitable. I've only
enabled this path on very specific lowering strategies. I'm planning to
widen its applicability in subsequent patches, but so far you'll notice
that even though we get fewer shufps instructions, we *still* do the bit
math in the FP execution port. I'm looking into why this is still
happening.
llvm-svn: 228124
Specifically, the existing patterns were scalar-only. These cover the
packed vector bitwise operations when specifically requested with pseudo
instructions. This is particularly important in SSE1 where we can't
actually emit a logical operation on a v2i64 as that isn't a legal type.
This will be tested in subsequent patches which form the floating point
and patterns in more places.
llvm-svn: 228123
The ARM assembler allows register alias redefinitions as long as it
targets the same register. r222319 broke that. In the AArch64 case
it would just produce a new warning, but in the ARM case it would
error out on previously accepted assembler.
llvm-svn: 228109
update_llc_test_checks.py.
The exact format of the checks has changed over time. This includes
different indenting rules, new shuffle comments that have been added,
and more operand hiding behind regular expressions.
No functional change to the tests are expected here, but this will make
subsequent patches have a clean diff as they change shuffle lowering.
llvm-svn: 228097
update_llc_test_checks.py script uses, and refresh the checks in this
test.
No functionality changed here, just bringing this test up to work with
automated updates using the python script.
llvm-svn: 228096
This will make it easy to update as I change some parts of the X86
backend, makes it more clear what instruction differences are
introduced, and I find it makes it a bit easier to read as well.
llvm-svn: 228095
This pass is responsible for figuring out where to place call safepoints and safepoint polls. It doesn't actually make the relocations explicit; that's the job of the RewriteStatepointsForGC pass (http://reviews.llvm.org/D6975).
Note that this code is not yet finalized. Its moving in tree for incremental development, but further cleanup is needed and will happen over the next few days. It is not yet part of the standard pass order.
Planned changes in the near future:
- I plan on restructuring the statepoint rewrite to use the functions add to the IRBuilder a while back.
- In the current pass, the function "gc.safepoint_poll" is treated specially but is not an intrinsic. I plan to make identifying the poll function a property of the GCStrategy at some point in the near future.
- As follow on patches, I will be separating a collection of test cases we have out of tree and submitting them upstream.
- It's not explicit in the code, but these two patches are introducing a new state for a statepoint which looks a lot like a patchpoint. There's no a transient form which doesn't yet have the relocations explicitly represented, but does prevent reordering of memory operations. Once this is in, I need to update actually make this explicit by reserving the 'unused' argument of the statepoint as a flag, updating the docs, and making the code explicitly check for such a thing. This wasn't really planned, but once I split the two passes - which was done for other reasons - the intermediate state fell out. Just reminds us once again that we need to merge statepoints and patchpoints at some point in the not that distant future.
Future directions planned:
- Identifying more cases where a backedge safepoint isn't required to ensure timely execution of a safepoint poll.
- Tweaking the insertion process to generate easier to optimize IR. (For example, investigating making SplitBackedge) the default.
- Adding opt-in flags for a GCStrategy to use this pass. Once done, add this pass to the actual pass ordering.
Differential Revision: http://reviews.llvm.org/D6981
llvm-svn: 228090