whether functions are removed, and fix the new PM's always inliner to
actually pass this test.
Without this, the new PM's always inliner leaves all the functions
kicking around which won't work out very well given the semantics of
always inline.
Doing this really highlights how frustrating the current alwaysinline
semantic contract is though -- why can we put it on *external*
functions, etc?
Also I've added a number of tricky and interesting test cases for
removing functions with the always inliner. There is one remaining case
not handled -- fully removing comdats -- and I've left a FIXME about
this.
llvm-svn: 290457
Pretty boring and lame as-is but necessary. This is definitely a place
we'll end up with extension hooks longer term. =]
Differential Revision: https://reviews.llvm.org/D28076
llvm-svn: 290449
The pass creates some state which expects to be cleaned up by
a later instance of the same pass. opt-bisect happens to expose
this not ideal design because calling skipLoop() will result in
this state not being cleaned up at times and an assertion firing
in `doFinalization()`. Chandler tells me the new pass manager will
give us options to avoid these design traps, but until it's not ready,
we need a workaround for the current pass infrastructure. Fix provided
by Andy Kaylor, see the review for a complete discussion.
Differential Revision: https://reviews.llvm.org/D25848
llvm-svn: 290427
According to the Cortex-A57 doc, FDIV/FSQRT instructions should use F0 unit
(W-unit in AArch64SchedA57.td, the same as cryptography instructions),
not F1 unit (X-unit in td, like ASIMD absolute diff accum SABA/UABA).
This patch changes FDIV/FSQRT scheduling declarations to use A57UnitW
instead of A57UnitX. Also, latencies for those instructions are
corrected.
Patch by Andrew Zhogin.
llvm-svn: 290426
Summary:
In mergeSPUpdates, debug values need to be ignored when getting the
previous element, otherwise debug data could have an impact on codegen.
In eliminateCallFramePseudoInstr, debug values after the erased element
could have an impact on codegen and should be skipped.
Closes PR31319 (https://llvm.org/bugs/show_bug.cgi?id=31319)
Reviewers: mkuper, MatzeB, aprantl
Subscribers: gbedwell, llvm-commits
Differential Revision: https://reviews.llvm.org/D27688
llvm-svn: 290423
1.Fix pessimized case in FIXME.
2.Add tests for it.
3.The canonicalisation on shifts results in different sequence for
tests of machine-licm.Correct some check lines.
Differential Revision: https://reviews.llvm.org/D27916
llvm-svn: 290410
This patch fixes some ASAN unittest failures on FreeBSD. See the
cfe-commits email thread for r290169 for more on those.
According to the LangRef, the allocsize attribute only tells us about
the number of bytes that exist at the memory location pointed to by the
return value of a function. It does not necessarily mean that the
function will only ever allocate. So, we need to be very careful about
treating functions with allocsize as general allocation functions. This
patch makes us fully conservative in this regard, though I suspect that
we have room to be a bit more aggressive if we want.
This has a FIXME that can be fixed by a relatively straightforward
refactor; I just wanted to keep this patch minimal. If this sticks, I'll
come back and fix it in a few days.
llvm-svn: 290397
Summary:
This change rewrites a core component in the ImplicitNullChecks pass for
greater simplicity since the original design was over-complicated for no
good reason. Please review this as essentially a new pass. The change
is almost NFC and I've added a test case for a scenario that this new
code handles that wasn't handled earlier.
The implicit null check pass, at its core, is a code hoisting transform.
It differs from "normal" code transforms in that it speculates
potentially faulting instructions (by design), but a lot of the usual
hazard detection logic (register read-after-write etc.) still applies.
We previously detected hazards by keeping track of registers defined and
used by machine instructions over an instruction range, but that was
unwieldy and did not actually confer any performance benefits. The
intent was to have linear time complexity over the number of machine
instructions considered, but it ended up being N^2 is practice.
This new version is more obviously O(N^2) (with N capped to 8 by
default) in hazard detection. It does not attempt to be clever in
tracking register uses or defs (the previous cleverness here was a
source of bugs).
Once this is checked in, I'll extract out the `IsSuitableMemoryOp` and
`CanHoistLoadInst` lambda into member functions (they're too complicated
to be inline lambdas) and do some other related NFC cleanups.
Reviewers: reames, anna, atrick
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D27592
llvm-svn: 290394
This patch adds support for YAML<->DWARF for debug_info sections.
This re-lands r290147, reverted in 290148, re-landed in r290204 after fixing the issue that caused bots to fail (thank you UBSan!), and reverted again in r290209 due to failures on big endian systems.
After adding support for preserving endianness, this should be good now.
llvm-svn: 290386
Use a dummy private function with inline asm calls instead of module
level asm blocks for CFI jumptables.
The main advantage is that now jumptable codegen can be affected by
the function attributes (like target_cpu on ARM). Module level asm
gets the default subtarget based on the target triple, which is often
not good enough.
This change also uses asm constraints/arguments to reference
jumptable targets and aliases directly. We no longer do asm name
mangling in an IR pass.
Differential Revision: https://reviews.llvm.org/D28012
llvm-svn: 290384
We used to not check generic vregs, but that is actually a mistake given
nothing in the GlobalISel pipeline is going to fix the constraints on
target specific instructions. Therefore, the target has to have them
right from the start.
llvm-svn: 290380
Target specific instructions have requirements that are not compatible
with what we want to test here. Namely, target specific instructions
must have their operands properly mapped on register classes.
llvm-svn: 290379
The InstructionSelect pass will not look at target specific instructions
since they are already selected. As a result, the operands of target
specific instructions must be properly constrained, because it is not
going to fix them.
This fixes invalid register classes on call instruction.
llvm-svn: 290377
When generic virtual registers get constrained, because of a use on a
target specific operation for instance, we end up with regular virtual
registers with a type and that's perfectly fine.
llvm-svn: 290376
This is going to be needed to be able to constraint register class on
target specific instruction while the RegBankSelect pass did not run
yet.
llvm-svn: 290375
Move the logic to constraint register from InstructionSelector to a
utility function. It will be required by other passes in the GlobalISel
pipeline.
llvm-svn: 290374
Canonicalize a select with a constant to the false side. This
enables more instruction shrinking opportunities since an
inline immediate can be used for the false side of v_cndmask_b32_e32.
This seems to usually be better but causes some code size regressions
in some tests.
llvm-svn: 290372
This is a succeeding patch of https://reviews.llvm.org/D22840 to address the
issue when a value to be merged into an int64 pair is in a different BB. Redoing
the store splitting in CodeGenPrepare so we can match the pattern across multiple
BBs and move some instructions into the same BB. We still keep the code in dag
combine so that we can catch cases that show up after DAG combining runs.
Differential Revision: https://reviews.llvm.org/D25914
llvm-svn: 290365
This is for splitMergedValStore in DAG Combine to share the target query interface
with similar logic in CodeGenPrepare.
Differential Revision: https://reviews.llvm.org/D24707
llvm-svn: 290363