Finally figured out that some bots were failing from r308114
with the message:
llvm-lto2: LTO::run failed: No available targets are compatible with this triple.
after adding in some other checking that finally caused this to show up
in the FileCheck output.
Added "REQUIRES: x86-registered-target" which should fix it.
llvm-svn: 308119
This restores r308078/r308079 with a fix for bot non-determinisim (make
sure we run llvm-lto in single threaded mode so the debug output doesn't get
interleaved).
llvm-svn: 308114
Summary:
Currently these methods call ConstantDataVector::getSplatValue which uses getElementsAsConstant to create a Constant object representing the element value. This method incurs a map lookup to see if we already have created such a Constant before and if not allocates a new Constant object.
This patch changes these methods to use getElementAsAPFloat and getElementAsInteger so we can just examine the data values directly.
Reviewers: spatel, pcc, dexonsmith, bogner, craig.topper
Reviewed By: craig.topper
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35040
llvm-svn: 308112
Summary:
If one side simplifies to the identity value for inner opcode, we can replace the value with just the operation that can't be simplified.
I've removed a couple now unneeded special cases in visitAnd and visitOr. There are probably other cases I missed.
Reviewers: spatel, majnemer, hfinkel, dberlin
Reviewed By: spatel
Subscribers: grandinj, llvm-commits, spatel
Differential Revision: https://reviews.llvm.org/D35451
llvm-svn: 308111
This fold hit the trifecta:
1. It was untested.
2. It oversteps (multiuse is not checked, so increases instruction count).
3. It is incomplete (doesn't work for vectors).
llvm-svn: 308102
that appears to exhibit non-determinism and is flaking on the bots
pretty consistently.
r308078: [ThinLTO] Ensure we always select the same function copy to import
r308079: Require asserts in new test that uses debug flag
llvm-svn: 308095
objects to start at the same address
As discussed on the ML, there's consensus that this is what the implementations
do and it seems sensible.
llvm-svn: 308090
function to every defined function known to LLVM as a library function.
LLVM can introduce calls to these functions either by replacing other
library calls or by recognizing patterns (such as memset_pattern or
vector math patterns) and replacing those with calls. When these library
functions are actually defined in the module, we need to have reference
edges to them initially so that we visit them during the CGSCC walk in
the right order and can effectively rebuild the call graph afterward.
This was discovered when building code with Fortify enabled as that is
a common case of both inline definitions of library calls and
simplifications of code into calling them.
This can in extreme cases of LTO-ing with libc introduce *many* more
reference edges. I discussed a bunch of different options with folks but
all of them are unsatisfying. They either make the graph operations
substantially more complex even when there are *no* defined libfuncs, or
they introduce some other complexity into the callgraph. So this patch
goes with the simplest possible solution of actual synthetic reference
edges. If this proves to be a memory problem, I'm happy to implement one
of the clever techniques to save memory here.
llvm-svn: 308088
If the `long-calls` feature flags is enabled, disable use of the `jal`
instruction. Instead of that call a function by by first loading its
address into a register, and then using the contents of that register.
Differential revision: https://reviews.llvm.org/D35168
llvm-svn: 308087
The type needs to be casted back to the original argument type.
Fixes an assert that for some reason is only run when
using -debug.
Includes an additional combine to avoid test regressions
from having conversions mixed with multiple Assert[SZ]ext
nodes. On subtargets where i16 is legal, this was producing an i32
register with an i16 AssertZExt, truncated to i16 with another i8
AssertZExt.
t2: i32,ch = CopyFromReg t0, Register:i32 %vreg0
t3: i16 = truncate t2
t5: i16 = AssertZext t3, ValueType:ch:i8
t6: i8 = truncate t5
t7: i32 = zero_extend t6
llvm-svn: 308082
Currently, for code like below,
===
inner_map = bpf_map_lookup_elem(outer_map, &port_key);
if (!inner_map) {
inner_map = &fallback_map;
}
===
the compiler generates (pseudo) code like the below:
===
I1: r1 = bpf_map_lookup_elem(outer_map, &port_key);
I2: r2 = 0
I3: if (r1 == r2)
I4: r6 = &fallback_map
I5: ...
===
During kernel verification process, After I1, r1 holds a state
map_ptr_or_null. If I3 condition is not taken
(path [I1, I2, I3, I5]), supposedly r1 should become map_ptr.
Unfortunately, kernel does not recognize this pattern
and r1 remains map_ptr_or_null at insn I5. This will cause
verificaiton failure later on.
Kernel, however, is able to recognize pattern "if (r1 == 0)"
properly and give a map_ptr state to r1 in the above case.
LLVM here generates suboptimal code which causes kernel verification
failure. This patch fixes the issue by changing BPF insn pattern
matching and lowering to generate proper codes if the righthand
parameter of the above condition is a constant. A test case
is also added.
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 308080
Summary:
Check if the first eligible callee is under the instruction threshold.
Checking this on the first eligible callee ensures that we don't end
up selecting different callees to import when we invoke this routine
with different thresholds due to reaching the callee via paths that
are shallower or hotter (when there are multiple copies, i.e. with
weak or linkonce linkage). We don't want to leave the decision of which
copy to import up to the backend.
Reviewers: mehdi_amini
Subscribers: inglorion, fhahn, llvm-commits
Differential Revision: https://reviews.llvm.org/D35436
llvm-svn: 308078
Now, getUserCost() only checks the src and dst types of EXT to decide it is free
or not. This change first checks the types, then calls isExtFreeImpl(), and
check if EXT can form ExtLoad at last. Currently, only AArch64 has customized
implementation of isExtFreeImpl() to check if EXT can be folded into its use.
Differential Revision: https://reviews.llvm.org/D34458
llvm-svn: 308076
This fixes a minor bug in insertion to a reachable node that caused
DominatorTree.InsertDeleteExhaustive flakiness. The patch also adds
a new testcase for this exact failure.
llvm-svn: 308074
The DominatorTree.InsertDeleteExhaustive uses a RNG with a
constant seed to generate different sequences of updates. The test
fails on some buildbots and this patch disables it for now.
llvm-svn: 308070
With this change, libFuzzer will ignore any arguments after a sigil
argument, but it will preserve these arguments at the end of the
command line when launching subprocesses. Using this, its possible to
handle positional and single-dash arguments to the program under test
by discarding everything up to -ignore_remaining_args=1 in
LLVMFuzzerInitialize.
llvm-svn: 308069
Summary:
This patch implements incremental edge deletions.
It also makes DominatorTreeBase store a pointer to the parent function. The parent function is needed to perform full rebuilts during some deletions, but it is also used to verify that inserted and deleted edges come from the same function.
Reviewers: dberlin, davide, grosser, sanjoy, brzycki
Reviewed By: dberlin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35342
llvm-svn: 308062
Restricting register class to PointerRegClass for memory operands.
Also fix the PointerRegClass for AArch64 from GPR64 to GPR64sp, since
XZR cannot hold a memory pointer while SP is.
Fixes PR33134.
Differential Revision: https://reviews.llvm.org/D34999
llvm-svn: 308060
Summary:
This patch is the first step in reducing HW prefetcher instruction tag
collisions in inner loops for Falkor. It adds a pass that annotates IR
loads with metadata to indicate that they are known to be strided loads,
and adds a target lowering hook that translates this metadata to a
target-specific MachineMemOperand flag.
A follow on change will use this MachineMemOperand flag to re-write
instructions to reduce tag collisions.
Reviewers: mcrosier, t.p.northover
Subscribers: aemerson, rengolin, mgorny, javed.absar, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D34963
llvm-svn: 308059
Summary:
This patch introduces incremental edge insertions based on the Depth Based Search algorithm.
Insertions should work for both dominators and postdominators.
Reviewers: dberlin, grosser, davide, sanjoy, brzycki
Reviewed By: dberlin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D35341
llvm-svn: 308054
Summary:
When checking for memory dependencies between calls using MemorySSA,
handle cases where the calls have no MemoryAccess associated with them
because the AA analysis being used has determined that the call does not
read/write memory.
Fixes PR33756
Reviewers: dberlin, davide
Subscribers: mcrosier, llvm-commits, Prazek
Differential Revision: https://reviews.llvm.org/D35317
llvm-svn: 308051
Add the following pattern to TryToUnfoldSelectInCurrBB()
bb:
%p = phi [0, %bb1], [1, %bb2], [0, %bb3], [1, %bb4], ...
%c = cmp %p, 0
%s = select %c, trueval, falseval
The Select in the above pattern will be unfolded and then jump-threaded. The
current implementation does not allow CMP in the middle of PHI and Select.
Differential Revision: https://reviews.llvm.org/D34762
llvm-svn: 308050