DeadArgumentElimination pass can replace one LLVM function with another,
invalidating a pointer stored in debug info metadata entry for this function.
To fix this, we collect debug info descriptors for functions before
running a DeadArgumentElimination pass and "patch" pointers in metadata nodes
if we replace a function.
llvm-svn: 165490
When the CFG contains a loop with multiple entry blocks, the traces
computed by MachineTraceMetrics don't always have the same nice
properties. Loop back-edges are normally excluded from traces, but
MachineLoopInfo doesn't recognize loops with multiple entry blocks, so
those back-edges may be included.
Avoid asserting when that happens by adding an isEarlierInSameTrace()
function that accurately determines if a dominating block is part of the
same trace AND is above the currrent block in the trace.
llvm-svn: 165434
Vector compare using altivec 'vcmpxxx' instructions have as third argument
a vector register instead of CR one, different from integer and float-point
compares. This leads to a failure in code generation, where 'SelectSETCC'
expects a DAG with a CR register and gets vector register instead.
This patch changes the behavior by just returning a DAG with the
vector compare instruction based on the type. The patch also adds a testcase
for all vector types llvm defines.
It also included a fix on signed 5-bits predicates printing, where
signed values were not handled correctly as signed (char are unsigned by
default for PowerPC). This generates 'vspltisw' (vector splat)
instruction with SIM out of range.
llvm-svn: 165419
are in fact identity operations. We detect these and kill their
partitions so that even splitting is unaffected by them. This is
particularly important because Clang relies on emitting identity memcpy
operations for struct copies, and these fold away to constants very
often after inlining.
Fixes the last big performance FIXME I have on my plate.
llvm-svn: 165285
Make sure functions located in user specified text sections (via the
section attribute) are located together with the default text sections.
Otherwise, for large object files, the relocations for call instructions
are more likely to be out of range. This becomes even more likely in the
presence of LTO.
rdar://12402636
llvm-svn: 165254
a) frame setup instructions define the prologue
b) we shouldn't change our location mid-stream
Add a test to make sure that the stack adjustment stays within
the prologue.
llvm-svn: 165250
We conservatively only check the first use to avoid walking long use chains.
This catches the common case of having both a load and a store to a pointer
supplied by a PHI node.
llvm-svn: 165232
cpyDest can be mutated in some cases, which would then cause a crash later if
indeed the memory was underaligned. This brought down several buildbots, so
I guess the underaligned case is much more common than I thought!
llvm-svn: 165228
Currently, we re-visit allocas when something changes about the way they
might be *split* to allow better scalarization to take place. However,
we weren't handling the case when the *promotion* is what would change
the behavior of SROA. When an address derived from an alloca is stored
into another alloca, we consider the first to have escaped. If the
second is ever promoted to an SSA value, we will suddenly be able to run
the SROA pass on the first alloca.
This patch adds explicit support for this form if iteration. When we
detect a store of a pointer derived from an alloca, we flag the
underlying alloca for reprocessing after promotion. The logic works hard
to only do this when there is definitely going to be promotion and it
might remove impediments to the analysis of the alloca.
Thanks to Nick for the great test case and Benjamin for some sanity
check review.
llvm-svn: 165223
was less aligned than the old. In the testcase this results in an overaligned
memset: the memset alignment was correct for the original memory but is too much
for the new memory. Fix this by either increasing the alignment of the new
memory or bailing out if that isn't possible. Should fix the gcc-4.7 self-host
buildbot failure.
llvm-svn: 165220
Sorry for this being broken so long. =/
As part of this, switch all of the existing tests to be Little Endian,
which is the behavior I was asserting in them anyways! Add in a new
big-endian test that checks the interesting behavior there.
Another part of this is to tighten the rules abotu when we perform the
full-integer promotion. This logic now rejects cases where there fully
promoted integer is a non-multiple-of-8 bitwidth or cases where the
loads or stores touch bits which are in the allocated space of the
alloca but are not loaded or stored when accessing the integer. Sadly,
these aren't really observable today as the rest of the pass will
already ensure the invariants hold. However, the latter situation is
likely to become a potential concern in the future.
Thanks to Benjamin and Duncan for early review of this patch. I'm still
looking into whether there are further endianness issues, please let me
know if anyone sees BE failures persisting past this.
llvm-svn: 165219
macro instruction (li) in the assembler.
We have identified three possible expansions depending on
the size of immediate operand:
1) for 0 ≤ j ≤ 65535.
li d,j =>
ori d,$zero,j
2) for −32768 ≤ j < 0.
li d,j =>
addiu d,$zero,j
3) for any other value of j that is representable as a 32-bit integer.
li d,j =>
lui d,hi16(j)
ori d,d,lo16(j)
All of the above have been implemented in ths patch.
Contributer: Vladimir Medic
llvm-svn: 165199
.set option
The patch implements following options
at - lets the assembler use the $at register for macros,
but generates warnings if the source program uses $at
noat - let source programs use $at without issuingwarnings.
noreorder - prevents the assembler from reordering machine
language instructions.
nomacro - causes the assembler to print a warning whenever
an assembler operation generates more than one
machine language instruction.
macro - lets the assembler generate multiple machine instructions
from a single assembler instruction
reorder - lets the assembler reorder machine language
instructions to improve performance
The above variants are parsed and their boolean values set or unset.
The code to actually use them will come later.
Following options are not implemented yet:
nomips16
nomicromips
move
nomove
Contributer: Vladimir Medic
llvm-svn: 165194
in the Intel syntax.
The MC layer supports emitting in the Intel syntax, but this would require the
inline assembly MachineInstr to be lowered to an MCInst before emission. This
is potential future work, but for now emitting directly from the MachineInstr
suffices.
llvm-svn: 165173
multiple stores with a single load. We create the wide loads and stores (and their chains)
before we remove the scalar loads and stores and fix the DAG chain. We attempted to merge
loads with a different chain. When that happened, the assumption that it is safe to RAUW
broke and a cycle was introduced.
llvm-svn: 165148
is not profitable in many cases because modern processors perform multiple stores
in parallel and merging stores prior to merging requires extra work. We handle two main cases:
1. Store of multiple consecutive constants:
q->a = 3;
q->4 = 5;
In this case we store a single legal wide integer.
2. Store of multiple consecutive loads:
int a = p->a;
int b = p->b;
q->a = a;
q->b = b;
In this case we load/store either ilegal vector registers or legal wide integer registers.
llvm-svn: 165125
a memcpy to reflect that '0' has a different meaning when applied to
a load or store. Now we correctly use underaligned loads and stores for
the test case added.
llvm-svn: 165101
necessary during rewriting. As part of this, fix a real think-o here
where we might have left off an alignment specification when the address
is in fact underaligned. I haven't come up with any way to trigger this,
as there is always some other factor that reduces the alignment, but it
certainly might have been an observable bug in some way I can't think
of. This also slightly changes the strategy for placing explicit
alignments on loads and stores to only do so when the alignment does not
match that required by the ABI. This causes a few redundant alignments
to go away from test cases.
I've also added a couple of tests that really push on the alignment that
we end up with on loads and stores. More to come here as I try to fix an
underlying bug I have conjectured and produced test cases for, although
it's not clear if this bug is the one currently hitting dragonegg's
gcc47 bootstrap.
llvm-svn: 165100
Enable the pass by default for targets that request it, and change the
-enable-early-ifcvt to the opposite -disable-early-ifcvt.
There are still some x86 regressions when enabling early if-conversion
because of the missing machine models. Disable the pass for x86 until
machine models are added.
llvm-svn: 165075
X86DAGToDAGISel::PreprocessISelDAG(), isel is moving load inside
callseq_start / callseq_end so it can be folded into a call. This can
create a cycle in the DAG when the call is glued to a copytoreg. We
have been lucky this hasn't caused too many issues because the pre-ra
scheduler has special handling of call sequences. However, it has
caused a crash in a specific tailcall case.
rdar://12393897
llvm-svn: 165072
scheduled for processing on the worklist eventually gets deleted while
we are processing another alloca, fixing the original test case in
PR13990.
To facilitate this, add a remove_if helper to the SetVector abstraction.
It's not easy to use the standard abstractions for this because of the
specifics of SetVectors types and implementation.
Finally, a nice small test case is included. Thanks to Benjamin for the
fantastic reduced test case here! All I had to do was delete some empty
basic blocks!
llvm-svn: 165065
JoinVals::pruneValues() calls LIS->pruneValue() to avoid conflicts when
overlapping two different values. This produces a set of live range end
points that are used to reconstruct the live range (with SSA update)
after joining the two registers.
When a value is pruned twice, the set of end points was insufficient:
v1 = DEF
v1 = REPLACE1
v1 = REPLACE2
KILL v1
The end point at KILL would only reconstruct the live range from
REPLACE2 to KILL, leaving the range REPLACE1-REPLACE2 dead.
Add REPLACE2 as an end point in this case so the full live range is
reconstructed.
This fixes PR13999.
llvm-svn: 165056
This adds 'elf' as a recognized target triple environment value and overrides the default generated object format on Windows platforms if that value is present. This patch also enables MCJIT tests on Windows using the new environment value.
llvm-svn: 165030