and better control the abstraction. Rename the type
to MVT. To update out-of-tree patches, the main
thing to do is to rename MVT::ValueType to MVT, and
rewrite expressions like MVT::getSizeInBits(VT) in
the form VT.getSizeInBits(). Use VT.getSimpleVT()
to extract a MVT::SimpleValueType for use in switch
statements (you will get an assert failure if VT is
an extended value type - these shouldn't exist after
type legalization).
This results in a small speedup of codegen and no
new testsuite failures (x86-64 linux).
llvm-svn: 52044
over-shift-right should return -1. So here it should be signed-extended,
when bitwidth larger than 64.
test case: llvm/test/ExecutionEngine/2008-06-05-APInt-OverAShr.ll
llvm-svn: 51999
work and how to replace them into individual values. Also, when trying to
replace an aggregrate that is used by load or store with a single (large)
integer, don't crash (but don't replace the aggregrate either).
Also adds a testcase for both structs and arrays.
llvm-svn: 51997
are the same as in unpacked structs, only field
positions differ. This only matters for structs
containing x86 long double or an apint; it may
cause backwards compatibility problems if someone
has bitcode containing a packed struct with a
field of one of those types.
The issue is that only 10 bytes are needed to
hold an x86 long double: the store size is 10
bytes, but the ABI size is 12 or 16 bytes (linux/
darwin) which comes from rounding the store size
up by the alignment. Because it seemed silly not
to pack an x86 long double into 10 bytes in a
packed struct, this is what was done. I now
think this was a mistake. Reserving the ABI size
for an x86 long double field even in a packed
struct makes things more uniform: the ABI size is
now always used when reserving space for a type.
This means that developers are less likely to
make mistakes. It also makes life easier for the
CBE which otherwise could not represent all LLVM
packed structs (PR2402).
Front-end people might need to adjust the way
they create LLVM structs - see following change
to llvm-gcc.
llvm-svn: 51928
and insertvalue and extractvalue instructions.
First-class array values are not trivial because C doesn't
support them. The approach I took here is to wrap all arrays
in structs. Feedback is welcome.
The 2007-01-15-NamedArrayType.ll test needed to be modified
because it has a "not grep" for a string that now exists,
because array types now have associated struct types, and
those struct types have names.
llvm-svn: 51881
is longer than the second one) should stop after finding one. Added break
instruction guarantees it. It also changes difference between offsets to
absolute value of this difference in the condition.
llvm-svn: 51875
out of instcombine into a new file in libanalysis. This also teaches
ComputeNumSignBits about the number of sign bits in a constantint.
llvm-svn: 51863
the conditions for performing the transform when only the
function declaration is available: no longer allow turning
i32 into i64 for example. Only allow changing between
pointer types, and between pointer types and integers of
the same size. For return values ptr -> intptr was already
allowed; I added ptr -> ptr and intptr -> ptr while there.
As shown by a recent objc testcase, changing the way
parameters/return values are passed can be fatal when calling
code written in assembler that directly manipulates call
arguments and return values unless the transform has no
impact on the way they are passed at the codegen level.
While it is possible to imagine an ABI that treats integers
of pointer size differently to pointers, I don't think LLVM
supports any so the transform should now be safe while still
being useful.
llvm-svn: 51834
we did not truncate the value down to i1 with (x&1). This caused a problem
when the computation of x was nontrivial, for example, "add i1 1, 1" would
return 2 instead of 0.
This makes the testcase compile into:
...
llvm_cbe_t = (((llvm_cbe_r == 0u) + (llvm_cbe_r == 0u))&1);
llvm_cbe_u = (((unsigned int )(bool )llvm_cbe_t));
...
instead of:
...
llvm_cbe_t = ((llvm_cbe_r == 0u) + (llvm_cbe_r == 0u));
llvm_cbe_u = (((unsigned int )(bool )llvm_cbe_t));
...
This fixes a miscompilation of mediabench/adpcm/rawdaudio/rawdaudio and
403.gcc with the CBE, regressions from LLVM 2.2. Tanya, please pull
this into the release branch.
llvm-svn: 51813
index for the input pattern in terms of the output pattern. Instead
keep track of how many fixed operands the input pattern actually
has, and have the input matching code pass the output-emitting
function that index value. This simplifies the code, disentangles
variables_ops from the support for predication operations, and
makes variable_ops more robust.
llvm-svn: 51808
insertvalue and extractvalue to use constant indices instead of
Value* indices. And begin updating LangRef.html.
There's definately more to come here, but I'm checking this
basic support in now to make it available to people who are
interested.
llvm-svn: 51806
cases due to an isel deficiency already noted in
lib/Target/X86/README.txt, but they can be matched in this fold-call.ll
testcase, for example.
This is interesting mainly because it exposes a tricky tblgen bug;
tblgen was incorrectly computing the starting index for variable_ops
in the case of a complex pattern.
llvm-svn: 51706
definitions. This adds a new construct, "discard", for indicating
that a named node in the input matching pattern is to be discarded,
instead of corresponding to a node in the output pattern. This
allows tblgen to know where the arguments for the varaible_ops are
supposed to begin.
This fixes "rdar://5791600", whatever that is ;-).
llvm-svn: 51699
the one case that ADCE catches that normal DCE doesn't: non-induction variable
loop computations.
This implementation handles this problem without using postdominators.
llvm-svn: 51668
instruction to execute. This can be used for transformations (like two-address
conversion) to remat an instruction instead of generating a "move"
instruction. The idea is to decrease the live ranges and register pressure and
all that jazz.
llvm-svn: 51660
code generator would do something like this:
f64 = load f32 <anyext>, f32mem
v2f64 = insertelt undef, %0, 0
v2f64 = insertelt %1, 0.0, 1
into
v2f64 = vzext_load f32mem
which on x86 is movsd, when you really wanted a cvtss2sd/movsd pair.
llvm-svn: 51624
the section or the visibility from one global
value to another: copyAttributesFrom. This is
particularly useful for duplicating functions:
previously this was done by explicitly copying
each attribute in turn at each place where a
new function was created out of an old one, with
the result that obscure attributes were regularly
forgotten (like the collector or the section).
Hopefully now everything is uniform and nothing
is forgotten.
llvm-svn: 51567
Running /Users/void/llvm/llvm.src/test/CodeGen/X86/dg.exp ...
FAIL: /Users/void/llvm/llvm.src/test/CodeGen/X86/2007-11-30-LoadFolding-Bug.ll
Failed with exit(1) at line 1
while running: llvm-as < /Users/void/llvm/llvm.src/test/CodeGen/X86/2007-11-30-LoadFolding-Bug.ll | llc -march=x86 -mattr=+sse2 -stats |& grep {1 .*folded into instructions}
child process exited abnormally
Make this conditional for now.
llvm-svn: 51563
Analysis/ConstantFolding to fold ConstantExpr's, then make instcombine use it
to try to use targetdata to fold constant expressions on void instructions.
Also extend the icmp(inttoptr, inttoptr) folding to handle the case where
int size != ptr size.
llvm-svn: 51559
The SimplifyCFG pass looks at basic blocks that contain only phi nodes,
followed by an unconditional branch. In a lot of cases, such a block (BB) can
be merged into their successor (Succ).
This merging is performed by TryToSimplifyUncondBranchFromEmptyBlock. It does
this by taking all phi nodes in the succesor block Succ and expanding them to
include the predecessors of BB. Furthermore, any phi nodes in BB are moved to
Succ and expanded to include the predecessors of Succ as well.
Before attempting this merge, CanPropagatePredecessorsForPHIs checks to see if
all phi nodes can be properly merged. All functional changes are made to
this function, only comments were updated in
TryToSimplifyUncondBranchFromEmptyBlock.
In the original code, CanPropagatePredecessorsForPHIs looks quite convoluted
and more like stack of checks added to handle different kinds of situations
than a comprehensive check. In particular the first check in the function did
some value checking for the case that BB and Succ have a common predecessor,
while the last check in the function simply rejected all cases where BB and
Succ have a common predecessor. The first check was still useful in the case
that BB did not contain any phi nodes at all, though, so it was not completely
useless.
Now, CanPropagatePredecessorsForPHIs is restructured to to look a lot more
similar to the code that actually performs the merge. Both functions now look
at the same phi nodes in about the same order. Any conflicts (phi nodes with
different values for the same source) that could arise from merging or moving
phi nodes are detected. If no conflicts are found, the merge can happen.
Apart from only restructuring the checks, two main changes in functionality
happened.
Firstly, the old code rejected blocks with common predecessors in most cases.
The new code performs some extra checks so common predecessors can be handled
in a lot of cases. Wherever common predecessors still pose problems, the
blocks are left untouched.
Secondly, the old code rejected the merge when values (phi nodes) from BB were
used in any other place than Succ. However, it does not seem that there is any
situation that would require this check. Even more, this can be proven.
Consider that BB is a block containing of a single phi node "%a" and a branch
to Succ. Now, since the definition of %a will dominate all of its uses, BB
will dominate all blocks that use %a. Furthermore, since the branch from BB to
Succ is unconditional, Succ will also dominate all uses of %a.
Now, assume that one predecessor of Succ is not dominated by BB (and thus not
dominated by Succ). Since at least one use of %a (but in reality all of them)
is reachable from Succ, you could end up at a use of %a without passing
through it's definition in BB (by coming from X through Succ). This is a
contradiction, meaning that our original assumption is wrong. Thus, all
predecessors of Succ must also be dominated by BB (and thus also by Succ).
This means that moving the phi node %a from BB to Succ does not pose any
problems when the two blocks are merged, and any use checks are not needed.
llvm-svn: 51478
and bitcode support for the extractvalue and insertvalue
instructions and constant expressions.
Note that this does not yet include CodeGen support.
llvm-svn: 51468
live interval to infinity if the instruction being rewritten is an
original remat def instruction. We were only checking against the clone
of the remat def which doesn't actually appear in the IR at all.
llvm-svn: 51440
BB1:
vr1025 = copy vr1024
..
BB2:
vr1024 = op
= op vr1025
<loop eventually branch back to BB1>
Even though vr1025 is copied from vr1024, it's not safe to coalesced them since live range of vr1025 intersects the def of vr1024. This happens when vr1025 is assigned the value of the previous iteration of vr1024 in the loop.
llvm-svn: 51394
they aren't in the header file, systems with a <string> header file that isn't
64-bit clean shouldn't warn if #including Path.h and specifying
-Wshorten-64-to-32.
llvm-svn: 51393
1. The "JITState" object creates a PassManager with the ModuleProvider that the
jit is created with. If the ModuleProvider is removed and deleted, the
PassManager is invalid.
2. The Global maps in the JIT were not invalidated with a ModuleProvider was
removed. This could lead to a case where the Module would be freed, and a
new Module with Globals at the same addresses could return invalid results.
llvm-svn: 51384
ScalarEvolution::deleteValueFromRecords on it before doing the
replaceAllUsesWith, because ScalarEvolution looks at the instruction's
users to find SCEV references to the instruction's SCEV object in its
internal maps.
Move all of LSR's loop-related state clearing after processing the loop
and before cleaning up dead PHI nodes. This eliminates all of LSR's SCEV
references just before the calls to ScalarEvolution::deleteValueFromRecords
so that when ScalarEvolution drops its own SCEV references, the reference
counts will reach zero and the SCEVs will be deleted immediately.
These changes fix some compiler aborts involving ScalarEvolution holding
onto and reusing SCEV objects for instructions that have been deleted.
No regression test unfortunately; because the symptoms were due to
dangling pointers, reduced testcases ended up being fairly arbitrary.
llvm-svn: 51359
If local spiller optimization turns some instruction into an identity copy, it will be removed. If the output register happens to be dead (and source is obviously killed), transfer the kill / dead information to last use / def in the same MBB.
llvm-svn: 51306
replaced is a PHI. This prevents it from inserting uses before defs
in the case that it isn't a PHI and it depends on other instructions
later in the block. This fixes the 447.dealII regression on x86-64.
llvm-svn: 51292
type and the other operand is a constant into integer comparisons.
This happens surprisingly frequently (e.g. 10 times in 471.omnetpp),
which are things like this:
%tmp8283 = sitofp i32 %tmp82 to double
%tmp1013 = fcmp ult double %tmp8283, 0.0
Clearly comparing tmp82 against i32 0 is cheaper here.
this also triggers 8 times in gobmk, including this one:
%tmp375376 = sitofp i32 %tmp375 to double
%tmp377 = fcmp ogt double %tmp375376, 8.150000e+01
which is comparing an integer against 81.5 :).
llvm-svn: 51268
intersecting bits. This triggers all over the place, for example in lencode,
with adds of stuff like:
%tmp580 = mul i32 %tmp579, 2
%tmp582 = and i32 %b8, 1
and
%tmp28 = shl i32 %abs.i, 1
%sign.0 = select i1 %tmp23, i32 1, i32 0
and
%tmp344 = shl i32 %tmp343, 2
%tmp346 = and i32 %tmp96, 3
etc.
llvm-svn: 51263