- Avoid attempting stride-reuse in the case that there are users that
aren't addresses. In that case, there will be places where the
multiplications won't be folded away, so it's better to try to
strength-reduce them.
- Several SSE intrinsics have operands that strength-reduction can
treat as addresses. The previous item makes this more visible, as
any non-address use of an IV can inhibit stride-reuse.
- Make ValidStride aware of whether there's likely to be a base
register in the address computation. This prevents it from thinking
that things like stride 9 are valid on x86 when the base register is
already occupied.
Also, XFAIL the 2007-08-10-LEA16Use32.ll test; the new logic to avoid
stride-reuse elimintes the LEA in the loop, so the test is no longer
testing what it was intended to test.
llvm-svn: 43231
operations so they work right for integers with funky
bit-widths. For example, consider extending i48 to i64
on a 32 bit machine. The i64 result is expanded to 2 x i32.
We know that the i48 operand will be promoted to i64, then
also expanded to 2 x i32. If we had the expanded promoted
operand to hand, then expanding the result would be trivial.
Unfortunately at this stage we can only get hold of the
promoted operand. So instead we kind of hand-expand, doing
explicit shifting and truncating to get the top and bottom
halves of the i64 operand into 2 x i32, which are then used
to expand the result. This is harmless, because when the
promoted operand is finally expanded all this bit fiddling
turns into trivial operations which are eliminated either
by the expansion code itself or the DAG combiner.
llvm-svn: 43223
Turn a store folding instruction into a load folding instruction. e.g.
xorl %edi, %eax
movl %eax, -32(%ebp)
movl -36(%ebp), %eax
orl %eax, -32(%ebp)
=>
xorl %edi, %eax
orl -36(%ebp), %eax
mov %eax, -32(%ebp)
This enables the unfolding optimization for a subsequent instruction which will
also eliminate the newly introduced store instruction.
llvm-svn: 43192
asserts in later checks rather than producing
the ordinary load it is supposed to. Avoid all
such hassles by directly returning an ordinary
load in this case.
llvm-svn: 43174
To do this it is necessary to add a "always inline" argument to the
memcpy node. For completeness I have also added this node to memmove
and memset. I have also added getMem* functions, because the extra
argument makes it cumbersome to use getNode and because I get confused
by it :-)
llvm-svn: 43172
types. This is needed for SIGN_EXTEND_INREG at least.
It is not clear if this is correct for other operations.
On the other hand, for the various load/store actions
it seems to correct to return the type action, as is
currently done.
Also, it seems that SelectionDAG::getValueType can be
called for extended value types; introduce a map for
holding these, since we don't really want to extend
the vector to be 2^32 pointers long!
Generalize DAGTypeLegalizer::PromoteResult_TRUNCATE
and DAGTypeLegalizer::PromoteResult_INT_EXTEND to handle
the various funky possibilities that apints introduce,
for example that you can promote to a type that needs
to be expanded.
llvm-svn: 43071
codegen support. This should have no effect on codegen
for other types. Debatable bits: (1) the use (abuse?)
of a set in SDNode::getValueTypeList; (2) the length of
getTypeToTransformTo, which maybe should be refactored
with a non-inline part for extended value types.
llvm-svn: 43030
was stored to the acutal stack slot before the parameters were
lowered to their stack slot. This could cause arguments to be
overwritten by the return address if the called function had less
parameters than the caller function. The update should remove the
last failing test case of llc-beta: SPASS.
llvm-svn: 43027
unconditionally creating an i64 bitcast. With the future legalizer
design, operation legalization can't introduce new nodes with illegal
types.
This fixes the rest of olden on ppc32.
llvm-svn: 43005
integer conversion. In some such cases this makes us one or two orders
of magnitude faster than NetBSD's libc. Glibc seems to have a similar
fast path.
Also, tighten up some upper bounds to save a bit of memory.
llvm-svn: 42984
getTypeToExpandTo. The difference is that
getTypeToExpandTo gives the final result of expansion
(eg: i128 -> i32 on a 32 bit machine) while
getTypeToTransformTo does just one step (i128 -> i64).
llvm-svn: 42982
take a deleted nodes vector, instead of requiring it.
One more significant change: Implement the start of a legalizer that
just works on types. This legalizer is designed to run before the
operation legalizer and ensure just that the input dag is transformed
into an output dag whose operand and result types are all legal, even
if the operations on those types are not.
This design/impl has the following advantages:
1. When finished, this will *significantly* reduce the amount of code in
LegalizeDAG.cpp. It will remove all the code related to promotion and
expansion as well as splitting and scalarizing vectors.
2. The new code is very simple, idiomatic, and modular: unlike
LegalizeDAG.cpp, it has no 3000 line long functions. :)
3. The implementation is completely iterative instead of recursive, good
for hacking on large dags without blowing out your stack.
4. The implementation updates nodes in place when possible instead of
deallocating and reallocating the entire graph that points to some
mutated node.
5. The code nicely separates out handling of operations with invalid
results from operations with invalid operands, making some cases
simpler and easier to understand.
6. The new -debug-only=legalize-types option is very very handy :),
allowing you to easily understand what legalize types is doing.
This is not yet done. Until the ifdef added to SelectionDAGISel.cpp is
enabled, this does nothing. However, this code is sufficient to legalize
all of the code in 186.crafty, olden and freebench on an x86 machine. The
biggest issues are:
1. Vectors aren't implemented at all yet
2. SoftFP is a mess, I need to talk to Evan about it.
3. No lowering to libcalls is implemented yet.
4. Various operations are missing etc.
5. There are FIXME's for stuff I hax0r'd out, like softfp.
Hey, at least it is a step in the right direction :). If you'd like to help,
just enable the #ifdef in SelectionDAGISel.cpp and compile code with it. If
this explodes it will tell you what needs to be implemented. Help is
certainly appreciated.
Once this goes in, we can do three things:
1. Add a new pass of dag combine between the "type legalizer" and "operation
legalizer" passes. This will let us catch some long-standing isel issues
that we miss because operation legalization often obfuscates the dag with
target-specific nodes.
2. We can rip out all of the type legalization code from LegalizeDAG.cpp,
making it much smaller and simpler. When that happens we can then
reimplement the core functionality left in it in a much more efficient and
non-recursive way.
3. Once the whole legalizer is non-recursive, we can implement whole-function
selectiondags maybe...
llvm-svn: 42981
Make two changes:
1) only xform "store of f32" if i32 is a legal type for the target.
2) only xform "store of f64" if either i64 or i32 are legal for the target.
3) if i64 isn't legal, manually lower to 2 stores of i32 instead of letting a
later pass of legalize do it. This is ugly, but helps future changes I'm
about to commit.
llvm-svn: 42980
the source register will be coalesced to the super register of the LHS. Properly
merge in the live ranges of the resulting coalesced interval that were part of
the original source interval to the live interval of the super-register.
llvm-svn: 42961
Turn this:
movswl %ax, %eax
movl %eax, -36(%ebp)
xorl %edi, -36(%ebp)
into
movswl %ax, %eax
xorl %edi, %eax
movl %eax, -36(%ebp)
by unfolding the load / store xorl into an xorl and a store when we know the
value in the spill slot is available in a register. This doesn't change the
number of instructions but reduce the number of times memory is accessed.
Also unfold some load folding instructions and reuse the value when similar
situation presents itself.
llvm-svn: 42947
from user input strings.
Such conversions are more intricate and subtle than they may appear;
it is unlikely I have got it completely right first time. I would
appreciate being informed of any bugs and incorrect roundings you
might discover.
llvm-svn: 42912
(almost) a register copy. However, it always coalesced to the register of the
RHS (the super-register). All uses of the result of a EXTRACT_SUBREG are sub-
register uses which adds subtle complications to load folding, spiller rewrite,
etc.
llvm-svn: 42899
Fix DecomposeSimpleLinearExpr to handle simple constants better.
Don't nuke gep(bitcast(allocation)) if the bitcast(allocation) will
fold the allocation. This fixes PR1728 and Instcombine/malloc3.ll
llvm-svn: 42891
Factor out the code that expands the "nasty scalar code" for unrolling
vectors into a separate routine, teach it how to handle mixed
vector/scalar operands, as seen in powi, and use it for several operators,
including sin, cos, powi, and pow.
Add support in SplitVectorOp for fpow, fpowi and for several unary
operators.
llvm-svn: 42884
enabled by passing -tailcallopt to llc. The optimization is
performed if the following conditions are satisfied:
* caller/callee are fastcc
* elf/pic is disabled OR
elf/pic enabled + callee is in module + callee has
visibility protected or hidden
llvm-svn: 42870
No compile-time support for constant operations yet,
just format transformations. Make readers and
writers work. Split constants into 2 doubles in
Legalize.
llvm-svn: 42865
- Added a function to hold the stack location
where GP must be stored during LowerCALL
- AsmPrinter now emits directives based on
relocation type
- PIC_ set to default relocation type (same as GCC)
llvm-svn: 42779
use ISD::{S,U}DIVREM and ISD::{S,U}MUL_HIO. Move the lowering code
associated with these operators into target-independent in LegalizeDAG.cpp
and TargetLowering.cpp.
llvm-svn: 42762
Check if one of the two results unneeded so see if a simpler operator
could bs used. Also check to see if each of the two computations could be
simplified if they were split into separate operators. Factor out the code
that calls visit() so that it can be used for this purpose.
llvm-svn: 42759
arbitrary range of bits embedded in the middle of another bignum.
This kind of operation is desirable in many cases of software
floating point, e.g. converting bignum integers to floating point
numbers of fixed precision (you want to extract the precision most
significant bits).
Elsewhere, add an assertion, and exit the shift functions early if
the shift count is zero.
llvm-svn: 42745