of all sizes from i1 to i256. The code is not
always that great, for example (x86)
movw %di, %ax
movw %ax, i17_s
where the store could be directly from %di.
llvm-svn: 53677
simply does the atomic.cmp.swap on the larger type,
which means it blows away whatever is sitting in
the bytes just after the memory location, i.e.
causes a buffer overflow. This really requires
target specific code, which is why LegalizeTypes
doesn't try to handle this case generically. The
existing (wrong) code in LegalizeDAG will go away
automatically once the type legalization code is
removed from LegalizeDAG so I'm leaving it there
for the moment. Meanwhile, don't test for this
feature.
llvm-svn: 53669
return values that are still (partially) live. Instead of updating all uses of
a call instruction after removing some elements, it now just rebuilds the
original struct (With undef gaps where the unused values were) and leaves it to
instcombine to clean this up.
The added testcase still fails currently, but this is due to instcombine which
isn't good enough yet. I will fix that part next.
llvm-svn: 53608
In LegalizeDAG the value is zero-extended to
the new type before byte swapping. It doesn't
matter how the extension is done since the new
bits are shifted off anyway after the swap, so
extend by any old rubbish bits. This results
in the final assembler for the testcase being
one line shorter.
llvm-svn: 53604
the second half of link-global-to-func.ll and causes some minor changes in
messages.
There are two TODOs here. First, this causes a regression in
2008-07-06-AliasWeakDest.ll, which is now failing (so I xfailed it). Anton,
I would really appreciate it if you could take a look at this. It should be
a matter of adding proper alias support to GetLinkageResult, and was probably
already a latent bug that would manifest with globals.
The second todo is to reimplement LinkAlias in the same pattern as
function and global linking. This should be pretty straight-forward for
someone who knows aliases, but isn't a requirement for correctness.
llvm-svn: 53548
(replacing a function with a global). This is needed when building
llvm itself with LTO on darwin, because of the EXPLICIT_SYMBOL hack
in lib/system/DynamicLibrary.cpp.
Implementation of linking the other way will need to wait for a
cleanup of LinkFunctionProtos.
llvm-svn: 53546
8 %reg1024<def> = IMPLICIT_DEF
12 %reg1024<def> = INSERT_SUBREG %reg1024<kill>, %reg1025, 2
The live range [12, 14) are not part of the r1024 live interval since it's defined by an implicit def. It will not conflicts with live interval of r1025. Now suppose both registers are spilled, you can easily see a situation where both registers are reloaded before the INSERT_SUBREG and both target registers that would overlap.
llvm-svn: 53503
was using the algorithm for folding unsigned comparisons which is
completely wrong. This has been broken since the signless types change.
llvm-svn: 53444
This cause a regression in InstCombine/JavaCompare, which was doing the right
thing on accident. To handle the missed case, generalize the comparisons based
on masked bits a little bit to handle comparisons against the max value. For
example, we can now xform (slt i32 (and X, 4), 4) -> (setne i32 (and X, 4), 4)
llvm-svn: 53443
Rewrite the DeadArgumentElimination pass, to use a more explicit tracking of
dependencies between return values and/or arguments. Also make the handling of
arguments and return values the same.
The pass now looks properly inside returned structs, but only at the first
level (ie, not inside nested structs).
This version fixed a few more bugs and was cleaned up a bit. It now passes all
of LLVM's testing, and should still pass SPEC2006. There is still a minor bug
with regard to returning nested structs. Since there is currently nothing that
emits such IR, I will fix that in a seperate commit (partly because it requires
a non-trivial fix).
llvm-svn: 53400
1) evaluate [v]fcmp true/false with undefs to true or false instead
of undef.
2) fix vector comparisons with undef to return a vector result instead
of i1
3) fix vector comparisons with evaluatable results to return vector
true/false instead of i1 true/false (PR2529)
llvm-svn: 53220
getTargetNode and SelectNodeTo to reduce duplication, and to
make some of the getTargetNode code available to SelectNodeTo.
Use SelectNodeTo instead of getTargetNode in several new
interesting cases, as it mutates nodes in place instead of
creating new ones.
This triggers some scheduling behavior differences due to nodes
being presented to the scheduler in a different order. Some of the
arbitrary scheduling decisions it makes are now arbitrarily made
differently. This is visible in CodeGen/PowerPC/LargeAbsoluteAddr.ll,
where a trivial scheduling difference led to a trivial register
allocation difference.
llvm-svn: 53203
1. LSR runOnLoop is always returning false regardless if any transformation is made.
2. AddUsersIfInteresting can create new instructions that are added to DeadInsts. But there is a later early exit which prevents them from being freed.
llvm-svn: 53193
Added abstract class MemSDNode for any Node that have an associated MemOperand
Changed atomic.lcs => atomic.cmp.swap, atomic.las => atomic.load.add, and
atomic.lss => atomic.load.sub
llvm-svn: 52706
test (doesn't work for any MMX vector types, it's
not me). Rewritten to use v2i16 which is generic
and going to stay that way; I think that preserves
the point of the test.
llvm-svn: 52692
,------.
| |
| v
| t2 = phi ... t1 ...
| |
| v
| t1 = ...
| ... = ... t1 ...
| |
`------'
where there is a use in a PHI node that's a predecessor to the defining
block. We don't want to mark all predecessors as having the value "alive" in
this case. Also, the assert was too restrictive and didn't handle this case.
llvm-svn: 52655
in the presence of out-of-loop users of in-loop values and the trip
count is not a known multiple of the unroll count, and to be a bit
simpler overall. This fixes PR2253.
llvm-svn: 52645
structures. Its default threshold is to promote things that are
smaller than 128 bytes, which is sane. However, it is not sane
to do this for things that turn into 128 *registers*. Add a cap
on the number of registers introduced, defaulting to 128/4=32.
llvm-svn: 52611
This is a fixed version that no longer uses multimap::equal_range, which
resulted in a pointer invalidation problem.
Also, DAE::InspectedFunctions was not really necessary, so it got removed.
Lastly, this version no longer applies the extra arg hack on functions who did
not have any arguments to start with.
llvm-svn: 52532
shuffle could be skipped. The check is invalid because the loop index i
doesn't correspond to the element actually inserted. The correct check is
already done a few lines earlier, for whether the element is already in
the right spot, so this shouldn't have any effect on the codegen for
code that was already correct.
llvm-svn: 52486