Avoid a fall-through in the (stubby) treatment of the longjmp intrinsic
call which causes llc & lli to core-dump.
Add a sort-of treatment of cast double to ulong. I am not really sure
what a user should expect to see upon casting a negative FP value to
unsigned long long. But with what is given here, I was able to write
a program that could cast -123.456 to ulong and back and get -123.0,
which seems like a step in the right direction. GCC seems to give you
0. I don't know if I'd consider that useful.
These cases were coming up in GNU coreutils-5.0.
llvm-svn: 7205
to clone the subgraph reachable from a set of root nodes, into the
current graph, merging the global nodes into those in the current graph.
(2) Added DSGraph::updateFromGlobalGraph() to rematerialize nodes from the
globals graph into the current graph in both BU and TD passes.
(3) Added hash_set<const GlobalValue*> InlinedGlobals: a set of globals to
track which globals have been inlined into the current graph from
callers or callees. In the TD pass, such globals are up-to-date and
do not need to be rematerialized from the GlobalsGraph.
(4) Added StripIncompleteBit/KeepIncompleteBit to remove incomplete bit
when cloning nodes into the globals graph.
llvm-svn: 7190
after all callees are inlined into the current graph.
NOTE: There's also a major bug fix for the BU pass in DataStructure.cpp,
which ensures that resolvable indirect calls are not moved out to the
globals graph, so that they are eventually inlined (if possible).
llvm-svn: 7189
after all callers are inlined into the current graph.
(2) Optimize the way a graph is inlined into its callees in the TD phase:
(a) Use DSGraph::cloneReachableSubgraph to clone only a subgraph at
each call site, for faster inlining.
(b) Clone separately for the same callee at different call sites,
since only the reachable subgraph is being cloned, not the entire
caller graph.
llvm-svn: 7188
and (2) faster inlining by cloning only reachable nodes. In particular:
(1) Added DSGraph::cloneReachableSubgraph and DSGraph::cloneReachableNodes
to clone the subgraph reachable from a set of root nodes, into the
current graph, merging the global nodes into thos in the current graph.
The TD pass now uses this for faster inlining, and so does the
next function.
(2) Added DSGraph::updateFromGlobalGraph() to rematerialize nodes from the
globals graph into the current graph in both BU and TD passes.
(3) `I' flags are removed from all nodes in the globals graph, because they
are difficult to maintain correctly and are not needed anyway.
(4) Aux. function calls are only removed to the globals graph if they
will never be resovled. (This is what fixed gap.) The immediate
reason is that if we took these out of a function (and moved them to
the globals graph) we would need to rematerialize these nodes into the
function graph for every function in the BU pass. The longer term
problem is that we would need to find a way to remove them from the
globals graph iff they have been resolved on all paths through the
call graph.
llvm-svn: 7187
now works in instructions which require a 2-bit or 3-bit INTcc code.
Incidentally, that means that the representation of INTcc registers is now the
same in both integer and FP instructions. Thus, code became much simpler and
cleaner.
llvm-svn: 7185
Also, placed DEBUG() guards around debug information so that the generated file
is much smaller and hence should be faster to preprocess/compile.
llvm-svn: 7180
masking and shifting operands directly into their place in the instruction,
instead of the old-fashioned way of ORing in each bit separately.
llvm-svn: 7179
allow, i.e. make a sequence of instructions to enable an indirect call using
jump-and-link and 2 temporary registers (which we save and ultimately restore).
Warning: if the delay slot of a function call is used to do meaningful work and
not just a NOP, this behavior is incorrect. However, the Sparc backend does not
yet utilize the delay slots effectively, so it is not necessary to make an
overly complicated algorithm for something that's not used.
llvm-svn: 7178
* FP double registers are now coded correctly
* Removed function which converted registers based on register types, it was
broken (because regTypes are broken)
llvm-svn: 7175
Specifically, this updates libtool to version 1.5 and adds the following:
- Added the -only-static option that we added in our previous libtool.
- Modified the autoconf macros so that libtool uses the -G option when
linking on Solaris. This allows libraries with global variables with
constructors to automatically run those constructors when the
library is dlopened().
llvm-svn: 7171
remembered in valuesStoredInFunction, but never traced at function return,
and that's too late to be finding the error anyway).
Stores trace both the value and the address being stored to,
but after some experience I think only values should be traced.
The pointer hash table just fills up far too quickly if every
store address were traced.
llvm-svn: 7169
out the entire llvm disassembly for the function at global constant-output
time, which caused the assembler to barf in 164.gzip. This fixes that
particular problem (though 164.gzip has other problems with X86 llc.)
llvm-svn: 7168
Fhourstones, McCat-vor, and many others...)
Printer.cpp: Print implicit uses for AddRegFrm instructions. Break gas
bug workarounds up into separate stanzas of code for each bug. Add new
workarounds for fild and fistp.
X86InstrInfo.def: Add O_ST0 implicit uses for more FP instrs where they
obviously apply. Also add PrintImplUses flags for FP instrs where they
are necessary for gas to understand the output.
llvm-svn: 7165
(1) Cannot use ANDN(ot), ORN, and XORN for boolean ops, only bitwise ops.
(2) Conditional move instructions must distinguish signed and unsigned
condition codes, e.g., MOVLE vs. MOVLEU.
(3) Conditional-move-on-register was using the cond-move-on-cc opcodes,
which produces a valid-looking instruction with bogus registers!
(4) Here's a really cute one: dividing-by-2^k for negative numbers needs to
add 2^k-1 before shifting, not add 1 after shifting. Sadly, these
are the same when k=0 so our poor test case worked fine.
(5) Casting between signed and unsigned values was not correct:
completely reimplemented.
(6) Zero-extension on unsigned values was bogus: I was only doing the
SRL and not the SLLX before it. Don't know WHAT I was thinking!
(7) And the most important class of changes: Sign-extensions on signed values.
Signed values are not sign-extended after ordinary operations,
so they must be sign-extended before the following cases:
-- passing to an external or unknown function
-- returning from a function
-- using as operand 2 of DIV or REM
-- using as either operand of condition-code setting operation
(currently only SUBCC), with smaller than 32-bit operands
Also, a couple of improvements:
(1) Fold cast-to-bool into Not(bool). Need to do this for And, Or, XOR also.
(2) Convert SetCC-Const into a conditional-move-on-register (case 41)
if the constant is 0. This was only being done for branch-on-SetCC-Const
when the branch is folded with the SetCC-Const.
llvm-svn: 7159
(1) An int CC live range must be spilled if there are any interferences,
even if no other "neighbour" in the interf. graph has been allocated
that reg. yet. This is actually true of any class with only one reg!
(2) SparcIntCCRegClass::colorIGNode sets the color even if the LR must
be spilled so that the machine-independent spill code doesn't have to
make the machine-dependent decision of which CC name to use based on
operand type: %xcc or %icc. (These are two halves of the same
register.)
(3) LR->isMarkedForSpill() is no longer the same as LR->hasColor().
These should never have been the same, and this is necessary now for #2.
(4) All RDCCR and WRCCR instructions are directly generated with the
phony number for %ccr so that EmitAssembly/EmitBinary doesn't have to
deal with this.
llvm-svn: 7152
(1) An int CC live range must be spilled if there are any interferences,
even if no other "neighbour" in the interf. graph has been allocated
that reg. yet. This is actually true of any class with only one reg!
(2) SparcIntCCRegClass::colorIGNode sets the color even if the LR must
be spilled so that the machine-independent spill code doesn't have to
make the machine-dependent decision of which CC name to use based on
operand type: %xcc or %icc. (These are two halves of the same register.)
(3) LR->isMarkedForSpill() is no longer the same as LR->hasColor().
These should never have been the same, and this is necessary now for #2.
(4) All RDCCR and WRCCR instructions are directly generated with the
phony number for %ccr so that EmitAssembly/EmitBinary doesn't have to
deal with this.
llvm-svn: 7151