because saving i1 and i2 to their ``designated'' stack slots corrupts unknown
memory in other functions, standard libraries, and worse.
In addition, this has the benefit of improving JIT performance because we
eliminate writing out 4 instructions in CompilationCallback() and 2 loads and 2
stores.
llvm-svn: 7653
2. Handle fp-to-uint conversions directly here instead of relying on
a pre-transformation to replace them with the 2-step conversion.
3. Use size rather than explicitly checking types when deciding what
opcodes to use, wherever possible. This is less error prone (the
bug fix above was not the first time!).
4. Float-to-pointer casts shd now work though this hasn't been tested.
llvm-svn: 7645
* Doxygen-ified comments
* Added capability to make far calls (i.e., beyond 30 bits in CALL instr)
which implies that we need to delete function references that were added by
the call to addFunctionReference() because the actual call instruction is 10
instructions away (thanks to 64-bit address construction)
* Cleaned up code that generates far jumps by using an array+loop
SparcV9CodeEmitter.h:
* Explained more of the side-effects of emitFarCall()
llvm-svn: 7639
so get rid of the def/use parameters that were getting passed in.
**** This now changes the semantics of these methods to preserve the flags,
not clobber them!
llvm-svn: 7602
* Use .zero to emit padding between struct elements
* Emit .comm symbols when we can, this dramatically reduces the amount of gunk we have to print
* Print global variable identifiers next to initializer more nicely.
llvm-svn: 7551
* Fix bug in the createNOP method, which was not marking the operands of the
generated XCHG as useanddef. I don't think this method is actually used,
so it wasn't breaking anything, but it should be fixed anyway...
llvm-svn: 7539
Note that some generated operators (like &, | or ^) may
not be supported by the assembler -- but if they've got
this far, it's better to generate them and let the assembler decide.
llvm-svn: 7476
since it is *necessary* for correct code generation. Only optional
transformations belong in the PreOpts pass (which needs to be renamed
from PreSelection to PreOpts).
llvm-svn: 7474
that depends on machine register size.
Moved insertCallerSavingCode() to PhyRegAlloc and
moved isRegVolatile and modifiedByCall to TargetRegInfo: they are all
machine independent. Remove several dead functions.
llvm-svn: 7392
spilling values used by an instruction in the delay slot of the branch
(which will eventually be moved before the branch).
2. Bug fix: Delete the delay slot instr, not the branch instr, when
moving delay slot instr. out!!!!
3. Move code to insert caller-saves moved here from SparcRegInfo:
it is now machine-independent.
llvm-svn: 7389
machine-independent.
Fix problem with using branch operand reg. as temp. reg. when
spilling values used by an instruction in the delay slot of the branch
(which will eventually be moved before the branch).
llvm-svn: 7385
now handle far calls (i.e., beyond the 30-bit limit in call instructions).
* As a side-effect, this allows us to unify and clean up the mmap() call and
code around it.
llvm-svn: 7381
This us used by bugpoint -- when code is compiled to a shared object to be
JITted, it must use the JIT's lazy resolution method to find function addresses,
because some functions will not be available at .so load time, as they are in
the bytecode file.
llvm-svn: 7363
Single and FP double reg types (which share the same reg class).
Now all methods marking/finding unused regs consider the regType
within the reg class, and SparcFloatRegClass specializes this code.
(2) Remove machine-specific regalloc. methods that are no longer needed.
In particular, arguments and return value from a call do not need
machine-specific code for allocation.
(3) Rename TargetRegInfo::getRegType variants to avoid unintentional
overloading when an include file is omitted.
llvm-svn: 7334
info (since multiple reg types may share the same reg class).
(2) Remove machine-specific regalloc. methods that are no longer needed.
In particular, arguments and return value from a call do not need
machine-specific code for allocation.
(3) Rename TargetRegInfo::getRegType variants to avoid unintentional
overloading when an include file is omitted.
llvm-svn: 7329
Mangler.cpp: Constify parameter to makeNameProper, and use const_iterator.
Make Count an unsigned int, and use utostr().
Don't name parameters things that start with underscore.
Mangler.h: All of the above, and also: Add Emacs mode-line. Include <set>.
llvm-svn: 7301
of codes. For example,
short kernel (short t1) {
t1 >>= 8; t1 <<= 8;
return t1;
}
became:
short %kernel(short %t1.1) {
%tmp.3 = shr short %t1.1, ubyte 8 ; <short> [#uses=1]
%tmp.5 = cast short %tmp.3 to int ; <int> [#uses=1]
%tmp.7 = shl int %tmp.5, ubyte 8 ; <int> [#uses=1]
%tmp.8 = cast int %tmp.7 to short ; <short> [#uses=1]
ret short %tmp.8
}
before, now it becomes:
short %kernel(short %t1.1) {
%tmp.3 = shr short %t1.1, ubyte 8 ; <short> [#uses=1]
%tmp.8 = shl short %tmp.3, ubyte 8 ; <short> [#uses=1]
ret short %tmp.8
}
which will become:
short %kernel(short %t1.1) {
%tmp.3 = and short %t1.1, 0xFF00
ret short %tmp.3
}
This implements cast-set.ll:test4 and test5
llvm-svn: 7290
doFinalization too except that would have made them shadow, not override,
the parent class :-P.
Allow *any* constant cast expression between pointers and longs,
or vice-versa, or any widening (not just same-size) conversion that
isLosslesslyConvertibleTo approves. This fixes oopack.
llvm-svn: 7288
Printer::doFinalization() out in the cold. Now we pass in a TargetMachine
to Printer's constructor and get the TargetData from the TargetMachine.
Don't pass TargetMachine or MRegisterInfo objects around in the Printer.
Constify TargetData references.
X86.h: Update comment and prototype of createX86CodePrinterPass().
X86TargetMachine.cpp: Update callers of createX86CodePrinterPass().
llvm-svn: 7275
Stop passing ostreams around: we already have one perfectly good ostream
and we can all share it.
Stop stashing a pointer to TargetData in the Pass object, because that will
lead to a crash if there are no functions in the module (ouch!) Instead,
use addRequired() and getAnalysis(), like we always should have done.
Move the check for ConstantExpr up before the check for isPrimitiveType,
because we need to be able to catch e.g. ubyte (cast bool false to ubyte),
whose type is primitive but which is nevertheless a ConstantExpr, by calling
our specialized handler instead of the AsmWriter. This would result in
assembler errors when we would try to output something like ".byte (cast
bool false to ubyte)".
GC some unused variable declarations.
llvm-svn: 7265
IC: (X ^ C1) | C2 --> (X | C2) ^ (C1&~C2)
We are now guaranteed that all 'or's will be inside of 'and's, and all 'and's
will be inside of 'xor's, if the second operands are constants.
llvm-svn: 7264
Avoid a fall-through in the (stubby) treatment of the longjmp intrinsic
call which causes llc & lli to core-dump.
Add a sort-of treatment of cast double to ulong. I am not really sure
what a user should expect to see upon casting a negative FP value to
unsigned long long. But with what is given here, I was able to write
a program that could cast -123.456 to ulong and back and get -123.0,
which seems like a step in the right direction. GCC seems to give you
0. I don't know if I'd consider that useful.
These cases were coming up in GNU coreutils-5.0.
llvm-svn: 7205
after all callees are inlined into the current graph.
NOTE: There's also a major bug fix for the BU pass in DataStructure.cpp,
which ensures that resolvable indirect calls are not moved out to the
globals graph, so that they are eventually inlined (if possible).
llvm-svn: 7189
after all callers are inlined into the current graph.
(2) Optimize the way a graph is inlined into its callees in the TD phase:
(a) Use DSGraph::cloneReachableSubgraph to clone only a subgraph at
each call site, for faster inlining.
(b) Clone separately for the same callee at different call sites,
since only the reachable subgraph is being cloned, not the entire
caller graph.
llvm-svn: 7188
and (2) faster inlining by cloning only reachable nodes. In particular:
(1) Added DSGraph::cloneReachableSubgraph and DSGraph::cloneReachableNodes
to clone the subgraph reachable from a set of root nodes, into the
current graph, merging the global nodes into thos in the current graph.
The TD pass now uses this for faster inlining, and so does the
next function.
(2) Added DSGraph::updateFromGlobalGraph() to rematerialize nodes from the
globals graph into the current graph in both BU and TD passes.
(3) `I' flags are removed from all nodes in the globals graph, because they
are difficult to maintain correctly and are not needed anyway.
(4) Aux. function calls are only removed to the globals graph if they
will never be resovled. (This is what fixed gap.) The immediate
reason is that if we took these out of a function (and moved them to
the globals graph) we would need to rematerialize these nodes into the
function graph for every function in the BU pass. The longer term
problem is that we would need to find a way to remove them from the
globals graph iff they have been resolved on all paths through the
call graph.
llvm-svn: 7187
now works in instructions which require a 2-bit or 3-bit INTcc code.
Incidentally, that means that the representation of INTcc registers is now the
same in both integer and FP instructions. Thus, code became much simpler and
cleaner.
llvm-svn: 7185
allow, i.e. make a sequence of instructions to enable an indirect call using
jump-and-link and 2 temporary registers (which we save and ultimately restore).
Warning: if the delay slot of a function call is used to do meaningful work and
not just a NOP, this behavior is incorrect. However, the Sparc backend does not
yet utilize the delay slots effectively, so it is not necessary to make an
overly complicated algorithm for something that's not used.
llvm-svn: 7178
* FP double registers are now coded correctly
* Removed function which converted registers based on register types, it was
broken (because regTypes are broken)
llvm-svn: 7175
remembered in valuesStoredInFunction, but never traced at function return,
and that's too late to be finding the error anyway).
Stores trace both the value and the address being stored to,
but after some experience I think only values should be traced.
The pointer hash table just fills up far too quickly if every
store address were traced.
llvm-svn: 7169
out the entire llvm disassembly for the function at global constant-output
time, which caused the assembler to barf in 164.gzip. This fixes that
particular problem (though 164.gzip has other problems with X86 llc.)
llvm-svn: 7168
Fhourstones, McCat-vor, and many others...)
Printer.cpp: Print implicit uses for AddRegFrm instructions. Break gas
bug workarounds up into separate stanzas of code for each bug. Add new
workarounds for fild and fistp.
X86InstrInfo.def: Add O_ST0 implicit uses for more FP instrs where they
obviously apply. Also add PrintImplUses flags for FP instrs where they
are necessary for gas to understand the output.
llvm-svn: 7165
(1) Cannot use ANDN(ot), ORN, and XORN for boolean ops, only bitwise ops.
(2) Conditional move instructions must distinguish signed and unsigned
condition codes, e.g., MOVLE vs. MOVLEU.
(3) Conditional-move-on-register was using the cond-move-on-cc opcodes,
which produces a valid-looking instruction with bogus registers!
(4) Here's a really cute one: dividing-by-2^k for negative numbers needs to
add 2^k-1 before shifting, not add 1 after shifting. Sadly, these
are the same when k=0 so our poor test case worked fine.
(5) Casting between signed and unsigned values was not correct:
completely reimplemented.
(6) Zero-extension on unsigned values was bogus: I was only doing the
SRL and not the SLLX before it. Don't know WHAT I was thinking!
(7) And the most important class of changes: Sign-extensions on signed values.
Signed values are not sign-extended after ordinary operations,
so they must be sign-extended before the following cases:
-- passing to an external or unknown function
-- returning from a function
-- using as operand 2 of DIV or REM
-- using as either operand of condition-code setting operation
(currently only SUBCC), with smaller than 32-bit operands
Also, a couple of improvements:
(1) Fold cast-to-bool into Not(bool). Need to do this for And, Or, XOR also.
(2) Convert SetCC-Const into a conditional-move-on-register (case 41)
if the constant is 0. This was only being done for branch-on-SetCC-Const
when the branch is folded with the SetCC-Const.
llvm-svn: 7159
(1) An int CC live range must be spilled if there are any interferences,
even if no other "neighbour" in the interf. graph has been allocated
that reg. yet. This is actually true of any class with only one reg!
(2) SparcIntCCRegClass::colorIGNode sets the color even if the LR must
be spilled so that the machine-independent spill code doesn't have to
make the machine-dependent decision of which CC name to use based on
operand type: %xcc or %icc. (These are two halves of the same
register.)
(3) LR->isMarkedForSpill() is no longer the same as LR->hasColor().
These should never have been the same, and this is necessary now for #2.
(4) All RDCCR and WRCCR instructions are directly generated with the
phony number for %ccr so that EmitAssembly/EmitBinary doesn't have to
deal with this.
llvm-svn: 7152
(1) An int CC live range must be spilled if there are any interferences,
even if no other "neighbour" in the interf. graph has been allocated
that reg. yet. This is actually true of any class with only one reg!
(2) SparcIntCCRegClass::colorIGNode sets the color even if the LR must
be spilled so that the machine-independent spill code doesn't have to
make the machine-dependent decision of which CC name to use based on
operand type: %xcc or %icc. (These are two halves of the same register.)
(3) LR->isMarkedForSpill() is no longer the same as LR->hasColor().
These should never have been the same, and this is necessary now for #2.
(4) All RDCCR and WRCCR instructions are directly generated with the
phony number for %ccr so that EmitAssembly/EmitBinary doesn't have to
deal with this.
llvm-svn: 7151
correct: empirically, "regType" is wrong for a number of registers. Thus, one
can only rely on the "regClass" to figure out what kind of register one is
dealing with.
This change switches to using only "regClass" and adds a few extra DEBUG() print
statements and a few clean-ups in comments and code, mostly minor.
llvm-svn: 7103