The immediate offset of the non-writeback i8 form (encoding T4) allows
negative offsets only. The positive offset form of the encoding is the
LDRT instruction. Immediate offsets in the range [0,255] use encoding T3
instead.
llvm-svn: 139254
In some cases such as interpreters using indirectbr, the CFG can be very
complicated, and live range splitting may be forced to insert a large
number of phi-defs. When that happens, traceSiblingValue can spend a
lot of time zipping around in the CFG looking for defs and reloads.
This patch causes more information to be cached in SibValues, and the
cached values are used to terminate searches early. This speeds up
spilling by 20x in one interpreter test case. For more typical code,
this is just a 10% speedup of spilling.
llvm-svn: 139247
There is no 16-bit wide encoding, so the .w suffix isn't needed (indeed, isn't
documented as allowed). Also add the missing '!' token on the _UPD
variant.
llvm-svn: 139243
Choose 32-bit vs. 16-bit encoding when there's no .w suffix in post-processing
as match classes are insufficient to handle the context-sensitiveness of
the writeback operand's legality for the 16-bit encodings.
llvm-svn: 139242
duplicate tests are eliminated (for example if the two functions both have
a catch clause catching the same type, ensure the redundant one is removed).
Note that it would probably be safe to say that eh.typeid.for is 'const',
but since two calls to it with the same argument can give different results
(but only if the calls are in different functions), it seems more correct to
mark it only 'pure'; this doesn't get in the way of the optimization.
llvm-svn: 139236
(The fix for the related failures on x86 is going to be nastier because we actually need Acquire memoperands attached to the atomic load instrs, etc.)
llvm-svn: 139221
with a vector condition); such selects become VSELECT codegen nodes.
This patch also removes VSETCC codegen nodes, unifying them with SETCC
nodes (codegen was actually often using SETCC for vector SETCC already).
This ensures that various DAG combiner optimizations kick in for vector
comparisons. Passes dragonegg bootstrap with no testsuite regressions
(nightly testsuite as well as "make check-all"). Patch mostly by
Nadav Rotem.
llvm-svn: 139159
Now the 'S' instructions, e.g. ADDS, treat S bit as optional operand as well.
Also fix isel hook to correctly set the optional operand.
rdar://10073745
llvm-svn: 139157
Even if there's no mode switch performed, the .code directive should still
be sent to the output streamer. Otherwise, for example, an output asm stream
is not equivalent to the input stream which generated it (a dependency on
the input target triple arm vs. thumb is introduced which was not originally
there).
llvm-svn: 139155
init.trampoline and adjust.trampoline intrinsics, into two intrinsics
like in GCC. While having one combined intrinsic is tempting, it is
not natural because typically the trampoline initialization needs to
be done in one function, and the result of adjust trampoline is needed
in a different (nested) function. To get around this llvm-gcc hacks the
nested function lowering code to insert an additional parent variable
holding the adjust.trampoline result that can be accessed from the child
function. Dragonegg doesn't have the luxury of tweaking GCC code, so it
stored the result of adjust.trampoline in the memory GCC set aside for
the trampoline itself (this is always available in the child function),
and set up some new memory (using an alloca) to hold the trampoline.
Unfortunately this breaks Go which allocates trampoline memory on the
heap and wants to use it even after the parent has exited (!). Rather
than doing even more hacks to get Go working, it seemed best to just use
two intrinsics like in GCC. Patch mostly by Sanjoy Das.
llvm-svn: 139140