- This fixed a number of bugs in if-converter, tail merging, and post-allocation
scheduler. If-converter now runs branch folding / tail merging first to
maximize if-conversion opportunities.
- Also changed the t2IT instruction slightly. It now defines the ITSTATE
register which is read by instructions in the IT block.
- Added Thumb2 specific hazard recognizer to ensure the scheduler doesn't
change the instruction ordering in the IT block (since IT mask has been
finalized). It also ensures no other instructions can be scheduled between
instructions in the IT block.
This is not yet enabled.
llvm-svn: 106344
basic tests.
This has been well tested on Darwin but not elsewhere.
It should work provided the linker correctly resolves
B.W <label in other function>
which it has not seen before, at least from llvm-based
compilers. I'm leaving the arm-tail-calls switch in
until I see if there's any problems because of that;
it might need to be disabled for some environments.
llvm-svn: 106299
replacing the overly conservative checks that I had introduced recently to
deal with correctness issues. This makes a pretty noticable difference
in our testcases where reg_sequences are used. I've updated one test to
check that we no longer emit the unnecessary subreg moves.
llvm-svn: 105991
i64 and f64 types, but now it also handle Neon vector types, so the f64 result
of VMOVDRR may need to be converted to a Neon type. Radar 8084742.
llvm-svn: 105845
optimization level.
This only really affects llc for now because both the llvm-gcc and clang front
ends override the default register allocator. I intend to remove that code later.
llvm-svn: 104904
copying VFP subregs. This exposed a bunch of dead code in the *spill-q.ll
tests, so I tweaked those tests to keep that code from being optimized away.
Radar 7872877.
llvm-svn: 104415
so that it will continue to test what it was meant to test when I commit a
separate change for better support of BUILD_VECTOR and VECTOR_SHUFFLE for Neon.
Fix a DAG combiner crash exposed by this test change.
llvm-svn: 104380
definitions of the virtual register.
This happens when spilling the registers produced by REG_SEQUENCE:
%reg1047:5<def>, %reg1047:6<def>, %reg1047:7<def> = VLD3d8 %reg1033, 0, pred:14, pred:%reg0
The rewriter would spill the register multiple times, dead store elimination
tried to keep up, but ended up cutting the branch it was sitting on.
llvm-svn: 104321
operand on the left, the interesting operand is on the right. This
fixes a bug where LSR was failing to recognize ICmpZero uses,
which led it to be unable to reverse the induction variable in the
attached testcase.
Delete test/CodeGen/X86/stack-color-with-reg-2.ll, because its test
is extremely fragile and hard to meaningfully update.
llvm-svn: 104262
The trouble arises when the result of a vector cmp + sext is then and'ed with all ones. Instcombine will turn it into a vector cmp + zext, dag combiner will miss turning it into a vsetcc and hell breaks loose after that.
Teach dag combine to turn a vector cpm + zest into a vsetcc + and 1. This fixes rdar://7923010.
llvm-svn: 104094
While that approach works wonders for register pressure, it tends to break
everything.
This should unbreak the arm-linux builder and fix a number of miscompilations.
llvm-svn: 103946
beneficial cases. See the changes in test/CodeGen/X86/tail-opts.ll and
test/CodeGen/ARM/ifcvt2.ll for details.
The fix is to change HashEndOfMBB to hash at most one instruction,
instead of trying to apply heuristics about when it will be profitable to
consider more than one instruction. The regular tail-merging heuristics
are already prepared to handle the same cases, and they're more precise.
Also, make test/CodeGen/ARM/ifcvt5.ll and
test/CodeGen/Thumb2/thumb2-branch.ll slightly more complex so that they
continue to test what they're intended to test.
And, this eliminates the problem in
test/CodeGen/Thumb2/2009-10-15-ITBlockBranch.ll, the testcase from
PR5204. Update it accordingly.
llvm-svn: 102907
the intrinsics. The reason for those i8* types is that the intrinsics are
overloaded on the vector type and we don't have a way to declare an intrinsic
where one argument is an overloaded vector type and another argument is a
pointer to the vector element type. The bitcasts added here will match what
the frontend will typically generate when these intrinsics are used.
llvm-svn: 101840
instructions to help disassembly.
We also changed the output of the addressing modes to omit the '+' from the
assembler syntax #+/-<imm> or +/-<Rm>. See, for example, A8.6.57/58/60.
And modified test cases to not expect '+' in +reg or #+num. For example,
; CHECK: ldr.w r9, [r7, #28]
llvm-svn: 98745
U test/CodeGen/ARM/tls2.ll
U test/CodeGen/ARM/arm-negative-stride.ll
U test/CodeGen/ARM/2009-10-30.ll
U test/CodeGen/ARM/globals.ll
U test/CodeGen/ARM/str_pre-2.ll
U test/CodeGen/ARM/ldrd.ll
U test/CodeGen/ARM/2009-10-27-double-align.ll
U test/CodeGen/Thumb2/thumb2-strb.ll
U test/CodeGen/Thumb2/ldr-str-imm12.ll
U test/CodeGen/Thumb2/thumb2-strh.ll
U test/CodeGen/Thumb2/thumb2-ldr.ll
U test/CodeGen/Thumb2/thumb2-str_pre.ll
U test/CodeGen/Thumb2/thumb2-str.ll
U test/CodeGen/Thumb2/thumb2-ldrh.ll
U utils/TableGen/TableGen.cpp
U utils/TableGen/DisassemblerEmitter.cpp
D utils/TableGen/RISCDisassemblerEmitter.h
D utils/TableGen/RISCDisassemblerEmitter.cpp
U Makefile.rules
U lib/Target/ARM/ARMInstrNEON.td
U lib/Target/ARM/Makefile
U lib/Target/ARM/AsmPrinter/ARMInstPrinter.cpp
U lib/Target/ARM/AsmPrinter/ARMAsmPrinter.cpp
U lib/Target/ARM/AsmPrinter/ARMInstPrinter.h
D lib/Target/ARM/Disassembler
U lib/Target/ARM/ARMInstrFormats.td
U lib/Target/ARM/ARMAddressingModes.h
U lib/Target/ARM/Thumb2ITBlockPass.cpp
llvm-svn: 98640
(RISCDisassemblerEmitter) which emits the decoder functions for ARM and Thumb,
and the disassembler core which invokes the decoder function and builds up the
MCInst based on the decoded Opcode.
Added sub-formats to the NeonI/NeonXI instructions to further refine the NEONFrm
instructions to help disassembly.
We also changed the output of the addressing modes to omit the '+' from the
assembler syntax #+/-<imm> or +/-<Rm>. See, for example, A8.6.57/58/60.
And modified test cases to not expect '+' in +reg or #+num. For example,
; CHECK: ldr.w r9, [r7, #28]
llvm-svn: 98637
an undef value. This is only going to come up for bugpoint-reduced tests --
correct programs will not access memory at undefined addresses -- so it's not
worth the effort of doing anything more aggressive.
llvm-svn: 97745
greater-than-or-equal SELECT_CCs to NEON vmin/vmax instructions. This is
only allowed when UnsafeFPMath is set or when at least one of the operands
is known to be nonzero.
llvm-svn: 97065
branch in ARM v4 code, since it gets clobbered by the return address before
it is used. Instead of adding a new register class containing all the GPRs
except LR, just use the existing tGPR class.
llvm-svn: 96360
bug fixes, and with improved heuristics for analyzing foreign-loop
addrecs.
This change also flattens IVUsers, eliminating the stride-oriented
groupings, which makes it easier to work with.
llvm-svn: 95975
legalization even when the IR-level optimizer has removed dead phis, such
as when the high half of an i64 value is unused on a 32-bit target.
I had to adjust a few test cases that had dead phis.
This is a partial fix for Radar 7627077.
llvm-svn: 95816
only run for x86 with fastisel. I've found it being very effective in
eliminating some obvious dead code as result of formal parameter lowering
especially when tail call optimization eliminated the need for some of the loads
from fixed frame objects. It also shrinks a number of the tests. A couple of
tests no longer make sense and are now eliminated.
llvm-svn: 95493
Even if they are suported by the core, they can be disabled
(this is just a configuration bit inside some register).
Allow unaligned memops on darwin and conservatively disallow them otherwise.
llvm-svn: 94889
This new version is much more aggressive about doing "full" reduction in
cases where it reduces register pressure, and also more aggressive about
rewriting induction variables to count down (or up) to zero when doing so
reduces register pressure.
It currently uses fairly simplistic algorithms for finding reuse
opportunities, but it introduces a new framework allows it to combine
multiple strategies at once to form hybrid solutions, instead of doing
all full-reduction or all base+index.
llvm-svn: 94061
adding an "i" to the suffix, indicating that the elements are integers, is
accepted but not part of the standard syntax. This helps us pass a few more
of the Neon tests from gcc.
llvm-svn: 93677
different BlockAddress labels, but nothing semantically important.
Add a FIXME that BlockAddress codegen is broken if the LLVM BB has
an empty name (e.g. strip was run).
llvm-svn: 93303
The change in SelectionDAGBuilder is needed to allow using bitcasts to convert
between f64 (the default type for ARM "d" registers) and 64-bit Neon vector
types. Radar 7457110.
llvm-svn: 91649