- Allocate MachineMemOperands and MachineMemOperand lists in MachineFunctions.
This eliminates MachineInstr's std::list member and allows the data to be
created by isel and live for the remainder of codegen, avoiding a lot of
copying and unnecessary translation. This also shrinks MemSDNode.
- Delete MemOperandSDNode. Introduce MachineSDNode which has dedicated
fields for MachineMemOperands.
- Change MemSDNode to have a MachineMemOperand member instead of its own
fields with the same information. This introduces some redundancy, but
it's more consistent with what MachineInstr will eventually want.
- Ignore alignment when searching for redundant loads for CSE, but remember
the greatest alignment.
Target-specific code which previously used MemOperandSDNodes with generic
SDNodes now use MemIntrinsicSDNodes, with opcodes in a designated range
so that the SelectionDAG framework knows that MachineMemOperand information
is available.
llvm-svn: 82794
of the defs are processed.
Also fix a implicit_def propagation bug: a implicit_def of a physical register
should be applied to uses of the sub-registers.
llvm-svn: 82616
we pushed the beginning of the interval back 1, so the
interval would overlap with inputs that die. We were
also pushing the end of the interval back 1, though,
which means the earlyclobber didn't overlap with other
output operands. Don't do this. PR 4964.
llvm-svn: 82342
The gist of this is if source of some of the copies that feed into a phi join is defined by the phi join, we'd like to eliminate them. However, if any of the non-identity source overlaps the live interval of the phi join then the coalescer won't be able to coalesce them. The early coalescer's job is to eliminate the identity copies by partially-coalescing the two live intervals.
llvm-svn: 81796
a new class, MachineInstrIndex, which hides arithmetic details from
most clients. This is a step towards allowing the register allocator
to update/insert code during allocation.
llvm-svn: 81040
MachineInstr and MachineOperand. This required eliminating a
bunch of stuff that was using DOUT, I hope that bill doesn't
mind me stealing his fun. ;-)
llvm-svn: 79813
- Some clients which used DOUT have moved to DEBUG. We are deprecating the
"magic" DOUT behavior which avoided calling printing functions when the
statement was disabled. In addition to being unnecessary magic, it had the
downside of leaving code in -Asserts builds, and of hiding potentially
unnecessary computations.
llvm-svn: 77019
This adds location info for all llvm_unreachable calls (which is a macro now) in
!NDEBUG builds.
In NDEBUG builds location info and the message is off (it only prints
"UREACHABLE executed").
llvm-svn: 75640
Make llvm_unreachable take an optional string, thus moving the cerr<< out of
line.
LLVM_UNREACHABLE is now a simple wrapper that makes the message go away for
NDEBUG builds.
llvm-svn: 75379
as an (index,bool) pair. The bool flag records whether the kill is a
PHI kill or not. This code will be used to enable splitting of live
intervals containing PHI-kills.
A slight change to live interval weights introduced an extra spill
into lsr-code-insertion (outside the critical sections). The test
condition has been updated to reflect this.
llvm-svn: 75097
Note, isUndef marker must be placed even on implicit_def def operand or else the scavenger will not ignore it. This is necessary because -O0 path does not use liveintervalanalysis, it treats implicit_def just like any other def.
llvm-svn: 74601
The register allocator, when it allocates a register to a virtual register defined by an implicit_def, can allocate any physical register without worrying about overlapping live ranges. It should mark all of operands of the said virtual register so later passes will do the right thing.
This is not the best solution. But it should be a lot less fragile to having the scavenger try to track what is defined by implicit_def.
llvm-svn: 74518
entries as there are basic blocks in the function. LiveVariables::getVarInfo
creates a VarInfo struct for every register in the function, leading to
quadratic space use. This patch changes the BitVector to a SparseBitVector,
which doesn't help the worst-case memory use but does reduce the actual use in
very long functions with short-lived variables.
llvm-svn: 72426
VirtRegMap keeps track of allocations so it knows what's not used. As a horrible hack, the stack coloring can color spill slots with *free* registers. That is, it replace reload and spills with copies from and to the free register. It unfold instructions that load and store the spill slot and replace them with register using variants.
Not yet enabled. This is part 1. More coming.
llvm-svn: 70787
This fixes a very subtle bug. vr defined by an implicit_def is allowed overlap with any register since it doesn't actually modify anything. However, if it's used as a two-address use, its live range can be extended and it can be spilled. The spiller must take care not to emit a reload for the vn number that's defined by the implicit_def. This is both a correctness and performance issue.
llvm-svn: 69743
%reg1498<def> = MOV32rm %reg1024, 1, %reg0, 12, %reg0, Mem:LD(4,4) [sunkaddr39 + 0]
%reg1506<def> = MOV32rm %reg1024, 1, %reg0, 8, %reg0, Mem:LD(4,4) [sunkaddr42 + 0]
%reg1486<def> = MOV32rr %reg1506
%reg1486<def> = XOR32rr %reg1486, %reg1498, %EFLAGS<imp-def,dead>
%reg1510<def> = MOV32rm %reg1024, 1, %reg0, 4, %reg0, Mem:LD(4,4) [sunkaddr45 + 0]
=>
%reg1498<def> = MOV32rm %reg2036, 1, %reg0, 12, %reg0, Mem:LD(4,4) [sunkaddr39 + 0]
%reg1506<def> = MOV32rm %reg2037, 1, %reg0, 8, %reg0, Mem:LD(4,4) [sunkaddr42 + 0]
%reg1486<def> = MOV32rr %reg1506
%reg1486<def> = XOR32rr %reg1486, %reg1498, %EFLAGS<imp-def,dead>
%reg1510<def> = MOV32rm %reg2038, 1, %reg0, 4, %reg0, Mem:LD(4,4) [sunkaddr45 + 0]
From linearscan's point of view, each of reg2036, 2037, and 2038 are separate registers, each is "killed" after a single use. The reloaded register is available and it's often clobbered right away. e.g. In thise case reg1498 is allocated EAX while reg2036 is allocated RAX. This means we end up with multiple reloads from the same stack slot in the same basic block.
Now linearscan recognize there are other reloads from same SS in the same BB. So it'll "downgrade" RAX (and its aliases) after reg2036 is allocated until the next reload (reg2037) is done. This greatly increase the likihood reloads from SS are reused.
This speeds up sha1 from OpenSSL by 5.8%. It is also an across the board win for SPEC2000 and 2006.
llvm-svn: 69585