Put a some code back to handle buggy behavior that GVN expects: it wants
loads to depend on each other, and accesses to depend on their allocations.
llvm-svn: 60240
Document the Dirty value more precisely, use it for the uninitialized
DepResultTy value. Change reverse mappings to be from an instruction*
instead of DepResultTy, and stop tracking other forms. This makes it more
clear that we only care about the instruction cases.
Eliminate a DepResultTy,bool pair by using Dirty in the local case as well,
shrinking the map and simplifying the code.
This speeds up GVN by ~3% on 403.gcc.
llvm-svn: 60232
query. This makes it crystal clear what cases can escape from MemDep that
the clients have to handle. This also gives the clients a nice simplified
interface to it that is easy to poke at.
This patch also makes DepResultTy and MemoryDependenceAnalysis::DepType
private, yay.
llvm-svn: 60231
of a pointer/int pair instead of a manually bitmangled pointer.
This forces clients to think a little more about checking the
appropriate pieces and will be useful for internal
implementation improvements later.
I'm not particularly happy with this. After going through this
I don't think that the clients of memdep should be exposed to
the internal type at all. I'll fix this in a subsequent commit.
This has no functionality change.
llvm-svn: 60230
properly updates the reverse dependency map when it installs updated
dependencies for instructions that depend on the removed instruction.
llvm-svn: 60222
wrappers around the interesting code and use an obscure iterator
abstraction that dates back many many years.
Move EraseDeadInstructions to Transforms/Utils and name it
RecursivelyDeleteTriviallyDeadInstructions.
llvm-svn: 60191
1. Make it fold blocks separated by an unconditional branch. This enables
jump threading to see a broader scope.
2. Make jump threading able to eliminate locally redundant loads when they
feed the branch condition of a block. This frequently occurs due to
reg2mem running.
3. Make jump threading able to eliminate *partially redundant* loads when
they feed the branch condition of a block. This is common in code with
lots of loads and stores like C++ code and 255.vortex.
This implements thread-loads.ll and rdar://6402033.
Per the fixme's, several pieces of this should be moved into Transforms/Utils.
llvm-svn: 60148
the conditional for the BRCOND statement. For instance, it will generate:
addl %eax, %ecx
jo LOF
instead of
addl %eax, %ecx
; About 10 instructions to compare the signs of LHS, RHS, and sum.
jl LOF
llvm-svn: 60123
performance in most cases on the Grawp tester, but does speed some
things up (like shootout/hash by 15%). This also doesn't impact
compile time in a noticable way on the Grawp tester.
It also, of course, gets the testcase it was designed for right :)
llvm-svn: 60120
heuristic: the value is already live at the new memory operation if
it is used by some other instruction in the memop's block. This is
cheap and simple to compute (moreso than full liveness).
This improves the new heuristic even more. For example, it cuts two
out of three new instructions out of 255.vortex:DbmFileInGrpHdr,
which is one of the functions that the heuristic regressed. This
overall eliminates another 40 instructions from 403.gcc and visibly
reduces register pressure in 255.vortex (though this only actually
ends up saving the 2 instructions from the whole program).
llvm-svn: 60084
phrased in terms of liveness instead of as a horrible hack. :)
In pratice, this doesn't change the generated code for either
255.vortex or 403.gcc, but it could cause minor code changes in
theory. This is framework for coming changes.
llvm-svn: 60082
-enable-smarter-addr-folding to llc) that gives CGP a better
cost model for when to sink computations into addressing modes.
The basic observation is that sinking increases register
pressure when part of the addr computation has to be available
for other reasons, such as having a use that is a non-memory
operation. In cases where it works, it can substantially reduce
register pressure.
This code is currently an overall win on 403.gcc and 255.vortex
(the two things I've been looking at), but there are several
things I want to do before enabling it by default:
1. This isn't doing any caching of results, so it is much slower
than it could be. It currently slows down release-asserts llc
by 1.7% on 176.gcc: 27.12s -> 27.60s.
2. This doesn't think about inline asm memory operands yet.
3. The cost model botches the case when the needed value is live
across the computation for other reasons.
I'll continue poking at this, and eventually turn it on as llcbeta.
llvm-svn: 60074
optimize addressing modes. This allows us to optimize things like isel-sink2.ll
into:
movl 4(%esp), %eax
cmpb $0, 4(%eax)
jne LBB1_2 ## F
LBB1_1: ## TB
movl $4, %eax
ret
LBB1_2: ## F
movzbl 7(%eax), %eax
ret
instead of:
_test:
movl 4(%esp), %eax
cmpb $0, 4(%eax)
leal 4(%eax), %eax
jne LBB1_2 ## F
LBB1_1: ## TB
movl $4, %eax
ret
LBB1_2: ## F
movzbl 3(%eax), %eax
ret
This shrinks (e.g.) 403.gcc from 1133510 to 1128345 lines of .s.
Note that the 2008-10-16-SpillerBug.ll testcase is dubious at best, I doubt
it is really testing what it thinks it is.
llvm-svn: 60068
(a) Remove conditionally removed code in SelectXAddr. Basically, hope for the
best that the A-form and D-form address predicates catch everything before
the code decides to emit a X-form address.
(b) Expand vector store test cases to include the usual suspects.
llvm-svn: 60034
can recursively match things) and scales by 0 by ignoring them.
This triggers once in 403.gcc, saving 1 (!!!!) instruction in the
whole huge app.
llvm-svn: 60013
into a new AddressingModeMatcher class. This makes it easier
to reason about and reduces passing around of stuff, but has
no functionality change.
llvm-svn: 60012
introduce any new spilling; it just uses unused registers.
Refactor the SUnit topological sort code out of the RRList scheduler and
make use of it to help with the post-pass scheduler.
llvm-svn: 59999
(a) Slight rethink on i64 zero/sign/any extend code - use a shuffle to
directly zero-extend i32 to i64, but use rotates and shifts for
sign extension. Also ensure unified register consistency.
(b) Add new test harness for i64 operations: i64ops.ll
llvm-svn: 59970
(a) Improve the extract element code: there's no need to do gymnastics with
rotates into the preferred slot if a shuffle will do the same thing.
(b) Rename a couple of SPUISD pseudo-instructions for readability and better
semantic correspondence.
(c) Fix i64 sign/any/zero extension lowering.
llvm-svn: 59965
(this doesn't happen that often, since most code
does not use illegal types) then follow it by a
DAG combiner run that is allowed to generate
illegal operations but not illegal types. I didn't
modify the target combiner code to distinguish like
this between illegal operations and illegal types,
so it will not produce illegal operations as well
as not producing illegal types.
llvm-svn: 59960
value. It must now be as if the pointer were allocated and has not escaped to
the caller. Thanks to Dan Gohman for pointing out the error in the original
and helping devise this definition.
llvm-svn: 59940
indicate functions that allocate, such as operator new, or list::insert. The
actual definition is slightly less strict (for now).
No changes to the bitcode reader/writer, asm printer or verifier were needed.
llvm-svn: 59934
"It simplifies the type legalization part a bit, and produces better code by
teaching SelectionDAG about the extra bits in an i8 SADDO/UADDO node. In
essence, I spontaneously decided that on x86 this i8 boolean result would be
either 0 or 1, and on other platforms 0/1 or 0/-1, depending on whether the
platform likes it's boolean zero extended or sign extended."
llvm-svn: 59864
g++ -m32 -c -g -DIN_GCC -W -Wall -Wwrite-strings -Wmissing-format-attribute -fno-common -mdynamic-no-pic -DHAVE_CONFIG_H -Wno-unused -DTARGET_NAME=\"i386-apple-darwin9.5.0\" -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include -DENABLE_LLVM -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/../llvm.src/include -D_DEBUG -D_GNU_SOURCE -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS -I. -I. -I../../llvm-gcc.src/gcc -I../../llvm-gcc.src/gcc/. -I../../llvm-gcc.src/gcc/../include -I./../intl -I../../llvm-gcc.src/gcc/../libcpp/include -I../../llvm-gcc.src/gcc/../libdecnumber -I../libdecnumber -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.obj/include -I/Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvm.src/include ../../llvm-gcc.src/gcc/llvm-types.cpp -o llvm-types.o
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemCpy(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1496: error: 'memcpy_i64' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemMove(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1512: error: 'memmove_i64' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp: In member function 'void TreeToLLVM::EmitMemSet(llvm::Value*, llvm::Value*, llvm::Value*, unsigned int)':
../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i32' is not a member of 'llvm::Intrinsic'
../../llvm-gcc.src/gcc/llvm-convert.cpp:1528: error: 'memset_i64' is not a member of 'llvm::Intrinsic'
make[3]: *** [llvm-convert.o] Error 1
make[3]: *** Waiting for unfinished jobs....
rm fsf-funding.pod gcov.pod gfdl.pod cpp.pod gpl.pod gcc.pod
make[2]: *** [all-stage1-gcc] Error 2
make[1]: *** [stage1-bubble] Error 2
make: *** [all] Error 2
llvm-svn: 59809
"ISD::ADDO". ISD::ADDO is lowered into a target-independent form that does the
addition and then checks if the result is less than one of the operands. (If it
is, then there was an overflow.)
llvm-svn: 59779
schedulers. This doesn't have much immediate impact because
targets that use these schedulers by default don't yet provide
pipeline information.
This code also didn't have the benefit of register pressure
information. Also, removing it will avoid problems with list-burr
suddenly starting to do latency-oriented scheduling on x86 when we
start providing pipeline data, which would increase spilling.
llvm-svn: 59775
some of the latency computation logic out of the SDNode
ScheduleDAG code into a TargetInstrItineraries helper method
to help with this.
llvm-svn: 59761
the RR scheduler actually does look at latency values, but it
doesn't use a hazard recognizer so it has no way to know when
a no-op is needed, as opposed to just stalling and incrementing
the cycle count.
llvm-svn: 59759
MachineInstr scheduling DAG, meaning they implicitly depend on all
preceding defs. This fixes Benchmarks/Shootout-C++/except and
Regression/C++/EH/simple_rethrow in
-relocation-model=pic -disable-post-RA-scheduler=false
mode.
llvm-svn: 59747
(a) Remove moved file (SPUAsmPrinter.cpp) to make svn happy.
(b) Remove truncated stores that will never be used.
(c) Add initial support for __muldi3 as a libcall.
llvm-svn: 59734