case:
int C[100];
int foo() {
return C[4];
}
We now codegen:
foo:
mov %EAX, DWORD PTR [C + 16]
ret
instead of:
foo:
mov %EAX, OFFSET C
mov %EAX, DWORD PTR [%EAX + 16]
ret
Other impressive features may be coming later.
This patch is contributed by Jeff Cohen!
llvm-svn: 17011
useful when you have a reference like:
int A[100];
void foo() { A[10] = 1; }
In this case, &A[10] is a single constant and should be treated as such.
Only MO_GlobalAddress and MO_ExternalSymbol are allowed to use this field, no
other operand type is.
This is another fine patch contributed by Jeff Cohen!!
llvm-svn: 17007
The problem occurred when trying to reload this instruction:
MOV32mr %reg2326, 8, %reg2297, 4, %reg2295
The value of reg2326 was available in EBX, so it was reused from there, instead
of reloading it into EDX.
The value of reg2297 was available in EDX, so it was reused from there, instead
of reloading it into EDI.
The value of reg2295 was not available, so we tried reloading it into EBX, its
assigned register. However, we checked and saw that we already reloaded
something into EBX, so we chose what reg2326 was assigned to (EDX) and reloaded
into that register instead.
Unfortunately EDX had already been used by reg2297, so reloading into EDX
clobbered the value used by the reg2326 operand, breaking the program.
The fix for this is to check that the newly picked register is ok. In this
case we now find that EDX is already used and try using EDI, which succeeds.
llvm-svn: 17006
This transformation fires a few dozen times across the testsuite.
For example, int test2(int X) { return X ^ 0x0FF00FF0; }
Old:
_test2:
lis r2, 4080
ori r2, r2, 4080
xor r3, r3, r2
blr
New:
_test2:
xoris r3, r3, 4080
xori r3, r3, 4080
blr
llvm-svn: 17004
addPassesToEmitMachineCode()
* Add support for registers and constants in getMachineOpValue()
This enables running "int main() { ret 0 }" via the PowerPC JIT.
llvm-svn: 16983
* Add implementation of getMachineOpValue() for generated code emitter
* Convert assert()s in unimplemented functions to abort()s so that non-debug
builds fail predictably
* Add file header comments
llvm-svn: 16981
and 64-bit code emitters that cannot share code unless we use virtual
functions
* Identify components being built by tablegen with more detail by assigning them
to PowerPC, PPC32, or PPC64 more specifically; also avoids seeing 'building
PowerPC XYZ' messages twice, where one is for PPC32 and one for PPC64
llvm-svn: 16980
to go in. This patch allows us to compute the trip count of loops controlled
by values loaded from constant arrays. The cannonnical example of this is
strlen when passed a constant argument:
for (int i = 0; "constantstring"[i]; ++i) ;
return i;
In this case, it will compute that the loop executes 14 times, which means
that the exit value of i is 14. Because of this, the loop gets DCE'd and
we are happy. This also applies to anything that does similar things, e.g.
loops like this:
const float Array[] = { 0.1, 2.1, 3.2, 23.21 };
for (int i = 0; Array[i] < 20; ++i)
and is actually fairly general.
The problem with this is that it almost never triggers. The reason is that
we run indvars and the loop optimizer only at compile time, which is before
things like strlen and strcpy have been inlined into the program from libc.
Because of this, it almost never is used (it triggers twice in specint2k).
I'm committing it because it DOES work, may be useful in the future, and
doesn't slow us down at all. If/when we start running the loop optimizer
at link-time (-O4?) this will be very nice indeed :)
llvm-svn: 16926
pointer recurrences into expressions from this:
%P_addr.0.i.0 = phi sbyte* [ getelementptr ([8 x sbyte]* %.str_1, int 0, int 0), %entry ], [ %inc.0.i, %no_exit.i ]
%inc.0.i = getelementptr sbyte* %P_addr.0.i.0, int 1 ; <sbyte*> [#uses=2]
into this:
%inc.0.i = getelementptr sbyte* getelementptr ([8 x sbyte]* %.str_1, int 0, int 0), int %inc.0.i.rec
Actually create something nice, like this:
%inc.0.i = getelementptr [8 x sbyte]* %.str_1, int 0, int %inc.0.i.rec
llvm-svn: 16924
well as a vector of constant*'s. It turns out that this is more efficient
and all of the clients want to do that, so we should cater to them.
llvm-svn: 16923
First, it allows SRA of globals that have embedded arrays, implementing
GlobalOpt/globalsra-partial.llx. This comes up infrequently, but does allow,
for example, deleting several stores to dead parts of globals in dhrystone.
Second, this implements GlobalOpt/malloc-promote-*.llx, which is the
following nifty transformation:
Basically if a global pointer is initialized with malloc, and we can tell
that the program won't notice, we transform this:
struct foo *FooPtr;
...
FooPtr = malloc(sizeof(struct foo));
...
FooPtr->A FooPtr->B
Into:
struct foo FooPtrBody;
...
FooPtrBody.A FooPtrBody.B
This comes up occasionally, for example, the 'disp' global in 183.equake (where
the xform speeds the CBE version of the program up from 56.16s to 52.40s (7%)
on apoc), and the 'desired_accept', 'fixLRBT', 'macroArray', & 'key_queue'
globals in 300.twolf (speeding it up from 22.29s to 21.55s (3.4%)).
The nice thing about this xform is that it exposes the resulting global to
global variable optimization and makes alias analysis easier in addition to
eliminating a few loads.
llvm-svn: 16916
first element of an array, return a GEP instead of a cast. This allows us
to transparently fold this:
int* getelementptr (int* cast ([100 x int]* %Gbody to int*), int 40)
into this:
int* getelementptr ([100 x int]* %Gbody, int 0, int 40)
llvm-svn: 16911
still optimize away all of the indirect calls and loads, etc from it.
This turns code like this:
if (G != 0)
G();
into
if (G != 0)
ActualCallee();
This triggers a couple of times in gcc and libstdc++.
llvm-svn: 16901
Deal with allocating stack space for outgoing args and copying them into the
correct stack slots (at least, we can copy <=32-bit int args).
We now correctly generate ADJCALLSTACK* instructions.
llvm-svn: 16881
stored to, but are stored at variable indexes. This occurs at least in
176.gcc, but probably others, and we should handle it for completeness.
llvm-svn: 16876
has a large number of users. Instead, just keep track of whether we're
making changes as we do so.
This patch has no functionlity changes.
llvm-svn: 16874
we know that all uses of the global will trap if the pointer contained is
null. In this case, we forward substitute the stored value to any uses.
This has the effect of devirtualizing trivial globals in trivial cases. For
example, 164.gzip contains this:
gzip.h:extern int (*read_buf) OF((char *buf, unsigned size));
bits.c: read_buf = file_read;
deflate.c: lookahead = read_buf((char*)window,
deflate.c: n = read_buf((char*)window+strstart+lookahead, more);
Since read_buf has to point to file_read at every use, we just replace
the calls through read_buf with a direct call to file_read.
This occurs in several benchmarks, including 176.gcc and 164.gzip. Direct
calls are good and stuff.
llvm-svn: 16871
the -sse* options (to avoid misleading people).
Also, the stack alignment of the target doesn't depend on whether SSE is
eventually implemented, so remove a comment.
llvm-svn: 16860