1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-28 14:32:51 +01:00
Commit Graph

129 Commits

Author SHA1 Message Date
Nate Begeman
4106e81966 Correct a documention link
llvm-svn: 20840
2005-03-26 01:28:05 +00:00
Nate Begeman
613e54d5f0 Fix an incorrect argument being passed to BuildMI for indirect calls.
llvm-svn: 20821
2005-03-24 23:34:38 +00:00
Nate Begeman
4584ca0554 Commit Gabor Greif's patch to use iterators in lowering intrinsics.
llvm-svn: 20816
2005-03-24 20:07:16 +00:00
Chris Lattner
ad07b1bc54 eliminate dead variables, patch contributed by Gabor Greif!
llvm-svn: 20812
2005-03-24 17:32:20 +00:00
Nate Begeman
833c1d0994 Implement more of the PPC32 Pattern ISel:
1) dynamic stack alloc
2) loads
3) shifts
4) subtract
5) immediate form of add, and, or, xor
6) change flag from -pattern-isel to -enable-ppc-pattern-isel

Remove dead arguments from getGlobalBaseReg in the simple ISel

llvm-svn: 20810
2005-03-24 06:28:42 +00:00
Misha Brukman
04d6d0666d We may be adding functions to the Module during initialization, so
conservatively, it's modified

llvm-svn: 20735
2005-03-21 19:22:14 +00:00
Chris Lattner
4b688a1c70 This mega patch converts us from using Function::a{iterator|begin|end} to
using Function::arg_{iterator|begin|end}.  Likewise Module::g* -> Module::global_*.

This patch is contributed by Gabor Greif, thanks!

llvm-svn: 20597
2005-03-15 04:54:21 +00:00
Chris Lattner
496d623600 Fix a crash handling 'undef bool', fixing an llc crash on 186.crafty
llvm-svn: 20523
2005-03-08 22:53:09 +00:00
Chris Lattner
c032990335 Fix Regression/CodeGen/PowerPC/2005-01-14-UndefLong.ll
llvm-svn: 19557
2005-01-14 20:22:02 +00:00
Chris Lattner
b0b49268c4 Fix: Regression/CodeGen/PowerPC/2005-01-14-SetSelectCrash.ll
llvm-svn: 19555
2005-01-14 19:31:00 +00:00
Chris Lattner
93fc4bd9cb This hunk:
-  unsigned TrueValue = getReg(TrueVal, BB, BB->begin());
+  unsigned TrueValue = getReg(TrueVal);

Fixes the PPC regressions from last night.

The other hunk is just a clarity improvement.

llvm-svn: 19263
2005-01-02 23:07:31 +00:00
Chris Lattner
ad63a0d6a4 Fix a FIXME: Select instructions on longs were miscompiled.
While we're at it, improve codegen of select instructions.  For this
testcase:

int %test(bool %C, int %A, int %B) {
  %D = select bool %C, int %A, int %B
  ret int %D
}

We used to generate this code:

_test:
        cmpwi cr0, r3, 0
        bne .LBB_test_2 ;
.LBB_test_1:    ;
        b .LBB_test_3   ;
.LBB_test_2:    ;
        or r5, r4, r4
.LBB_test_3:    ;
        or r3, r5, r5
        blr

Now we emit:

_test:
        cmpwi cr0, r3, 0
        bne .LBB_test_2 ;
.LBB_test_1:    ;
        or r4, r5, r5
.LBB_test_2:    ;
        or r3, r4, r4
        blr

-Chris

llvm-svn: 19214
2005-01-01 16:10:12 +00:00
Chris Lattner
2231d21dad Fix several bugs in 'op x, imm' handling. Foremost is that we now emit
addi r3, r3, -1
instead of
   addi r3, r3, 1

for 'sub int X, 1'.

Secondarily, this fixes several cases where we could crash given an unsigned
constant.  And fixes a couple of minor missed optimization cases, such as
xor X, ~0U -> not X

llvm-svn: 18379
2004-11-30 07:30:20 +00:00
Chris Lattner
1e093bfb2b Fix CodeGen/PowerPC/2004-11-30-shr-var-crash.ll
llvm-svn: 18376
2004-11-30 06:40:04 +00:00
Chris Lattner
629965fbe0 Fix test/Regression/CodeGen/PowerPC/2004-11-29-ShrCrash.ll
llvm-svn: 18374
2004-11-30 06:36:11 +00:00
Chris Lattner
23a2a6e5d3 Fix test/Regression/CodeGen/PowerPC/2004-11-30-shift-crash.ll
llvm-svn: 18371
2004-11-30 06:29:10 +00:00
Nate Begeman
6048139b1f Remove the ISel->AsmPrinter link via the TargetMachine that was put in
place to help bring up the PowerPC back end on Darwin.  This code is no
longer serves any purpose now that the AsmPrinter does the right thing
all the time printing GlobalValues.  --Cruft.

llvm-svn: 18267
2004-11-27 04:45:11 +00:00
Nate Begeman
6405f5e9b3 Enable optimization suggested by Chris Lattner to not emit reloc stubs for
static global variables whose addresses are taken.  This allows us to
convert the following code for taking the address of a static function foo

        addis r2, r30, ha16(Ll1__2E_foo_2$non_lazy_ptr-"L00001$pb")
        lwz r3, lo16(Ll1__2E_foo_2$non_lazy_ptr-"L00001$pb")(r2)

which also includes linker stub code emitted at the end of the .s file not
shown here, and replace it with this:

        addis r2, r30, ha16(l1__2E_foo_2-"L00001$pb")
        la r3, lo16(l1__2E_foo_2-"L00001$pb")(r2)

which in addition to not needing linker help, also has no load instruction.
For those not up on PowerPC mnemonics, la is shorthand for add immediate.

llvm-svn: 18239
2004-11-25 07:09:01 +00:00
Nate Begeman
e9b752c4e3 Add the same optimization that we do loading from fixed alloca slots to
storing to fixed alloca slots.

llvm-svn: 18221
2004-11-24 21:53:14 +00:00
Chris Lattner
5ac6f7a36d Simplify code a bit
llvm-svn: 18146
2004-11-23 06:05:44 +00:00
Chris Lattner
1b163867c6 LA is really addi. Be consistent with operand ordering to avoid confusing the code emitter
llvm-svn: 18138
2004-11-23 05:54:25 +00:00
Nate Begeman
7ec36ad70f Fix Shootout-C++/wc, which was broken by my recent changes to emit fewer
reg-reg copies.  The necessary conditions for this bug are a GEP that is
used outside the basic block in which it is defined, whose components
other than the pointer are all constant zero, and where the use is
selected before the definition (backwards branch to successsor block).

llvm-svn: 18084
2004-11-21 05:14:06 +00:00
Nate Begeman
83cded0ecb Eliminate another 6k register copies that the register allocator would just
coalesce out of hbd.  Speeds up compilation by 2% (0.6s)

llvm-svn: 17987
2004-11-19 08:01:16 +00:00
Nate Begeman
de1fd6a162 Generate fewer reg-reg copies for the register allocator to deal with.
This eliminates over 2000 in hbd alone.

llvm-svn: 17973
2004-11-19 02:06:40 +00:00
Nate Begeman
567d30174a Eliminate another common source of moves that the register allocator
shouldn't be forced to coalesce for us: folded GEP operations.  This too
fires thousands of times across the testsuite.

llvm-svn: 17947
2004-11-18 07:22:46 +00:00
Nate Begeman
3e1aaef2b5 When accessing the base register for global variables, use the register
directly rather than making a copy for the register allocator to coalesce.
This kills thousands of live intervals across the testsuite.

llvm-svn: 17946
2004-11-18 06:51:29 +00:00
Nate Begeman
7e254235e2 Clean up and fix cast codegen by removing cases that are handled elsewhere,
and properly emitting signed short to unsigned int.  This fixes the last
regression vs. the CBE, MultiSource/Applications/hbd.

llvm-svn: 17942
2004-11-18 04:56:53 +00:00
Nate Begeman
a0c15f3ffd Put int the getReg cast optimization from x86 so that we generate fewer
move instructions for the register allocator to coalesce.

llvm-svn: 17608
2004-11-08 02:25:40 +00:00
Nate Begeman
a7541b19fc Disable bogus cast elimination when the cast is used by a setcc instruction.
llvm-svn: 17583
2004-11-07 20:23:42 +00:00
Nate Begeman
bc8bc24d28 Thanks to sabre for pointing out that we were incorrectly codegen'ing
int test(int x) { return 32768 - x; }

Fixed by teaching the function that checks a constant's validity to be used
as an immediate argument about subtract-from instructions.

llvm-svn: 17476
2004-11-04 19:43:18 +00:00
Nate Begeman
113f516f6b Fix treecc. Also fix a latent bug in emitBinaryConstOperation that would
allow and const, 0 to be incorrectly codegen'd into a rlwinm instruction.

llvm-svn: 17234
2004-10-26 03:48:25 +00:00
Nate Begeman
4b5ed899fd Implement more complete and correct codegen for bitfield inserts, as tested
by the recently committed rlwimi.ll test file.  Also commit initial code
for bitfield extract, although it is turned off until fully debugged.

llvm-svn: 17207
2004-10-24 10:33:30 +00:00
Nate Begeman
91ef127999 Kill casts from integer types to unsigned byte, when the cast was only used
as the shift amount operand to a shift instruction.  This was causing us to
emit unnecessary clear operations for code such as:
int foo(int x) { return 1 << x; }

llvm-svn: 17175
2004-10-23 00:50:23 +00:00
Reid Spencer
019621a1ea Adjust to changes in Makefile.rules
llvm-svn: 17167
2004-10-22 21:02:08 +00:00
Nate Begeman
d7cbf1d28e Don't clear or sign extend bool->int. This fires a few dozen times on the test suite
llvm-svn: 17147
2004-10-20 21:55:41 +00:00
Nate Begeman
f9aac7846c Implement bitfield insert by recognizing the following pattern:
1. optional shift left
2. and x, immX
3. and y, immY
4. or z, x, y
==> rlwimi z, x, y, shift, mask begin, mask end

where immX == ~immY and immX is a run of set bits. This transformation
fires 32 times on voronoi, once on espresso, and probably several
dozen times on external benchmarks such as gcc.

To put this in terms of actual code generated for
struct B { unsigned a : 3; unsigned b : 2; };
void storeA (struct B *b, int v) { b->a = v;}
void storeB (struct B *b, int v) { b->b = v;}

Old:
_storeA:
        rlwinm r2, r4, 0, 29, 31
        lwz r4, 0(r3)
        rlwinm r4, r4, 0, 0, 28
        or r2, r4, r2
        stw r2, 0(r3)
        blr

_storeB:
        rlwinm r2, r4, 3, 0, 28
        rlwinm r2, r2, 0, 27, 28
        lwz r4, 0(r3)
        rlwinm r4, r4, 0, 29, 26
        or r2, r2, r4
        stw r2, 0(r3)
        blr

New:
_storeA:
        lwz r2, 0(r3)
        rlwimi r2, r4, 0, 29, 31
        stw r2, 0(r3)
        blr

_storeB:
        lwz r2, 0(r3)
        rlwimi r2, r4, 3, 27, 28
        stw r2, 0(r3)
        blr

llvm-svn: 17078
2004-10-17 05:19:20 +00:00
Nate Begeman
d4c970aa3d Finally fix one of the oldest FIXMEs in the PowerPC backend: correctly
flag rotate left word immediate then mask insert (rlwimi) as a two-address
instruction, and update the ISel usage of the instruction accordingly.

This will allow us to properly schedule rlwimi, and use it to efficiently
codegen bitfield operations.

llvm-svn: 17068
2004-10-16 20:43:38 +00:00
Chris Lattner
3662abfd5a ADd support for undef and unreachable
llvm-svn: 17050
2004-10-16 18:13:47 +00:00
Nate Begeman
d8183bd297 Better codegen of binary integer ops with 32 bit immediate operands.
This transformation fires a few dozen times across the testsuite.

For example, int test2(int X) { return X ^ 0x0FF00FF0; }
Old:
_test2:
        lis r2, 4080
        ori r2, r2, 4080
        xor r3, r3, r2
        blr

New:
_test2:
        xoris r3, r3, 4080
        xori r3, r3, 4080
        blr

llvm-svn: 17004
2004-10-15 00:50:19 +00:00
Nate Begeman
dfefd2f3fc Implement logical and with an immediate that consists of a contiguous block
of one or more 1 bits (may wrap from least significant bit to most
significant bit) as the rlwinm rather than andi., andis., or some longer
instructons sequence.

int andn4(int z) { return z & -4; }
int clearhi(int z) { return z & 0x0000FFFF; }
int clearlo(int z) { return z & 0xFFFF0000; }
int clearmid(int z) { return z & 0x00FFFF00; }
int clearwrap(int z) { return z & 0xFF0000FF; }

_andn4:
        rlwinm r3, r3, 0, 0, 29
        blr

_clearhi:
        rlwinm r3, r3, 0, 16, 31
        blr

_clearlo:
        rlwinm r3, r3, 0, 0, 15
        blr

_clearmid:
        rlwinm r3, r3, 0, 8, 23
        blr

_clearwrap:
        rlwinm r3, r3, 0, 24, 7
        blr

llvm-svn: 16832
2004-10-08 02:49:24 +00:00
Nate Begeman
370b1b7a9a Several fixes and enhancements to the PPC32 backend.
1. Fix an illegal argument to getClassB when deciding whether or not to
   sign extend a byte load.

2. Initial addition of isLoad and isStore flags to the instruction .td file
   for eventual use in a scheduler.

3. Rewrite of how constants are handled in emitSimpleBinaryOperation so
   that we can emit the PowerPC shifted immediate instructions far more
   often.  This allows us to emit the following code:

int foo(int x) { return x | 0x00F0000; }

_foo:
.LBB_foo_0:     ; entry
        ; IMPLICIT_DEF
        oris r3, r3, 15
        blr

llvm-svn: 16826
2004-10-07 22:30:03 +00:00
Chris Lattner
38fbf09104 Correct some typeos
llvm-svn: 16770
2004-10-06 16:28:24 +00:00
Nate Begeman
79d42a185e Turning on fsel code gen now that we can do so would be good.
llvm-svn: 16765
2004-10-06 11:03:30 +00:00
Nate Begeman
7b4fe83ba8 Implement floating point select for lt, gt, le, ge using the powerpc fsel
instruction.

Now, rather than emitting the following loop out of bisect:
.LBB_main_19:	; no_exit.0.i
	rlwinm r3, r2, 3, 0, 28
	lfdx f1, r3, r27
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f2, lo16(.CPI_main_1-"L00000$pb")(r3)
	fsub f2, f2, f1
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3)
	fcmpu cr0, f1, f4
	bge .LBB_main_64	; no_exit.0.i
.LBB_main_63:	; no_exit.0.i
	b .LBB_main_65	; no_exit.0.i
.LBB_main_64:	; no_exit.0.i
	fmr f2, f1
.LBB_main_65:	; no_exit.0.i
	addi r3, r2, 1
	rlwinm r3, r3, 3, 0, 28
	lfdx f1, r3, r27
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3)
	fsub f4, f4, f1
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f5, lo16(.CPI_main_1-"L00000$pb")(r3)
	fcmpu cr0, f1, f5
	bge .LBB_main_67	; no_exit.0.i
.LBB_main_66:	; no_exit.0.i
	b .LBB_main_68	; no_exit.0.i
.LBB_main_67:	; no_exit.0.i
	fmr f4, f1
.LBB_main_68:	; no_exit.0.i
	fadd f1, f2, f4
	addis r3, r30, ha16(.CPI_main_2-"L00000$pb")
	lfd f2, lo16(.CPI_main_2-"L00000$pb")(r3)
	fmul f1, f1, f2
	rlwinm r3, r2, 3, 0, 28
	lfdx f2, r3, r28
	fadd f4, f2, f1
	fcmpu cr0, f4, f0
	bgt .LBB_main_70	; no_exit.0.i
.LBB_main_69:	; no_exit.0.i
	b .LBB_main_71	; no_exit.0.i
.LBB_main_70:	; no_exit.0.i
	fmr f0, f4
.LBB_main_71:	; no_exit.0.i
	fsub f1, f2, f1
	addi r2, r2, -1
	fcmpu cr0, f1, f3
	blt .LBB_main_73	; no_exit.0.i
.LBB_main_72:	; no_exit.0.i
	b .LBB_main_74	; no_exit.0.i
.LBB_main_73:	; no_exit.0.i
	fmr f3, f1
.LBB_main_74:	; no_exit.0.i
	cmpwi cr0, r2, -1
	fmr f16, f0
	fmr f17, f3
	bgt .LBB_main_19	; no_exit.0.i

We emit this instead:
.LBB_main_19:	; no_exit.0.i
	rlwinm r3, r2, 3, 0, 28
	lfdx f1, r3, r27
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f2, lo16(.CPI_main_1-"L00000$pb")(r3)
	fsub f2, f2, f1
	fsel f1, f1, f1, f2
	addi r3, r2, 1
	rlwinm r3, r3, 3, 0, 28
	lfdx f2, r3, r27
	addis r3, r30, ha16(.CPI_main_1-"L00000$pb")
	lfd f4, lo16(.CPI_main_1-"L00000$pb")(r3)
	fsub f4, f4, f2
	fsel f2, f2, f2, f4
	fadd f1, f1, f2
	addis r3, r30, ha16(.CPI_main_2-"L00000$pb")
	lfd f2, lo16(.CPI_main_2-"L00000$pb")(r3)
	fmul f1, f1, f2
	rlwinm r3, r2, 3, 0, 28
	lfdx f2, r3, r28
	fadd f4, f2, f1
	fsub f5, f0, f4
	fsel f0, f5, f0, f4
	fsub f1, f2, f1
	addi r2, r2, -1
	fsub f2, f1, f3
	fsel f3, f2, f3, f1
	cmpwi cr0, r2, -1
	fmr f16, f0
	fmr f17, f3
	bgt .LBB_main_19	; no_exit.0.i

llvm-svn: 16764
2004-10-06 09:53:04 +00:00
Nate Begeman
65376f660e Generate better code by being far less clever when it comes to the select instruction. Don't create overlapping register lifetimes
llvm-svn: 16580
2004-09-29 05:00:31 +00:00
Nate Begeman
a8b079e16a improve Type::BoolTy codegen by eliminating unnecessary clears and sign extends
llvm-svn: 16578
2004-09-29 03:45:33 +00:00
Nate Begeman
dc50ea0d82 To go along with sabre's improved InstCombining, improve recognition of
integers that we can use as immediate values in instructions.

Example from yacr2:
-       lis r10, -1
-       ori r10, r10, 65535
-       add r28, r28, r10
+       addi r28, r28, -1
        addi r7, r7, 1
        addi r9, r9, 1
        b .LBB_main_9   ; loopentry.1.i214

llvm-svn: 16566
2004-09-29 02:35:05 +00:00
Nate Begeman
921a44443d Correct some BuildMI arguments for the upcoming simple scheduler
llvm-svn: 16519
2004-09-27 05:08:17 +00:00
Nate Begeman
75f0d35dc6 Fix the last of the major PPC GEP folding deficiencies. This will allow
the ISel to use indexed and non-zero immediate offsets for GEPs that have
more than one use.  This is common for instruction sequences such as a load
followed by a modify and store to the same address.

llvm-svn: 16493
2004-09-23 05:31:33 +00:00
Nate Begeman
61d1797c03 add optimized code sequences for setcc x, 0
llvm-svn: 16478
2004-09-22 04:40:25 +00:00