1. optional shift left
2. and x, immX
3. and y, immY
4. or z, x, y
==> rlwimi z, x, y, shift, mask begin, mask end
where immX == ~immY and immX is a run of set bits. This transformation
fires 32 times on voronoi, once on espresso, and probably several
dozen times on external benchmarks such as gcc.
To put this in terms of actual code generated for
struct B { unsigned a : 3; unsigned b : 2; };
void storeA (struct B *b, int v) { b->a = v;}
void storeB (struct B *b, int v) { b->b = v;}
Old:
_storeA:
rlwinm r2, r4, 0, 29, 31
lwz r4, 0(r3)
rlwinm r4, r4, 0, 0, 28
or r2, r4, r2
stw r2, 0(r3)
blr
_storeB:
rlwinm r2, r4, 3, 0, 28
rlwinm r2, r2, 0, 27, 28
lwz r4, 0(r3)
rlwinm r4, r4, 0, 29, 26
or r2, r2, r4
stw r2, 0(r3)
blr
New:
_storeA:
lwz r2, 0(r3)
rlwimi r2, r4, 0, 29, 31
stw r2, 0(r3)
blr
_storeB:
lwz r2, 0(r3)
rlwimi r2, r4, 3, 27, 28
stw r2, 0(r3)
blr
llvm-svn: 17078
flag rotate left word immediate then mask insert (rlwimi) as a two-address
instruction, and update the ISel usage of the instruction accordingly.
This will allow us to properly schedule rlwimi, and use it to efficiently
codegen bitfield operations.
llvm-svn: 17068
This transformation fires a few dozen times across the testsuite.
For example, int test2(int X) { return X ^ 0x0FF00FF0; }
Old:
_test2:
lis r2, 4080
ori r2, r2, 4080
xor r3, r3, r2
blr
New:
_test2:
xoris r3, r3, 4080
xori r3, r3, 4080
blr
llvm-svn: 17004
of one or more 1 bits (may wrap from least significant bit to most
significant bit) as the rlwinm rather than andi., andis., or some longer
instructons sequence.
int andn4(int z) { return z & -4; }
int clearhi(int z) { return z & 0x0000FFFF; }
int clearlo(int z) { return z & 0xFFFF0000; }
int clearmid(int z) { return z & 0x00FFFF00; }
int clearwrap(int z) { return z & 0xFF0000FF; }
_andn4:
rlwinm r3, r3, 0, 0, 29
blr
_clearhi:
rlwinm r3, r3, 0, 16, 31
blr
_clearlo:
rlwinm r3, r3, 0, 0, 15
blr
_clearmid:
rlwinm r3, r3, 0, 8, 23
blr
_clearwrap:
rlwinm r3, r3, 0, 24, 7
blr
llvm-svn: 16832
1. Fix an illegal argument to getClassB when deciding whether or not to
sign extend a byte load.
2. Initial addition of isLoad and isStore flags to the instruction .td file
for eventual use in a scheduler.
3. Rewrite of how constants are handled in emitSimpleBinaryOperation so
that we can emit the PowerPC shifted immediate instructions far more
often. This allows us to emit the following code:
int foo(int x) { return x | 0x00F0000; }
_foo:
.LBB_foo_0: ; entry
; IMPLICIT_DEF
oris r3, r3, 15
blr
llvm-svn: 16826
integers that we can use as immediate values in instructions.
Example from yacr2:
- lis r10, -1
- ori r10, r10, 65535
- add r28, r28, r10
+ addi r28, r28, -1
addi r7, r7, 1
addi r9, r9, 1
b .LBB_main_9 ; loopentry.1.i214
llvm-svn: 16566
the ISel to use indexed and non-zero immediate offsets for GEPs that have
more than one use. This is common for instruction sequences such as a load
followed by a modify and store to the same address.
llvm-svn: 16493
Move include/Config and include/Support into include/llvm/Config,
include/llvm/ADT and include/llvm/Support. From here on out, all LLVM
public header files must be under include/llvm/.
llvm-svn: 16137
cast fp->bool
cast ulong->fp
algebraic right shift long by non-constant value
These changes tested across most of the test suite. Fixes Regression/casts
llvm-svn: 16081
hurt a lot in others. Instead, improve branching version of SetCC and
Select instructions. The old code will be in CVS should we ever need to
dig it up again.
llvm-svn: 15979
Change int->float cast code to put conversion constants in constant pool.
Shorten code sequence for constant pool fp loads.
Remove LOADLoDirect/LOADLoIndirect psuedo instructions and tweak asmwriter
llvm-svn: 15913
that have a frame pointer. This change fixes Burg. In addition, make
the necessary changes to floating point code gen and constant loading after
Chris Lattner's fixes to the asm writer. These changes fix MallocBench/gs
llvm-svn: 15873
Darwin. Also, change asm printer to output proper stubs for external
functions whose address is passed as an argument to aid in bugpointing.
llvm-svn: 15721
Replace STDX (store 64 bit int indexed) with STFDX (store double indexed)
Fix latent bug in indexed load generation
Generate indexed loads and stores in many more cases
llvm-svn: 15626
Use a PowerPC specific prolog epilog inserter to control where spilled
callee save regs are placed on the stack.
Get rid of implicit return address stack slot, save return address reg
(LR) in appropriate slot
Improve code generated for functions that don't have calls or access
globals
Note from Chris: PowerPCPEI will eventually be eliminated, once the
functionality is merged into CodeGen/PrologEpilogInserter.cpp
llvm-svn: 15536
* Implemented GEP folding
* Dynamically output global address stuff once per function
* Fix casting fp<->short/byte
Patch contributed by Nate Begeman.
llvm-svn: 15237
* Don't allow negative immediates to users of unsigned immediates
* Fix long compares
* Support <const int>, op as a potential immediate candidate
* Fix sign extension of short and byte loads
* Fix and improve integer casts
* Fix passing of doubles as vararg functions
Patch contributed by Nate Begeman.
llvm-svn: 15109
* Generation of opcodes that take 16 bit immediates
* Rewrote multiply to be correct for 64 bit values
* Rewrote all the long handling to be correct for PowerPC
* Fix visitSelectInst() to define the upper register of the pair of regs
representing a long value
Patch contributed by Nate Begeman.
llvm-svn: 15083
* Fix functions that take more than 32 bytes of args
* Alignment of doubles in structs is 4 bytes, not 8
* Fix passing long args: rN = hi, rN+1 = lo
* Rewrite signed divide
* Rewrite Intrinsic::returnaddress
Patch courtesy of Nate Begeman.
llvm-svn: 15036
* Fn args passed in registers are now recorded as used by the call instruction
`-> asm printer updated to not print out those registers with the call instr
* Stack frame layout in prolog/epilog fixed, spills and vararg fns now work
* float/double to signed int codegen now correct
* various single precision float codegen bugs fixed
* const integer multiply codegen fixed
* select and setcc blocks inserted into the correct place in machine CFG
* load of integer constant code optimized
All of Shootout tests now work. Great thanks to Nate Begeman for the patch!
llvm-svn: 15014
a funky way to "use" R0 for a 0-valued operand
* Add IMPLICIT_DEFs for incoming function arguments via registers to help the
register allocator not clobber those registers
* Implement comparisons with longs
* Teach emitSelectOperation() to fold the SetCC operation
Patch contributed by Nate Begeman
llvm-svn: 14901
The large diff is because of indentation of a whole region
* Fix querying predecessor blocks in SelectPHINodes(), thanks to Brian (v8)
* Add support for external functions malloc() and free()
* Fix some code indentation
Remember, kids: It's not plagiarism if you "creatively borrow" from your
sources. It's called "research"!
llvm-svn: 14723
* Make visitSetCondInst() share condition-generating code with EmitComparison()
* There are 13 FPRs for function-passing arguments, not 8
* Do not rely on registers being sequential, use an array lookup
* In unimplemented switch cases, send an error and abort instead of silent
fall-through
* Add doInitialization() for adding function prototypes for external math fns
* Minor changes: fix indentation, spacing, code clarity
llvm-svn: 14653
* If SetCondInst is folded into BranchInst (and it is the only user), do not
emit code for SetCondInst
* Fix assembly opcodes in comments in visitSetCondInst()
* Fix codegen of conditional branches
llvm-svn: 14643
* Use the SetCC handling code in the format of Brian's V8
* Add FIXMEs where calls to functions are being made without adding them to the
Module first... they cause missing symbols at assembly-time.
llvm-svn: 14553
for the function
* Registers aren't necessarily sequential wrt their enums, don't rely on it
when emitting function arguments into sequential registers
* Remove X86-specific comments about AL/BL/AH/BH/EDX/etc
* Add an abort() for an unimplemented signed right shift
* The src operand for a GEP was never emitted! Fixed.
* We can skip zero-valued GEP indices as they are no-ops.
"Hello, World!" now works.
llvm-svn: 14505