1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 20:23:11 +01:00
Commit Graph

5725 Commits

Author SHA1 Message Date
Chris Lattner
800f720e7b Add an instruction selector capable of selecting 'ret void'
llvm-svn: 11973
2004-02-29 00:27:00 +00:00
Alkis Evlogimenos
6815402082 SHLD and SHRD take 32-bit operands but an 8-bit immediate. Rename them
to denote this fact.

llvm-svn: 11972
2004-02-28 23:46:44 +00:00
Alkis Evlogimenos
e8dac99a43 Floating point loads/stores act on memory operands. Rename them to
denote this fact.

llvm-svn: 11971
2004-02-28 23:42:35 +00:00
Alkis Evlogimenos
1d71a15be9 Rename instruction templates to be easier to the human eye to
parse. The name is now I (operand size)*. For example:

Im32 -> instruction with 32-bit memory operands.

Im16i8 -> instruction with 16-bit memory operands and 8 bit immediate
          operands.

llvm-svn: 11970
2004-02-28 23:09:03 +00:00
Alkis Evlogimenos
6038a89025 Uncomment instructions that take both an immediate and a memory
operand but their sizes differ.

llvm-svn: 11969
2004-02-28 22:06:59 +00:00
Alkis Evlogimenos
f208a0fd81 Each instruction now has both an ImmType and a MemType. This describes
the size of the immediate and the memory operand on instructions that
use them. This resolves problems with instructions that take both a
memory and an immediate operand but their sizes differ (i.e. ADDmi32b).

llvm-svn: 11967
2004-02-28 22:02:05 +00:00
Brian Gaeke
d551a8b5fe Fix typo in comment
llvm-svn: 11966
2004-02-28 21:55:18 +00:00
Chris Lattner
a854ddd528 Implement switch->br and br->switch folding by ripping out the switch->switch
and br->br code and generalizing it.  This allows us to compile code like this:

int test(Instruction *I) {
  if (isa<CastInst>(I))
    return foo(7);
  else if (isa<BranchInst>(I))
    return foo(123);
  else if (isa<UnwindInst>(I))
    return foo(1241);
  else if (isa<SetCondInst>(I))
    return foo(1);
  else if (isa<VAArgInst>(I))
    return foo(42);
  return foo(-1);
}

into:

int %_Z4testPN4llvm11InstructionE("struct.llvm::Instruction"* %I) {
entry:
        %tmp.1.i.i.i.i.i.i.i = getelementptr "struct.llvm::Instruction"* %I, long 0, ubyte 4            ; <uint*> [#uses=1]
        %tmp.2.i.i.i.i.i.i.i = load uint* %tmp.1.i.i.i.i.i.i.i          ; <uint> [#uses=2]
        %tmp.2.i.i.i.i.i.i = seteq uint %tmp.2.i.i.i.i.i.i.i, 27                ; <bool> [#uses=0]
        switch uint %tmp.2.i.i.i.i.i.i.i, label %endif.0 [
                 uint 27, label %then.0
                 uint 2, label %then.1
                 uint 5, label %then.2
                 uint 14, label %then.3
                 uint 15, label %then.3
                 uint 16, label %then.3
                 uint 17, label %then.3
                 uint 18, label %then.3
                 uint 19, label %then.3
                 uint 32, label %then.4
        ]
...

As well as handling the cases in 176.gcc and many other programs more effectively.

llvm-svn: 11964
2004-02-28 21:28:10 +00:00
Chris Lattner
3583890ab7 Change this so that LLC actually tries to run the code generator, though it will
immediately abort due to lack of an instruction selector. :)

llvm-svn: 11963
2004-02-28 20:21:45 +00:00
Chris Lattner
529a354ea4 SparcV8 now builds.
llvm-svn: 11960
2004-02-28 19:54:00 +00:00
Chris Lattner
5effdb67b7 fine grainify namespacification
llvm-svn: 11959
2004-02-28 19:53:18 +00:00
Chris Lattner
3852b0c3b8 Finegrainify namespacification
llvm-svn: 11958
2004-02-28 19:52:49 +00:00
Chris Lattner
88268605ec Tab completion is our friend.
llvm-svn: 11957
2004-02-28 19:45:39 +00:00
Chris Lattner
013aa47975 Clean up rules
llvm-svn: 11956
2004-02-28 19:43:40 +00:00
Chris Lattner
d2bb7e91b0 Bring this directory into "it actually compiles" land
llvm-svn: 11955
2004-02-28 19:37:18 +00:00
Chris Lattner
3f70429d28 Fix multiple inclusion problem
llvm-svn: 11954
2004-02-28 19:31:32 +00:00
Chris Lattner
69ab9e0840 if there is already a prototype for malloc/free, use it, even if it's incorrect.
Do not just inject a new prototype.

llvm-svn: 11951
2004-02-28 18:51:45 +00:00
Alkis Evlogimenos
977dbaadf7 Do not generate instructions with mismatched memory/immediate sized
operands. The X86 backend doesn't handle them properly right now.

llvm-svn: 11944
2004-02-28 06:01:43 +00:00
Chris Lattner
7872171767 Rename AddUsesToWorkList -> AddUsersToWorkList, as that is what it does.
Create a new AddUsesToWorkList method
optimize memmove/set/cpy of zero bytes to a noop.

llvm-svn: 11941
2004-02-28 05:22:00 +00:00
Chris Lattner
192d8413d3 Turn 'free null' into nothing
llvm-svn: 11940
2004-02-28 04:57:37 +00:00
Misha Brukman
0b846ae65c Right, it's really Extractor, not Extraction.
llvm-svn: 11939
2004-02-28 03:37:58 +00:00
Misha Brukman
f14fbb1a0b A pass that uses the generic CodeExtractor to rip out *every* loop in every
function, as long as the loop isn't the only one in that function. This should
help debugging passes easier with BugPoint.

llvm-svn: 11936
2004-02-28 03:33:01 +00:00
Misha Brukman
26e90f8776 A generic code extractor: given a list of BasicBlocks, it will rip them out into
a new function, taking care of inputs and outputs.

llvm-svn: 11935
2004-02-28 03:26:20 +00:00
Alkis Evlogimenos
84f00e93f7 Further comment updates.
llvm-svn: 11933
2004-02-28 03:20:31 +00:00
Alkis Evlogimenos
edbe362160 Update comments.
llvm-svn: 11932
2004-02-28 03:12:31 +00:00
Alkis Evlogimenos
0f91ce52a0 My previous commit broke the jit. The shift instructions always take
an 8-bit immediate. So mark the shifts that take immediates as taking
an 8-bit argument. The rest with the implicit use of CL are marked
appropriately.

A bug still exists:

def SHLDmri32  : I2A8 <"shld", 0xA4, MRMDestMem>, TB;           // [mem32] <<= [mem32],R32 imm8

The immediate in the above instruction is 8-bit but the memory
reference is 32-bit. The printer prints this as an 8-bit reference
which confuses the assembler. Same with SHRDmri32.

llvm-svn: 11931
2004-02-28 02:56:26 +00:00
Brian Gaeke
6afa0813d2 Turn off the SparcV9MachineCodeDestructionPass for now, because it's buggy
llvm-svn: 11930
2004-02-27 21:15:40 +00:00
Brian Gaeke
0e74ff91a0 Correct DestroyMachineFunction's getPassName
llvm-svn: 11929
2004-02-27 21:01:14 +00:00
Chris Lattner
9e71c09ff5 Only clone global nodes between graphs if both graphs have the global.
llvm-svn: 11928
2004-02-27 20:05:15 +00:00
Chris Lattner
138a7dfb62 ADD MORE FUNCTIONS!
llvm-svn: 11927
2004-02-27 20:04:48 +00:00
Alkis Evlogimenos
ace6d81654 Fix argument size for SHL, SHR, SAR, SHLD and SHRD families of
instructions.

llvm-svn: 11923
2004-02-27 19:46:30 +00:00
Alkis Evlogimenos
839c70f45d Fix encoding of ADD and SUB family of instructions. Also rearrange
them so that they are consistent with AND, XOR, etc...

llvm-svn: 11922
2004-02-27 18:57:00 +00:00
Alkis Evlogimenos
56d357aa23 Rename MRMS[0-7]{r,m} to MRM[0-7]{r,m}.
llvm-svn: 11921
2004-02-27 18:55:12 +00:00
Chris Lattner
d06b64c941 setcond instructions don't have aliasing implications.
llvm-svn: 11919
2004-02-27 18:09:25 +00:00
Chris Lattner
644af802c9 Fix Regression/Assembler/2004-02-27-SelfUseAssertError.ll
llvm-svn: 11913
2004-02-27 17:28:25 +00:00
Alkis Evlogimenos
5ac109957f Add memory operand folding support for the SETcc family of
instructions.

llvm-svn: 11907
2004-02-27 16:13:37 +00:00
Alkis Evlogimenos
0742b93bb9 Add memory operand folding support for SHLD and SHRD instructions.
llvm-svn: 11905
2004-02-27 15:03:18 +00:00
Alkis Evlogimenos
b1f67f6741 Add memory operand folding support for SHL, SHR and SAR, SHLD instructions.
llvm-svn: 11903
2004-02-27 09:28:43 +00:00
Alkis Evlogimenos
cf49d13ed2 Rename SHL, SHR, SAR, SHLD and SHLR instructions to make them
consistent with the rest and also pepare for the addition of their
memory operand variants.

llvm-svn: 11902
2004-02-27 06:57:05 +00:00
Chris Lattner
ffae67bae8 Implement test/Regression/Transforms/InstCombine/canonicalize_branch.ll
This is a really minor thing, but might help out the 'switch statement induction'
code in simplifycfg.

llvm-svn: 11900
2004-02-27 06:27:46 +00:00
Alkis Evlogimenos
ddfd27ff97 Rename member function to be consistent with the rest.
llvm-svn: 11898
2004-02-27 06:11:15 +00:00
Alkis Evlogimenos
63fda9c474 Make spiller push stores right after the definition of a register so
that they are as far away from the loads as possible.

llvm-svn: 11895
2004-02-27 04:51:35 +00:00
Alkis Evlogimenos
3093de5bfb Fix crash caused by passing register 0 to
MRegisterInfo::isPhysicalRegister().

llvm-svn: 11894
2004-02-27 01:52:34 +00:00
Alkis Evlogimenos
cff0fc180c Clear maps right after basic block is processed.
llvm-svn: 11892
2004-02-26 23:22:23 +00:00
John Criswell
0b01bff060 Fixes for PR258 and PR259.
Functions with linkonce linkage are declared with weak linkage.
Global floating point constants used to represent unprintable values
(such as NaN and infinity) are declared static so that they don't interfere
with other CBE generated translation units.

llvm-svn: 11884
2004-02-26 22:20:58 +00:00
Chris Lattner
d359fbf991 Be a good little compiler and handle direct calls efficiently, even if there
are beastly ConstantPointerRefs in the way...

llvm-svn: 11883
2004-02-26 22:07:22 +00:00
Alkis Evlogimenos
b15631fcfa Uncomment assertions that register# != 0 on calls to
MRegisterInfo::is{Physical,Virtual}Register. Apply appropriate fixes
to relevant files.

llvm-svn: 11882
2004-02-26 22:00:20 +00:00
Chris Lattner
07c3941266 Since LLVM uses structure type equivalence, it isn't useful to keep around
multiple type names for the same structural type.  Make DTE eliminate all
but one of the type names

llvm-svn: 11879
2004-02-26 20:02:23 +00:00
Chris Lattner
4aff6ec077 Use a map instead of annotations
llvm-svn: 11875
2004-02-26 08:02:17 +00:00
Chris Lattner
5924de460e remove obsolete comment
llvm-svn: 11872
2004-02-26 07:59:22 +00:00
Chris Lattner
aa6f7cb4e4 Make sure that at least one virtual method is defined in a .cpp file to avoid
having the compiler emit RTTI and vtables to EVERY translation unit.

llvm-svn: 11871
2004-02-26 07:24:18 +00:00
Chris Lattner
e07d786aa6 turn things like:
if (X == 0 || X == 2)

...where the comparisons and branches are in different blocks... into a switch
instruction.  This comes up a lot in various programs, and works well with
the switch/switch merging code I checked earlier.  For example, this testcase:

int switchtest(int C) {
  return C == 0 ? f(123) :
         C == 1 ? f(3123) :
         C == 4 ? f(312) :
         C == 5 ? f(1234): f(444);
}

is converted into this:
        switch int %C, label %cond_false.3 [
                 int 0, label %cond_true.0
                 int 1, label %cond_true.1
                 int 4, label %cond_true.2
                 int 5, label %cond_true.3
        ]

instead of a whole bunch of conditional branches.

Admittedly the code is ugly, and incomplete.  To be complete, we need to add
br -> switch merging and switch -> br merging.  For example, this testcase:

struct foo { int Q, R, Z; };
#define A (X->Q+X->R * 123)
int test(struct foo *X) {
  return A  == 123 ? X1() :
        A == 12321 ? X2():
        (A == 111 || A == 222) ? X3() :
        A == 875 ? X4() : X5();
}

Gets compiled to this:
        switch int %tmp.7, label %cond_false.2 [
                 int 123, label %cond_true.0
                 int 12321, label %cond_true.1
                 int 111, label %cond_true.2
                 int 222, label %cond_true.2
        ]
...
cond_false.2:           ; preds = %entry
        %tmp.52 = seteq int %tmp.7, 875         ; <bool> [#uses=1]
        br bool %tmp.52, label %cond_true.3, label %cond_false.3

where the branch could be folded into the switch.

This kind of thing occurs *ALL OF THE TIME*, especially in programs like
176.gcc, which is a horrible mess of code.  It contains stuff like *shudder*:

#define SWITCH_TAKES_ARG(CHAR) \
  (   (CHAR) == 'D' \
   || (CHAR) == 'U' \
   || (CHAR) == 'o' \
   || (CHAR) == 'e' \
   || (CHAR) == 'u' \
   || (CHAR) == 'I' \
   || (CHAR) == 'm' \
   || (CHAR) == 'L' \
   || (CHAR) == 'A' \
   || (CHAR) == 'h' \
   || (CHAR) == 'z')

and

#define CONST_OK_FOR_LETTER_P(VALUE, C)                 \
  ((C) == 'I' ? SMALL_INTVAL (VALUE)                    \
   : (C) == 'J' ? SMALL_INTVAL (-(VALUE))               \
   : (C) == 'K' ? (unsigned)(VALUE) < 32                \
   : (C) == 'L' ? ((VALUE) & 0xffff) == 0               \
   : (C) == 'M' ? integer_ok_for_set (VALUE)            \
   : (C) == 'N' ? (VALUE) < 0                           \
   : (C) == 'O' ? (VALUE) == 0                          \
   : (C) == 'P' ? (VALUE) >= 0                          \
   : 0)

and

#define LEGITIMIZE_ADDRESS(X,OLDX,MODE,WIN)                     \
{                                                               \
  if (GET_CODE (X) == PLUS && CONSTANT_ADDRESS_P (XEXP (X, 1))) \
    (X) = gen_rtx (PLUS, SImode, XEXP (X, 0),                   \
                   copy_to_mode_reg (SImode, XEXP (X, 1)));     \
  if (GET_CODE (X) == PLUS && CONSTANT_ADDRESS_P (XEXP (X, 0))) \
    (X) = gen_rtx (PLUS, SImode, XEXP (X, 1),                   \
                   copy_to_mode_reg (SImode, XEXP (X, 0)));     \
  if (GET_CODE (X) == PLUS && GET_CODE (XEXP (X, 0)) == MULT)   \
    (X) = gen_rtx (PLUS, SImode, XEXP (X, 1),                   \
                   force_operand (XEXP (X, 0), 0));             \
  if (GET_CODE (X) == PLUS && GET_CODE (XEXP (X, 1)) == MULT)   \
    (X) = gen_rtx (PLUS, SImode, XEXP (X, 0),                   \
                   force_operand (XEXP (X, 1), 0));             \
  if (GET_CODE (X) == PLUS && GET_CODE (XEXP (X, 0)) == PLUS)   \
    (X) = gen_rtx (PLUS, Pmode, force_operand (XEXP (X, 0), NULL_RTX),\
                   XEXP (X, 1));                                \
  if (GET_CODE (X) == PLUS && GET_CODE (XEXP (X, 1)) == PLUS)   \
    (X) = gen_rtx (PLUS, Pmode, XEXP (X, 0),                    \
                   force_operand (XEXP (X, 1), NULL_RTX));      \
  if (GET_CODE (X) == SYMBOL_REF || GET_CODE (X) == CONST       \
           || GET_CODE (X) == LABEL_REF)                        \
    (X) = legitimize_address (flag_pic, X, 0, 0);               \
  if (memory_address_p (MODE, X))                               \
    goto WIN; }

and others.  These macros get used multiple times of course.  These are such
lovely candidates for macros, aren't they?  :)

This code also nicely handles LLVM constructs that look like this:

  if (isa<CastInst>(I))
   ...
  else if (isa<BranchInst>(I))
   ...
  else if (isa<SetCondInst>(I))
   ...
  else if (isa<UnwindInst>(I))
   ...
  else if (isa<VAArgInst>(I))
   ...

where the isa can obviously be a dyn_cast as well.  Switch instructions are a
good thing.

llvm-svn: 11870
2004-02-26 07:13:46 +00:00
Chris Lattner
ac94c441b6 No need to clear the map here, it will always be empty
llvm-svn: 11868
2004-02-26 05:21:21 +00:00
Chris Lattner
9e55e31b2d Fix typo
llvm-svn: 11864
2004-02-26 03:45:03 +00:00
Chris Lattner
7990e4dcd0 The node doesn't have to be _no_ node flags, it just has to be complete and
not have any globals.

llvm-svn: 11863
2004-02-26 03:43:43 +00:00
Chris Lattner
948fffa8a2 Add _more_ functions
llvm-svn: 11862
2004-02-26 03:43:08 +00:00
Chris Lattner
6a3796eaf9 Fix some warnings, some of which were spurious, and some of which were real
bugs.  Thanks Brian!

llvm-svn: 11859
2004-02-26 01:20:02 +00:00
Misha Brukman
3d1720cdb9 Instructions to call and return from functions.
llvm-svn: 11858
2004-02-26 00:37:12 +00:00
Chris Lattner
9fd0c48f80 Two changes:
1. Functions do not make things incomplete, only variables
 2. Constant global variables no longer need to be marked incomplete, because
    we are guaranteed that the initializer for the global will be in the
    graph we are hacking on now.  This makes resolution of indirect calls happen
    a lot more in the bu pass, supports things like vtables and the C counterparts
    (giant constant arrays of function pointers), etc...

Testcase here: test/Regression/Analysis/DSGraph/constant_globals.ll

llvm-svn: 11852
2004-02-25 23:36:08 +00:00
Chris Lattner
7d273bc532 When building local graphs, clone the initializer for constant globals into each
local graph that uses the global.

llvm-svn: 11850
2004-02-25 23:31:02 +00:00
Alkis Evlogimenos
af42cbf42f Fix bugs found with recent addition of assertions in
MRegisterInfo::is{Physical,Virtual}Register.

llvm-svn: 11849
2004-02-25 23:21:52 +00:00
Chris Lattner
9fe3bf296d Simplify the dead node elimination stuff
Make the incompleteness marker faster by looping directly over the globals
instead of over the scalars to find the globals

Fix a bug where we didn't mark a global incomplete if it didn't have any
outgoing edges.  This wouldn't break any current clients but is still wrong.

llvm-svn: 11848
2004-02-25 23:08:00 +00:00
Chris Lattner
a9f67b5ab8 Add a bunch more functions
llvm-svn: 11847
2004-02-25 23:06:40 +00:00
Chris Lattner
d99d965f8c Try harder to get symbol info
llvm-svn: 11846
2004-02-25 23:06:30 +00:00
Brian Gaeke
aba4159be8 Represent va_list in interpreter as a (ec-stack-depth . var-arg-index)
pair, and look up varargs in the execution stack every time, instead of
just pushing iterators (which can be invalidated during callFunction())
around.  (union GenericValue now has a "pair of uints" member, to support
this mechanism.) Fixes Bug 234.

llvm-svn: 11845
2004-02-25 23:01:48 +00:00
Brian Gaeke
4f0a829a68 Great sparc renaming fallout IV: Sparc --> SparcV9.
llvm-svn: 11844
2004-02-25 22:09:36 +00:00
Alkis Evlogimenos
2caa729f02 Remove asssert since it is breaking cases that it shouldn't.
llvm-svn: 11841
2004-02-25 22:01:06 +00:00
Alkis Evlogimenos
f1516015af Add DenseMap template and actually use it for for mapping virtual regs
to objects.

llvm-svn: 11840
2004-02-25 21:55:45 +00:00
Chris Lattner
2a13dd5706 My faith in programmers has been found to be totally misplaced. One would
assume that if they don't intend to write to a global variable, that they
would mark it as constant.  However, there are people that don't understand
that the compiler can do nice things for them if they give it the information
it needs.

This pass looks for blatently obvious globals that are only ever read from.
Though it uses a trivially simple "alias analysis" of sorts, it is still able
to do amazing things to important benchmarks.  253.perlbmk, for example,
contains several ***GIANT*** function pointer tables that are not marked
constant and should be.  Marking them constant allows the optimizer to turn
a whole bunch of indirect calls into direct calls.  Note that only a link-time
optimizer can do this transformation, but perlbmk does have several strings
and other minor globals that can be marked constant by this pass when run
from GCCAS.

176.gcc has a ton of strings and large tables that are marked constant, both
at compile time (38 of them) and at link time (48 more).  Other benchmarks
give similar results, though it seems like big ones have disproportionally
more than small ones.

This pass is extremely quick and does good things.  I'm going to enable it
in gccas & gccld.  Not bad for 50 SLOC.

llvm-svn: 11836
2004-02-25 21:34:36 +00:00
Misha Brukman
6a13621948 SparcV8 regs are really 32-bit, not 64! Thanks, Chris.
llvm-svn: 11835
2004-02-25 21:03:02 +00:00
Misha Brukman
f12c1e5a55 Clean up the tablegen descriptions for SparcV8.
llvm-svn: 11834
2004-02-25 21:02:21 +00:00
Misha Brukman
c8801eb5be Fix the SparcV8 register definitions that were imported from PPC template.
llvm-svn: 11833
2004-02-25 21:00:05 +00:00
Misha Brukman
a4b3e0f01b SparcV8 has different types of instructions, but F1 is only used for CALL.
llvm-svn: 11832
2004-02-25 20:52:20 +00:00
Chris Lattner
ccae3f6d60 Add an assertion
llvm-svn: 11830
2004-02-25 19:37:44 +00:00
Chris Lattner
7c05e5d4d8 Fix failures in 099.go due to the cfgsimplify pass creating switch instructions
where there did not used to be any before

llvm-svn: 11829
2004-02-25 19:30:19 +00:00
Brian Gaeke
5166390fd2 SparcV8 skeleton
llvm-svn: 11828
2004-02-25 19:28:19 +00:00
Brian Gaeke
c6de948cd1 Great renaming part II: Sparc --> SparcV9 (also includes command-line options and Makefiles)
llvm-svn: 11827
2004-02-25 19:08:12 +00:00
Brian Gaeke
965df0b91b Great renaming: Sparc --> SparcV9
llvm-svn: 11826
2004-02-25 18:44:15 +00:00
Chris Lattner
4f09004dff Add a bunch more functions used by perlbmk
llvm-svn: 11824
2004-02-25 17:43:20 +00:00
Chris Lattner
04f116953d Fix incorrect debug code
llvm-svn: 11821
2004-02-25 15:15:04 +00:00
Chris Lattner
ab9628ad18 Teach the instruction selector how to transform 'array' GEP computations into X86
scaled indexes.  This allows us to compile GEP's like this:

int* %test([10 x { int, { int } }]* %X, int %Idx) {
        %Idx = cast int %Idx to long
        %X = getelementptr [10 x { int, { int } }]* %X, long 0, long %Idx, ubyte 1, ubyte 0
        ret int* %X
}

Into a single address computation:

test:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, DWORD PTR [%ESP + 8]
        lea %EAX, DWORD PTR [%EAX + 8*%ECX + 4]
        ret

Before it generated:
test:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, DWORD PTR [%ESP + 8]
        shl %ECX, 3
        add %EAX, %ECX
        lea %EAX, DWORD PTR [%EAX + 4]
        ret

This is useful for things like int/float/double arrays, as the indexing can be folded into
the loads&stores, reducing register pressure and decreasing the pressure on the decode unit.
With these changes, I expect our performance on 256.bzip2 and gzip to improve a lot.  On
bzip2 for example, we go from this:

10665 asm-printer           - Number of machine instrs printed
   40 ra-local              - Number of loads/stores folded into instructions
 1708 ra-local              - Number of loads added
 1532 ra-local              - Number of stores added
 1354 twoaddressinstruction - Number of instructions added
 1354 twoaddressinstruction - Number of two-address instructions
 2794 x86-peephole          - Number of peephole optimization performed

to this:
9873 asm-printer           - Number of machine instrs printed
  41 ra-local              - Number of loads/stores folded into instructions
1710 ra-local              - Number of loads added
1521 ra-local              - Number of stores added
 789 twoaddressinstruction - Number of instructions added
 789 twoaddressinstruction - Number of two-address instructions
2142 x86-peephole          - Number of peephole optimization performed

... and these types of instructions are often in tight loops.

Linear scan is also helped, but not as much.  It goes from:

8787 asm-printer           - Number of machine instrs printed
2389 liveintervals         - Number of identity moves eliminated after coalescing
2288 liveintervals         - Number of interval joins performed
3522 liveintervals         - Number of intervals after coalescing
5810 liveintervals         - Number of original intervals
 700 spiller               - Number of loads added
 487 spiller               - Number of stores added
 303 spiller               - Number of register spills
1354 twoaddressinstruction - Number of instructions added
1354 twoaddressinstruction - Number of two-address instructions
 363 x86-peephole          - Number of peephole optimization performed

to:

7982 asm-printer           - Number of machine instrs printed
1759 liveintervals         - Number of identity moves eliminated after coalescing
1658 liveintervals         - Number of interval joins performed
3282 liveintervals         - Number of intervals after coalescing
4940 liveintervals         - Number of original intervals
 635 spiller               - Number of loads added
 452 spiller               - Number of stores added
 288 spiller               - Number of register spills
 789 twoaddressinstruction - Number of instructions added
 789 twoaddressinstruction - Number of two-address instructions
 258 x86-peephole          - Number of peephole optimization performed

Though I'm not complaining about the drop in the number of intervals.  :)

llvm-svn: 11820
2004-02-25 07:00:55 +00:00
Chris Lattner
dccf14825c * Make the previous patch more efficient by not allocating a temporary MachineInstr
to do analysis.

*** FOLD getelementptr instructions into loads and stores when possible,
    making use of some of the crazy X86 addressing modes.

For example, the following C++ program fragment:

struct complex {
    double re, im;
    complex(double r, double i) : re(r), im(i) {}
};
inline complex operator+(const complex& a, const complex& b) {
    return complex(a.re+b.re, a.im+b.im);
}
complex addone(const complex& arg) {
    return arg + complex(1,0);
}

Used to be compiled to:
_Z6addoneRK7complex:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, DWORD PTR [%ESP + 8]
***     mov %EDX, %ECX
        fld QWORD PTR [%EDX]
        fld1
        faddp %ST(1)
***     add %ECX, 8
        fld QWORD PTR [%ECX]
        fldz
        faddp %ST(1)
***     mov %ECX, %EAX
        fxch %ST(1)
        fstp QWORD PTR [%ECX]
***     add %EAX, 8
        fstp QWORD PTR [%EAX]
        ret

Now it is compiled to:
_Z6addoneRK7complex:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %ECX, DWORD PTR [%ESP + 8]
        fld QWORD PTR [%ECX]
        fld1
        faddp %ST(1)
        fld QWORD PTR [%ECX + 8]
        fldz
        faddp %ST(1)
        fxch %ST(1)
        fstp QWORD PTR [%EAX]
        fstp QWORD PTR [%EAX + 8]
        ret

Other programs should see similar improvements, across the board.  Note that
in addition to reducing instruction count, this also reduces register pressure
a lot, always a good thing on X86.  :)

llvm-svn: 11819
2004-02-25 06:13:04 +00:00
Chris Lattner
10d08a2955 Add a helper to create an addressing mode given all of the pieces.
llvm-svn: 11818
2004-02-25 06:01:07 +00:00
Chris Lattner
c0e2bc0250 add an inefficient way of folding structure and constant array indexes together
into a single LEA instruction.  This should improve the code generated for
things like X->A.B.C[12].D.

The bigger benefit is still coming though.  Note that this uses an LEA instruction
instead of an add, giving the register allocator more freedom.  We should probably
never generate ADDri32's.

llvm-svn: 11817
2004-02-25 03:45:50 +00:00
Chris Lattner
969f90db77 Implement special case for storing an immediate into memory so that we don't need
an intermediate register.

llvm-svn: 11816
2004-02-25 02:56:58 +00:00
Chris Lattner
9036c86b14 Add support for 'rename'
llvm-svn: 11813
2004-02-24 22:17:00 +00:00
Chris Lattner
57ee51ae0b Make the verifier a little more explicit about this problem.
llvm-svn: 11811
2004-02-24 22:06:07 +00:00
Chris Lattner
d9652be664 Add support for remove, fwrite, and fread
Also fix problem where we didn't check to see if a node pointer was null.
Though fclose(null) doesn't make a lot of sense, 300.twolf does it.

llvm-svn: 11810
2004-02-24 22:02:48 +00:00
Brian Gaeke
eae0364189 FunctionLiveVarInfo.h moved: include/llvm/CodeGen -> lib/Target/Sparc/LiveVar
llvm-svn: 11804
2004-02-24 19:46:00 +00:00
Chris Lattner
9da41150e8 Fix some unexpected fallout from the config.h changes. Because the CBE no
longer was getting this #include, it always fell back on the less precise
floating point initializer values, causing some testsuite failures.

llvm-svn: 11803
2004-02-24 18:34:10 +00:00
Chris Lattner
fc15346b60 Fix a faulty optimization on FP values
llvm-svn: 11801
2004-02-24 18:10:14 +00:00
Chris Lattner
7845e4f7f0 If a block is made dead, make sure to promptly remove it.
llvm-svn: 11799
2004-02-24 16:09:21 +00:00
Alkis Evlogimenos
6d7150e9bb Move machine code rewriter and spiller outside the register
allocator.

The implementation is completely rewritten and now employs several
optimizations not exercised before. For example for 164.gzip we have
997 loads and 699 stores vs the 1221 loads and 880 stores we have
before.

llvm-svn: 11798
2004-02-24 08:58:30 +00:00
Chris Lattner
d678669018 Implement SimplifyCFG/switch_switch_fold.ll
This case occurs many times in various benchmarks, especially when combined
with the previous patch.  This allows it to get stuff like:
  if (X == 4 || X == 3)
    if (X == 5 || X == 8)

and

switch (X) {
case 4: case 5: case 6:
  if (X == 4 || X == 5)

llvm-svn: 11797
2004-02-24 07:23:58 +00:00
Alkis Evlogimenos
042f01039b Add predicates for checking if a virtual register has a physical
register mapping or a stack slot mapping.

llvm-svn: 11795
2004-02-24 06:30:36 +00:00
Chris Lattner
1293e1d00c Rearrange code a bit
llvm-svn: 11793
2004-02-24 05:54:22 +00:00
Chris Lattner
e5db7dc4c6 Implement: test/Regression/Transforms/SimplifyCFG/switch_create.ll
This turns code like this:
  if (X == 4 | X == 7)
and
  if (X != 4 & X != 7)
into switch instructions.

llvm-svn: 11792
2004-02-24 05:38:11 +00:00
Alkis Evlogimenos
0d0db88889 Make enum private as it is an implementation detail.
llvm-svn: 11782
2004-02-23 23:49:40 +00:00
Alkis Evlogimenos
9344a740be Remove '4Virt' from member function names as it is obvious.
llvm-svn: 11781
2004-02-23 23:47:10 +00:00
Alkis Evlogimenos
d192266264 Refactor VirtRegMap out of RegAllocLinearScan as the first part of bug
251 (providing a generic machine code rewriter/spiller).

llvm-svn: 11780
2004-02-23 23:08:11 +00:00
Chris Lattner
78800ae270 Generate much more efficient code in programs like pifft
llvm-svn: 11775
2004-02-23 21:46:58 +00:00
Chris Lattner
7fa6519e07 Fix a small typeo in my checkin last night that broke vortex and other programs :(
llvm-svn: 11774
2004-02-23 21:46:42 +00:00
Chris Lattner
253f77f2a7 Fix InstCombine/2004-02-23-ShiftShiftOverflow.ll
Also, turn 'shr int %X, 1234' into 'shr int %X, 31'

llvm-svn: 11768
2004-02-23 20:30:06 +00:00
Alkis Evlogimenos
34f28e5d3f Add number of spilled registers statistic.
llvm-svn: 11759
2004-02-23 18:45:32 +00:00
Chris Lattner
82e1a3657d Fix bugs in finegrainification
llvm-svn: 11758
2004-02-23 18:40:08 +00:00
Chris Lattner
1bf9dde4a1 Finegrainify namespacification
llvm-svn: 11757
2004-02-23 18:38:20 +00:00
Alkis Evlogimenos
82a1d7d30e Use MachineBasicBlock::getParent().
llvm-svn: 11756
2004-02-23 18:36:38 +00:00
Alkis Evlogimenos
2863fbd178 Remove implementation of default constructor as it is useless now.
llvm-svn: 11755
2004-02-23 18:28:35 +00:00
Alkis Evlogimenos
9b103024ef Refactor rewinding code for finding the first terminator of a basic
block into MachineBasicBlock::getFirstTerminator().

This also fixes a bug in the implementation of the above in both
RegAllocLocal and InstrSched, where instructions where added after the
terminator if the basic block's only instruction was a terminator (it
shouldn't matter for RegAllocLocal since this case never occurs in
practice).

llvm-svn: 11748
2004-02-23 18:14:48 +00:00
Chris Lattner
40e15a6000 Simplify code a bit, don't go off the end of the block, now that the current
block we are in might be empty

llvm-svn: 11744
2004-02-23 07:42:19 +00:00
Chris Lattner
28e4e925eb We were forgetting to add FP_REG_KILL instructions to basic blocks which will
eventually get an assignment due to elimination of PHIs.

llvm-svn: 11743
2004-02-23 07:29:45 +00:00
Chris Lattner
74418a30aa Implement cast.ll::test14/15
llvm-svn: 11742
2004-02-23 07:16:20 +00:00
Chris Lattner
a65e5e3df1 Refactor some code. In the mul - setcc folding case, we really care about
whether this is the sign bit or not, so check unsigned comparisons as well.

llvm-svn: 11740
2004-02-23 06:38:22 +00:00
Alkis Evlogimenos
50598d1135 Improved PhysRegTracker interface. RegAlloc lazily allocates the register tracker using a std::auto_ptr
llvm-svn: 11738
2004-02-23 06:10:13 +00:00
Chris Lattner
9ecc3fc3c1 Implement mul.ll:test11
llvm-svn: 11737
2004-02-23 06:00:11 +00:00
Chris Lattner
51b37305d9 Implement "strength reduction" of X <= C and X >= C
llvm-svn: 11735
2004-02-23 05:47:48 +00:00
Chris Lattner
c31a2e26ab Implement InstCombine/mul.ll:test10, which is a case that occurs when dealing
with "predication"

llvm-svn: 11734
2004-02-23 05:39:21 +00:00
Alkis Evlogimenos
99af6ca36b Simplify iterator usage now that we have next(). Also don't pass iterators by reference now that MachineInstr* are in an ilist
llvm-svn: 11732
2004-02-23 04:12:30 +00:00
Chris Lattner
b200638dc4 Work around a gas bug. Print '-9223372036854775808' as unsigned.
llvm-svn: 11729
2004-02-23 03:27:05 +00:00
Chris Lattner
85f13fae06 Implement cast fp -> bool
llvm-svn: 11728
2004-02-23 03:21:41 +00:00
Chris Lattner
795ca35cde Stop passing iterators around by reference now that we have ilists!
Implement cast Type::ULongTy -> double

llvm-svn: 11726
2004-02-23 03:10:10 +00:00
Alkis Evlogimenos
976f485826 Some code cleanups from Chris
llvm-svn: 11724
2004-02-23 01:57:39 +00:00
Alkis Evlogimenos
1525e120a6 Fix comments in PhysRegTracker and rename isPhysRegAvail to isRegAvail to be consistent with the other two
llvm-svn: 11723
2004-02-23 01:25:05 +00:00
Chris Lattner
f9acb33dfd Add a new cmove instruction
llvm-svn: 11722
2004-02-23 01:16:05 +00:00
Alkis Evlogimenos
ee3ef42726 Move LiveIntervals.h up to be the first included header
llvm-svn: 11721
2004-02-23 01:01:21 +00:00
Alkis Evlogimenos
ba2b9aec71 Pull PhysRegTracker out of RegAllocLinearScan as it can be used by other allocators as well
llvm-svn: 11720
2004-02-23 00:53:31 +00:00
Alkis Evlogimenos
850bd0819f Move LiveIntervals.h to lib/CodeGen since it shouldn't be exposed to other parts of the compiler
llvm-svn: 11719
2004-02-23 00:50:15 +00:00
Chris Lattner
cf8db3e8aa Only insert FP_REG_KILL instructions in MachineBasicBlocks that actually
use FP instructions.  This reduces the number of instructions inserted in
176.gcc (for example) from 58074 to 101 (it doesn't use much FP, which
is typical).  This reduction speeds up the entire code generator.  In the
case of 176.gcc, llc went from taking 31.38s to 24.78s.  The passes that
sped up the most are the register allocator and the 2 live variable analysis
passes, which sped up 2.3, 1.3, and 1.5s respectively.  The asmprinter
pass also sped up because it doesn't print the instructions in comments :)

Note that this patch is likely to expose latent bugs in machine code passes,
because now basicblock can be empty, where they were never empty before.  I
cleaned out regalloclocal, but who knows about linscan :)

llvm-svn: 11717
2004-02-22 19:47:26 +00:00
Chris Lattner
5485375e5d Another bug fix for empty MBB's
llvm-svn: 11716
2004-02-22 19:37:31 +00:00
Alkis Evlogimenos
7f7d70a53c Move MOTy::UseType enum into MachineOperand. This eliminates the
switch statements in the constructors and simplifies the
implementation of the getUseType() member function. You will have to
specify defs using MachineOperand::Def instead of MOTy::Def though
(similarly for Use and UseAndDef).

llvm-svn: 11715
2004-02-22 19:23:26 +00:00
Chris Lattner
56a7886c8a Fix a bug where we were implicitly assuming that there would be at least
one terminator instruction in each basic block.

llvm-svn: 11714
2004-02-22 19:08:15 +00:00
Chris Lattner
cc9a188e0a Reduce the number of pointless copies inserted due to constant pointer refs.
Also, make an assertion actually fireable!

llvm-svn: 11713
2004-02-22 17:35:42 +00:00
Chris Lattner
ed03319931 Fix bug in previous checkout: leave the iterator at the first instruction
AFTER the GEP that was emitted.  :(

llvm-svn: 11712
2004-02-22 17:05:38 +00:00
Chris Lattner
ade64c9839 Completely rewrite how getelementptr instructions are expanded. This has two
(minor) benefits right now:

1. An extra dummy MOVrr32 is gone.  This move would often be coallesced by
   both allocators anyway.
2. The code now uses the gep_type_iterator to walk the gep, which should future
   proof it a bit.  It still assumes that array indexes are Longs though.

These don't really justify rewriting the code.  The big benefit will come later
though.

llvm-svn: 11710
2004-02-22 07:04:00 +00:00
Alkis Evlogimenos
6998610eda When folding memory operands in machine instructions be careful to
leave register operands with the same use/def flags as the original
instruction.

llvm-svn: 11709
2004-02-22 06:54:26 +00:00
Chris Lattner
63b79422f3 Fix a soon-to-be-missing #include
llvm-svn: 11707
2004-02-22 06:26:17 +00:00
Chris Lattner
727748b382 Get all instruction definitions
llvm-svn: 11706
2004-02-22 06:25:38 +00:00
Chris Lattner
3392d316e9 Wow this is out of date. When we have _real_ code generator documentation,
this should be folded into it.

llvm-svn: 11705
2004-02-22 05:53:54 +00:00
Alkis Evlogimenos
ba33a0ab9b Print basic block boundaries in machine instruction debug output.
llvm-svn: 11704
2004-02-22 05:46:04 +00:00
Chris Lattner
69bb1545d1 Implement Transforms/InstCombine/cast.ll:test13, a case which occurs in a
hot 164.gzip loop.

llvm-svn: 11702
2004-02-22 05:25:17 +00:00
Chris Lattner
cf8afa52b8 The two address pass cannot handle two addr instructions where one incoming
value is a physreg and one is a virtreg.  For this reason, disable copy folding
entirely for physregs.  Also, use the new isMoveInstr target hook which gives us
folding of FP moves as well.

llvm-svn: 11700
2004-02-22 04:44:58 +00:00
Alkis Evlogimenos
32d12d31ae Abstract merging of ranges away from number of slots per instruction.
Also make it less aggressive as the current implementation breaks in
some cases.

llvm-svn: 11696
2004-02-22 04:05:13 +00:00
Chris Lattner
573441bfbd Use isNull instead of getNode() to test for existence of a node, this is cheaper.
FIX MAJOR BUG, whereby we didn't merge null edges correctly. Correcting this
fixes poolallocation on 175.vpr, and possibly others.

llvm-svn: 11695
2004-02-22 00:53:54 +00:00
Chris Lattner
7bff00313e Fix an iterator invalidation problem which was causing some nodes to not be
correctly merged over!

llvm-svn: 11693
2004-02-21 22:28:26 +00:00
Chris Lattner
458704f675 Use handy method
llvm-svn: 11692
2004-02-21 22:27:31 +00:00
Misha Brukman
642275dc4e `cat' is usually in /bin, not /usr/bin, at least on our systems.
llvm-svn: 11690
2004-02-21 21:51:41 +00:00
Chris Lattner
7448a9867b When printing a stack trace, demangle it if possible. Since we are potentially
in a signal handler, allocating memory or doing other unsafe things is bad,
which means we should do it in a different process.

llvm-svn: 11689
2004-02-21 21:06:19 +00:00
Alkis Evlogimenos
e39c21cc93 Make 'fold' statistic's description the same in both allocators.
llvm-svn: 11687
2004-02-21 18:07:33 +00:00
Chris Lattner
dacc6a7448 Instead of cloning the globals for main into the globals graph at the end of
BU propagation, clone the globals into the GG of EACH FUNCTION that finishes
processing!  The GlobalsGraph *must* include all globals and effects from
all functions in the program.  Fixing this makes pool allocation work better
on 175.vpr, but it still ultimately crashes.

llvm-svn: 11686
2004-02-21 00:30:28 +00:00
Chris Lattner
0f9200e5a5 There is no need to merge the globals graph into the function graphs at the
end of the BU and CBU passes.  The globals will be marked incomplete, so it
doesn't matter if they are missing some info, and merging isn't guaranteed
to bring everything in anyway!

llvm-svn: 11684
2004-02-20 23:52:15 +00:00