1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00
Commit Graph

731 Commits

Author SHA1 Message Date
Duncan Sands
3a0d757bd5 Make invokes of inline asm legal. Teach codegen
how to lower them (with no attempt made to be
efficient, since they should only occur for
unoptimized code).

llvm-svn: 45108
2007-12-17 18:08:19 +00:00
David Greene
d85bd2805d GLIBCXX_DEBUG fix. std::vector<>::end() is invalidated by erase.
llvm-svn: 45101
2007-12-17 17:42:03 +00:00
Christopher Lamb
a608afb52e Change the PointerType api for creating pointer types. The old functionality of PointerType::get() has become PointerType::getUnqual(), which returns a pointer in the generic address space. The new prototype of PointerType::get() requires both a type and an address space.
llvm-svn: 45082
2007-12-17 01:12:55 +00:00
Duncan Sands
830037ab2d Revert this part of r45073 until the verifier is
changed not to reject invoke of inline asm.

llvm-svn: 45077
2007-12-16 21:01:21 +00:00
Duncan Sands
bf62f62058 Make instcombine promote inline asm calls to 'nounwind'
calls.  Remove special casing of inline asm from the
inliner.  There is a potential problem: the verifier
rejects invokes of inline asm (not sure why).  If an
asm call is not marked "nounwind" in some .ll, and
instcombine is not run, but the inliner is run, then
an illegal module will be created.  This is bad but
I'm not sure what the best approach is.  I'm tempted
to remove the check in the verifier...

llvm-svn: 45073
2007-12-16 15:51:49 +00:00
Chris Lattner
5ca42cd342 Fix PR1850 by removing an unsafe transformation from VMCore/ConstantFold.cpp.
Reimplement the xform in Analysis/ConstantFolding.cpp where we can use
targetdata to validate that it is safe.  While I'm in there, fix some const
correctness issues and generalize the interface to the "operand folder".

llvm-svn: 44817
2007-12-10 22:53:04 +00:00
Gordon Henriksen
5d201e0bcc Adding a collector name attribute to Function in the IR. These
methods are new to Function:

  bool hasCollector() const;
  const std::string &getCollector() const;
  void setCollector(const std::string &);
  void clearCollector();

The assembly representation is as such:

  define void @f() gc "shadow-stack" { ...

The implementation uses an on-the-side table to map Functions to 
collector names, such that there is no overhead. A StringPool is 
further used to unique collector names, which are extremely
likely to be unique per process.

llvm-svn: 44769
2007-12-10 03:18:06 +00:00
Duncan Sands
1e2e4972ff Rather than having special rules like "intrinsics cannot
throw exceptions", just mark intrinsics with the nounwind
attribute.  Likewise, mark intrinsics as readnone/readonly
and get rid of special aliasing logic (which didn't use
anything more than this anyway).

llvm-svn: 44544
2007-12-03 20:06:50 +00:00
Duncan Sands
3602011bec Fix PR1146: parameter attributes are longer part of
the function type, instead they belong to functions
and function calls.  This is an updated and slightly
corrected version of Reid Spencer's original patch.
The only known problem is that auto-upgrading of
bitcode files doesn't seem to work properly (see
test/Bitcode/AutoUpgradeIntrinsics.ll).  Hopefully
a bitcode guru (who might that be? :) ) will fix it.

llvm-svn: 44359
2007-11-27 13:23:08 +00:00
Owen Anderson
43d4a82d4b Make LoopInfoBase more generic, in preparation for having MachineLoopInfo. This involves a small interface change.
llvm-svn: 44348
2007-11-27 03:43:35 +00:00
Anton Korobeynikov
79e4c423ea Fix indent
llvm-svn: 43941
2007-11-09 12:34:20 +00:00
Anton Korobeynikov
369f4381ea Forget to commit users part of value mapper interface
llvm-svn: 43940
2007-11-09 12:27:04 +00:00
Anton Korobeynikov
45ee4e7e7c And delete this one
llvm-svn: 43939
2007-11-09 12:22:04 +00:00
Gordon Henriksen
4d157a1bc6 Finishing initial docs for all transformations in Passes.html.
Also cleaned up some comments in source files.

llvm-svn: 43674
2007-11-04 16:15:04 +00:00
Dan Gohman
19d88d511b Add std:: to sort calls.
llvm-svn: 43652
2007-11-02 22:24:01 +00:00
Dan Gohman
26c8800fbd Change illegal uses of ++ to uses of STLExtra.h's next function.
llvm-svn: 43651
2007-11-02 22:22:02 +00:00
Duncan Sands
eb464e976f Executive summary: getTypeSize -> getTypeStoreSize / getABITypeSize.
The meaning of getTypeSize was not clear - clarifying it is important
now that we have x86 long double and arbitrary precision integers.
The issue with long double is that it requires 80 bits, and this is
not a multiple of its alignment.  This gives a primitive type for
which getTypeSize differed from getABITypeSize.  For arbitrary precision
integers it is even worse: there is the minimum number of bits needed to
hold the type (eg: 36 for an i36), the maximum number of bits that will
be overwriten when storing the type (40 bits for i36) and the ABI size
(i.e. the storage size rounded up to a multiple of the alignment; 64 bits
for i36).

This patch removes getTypeSize (not really - it is still there but
deprecated to allow for a gradual transition).  Instead there is:

(1) getTypeSizeInBits - a number of bits that suffices to hold all
values of the type.  For a primitive type, this is the minimum number
of bits.  For an i36 this is 36 bits.  For x86 long double it is 80.
This corresponds to gcc's TYPE_PRECISION.

(2) getTypeStoreSizeInBits - the maximum number of bits that is
written when storing the type (or read when reading it).  For an
i36 this is 40 bits, for an x86 long double it is 80 bits.  This
is the size alias analysis is interested in (getTypeStoreSize
returns the number of bytes).  There doesn't seem to be anything
corresponding to this in gcc.

(3) getABITypeSizeInBits - this is getTypeStoreSizeInBits rounded
up to a multiple of the alignment.  For an i36 this is 64, for an
x86 long double this is 96 or 128 depending on the OS.  This is the
spacing between consecutive elements when you form an array out of
this type (getABITypeSize returns the number of bytes).  This is
TYPE_SIZE in gcc.

Since successive elements in a SequentialType (arrays, pointers
and vectors) need to be aligned, the spacing between them will be
given by getABITypeSize.  This means that the size of an array
is the length times the getABITypeSize.  It also means that GEP
computations need to use getABITypeSize when computing offsets.
Furthermore, if an alloca allocates several elements at once then
these too need to be aligned, so the size of the alloca has to be
the number of elements multiplied by getABITypeSize.  Logically
speaking this doesn't have to be the case when allocating just
one element, but it is simpler to also use getABITypeSize in this
case.  So alloca's and mallocs should use getABITypeSize.  Finally,
since gcc's only notion of size is that given by getABITypeSize, if
you want to output assembler etc the same as gcc then getABITypeSize
is the size you want.

Since a store will overwrite no more than getTypeStoreSize bytes,
and a read will read no more than that many bytes, this is the
notion of size appropriate for alias analysis calculations.

In this patch I have corrected all type size uses except some of
those in ScalarReplAggregates, lib/Codegen, lib/Target (the hard
cases).  I will get around to auditing these too at some point,
but I could do with some help.

Finally, I made one change which I think wise but others might
consider pointless and suboptimal: in an unpacked struct the
amount of space allocated for a field is now given by the ABI
size rather than getTypeStoreSize.  I did this because every
other place that reserves memory for a type (eg: alloca) now
uses getABITypeSize, and I didn't want to make an exception
for unpacked structs, i.e. I did it to make things more uniform.
This only effects structs containing long doubles and arbitrary
precision integers.  If someone wants to pack these types more
tightly they can always use a packed struct.

llvm-svn: 43620
2007-11-01 20:53:16 +00:00
Chris Lattner
891066cfff Fix PR1752 and LoopSimplify/2007-10-28-InvokeCrash.ll: terminators
can have uses too.  Wouldn't it be nice if invoke didn't exist? :)

llvm-svn: 43426
2007-10-29 02:30:37 +00:00
Anton Korobeynikov
bcee4726bf Reg2Mem cleanup and optimizations:
- enable phi instructions demotion to stack
 - create alloca instructions in the entry block

llvm-svn: 43208
2007-10-21 23:05:16 +00:00
Owen Anderson
4b407757d0 Move Split<...>() into DomTreeBase. This should make the #include's of DominatorInternals.h
in CodeExtractor and LoopSimplify unnecessary.

Hartmut, could you confirm that this fixes the issues you were seeing?

llvm-svn: 43115
2007-10-18 05:13:52 +00:00
Hartmut Kaiser
4cbb4f081b Fixed linker errors (unresolved externals: split<>(...)) when compiling with VC++. Please review.
llvm-svn: 43081
2007-10-17 18:37:09 +00:00
Devang Patel
f4411aa165 Fix comment.
llvm-svn: 42048
2007-09-17 20:07:40 +00:00
Chris Lattner
cc315726f7 Merge DenseMapKeyInfo & DenseMapValueInfo into DenseMapInfo
Add a new DenseMapInfo::isEqual method to allow clients to redefine
the equality predicate used when probing the hash table.

llvm-svn: 42042
2007-09-17 18:34:04 +00:00
Devang Patel
2cebc6f649 Insert cloned loop basic blocks before original loop header.
llvm-svn: 41713
2007-09-04 20:46:35 +00:00
David Greene
8cda5af2e7 Update GEP constructors to use an iterator interface to fix
GLIBCXX_DEBUG issues.

llvm-svn: 41697
2007-09-04 15:46:09 +00:00
Anton Korobeynikov
5b49f44609 Silence warning while compiling with gcc 4.2
llvm-svn: 41676
2007-09-02 22:11:14 +00:00
David Greene
5b85021be8 Update InvokeInst to work like CallInst
llvm-svn: 41506
2007-08-27 19:04:21 +00:00
Anton Korobeynikov
3dffac0c59 Don't promote volatile loads/stores. This is needed (for example) to handle setjmp/longjmp properly.
This fixes PR1520.

llvm-svn: 41461
2007-08-26 21:43:30 +00:00
Devang Patel
f06e667e9c Use SmallVector instead of std::vector.
llvm-svn: 41207
2007-08-21 00:31:24 +00:00
Devang Patel
fded73828f When one branch of condition is eliminated then head of the other
branch is not necessary immediate dominators of merge blcok in all cases.

llvm-svn: 41144
2007-08-17 21:59:16 +00:00
Devang Patel
1dd44d7501 Break infinite loop.
llvm-svn: 41091
2007-08-14 23:59:17 +00:00
Devang Patel
d1d0316041 If NewBB dominates DestBB then DestBB is not part of NewBB's dominance frontier.
llvm-svn: 41051
2007-08-13 21:59:17 +00:00
Devang Patel
d412a2a0ed Add utility to clone loops.
llvm-svn: 40997
2007-08-10 17:59:47 +00:00
Chris Lattner
bf64e878e6 remove some dead lines
llvm-svn: 40859
2007-08-06 06:21:06 +00:00
Chris Lattner
e562e9bdb0 rewrite the code used to construct pruned SSA form with the IDF method.
In the old way, we computed and inserted phi nodes for the whole IDF of 
the definitions of the alloca, then computed which ones were dead and
removed them.

In the new method, we first compute the region where the value is live,
and use that information to only insert phi nodes that are live.  This
eliminates the need to compute liveness later, and stops the algorithm
from inserting a bunch of phis which it then later removes.

This speeds up the testcase in PR1432 from 2.00s to 0.15s (14x) in a
release build and 6.84s->0.50s (14x) in a debug build.

llvm-svn: 40825
2007-08-04 22:50:14 +00:00
Chris Lattner
b7d4ef6ca6 Factor out a whole bunch of code into it's own method.
llvm-svn: 40824
2007-08-04 21:14:29 +00:00
Chris Lattner
d4a88d77d4 Use getNumPreds(BB) instead of computing them manually. This is a very small but
measurable speedup.

llvm-svn: 40823
2007-08-04 21:06:15 +00:00
Chris Lattner
6b9dca62dd Change the rename pass to be "tail recursive", only adding N-1 successors
to the worklist, and handling the last one with a 'tail call'.  This speeds
up PR1432 from 2.0578s to 2.0012s (2.8%)

llvm-svn: 40822
2007-08-04 20:40:27 +00:00
Chris Lattner
c1d2c2bdc8 cache computation of #preds for a BB. This speeds up
mem2reg from 2.0742->2.0522s on PR1432.

llvm-svn: 40821
2007-08-04 20:24:50 +00:00
Chris Lattner
8335a86536 reserve operand space for phi nodes when we insert them.
llvm-svn: 40820
2007-08-04 20:14:34 +00:00
Chris Lattner
32d9e4ba5c use continue to avoid nesting, no functionality change.
llvm-svn: 40819
2007-08-04 20:07:06 +00:00
Chris Lattner
a97ceae263 Promoting allocas with the 'single store' fastpath is
faster than with the 'local to a block' fastpath.  This speeds
up PR1432 from 2.1232 to 2.0686s (2.6%)

llvm-svn: 40818
2007-08-04 20:03:23 +00:00
Chris Lattner
479e3fa267 When PromoteLocallyUsedAllocas promoted allocas, it didn't remember
to increment NumLocalPromoted, and didn't actually delete the
dead alloca, leading to an extra iteration of mem2reg.

llvm-svn: 40817
2007-08-04 20:01:43 +00:00
Chris Lattner
bd506a8e12 std::map -> DenseMap
llvm-svn: 40816
2007-08-04 19:52:20 +00:00
Chris Lattner
9748fa5c6f fix a logic bug where we wouldn't promote single store allocas if the
stored value was a non-instruction value.  Doh.

This increase the # single store allocas from 8982 to 9026, and
speeds up mem2reg on the testcase in PR1432 from 2.17 to 2.13s.

llvm-svn: 40813
2007-08-04 02:45:02 +00:00
Chris Lattner
3f971fdbd5 When we do the single-store optimization, delete both the store
and the alloca so they don't get reprocessed.

This speeds up PR1432 from 2.20s to 2.17s.

llvm-svn: 40812
2007-08-04 02:38:38 +00:00
Chris Lattner
c38b2a2473 Three improvements:
1. Check for revisiting a block before checking domination, which is faster.
  2. If the stored value isn't an instruction, we don't have to check for domination.
  3. If we have a value used in the same block more than once, make sure to remove the
     block from the UsingBlocks vector.  Not doing so forces us to go through the slow
     path for the alloca.

The combination of these improvements increases the number of allocas on the fastpath
from 8935 to 8982 on PR1432.  This speeds it up from 2.90s to 2.20s (31%)

llvm-svn: 40811
2007-08-04 02:32:22 +00:00
Chris Lattner
fe6a3e2fb4 switch from using a std::set to using a SmallPtrSet. This speeds up the
testcase in PR1432 from 6.33s to 2.90s (2.22x)

llvm-svn: 40810
2007-08-04 02:21:22 +00:00
Chris Lattner
9b45ad1f5c In mem2reg, when handling the single-store case, make sure to remove
a using block from the list if we handle it.  Not doing this caused us
to not be able to promote (with the fast path) allocas which have uses (whoops).

This increases the # allocas hitting this fastpath from 4042 to 8935 on the
testcase in PR1432, speeding up mem2reg by 2.6x

llvm-svn: 40809
2007-08-04 02:15:24 +00:00
Chris Lattner
b5af2cf90d split rewriting of single-store allocas into its own
method.

llvm-svn: 40806
2007-08-04 01:47:41 +00:00