1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00
Commit Graph

12919 Commits

Author SHA1 Message Date
Owen Anderson
285891eccf Enhance both TargetLibraryInfo and SelectionDAGBuilder so that the latter can use the former to prevent the formation of libm SDNode's when -fno-builtin is passed.
llvm-svn: 146193
2011-12-08 22:15:21 +00:00
Devang Patel
9680ebaff4 Refactor. No intentional functionality change.
llvm-svn: 146187
2011-12-08 21:48:01 +00:00
Chad Rosier
d0a0415340 Add rather verbose stats for fast-isel failures.
llvm-svn: 146186
2011-12-08 21:37:10 +00:00
Devang Patel
edfacfabb8 Filter "sink to" candidate blocks sooner. This avoids unnecessary computation to determine whether the block dominates all uses or not.
llvm-svn: 146184
2011-12-08 21:33:23 +00:00
Owen Anderson
d003a613e7 Teach SelectionDAG to match more calls to libm functions onto existing SDNodes. Mark these nodes as illegal by default, unless the target declares otherwise.
llvm-svn: 146171
2011-12-08 19:32:14 +00:00
Evan Cheng
320b2be38c Make MachineInstr instruction property queries more flexible. This change all
clients to decide whether to look inside bundled instructions and whether
the query should return true if any / all bundled instructions have the
queried property.

llvm-svn: 146168
2011-12-08 19:23:10 +00:00
Nadav Rotem
341b30a457 Fix a bug in the integer-promotion of bitcast operations on vector types.
We must not issue a bitcast operation for integer-promotion of vector types, because the
location of the values in the vector may be different.

llvm-svn: 146150
2011-12-08 13:10:01 +00:00
Pete Cooper
5e48c1e8e3 Reverting r145899 as it breaks clang self-hosting
llvm-svn: 146136
2011-12-08 03:24:10 +00:00
Eli Friedman
e68dd964f7 Make sure we correctly set LiveRegGens when a call is unscheduled. <rdar://problem/10460321>. No testcase because this is very sensitive to scheduling.
llvm-svn: 146087
2011-12-07 22:24:28 +00:00
Eli Friedman
333928a702 Fix an assertion in the scheduler. PR11386. No testcase included because it's rather delicate.
llvm-svn: 146083
2011-12-07 22:06:02 +00:00
Nick Lewycky
9139ac9fdb These global variables aren't thread-safe, STATISTIC is. Andy Trick tells me
that he isn't using these any more, so just delete them.

llvm-svn: 146076
2011-12-07 21:35:59 +00:00
Jakub Staszak
a8a18f2cf5 Remove unneeded semicolon.
Skip two looking up at BlockChain.

llvm-svn: 146053
2011-12-07 19:46:10 +00:00
Evan Cheng
1acd685d87 Add bundle aware API for querying instruction properties and switch the code
generator to it. For non-bundle instructions, these behave exactly the same
as the MC layer API.

For properties like mayLoad / mayStore, look into the bundle and if any of the
bundled instructions has the property it would return true.
For properties like isPredicable, only return true if *all* of the bundled
instructions have the property.
For properties like canFoldAsLoad, isCompare, conservatively return false for
bundles.

llvm-svn: 146026
2011-12-07 07:15:52 +00:00
Eli Friedman
e74e55c372 Zap unnecessary isIntDivCheap() check. PR11485. No testcase because this doesn't affect any in-tree target.
llvm-svn: 146015
2011-12-07 03:55:52 +00:00
Jakob Stoklund Olesen
c1f4115eb1 Add missing check.
llvm-svn: 146004
2011-12-07 01:08:22 +00:00
Eli Friedman
9e8d557cd1 Support vector bitcasts in the AsmPrinter. PR11495.
llvm-svn: 146001
2011-12-07 00:50:54 +00:00
Jakob Stoklund Olesen
e612fdbbab Add MachineOperand IsInternalRead flag.
This flag is used when bundling machine instructions.  It indicates
whether the operand reads a value defined inside or outside its bundle.

llvm-svn: 145997
2011-12-07 00:22:07 +00:00
Eli Friedman
5545db0906 Fix an optimization involving EXTRACT_SUBVECTOR in DAGCombine so it behaves correctly. PR11494.
llvm-svn: 145996
2011-12-07 00:11:56 +00:00
Jakub Staszak
f1b60daf50 Remove unneeded type.
llvm-svn: 145995
2011-12-07 00:08:00 +00:00
Jakub Staszak
e4104abf3c - Remove unneeded #includes.
- Remove unused types/fields.
- Add some constantness.

llvm-svn: 145993
2011-12-06 23:59:33 +00:00
Evan Cheng
5061553f9d First chunk of MachineInstr bundle support.
1. Added opcode BUNDLE
2. Taught MachineInstr class to deal with bundled MIs
3. Changed MachineBasicBlock iterator to skip over bundled MIs; added an iterator to walk all the MIs
4. Taught MachineBasicBlock methods about bundled MIs

llvm-svn: 145975
2011-12-06 22:12:01 +00:00
Jakob Stoklund Olesen
5fb70d560d Pretty-print basic block alignment.
llvm-svn: 145965
2011-12-06 21:08:39 +00:00
Sebastian Pop
182ae6a6fa use space star instead of star space
llvm-svn: 145944
2011-12-06 17:34:16 +00:00
Sebastian Pop
cb55bb22ab add missing point at the end of sentences
llvm-svn: 145943
2011-12-06 17:34:11 +00:00
Evan Cheng
91ae428cc0 Mix some minor misuse of MachineBasicBlock iterator.
llvm-svn: 145903
2011-12-06 02:49:06 +00:00
Pete Cooper
61ffb8fcc5 Removed isWinToJoinCrossClass from the register coalescer.
The new register allocator is much more able to split back up ranges too constrained by register classes.

Fixes <rdar://problem/10466609>

llvm-svn: 145899
2011-12-06 02:06:50 +00:00
Lang Hames
a7f56028f8 Kill off the LoopSplitter. It's not being used or maintained.
llvm-svn: 145897
2011-12-06 01:57:59 +00:00
Lang Hames
29572733bb Update PBQP's analysis usage to reflect the requirements of the inline spiller.
llvm-svn: 145893
2011-12-06 01:45:57 +00:00
Jakob Stoklund Olesen
e53ed273d9 Use logarithmic units for basic block alignment.
This was actually a bit of a mess. TLI.setPrefLoopAlignment was clearly
documented as taking log2(bytes) units, but the x86 target would still
set a preferred loop alignment of '16'.

CodePlacementOpt passed this number on to the basic block, and
AsmPrinter interpreted it as bytes.

Now both MachineFunction and MachineBasicBlock use logarithmic
alignments.

Obviously, MachineConstantPool still measures alignments in bytes, so we
can emulate the thrill of using as.

llvm-svn: 145889
2011-12-06 01:26:19 +00:00
Nadav Rotem
1a91e4381d Add support for vectors of pointers.
llvm-svn: 145801
2011-12-05 06:29:09 +00:00
Eric Christopher
5697266013 Add inline subprogram names to the name lookup table since they may
not get there any other way.

llvm-svn: 145789
2011-12-04 06:02:38 +00:00
Anton Korobeynikov
e2277de6a7 Emit the ctors in the proper order on ARM/EABI.
Maybe some targets should use this as well.

Patch by Evgeniy Stepanov!

llvm-svn: 145781
2011-12-03 23:49:37 +00:00
Benjamin Kramer
ab489e4c97 Simplify code. No functionality change.
-3% on ARMDissasembler.cpp.

llvm-svn: 145773
2011-12-03 16:18:22 +00:00
Nick Lewycky
7d0d3c2d58 Move global variables in TargetMachine into new TargetOptions class. As an API
change, now you need a TargetOptions object to create a TargetMachine. Clang
patch to follow.

One small functionality change in PTX. PTX had commented out the machine
verifier parts in their copy of printAndVerify. That now calls the version in
LLVMTargetMachine. Users of PTX who need verification disabled should rely on
not passing the command-line flag to enable it.

llvm-svn: 145714
2011-12-02 22:16:29 +00:00
Hal Finkel
9ffa9b3c7f make sure ScheduleDAGInstrs::EmitSchedule does not crash when the first instruction in Sequence is a Noop
llvm-svn: 145677
2011-12-02 04:58:07 +00:00
Dylan Noblesmith
9678a90d8f CodeGen: fix CMake build
Missing file from r145629.

llvm-svn: 145634
2011-12-01 21:49:23 +00:00
Anshuman Dasgupta
f754c8bf8e Add a deterministic finite automaton based packetizer for VLIW architectures
llvm-svn: 145629
2011-12-01 21:10:21 +00:00
Chad Rosier
0ff2f46d12 If fast-isel fails, remove dead instructions generated during the failed
attempt.  

llvm-svn: 145425
2011-11-29 19:40:47 +00:00
Daniel Dunbar
4e00f5f8fd build/CMake: Finish removal of add_llvm_library_dependencies.
llvm-svn: 145420
2011-11-29 19:25:30 +00:00
Bill Wendling
0f9a3f91ae On MachO, the pointer to the personality function should always be in the
non_lazy_symbol_pointers section (__IMPORT,__pointers). Ignore the 'hidden' part
since that will place it in the wrong section.
<rdar://problem/10443720>

llvm-svn: 145356
2011-11-29 01:43:20 +00:00
Eli Friedman
473a76a0df Make SelectionDAG::InferPtrAlignment use llvm::ComputeMaskedBits instead of duplicating the logic for globals. Make llvm::ComputeMaskedBits handle GlobalVariables slightly more aggressively, to match what InferPtrAlignment knew how to do.
llvm-svn: 145304
2011-11-28 22:48:22 +00:00
Evan Cheng
1435aa5fdc Revert r145273 and fix in SelectionDAG::InferPtrAlignment() instead.
Conservatively returns zero when the GV does not specify an alignment nor is it
initialized. Previously it returns ABI alignment for type of the GV. However, if
the type is a "packed" type, then the under-specified alignments is attached to
the load / store instructions. In that case, the alignment of the type cannot be
trusted.
rdar://10464621

llvm-svn: 145300
2011-11-28 22:37:34 +00:00
Evan Cheng
567aa3dfb3 DAG combine should not increase alignment of loads / stores with alignment less
than ABI alignment. These are loads / stores from / to "packed" data structures.
Their alignments are intentionally under-specified.

rdar://10301431

llvm-svn: 145273
2011-11-28 20:42:56 +00:00
Chad Rosier
a512a13c2c 80-column.
llvm-svn: 145267
2011-11-28 19:59:09 +00:00
Bill Wendling
8beab76b07 Remove dead llvm.eh.sjlj.dispatchsetup intrinsic.
llvm-svn: 145263
2011-11-28 19:23:13 +00:00
Chandler Carruth
9ab5855a19 Prevent rotating the blocks of a loop (and thus getting a backedge to be
fallthrough) in cases where we might fail to rotate an exit to an outer
loop onto the end of the loop chain.

Having *some* rotation, but not performing this rotation, is the primary
fix of thep performance regression with -enable-block-placement for
Olden/em3d (a whopping 30% regression). Still working on reducing the
test case that actually exercises this and the new rotation strategy out
of this code, but I want to check if this regresses other test cases
first as that may indicate it isn't the correct fix.

llvm-svn: 145195
2011-11-27 20:18:00 +00:00
Chandler Carruth
bb4c250613 Take two on rotating the block ordering of loops. My previous attempt
was centered around the premise of laying out a loop in a chain, and
then rotating that chain. This is good for preserving contiguous layout,
but bad for actually making sane rotations. In order to keep it safe,
I had to essentially make it impossible to rotate deeply nested loops.
The information needed to correctly reason about a deeply nested loop is
actually available -- *before* we layout the loop. We know the inner
loops are already fused into chains, etc. We lose information the moment
we actually lay out the loop.

The solution was the other alternative for this algorithm I discussed
with Benjamin and some others: rather than rotating the loop
after-the-fact, try to pick a profitable starting block for the loop's
layout, and then use our existing layout logic. I was worried about the
complexity of this "pick" step, but it turns out such complexity is
needed to handle all the important cases I keep teasing out of benchmarks.

This is, I'm afraid, a bit of a work-in-progress. It is still
misbehaving on some likely important cases I'm investigating in Olden.
It also isn't really tested. I'm going to try to craft some interesting
nested-loop test cases, but it's likely to be extremely time consuming
and I don't want to go there until I'm sure I'm testing the correct
behavior. Sadly I can't come up with a way of getting simple, fine
grained test cases for this logic. We need complex loop structures to
even trigger much of it.

llvm-svn: 145183
2011-11-27 13:34:33 +00:00
Chandler Carruth
c4043847df Fix an impressive type-o / spell-o Duncan noticed.
llvm-svn: 145181
2011-11-27 10:32:16 +00:00
Chandler Carruth
e6374e6953 Rework a bit of the implementation of loop block rotation to not rely so
heavily on AnalyzeBranch. That routine doesn't behave as we want given
that rotation occurs mid-way through re-ordering the function. Instead
merely check that there are not unanalyzable branching constructs
present, and then reason about the CFG via successor lists. This
actually simplifies my mental model for all of this as well.

The concrete result is that we now will rotate more loop chains. I've
added a test case from Olden highlighting the effect. There is still
a bit more to do here though in order to regain all of the performance
in Olden.

llvm-svn: 145179
2011-11-27 09:22:53 +00:00
Chandler Carruth
0d073febb6 Introduce a loop block rotation optimization to the new block placement
pass. This is designed to achieve one of the important optimizations
that the old code placement pass did, but more simply.

This is a somewhat rough and *very* conservative version of the
transform. We could get a lot fancier here if there are profitable cases
to do so. In particular, this only looks for a single pattern, it
insists that the loop backedge being rotated away is the last backedge
in the chain, and it doesn't provide any means of doing better in-loop
placement due to the rotation. However, it appears that it will handle
the important loops I am finding in the LLVM test suite.

llvm-svn: 145158
2011-11-27 00:38:03 +00:00
Benjamin Kramer
d861d825f2 Move code into anonymous namespaces.
llvm-svn: 145154
2011-11-26 23:01:57 +00:00
Chandler Carruth
f6e96b54f8 Fix a silly use-after-free issue. A much earlier version of this code
need lots of fanciness around retaining a reference to a Chain's slot in
the BlockToChain map, but that's all gone now. We can just go directly
to allocating the new chain (which will update the mapping for us) and
using it.

Somewhat gross mechanically generated test case replicates the issue
Duncan spotted when actually testing this out.

llvm-svn: 145120
2011-11-24 11:23:15 +00:00
Chandler Carruth
1d3f68ffd0 When adding blocks to the list of those which no longer have any CFG
conflicts, we should only be adding the first block of the chain to the
list, lest we try to merge into the middle of that chain. Most of the
places we were doing this we already happened to be looking at the first
block, but there is no reason to assume that, and in some cases it was
clearly wrong.

I've added a couple of tests here. One already worked, but I like having
an explicit test for it. The other is reduced from a test case Duncan
reduced for me and used to crash. Now it is handled correctly.

llvm-svn: 145119
2011-11-24 08:46:04 +00:00
Chandler Carruth
dcf04fc35a Relax an invariant that block placement was trying to assert a bit
further. This invariant just wasn't going to work in the face of
unanalyzable branches; we need to be resillient to the phenomenon of
chains poking into a loop and poking out of a loop. In fact, we already
were, we just needed to not assert on it.

This was found during a bootstrap with block placement turned on.

llvm-svn: 145100
2011-11-23 10:35:36 +00:00
Chandler Carruth
c175221b2d Handle the case of a no-return invoke correctly. It actually still has
successors, they just are all landing pad successors. We handle this the
same way as no successors. Comments attached for the next person to wade
through here and another lovely test case courtesy of Benjamin Kramer's
bugpoint reduction.

llvm-svn: 145098
2011-11-23 08:23:54 +00:00
Bob Wilson
8722a53e52 Enable stack protectors for all arrays, not just char arrays. rdar://5875909
Patch by Bill Wendling.

llvm-svn: 145097
2011-11-23 07:13:56 +00:00
Jakob Stoklund Olesen
2b76f5685c Fix PR11422.
This was a bug in keeping track of the available domains when merging
domain values.

The wrong domain mask caused ExecutionDepsFix to try to move VANDPSYrr
to the integer domain which is only available in AVX2.

Also add an assertion to catch future attempts at emitting AVX2
instructions.

llvm-svn: 145096
2011-11-23 04:03:08 +00:00
Chandler Carruth
a357b0a5ee Fix a crash in block placement due to an inner loop that happened to be
reversed in the function's original ordering, and we happened to
encounter it while handling an outer unnatural CFG structure.

Thanks to the test case reduced from GCC's source by Benjamin Kramer.
This may also fix a crasher in gzip that Duncan reduced for me, but
I haven't yet gotten to testing that one.

llvm-svn: 145094
2011-11-23 03:03:21 +00:00
Chandler Carruth
59f0abf50e Fix a devilish miscompile exposed by block placement. The
updateTerminator code didn't correctly handle EH terminators in one very
specific case. AnalyzeBranch would find no terminator instruction, and
so the fallback in updateTerminator is to assume fallthrough. This is
correct, but the destination of the fallthrough was assumed to be the
first successor.

This is *almost always* true, but in certain cases the loop
transformations will cause the landing pad to be the first successor!
Instead of this brittle logic, actually look through the successors for
a non-landing-pad accessor, and to assert if more than one is found.

This will hopefully fix some (if not all) of the self host miscompiles
with block placement. Thanks to Benjamin Kramer for reporting, Nick
Lewycky for an initial stab at a reduction, and Duncan for endless
advice on EH (which I know nothing about) as well as reviewing the
actual fix.

llvm-svn: 145062
2011-11-22 13:13:16 +00:00
Chandler Carruth
63f2a37e6e Fix an obvious omission in the SelectionDAGBuilder where we were
dropping weights on the floor for invokes. This was impeding my writing
further test cases for invoke when interacting with probabilities and
block placement.

No test case as there doesn't appear to be a way to test this stuff. =/
Suggestions for a test case of course welcome. I hope to be able to add
test cases that indirectly cover this eventually by adding probabilities
to the exceptional edge and reordering blocks as a result.

llvm-svn: 145060
2011-11-22 11:37:46 +00:00
Rafael Espindola
b45f6ea527 If a register is both an early clobber and part of a tied use, handle the use
before the clobber so that we copy the value if needed.

Fixes pr11415.

llvm-svn: 145056
2011-11-22 06:27:18 +00:00
Chandler Carruth
aac8e5082a The logic for breaking the CFG in the presence of hot successors didn't
properly account for the *global* probability of the edge being taken.
This manifested as a very large number of unconditional branches to
blocks being merged against the CFG even though they weren't
particularly hot within the CFG.

The fix is to check whether the edge being merged is both locally hot
relative to other successors for the source block, and globally hot
compared to other (unmerged) predecessors of the destination block.

This introduces a new crasher on GCC single-source, but it's currently
behind a flag, and Ben has offered to work on the reduction. =]

llvm-svn: 145010
2011-11-20 11:22:06 +00:00
Chandler Carruth
f24d3f8fc7 Move the handling of unanalyzable branches out of the loop-driven chain
formation phase and into the initial walk of the basic blocks. We
essentially pre-merge all blocks where unanalyzable fallthrough exists,
as we won't be able to update the terminators effectively after any
reorderings. This is quite a bit more principled as there may be CFGs
where the second half of the unanalyzable pair has some analyzable
predecessor that gets placed first. Then it may get placed next,
implicitly breaking the unanalyzable branch even though we never even
looked at the part that isn't analyzable. I've included a test case that
triggers this (thanks Benjamin yet again!), and I'm hoping to synthesize
some more general ones as I dig into related issues.

Also, to make this new scheme work we have to be able to handle branches
into the middle of a chain, so add this check. We always fallback on the
incoming ordering.

Finally, this starts to really underscore a known limitation of the
current implementation -- we don't consider broken predecessors when
merging successors. This can caused major missed opportunities, and is
something I'm planning on looking at next (modulo more bug reports).

llvm-svn: 144994
2011-11-19 10:26:02 +00:00
Devang Patel
a0973b0c53 DISubrange supports unsigned lower/upper array bounds, so let's not fake it in the end while emitting DWARF. If a FE needs to encode signed lower/upper array bounds then we need to extend DISubrange or ad DISignedSubrange.
llvm-svn: 144937
2011-11-17 23:43:15 +00:00
Chad Rosier
2673f8862f When fast iseling a GEP, accumulate the offset rather than emitting a series of
ADDs.  MaxOffs is used as a threshold to limit the size of the offset. Tradeoffs
being: (1) If we can't materialize the large constant then we'll cause fast-isel
to bail. (2) Too large of an offset can't be directly encoded in the ADD
resulting in a MOV+ADD.  Generally not a bad thing because otherwise we would
have had ADD+ADD, but on Thumb this turns into a MOVS+MOVT+ADD. Working on a fix
for that. (3) Conversely, too low of a threshold we'll miss opportunities to 
coalesce ADDs.
rdar://10412592

llvm-svn: 144886
2011-11-17 07:15:58 +00:00
Eli Friedman
51adc2ea5a Make sure to replace the chain properly when DAGCombining a LOAD+EXTRACT_VECTOR_ELT into a single LOAD. Fixes PR10747/PR11393.
llvm-svn: 144863
2011-11-16 23:50:22 +00:00
Chad Rosier
36cc01dbd3 Add fast-isel stats to determine who's doing all the work, the
target-independent selector or the target-specific selector.

llvm-svn: 144833
2011-11-16 21:05:28 +00:00
Chad Rosier
bf857d6eaa Fix the stats collection for fast-isel. The failed count was only accounting
for a single miss and not all predecessor instructions that get selected by
the selection DAG instruction selector.  This is still not exact (e.g., over
states misses when folded/dead instructions are present), but it is a step in
the right direction.

llvm-svn: 144832
2011-11-16 21:02:08 +00:00
Evan Cheng
5242b6aaa1 Disable expensive two-address optimizations at -O0. rdar://10453055
llvm-svn: 144806
2011-11-16 18:44:48 +00:00
Evan Cheng
cbfacfdb63 Disable the assertion again. Looks like fastisel is still generating bad kill markers.
llvm-svn: 144804
2011-11-16 18:32:14 +00:00
Evan Cheng
2b239cbcf6 Sink codegen optimization level into MCCodeGenInfo along side relocation model
and code model. This eliminates the need to pass OptLevel flag all over the
place and makes it possible for any codegen pass to use this information.

llvm-svn: 144788
2011-11-16 08:38:26 +00:00
Bob Wilson
dbab14b8ea Record landing pads with a SmallSetVector to avoid multiple entries.
There may be many invokes that share one landing pad, and the previous code
would record the landing pad once for each invoke.  Besides the wasted
effort, a pair of volatile loads gets inserted every time the landing pad is
processed.  The rest of the code can get optimized away when a landing pad
is processed repeatedly, but the volatile loads remain, resulting in code like:

LBB35_18:
Ltmp483:
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r4, [r7, #-72]
        ldr     r2, [r7, #-68]

llvm-svn: 144787
2011-11-16 07:57:21 +00:00
Bob Wilson
45ab17a709 Update the SP in the SjLj jmpbuf whenever it changes. <rdar://problem/10444602>
This same basic code was in the older version of the SjLj exception handling,
but it was removed in the recent revisions to that code.  It needs to be there.

llvm-svn: 144782
2011-11-16 07:12:00 +00:00
Evan Cheng
ab9e2ad9c4 Revert r144568 now that r144730 has fixed the fast-isel kill marker bug.
llvm-svn: 144776
2011-11-16 04:55:01 +00:00
Evan Cheng
65d5df1165 If the 2addr instruction has other kills, don't move it below any other uses since we don't want to extend other live ranges.
llvm-svn: 144772
2011-11-16 03:47:42 +00:00
Evan Cheng
27c17a65d1 RescheduleKillAboveMI() must backtrack to before the rescheduled DBG_VALUE instructions. rdar://10451185
llvm-svn: 144771
2011-11-16 03:33:08 +00:00
Evan Cheng
d756c3ec65 Process all uses first before defs to accurately capture register liveness. rdar://10449480
llvm-svn: 144770
2011-11-16 03:05:12 +00:00
Eli Friedman
1f3d774ba4 CONCAT_VECTORS can have more than two operands. PR11389.
llvm-svn: 144768
2011-11-16 02:52:39 +00:00
Eli Friedman
ce0cea66b9 Add a couple asserts so it will be easier to debug if we accidentally pass indexed loads/stores to the legalizer.
llvm-svn: 144767
2011-11-16 02:43:15 +00:00
Owen Anderson
48a129b50e Rename MVT::untyped to MVT::Untyped to match similar nomenclature.
llvm-svn: 144747
2011-11-16 01:02:57 +00:00
Eric Christopher
c9b63af4bb Stabilize the output of the dwarf accelerator tables. Fixes a comparison
failure during bootstrap with it turned on.

llvm-svn: 144731
2011-11-15 23:37:17 +00:00
Chad Rosier
71f1bbe1e7 GEPs with all zero indices are trivially coalesced by fast-isel. For example,
%arrayidx135 = getelementptr inbounds [4 x [4 x [4 x [4 x i32]]]]* %M0, i32 0, i64 0
%arrayidx136 = getelementptr inbounds [4 x [4 x [4 x i32]]]* %arrayidx135, i32 0, i64 %idxprom134

Prior to this commit, the GEP instruction that defines %arrayidx136 thought that 
%arrayidx135 was a trivial kill.  The GEP that defines %arrayidx135 doesn't 
generate any code and thus %M0 gets folded into the second GEP.  Thus, we need
to look through GEPs with all zero indices.
rdar://10443319

llvm-svn: 144730
2011-11-15 23:34:05 +00:00
Pete Cooper
8441c08e0b Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used
by later instructions.

Only done for DEC64m right now.

Fixes <rdar://problem/6172640>

llvm-svn: 144705
2011-11-15 21:57:53 +00:00
Devang Patel
11c550b1e5 Insert modified DBG_VALUE into LiveDbgValueMap.
llvm-svn: 144696
2011-11-15 21:03:58 +00:00
Rafael Espindola
95f4e0c409 We currently use a callback to handle an IL pass deleting a BB that still
has a reference to it. Unfortunately, that doesn't work for codegen passes
since we don't get notified of MBB's being deleted (the original BB stays).

Use that fact to our advantage and after printing a function, check if
any of the IL BBs corresponds to a symbol that was not printed. This fixes
pr11202.

llvm-svn: 144674
2011-11-15 19:08:46 +00:00
Benjamin Kramer
a2f57dee6d Remove all remaining uses of Value::getNameStr().
llvm-svn: 144648
2011-11-15 16:27:03 +00:00
Benjamin Kramer
3eeef2e739 Twinify GraphWriter a little bit.
llvm-svn: 144647
2011-11-15 16:26:38 +00:00
Jakob Stoklund Olesen
7b29a27e02 Check all overlaps when looking for used registers.
A function using any RC alias is enough to enable the ExeDepsFix pass.

llvm-svn: 144636
2011-11-15 08:20:43 +00:00
Jay Foad
e81b476d52 Make use of MachinePointerInfo::getFixedStack.
llvm-svn: 144635
2011-11-15 07:51:13 +00:00
Jay Foad
ce22ec7def Remove some unnecessary includes of PseudoSourceValue.h.
llvm-svn: 144634
2011-11-15 07:50:46 +00:00
Evan Cheng
40f68c968e Set SeenStore to true to prevent loads from being moved; also eliminates a non-deterministic behavior.
llvm-svn: 144628
2011-11-15 06:26:51 +00:00
Chandler Carruth
fdcba17bec Rather than trying to use the loop block sequence *or* the function
block sequence when recovering from unanalyzable control flow
constructs, *always* use the function sequence. I'm not sure why I ever
went down the path of trying to use the loop sequence, it is
fundamentally not the correct sequence to use. We're trying to preserve
the incoming layout in the cases of unreasonable control flow, and that
is only encoded at the function level. We already have a filter to
select *exactly* the sub-set of blocks within the function that we're
trying to form into a chain.

The resulting code layout is also significantly better because of this.
In several places we were ending up with completely unreasonable control
flow constructs due to the ordering chosen by the loop structure for its
internal storage. This change removes a completely wasteful vector of
basic blocks, saving memory allocation in the common case even though it
costs us CPU in the fairly rare case of unnatural loops. Finally, it
fixes the latest crasher reduced out of GCC's single source. Thanks
again to Benjamin Kramer for the reduction, my bugpoint skills failed at
it.

llvm-svn: 144627
2011-11-15 06:26:43 +00:00
Jakob Stoklund Olesen
2709f65821 Break false dependencies before partial register updates.
Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix
about instructions with partial register updates causing false unwanted
dependencies.

The ExecutionDepsFix pass will break the false dependencies if the
updated register was written in the previoius N instructions.

The small loop added to sse-domains.ll runs twice as fast with
dependency-breaking instructions inserted.

llvm-svn: 144602
2011-11-15 01:15:30 +00:00
Jakob Stoklund Olesen
0c310642e2 Track register ages more accurately.
Keep track of the last instruction to define each register individually
instead of per DomainValue.  This lets us track more accurately when a
register was last written.

Also track register ages across basic blocks.  When entering a new
basic block, use the least stale predecessor def as a worst case
estimate for register age.

The register age is used to arbitrate between conflicting domains. The
most recently defined register wins.

llvm-svn: 144601
2011-11-15 01:15:25 +00:00
Evan Cheng
95e735afa7 Avoid dereferencing off the beginning of lists.
llvm-svn: 144569
2011-11-14 21:11:15 +00:00
Evan Cheng
ba11ee300a At -O0, multiple uses of a virtual registers in the same BB are being marked
"kill". This looks like a bug upstream. Since that's going to take some time
to understand, loosen the assertion and disable the optimization when
multiple kills are seen.

llvm-svn: 144568
2011-11-14 21:02:09 +00:00
Evan Cheng
f19d257488 Teach two-address pass to re-schedule two-address instructions (or the kill
instructions of the two-address operands) in order to avoid inserting copies.
This fixes the few regressions introduced when the two-address hack was
disabled (without regressing the improvements).
rdar://10422688

llvm-svn: 144559
2011-11-14 19:48:55 +00:00
Jakob Stoklund Olesen
6035535c96 Fix early-clobber handling in shrinkToUses.
I broke this in r144515, it affected most ARM testers.

<rdar://problem/10441389>

llvm-svn: 144547
2011-11-14 18:45:38 +00:00
Chandler Carruth
9e6d173b9e It helps to deallocate memory as well as allocate it. =] This actually
cleans up all the chains allocated during the processing of each
function so that for very large inputs we don't just grow memory usage
without bound.

llvm-svn: 144533
2011-11-14 10:57:23 +00:00
Chandler Carruth
06afac4924 Remove an over-eager assert that was firing on one of the ARM regression
tests when I forcibly enabled block placement.

It is apparantly possible for an unanalyzable block to fallthrough to
a non-loop block. I don't actually beleive this is correct, I believe
that 'canFallThrough' is returning true needlessly for the code
construct, and I've left a bit of a FIXME on the verification code to
try to track down why this is coming up.

Anyways, removing the assert doesn't degrade the correctness of the algorithm.

llvm-svn: 144532
2011-11-14 10:55:53 +00:00
Chandler Carruth
a1475d9b6b Begin chipping away at one of the biggest quadratic-ish behaviors in
this pass. We're leaving already merged blocks on the worklist, and
scanning them again and again only to determine each time through that
indeed they aren't viable. We can instead remove them once we're going
to have to scan the worklist. This is the easy way to implement removing
them. If this remains on the profile (as I somewhat suspect it will), we
can get a lot more clever here, as the worklist's order is essentially
irrelevant. We can use swapping and fold the two loops to reduce
overhead even when there are many blocks on the worklist but only a few
of them are removed.

llvm-svn: 144531
2011-11-14 09:46:33 +00:00
Chandler Carruth
f89087744e Under the hood, MBPI is doing a linear scan of every successor every
time it is queried to compute the probability of a single successor.
This makes computing the probability of every successor of a block in
sequence... really really slow. ;] This switches to a linear walk of the
successors rather than a quadratic one. One of several quadratic
behaviors slowing this pass down.

I'm not really thrilled with moving the sum code into the public
interface of MBPI, but I don't (at the moment) have ideas for a better
interface. My direction I'm thinking in for a better interface is to
have MBPI actually retain much more state and make *all* of these
queries cheap. That's a lot of work, and would require invasive changes.
Until then, this seems like the least bad (ie, least quadratic)
solution. Suggestions welcome.

llvm-svn: 144530
2011-11-14 09:12:57 +00:00
Chandler Carruth
09418993f8 Reuse the logic in getEdgeProbability within getHotSucc in order to
correctly handle blocks whose successor weights sum to more than
UINT32_MAX. This is slightly less efficient, but the entire thing is
already linear on the number of successors. Calling it within any hot
routine is a mistake, and indeed no one is calling it. It also
simplifies the code.

llvm-svn: 144527
2011-11-14 08:55:59 +00:00
Chandler Carruth
462bb16130 Fix an overflow bug in MachineBranchProbabilityInfo. This pass relied on
the sum of the edge weights not overflowing uint32, and crashed when
they did. This is generally safe as BranchProbabilityInfo tries to
provide this guarantee. However, the CFG can get modified during codegen
in a way that grows the *sum* of the edge weights. This doesn't seem
unreasonable (imagine just adding more blocks all with the default
weight of 16), but it is hard to come up with a case that actually
triggers 32-bit overflow. Fortuately, the single-source GCC build is
good at this. The solution isn't very pretty, but its no worse than the
previous code. We're already summing all of the edge weights on each
query, we can sum them, check for an overflow, compute a scale, and sum
them again.

I've included a *greatly* reduced test case out of the GCC source that
triggers it. It's a pretty lame test, as it clearly is just barely
triggering the overflow. I'd like to have something that is much more
definitive, but I don't understand the fundamental pattern that triggers
an explosion in the edge weight sums.

The buggy code is duplicated within this file. I'll colapse them into
a single implementation in a subsequent commit.

llvm-svn: 144526
2011-11-14 08:50:16 +00:00
Jakob Stoklund Olesen
25e009690c Use getVNInfoBefore() when it makes sense.
llvm-svn: 144517
2011-11-14 01:39:36 +00:00
Chandler Carruth
b7f21af176 Teach machine block placement to cope with unnatural loops. These don't
get loop info structures associated with them, and so we need some way
to make forward progress selecting and placing basic blocks. The
technique used here is pretty brutal -- it just scans the list of blocks
looking for the first unplaced candidate. It keeps placing blocks like
this until the CFG becomes tractable.

The cost is somewhat unfortunate, it requires allocating a vector of all
basic block pointers eagerly. I have some ideas about how to simplify
and optimize this, but I'm trying to get the logic correct first.

Thanks to Benjamin Kramer for the reduced test case out of GCC. Sadly
there are other bugs that GCC is tickling that I'm reducing and working
on now.

llvm-svn: 144516
2011-11-14 00:00:35 +00:00
Jakob Stoklund Olesen
44da32bc7f Use kill slots instead of the previous slot in shrinkToUses.
It's more natural to use the actual end points.

llvm-svn: 144515
2011-11-13 23:53:25 +00:00
Chandler Carruth
1d9335bbb2 Cleanup some 80-columns violations and poor formatting. These snuck by
when I was reading through the code for style.

llvm-svn: 144513
2011-11-13 22:50:09 +00:00
Jakob Stoklund Olesen
4255046343 Terminate all dead defs at the dead slot instead of the 'next' slot.
This makes no difference for normal defs, but early clobber dead defs
now look like:

  [Slot_EarlyClobber; Slot_Dead)

instead of:

  [Slot_EarlyClobber; Slot_Register).

Live ranges for normal dead defs look like:

  [Slot_Register; Slot_Dead)

as before.

llvm-svn: 144512
2011-11-13 22:42:13 +00:00
Jakob Stoklund Olesen
170f021d76 Simplify early clobber slots a bit.
llvm-svn: 144507
2011-11-13 22:05:42 +00:00
Chandler Carruth
7e39d0c643 Enhance the assertion mechanisms in place to make it easier to catch
when we fail to place all the blocks of a loop. Currently this is
happening for unnatural loops, and this logic helps more immediately
point to the problem.

llvm-svn: 144504
2011-11-13 21:39:51 +00:00
Jakob Stoklund Olesen
9b34607bdf Rename SlotIndexes to match how they are used.
The old naming scheme (load/use/def/store) can be traced back to an old
linear scan article, but the names don't match how slots are actually
used.

The load and store slots are not needed after the deferred spill code
insertion framework was deleted.

The use and def slots don't make any sense because we are using
half-open intervals as is customary in C code, but the names suggest
closed intervals.  In reality, these slots were used to distinguish
early-clobber defs from normal defs.

The new naming scheme also has 4 slots, but the names match how the
slots are really used.  This is a purely mechanical renaming, but some
of the code makes a lot more sense now.

llvm-svn: 144503
2011-11-13 20:45:27 +00:00
Chandler Carruth
bbc3bddb16 Teach MBP to force-merge layout successors for blocks with unanalyzable
branches that also may involve fallthrough. In the case of blocks with
no fallthrough, we can still re-order the blocks profitably. For example
instruction decoding will in some cases continue past an indirect jump,
making laying out its most likely successor there profitable.

Note, no test case. I don't know how to write a test case that exercises
this logic, but it matches the described desired semantics in
discussions with Jakob and others. If anyone has a nice example of IR
that will trigger this, that would be lovely.

Also note, there are still assertion failures in real world code with
this. I'm digging into those next, now that I know this isn't the cause.

llvm-svn: 144499
2011-11-13 12:17:28 +00:00
Chandler Carruth
07f8871a86 Hoist another gross nested loop into a helper method.
llvm-svn: 144498
2011-11-13 11:42:26 +00:00
Chandler Carruth
e85aeac703 Add a missing doxygen comment for a helper method.
llvm-svn: 144497
2011-11-13 11:34:55 +00:00
Chandler Carruth
535c0683ea Hoist a nested loop into its own method.
llvm-svn: 144496
2011-11-13 11:34:53 +00:00
Chandler Carruth
e67c92282f Rewrite #3 of machine block placement. This is based somewhat on the
second algorithm, but only loosely. It is more heavily based on the last
discussion I had with Andy. It continues to walk from the inner-most
loop outward, but there is a key difference. With this algorithm we
ensure that as we visit each loop, the entire loop is merged into
a single chain. At the end, the entire function is treated as a "loop",
and merged into a single chain. This chain forms the desired sequence of
blocks within the function. Switching to a single algorithm removes my
biggest problem with the previous approaches -- they had different
behavior depending on which system triggered the layout. Now there is
exactly one algorithm and one basis for the decision making.

The other key difference is how the chain is formed. This is based
heavily on the idea Andy mentioned of keeping a worklist of blocks that
are viable layout successors based on the CFG. Having this set allows us
to consistently select the best layout successor for each block. It is
expensive though.

The code here remains very rough. There is a lot that needs to be done
to clean up the code, and to make the runtime cost of this pass much
lower. Very much WIP, but this was a giant chunk of code and I'd rather
folks see it sooner than later. Everything remains behind a flag of
course.

I've added a couple of tests to exercise the issues that this iteration
was motivated by: loop structure preservation. I've also fixed one test
that was exhibiting the broken behavior of the previous version.

llvm-svn: 144495
2011-11-13 11:20:44 +00:00
NAKAMURA Takumi
fca0bc8d54 Prune more RALinScan. RALinScan was also here!
llvm-svn: 144487
2011-11-13 01:33:10 +00:00
Jakob Stoklund Olesen
989b255462 More dead code elimination in VirtRegMap.
This thing is looking a lot like a virtual register map now.

llvm-svn: 144486
2011-11-13 01:23:34 +00:00
Jakob Stoklund Olesen
eb77ff7f7d Stop tracking spill slot uses in VirtRegMap.
Nobody cared, StackSlotColoring scans the instructions to find used stack
slots.

llvm-svn: 144485
2011-11-13 01:23:30 +00:00
Jakob Stoklund Olesen
edaed81556 Remove dead code and data from VirtRegMap.
Most of this stuff was supporting the old deferred spill code insertion
mechanism.  Modern spillers just edit machine code in place.

llvm-svn: 144484
2011-11-13 01:02:04 +00:00
Jakob Stoklund Olesen
27c2431c70 Stop tracking unused registers in VirtRegMap.
The information was only used by the register allocator in
StackSlotColoring.

llvm-svn: 144482
2011-11-13 00:39:45 +00:00
Jakob Stoklund Olesen
3eaaa93104 Remove the -color-ss-with-regs option.
It was off by default.

The new register allocators don't have the problems that made it
necessary to reallocate registers during stack slot coloring.

llvm-svn: 144481
2011-11-13 00:31:23 +00:00
Jakob Stoklund Olesen
c7672e78ef Delete VirtRegRewriter.
And there was much rejoicing.

llvm-svn: 144480
2011-11-13 00:16:01 +00:00
Jakob Stoklund Olesen
b54c411fc1 Switch PBQP to VRM's trivial rewriter.
The very complicated VirtRegRewriter is going away.

llvm-svn: 144479
2011-11-13 00:02:24 +00:00
Jakob Stoklund Olesen
5a265aeb70 Delete the old spilling framework from LiveIntervalAnalysis.
This is dead code, all register allocators use InlineSpiller.

llvm-svn: 144478
2011-11-12 23:57:05 +00:00
Jakob Stoklund Olesen
d0ddec5771 Delete the 'standard' spiller with used the old spilling framework.
The current register allocators all use the inline spiller.

llvm-svn: 144477
2011-11-12 23:29:02 +00:00
Jakob Stoklund Olesen
92abfb4cd7 Switch PBQP to the modern InlineSpiller framework.
It is worth noting that the old spiller would split live ranges around
basic blocks. The new spiller doesn't do that.

PBQP should do its own live range splitting with
SplitEditor::splitSingleBlock() if desired.  See
RAGreedy::tryBlockSplit().

llvm-svn: 144476
2011-11-12 23:17:52 +00:00
Jakob Stoklund Olesen
78902f9088 Delete the linear scan register allocator.
RegAllocGreedy has been the default for six months now.

Deleting RegAllocLinearScan makes it possible to also delete
VirtRegRewriter and clean up the spiller code.

llvm-svn: 144475
2011-11-12 22:39:45 +00:00
Rafael Espindola
5f6b14719f The dwarf standard says that the only differences between a out-of-line
instance and a concrete inlined instance are the use of DW_TAG_subprogram
instead of DW_TAG_inlined_subroutine and the who owns the tree.

We were also omitting DW_AT_inline from the abstract roots. To fix this,
make sure we mark abstract instance roots with DW_AT_inline even when
we have only out-of-line instances referring to them with DW_AT_abstract_origin.

FileCheck is not a very good tool for tests like this, maybe we should add
a -verify mode to llvm-dwarfdump.

llvm-svn: 144441
2011-11-12 01:57:54 +00:00
Eli Friedman
8563e57e38 Don't try to form pre/post-indexed loads/stores until after LegalizeDAG runs. Fixes PR11029.
llvm-svn: 144438
2011-11-12 00:35:34 +00:00
Eli Friedman
e1ea21fd5d Some cleanup and bulletproofing for node replacement in LegalizeDAG. To maintain LegalizeDAG invariants, whenever we a node is replaced, we must attempt to delete it, and if it still
has uses after it is replaced (which can happen in rare cases due to CSE), we must revisit it.

llvm-svn: 144432
2011-11-11 23:58:27 +00:00
Nicolas Geoffray
98ef58406c Add a custom safepoint method, in order for language implementers to decide which machine instruction gets to be a safepoint.
llvm-svn: 144399
2011-11-11 18:32:52 +00:00
Eric Christopher
6b4d25e3f1 Initialize variable.
llvm-svn: 144360
2011-11-11 03:16:32 +00:00
Eric Christopher
8b9a227517 If we have a DIE with an AT_specification use that instead of the normal
addr DIE when adding to the dwarf accelerator tables.

llvm-svn: 144354
2011-11-11 01:55:22 +00:00
Rafael Espindola
e7024f983a Check in getOrCreateSubprogramDIE if a declaration exists and if so output
it first.

This is a more general fix to pr11300.

llvm-svn: 144324
2011-11-10 22:34:29 +00:00
Eric Christopher
204348c454 Make types and namespaces take multiple DIEs for the accelerator tables
as well.

llvm-svn: 144319
2011-11-10 21:47:55 +00:00
Eric Christopher
71b5af9cbe Move type handling to make sure we get all created types that aren't
forward decls and have names into the dwarf accelerator types table.

llvm-svn: 144306
2011-11-10 19:52:58 +00:00
Eric Christopher
60be243dc5 Rework adding function names to the dwarf accelerator tables, allow
multiple dies per function and support C++ basenames.

llvm-svn: 144304
2011-11-10 19:25:34 +00:00
Evan Cheng
4760ff0763 Use a bigger hammer to fix PR11314 by disabling the "forcing two-address
instruction lower optimization" in the pre-RA scheduler.

The optimization, rather the hack, was done before MI use-list was available.
Now we should be able to implement it in a better way, perhaps in the
two-address pass until a MI scheduler is available.

Now that the scheduler has to backtrack to handle call sequences. Adding
artificial scheduling constraints is just not safe. Furthermore, the hack
is not taking all the other scheduling decisions into consideration so it's just
as likely to pessimize code. So I view disabling this optimization goodness
regardless of PR11314.

llvm-svn: 144267
2011-11-10 07:43:16 +00:00
Jakob Stoklund Olesen
bc48cd34b6 Strip old implicit operands after foldMemoryOperand.
The TII.foldMemoryOperand hook preserves implicit operands from the
original instruction.  This is not what we want when those implicit
operands refer to the register being spilled.

Implicit operands referring to other registers are preserved.

This fixes PR11347.

llvm-svn: 144247
2011-11-10 00:17:03 +00:00
Eli Friedman
b01f15653c Add check so we don't try to perform an impossible transformation. Fixes issue from PR11319.
llvm-svn: 144216
2011-11-09 22:25:12 +00:00
Benjamin Kramer
71e3909e9c Add comments.
llvm-svn: 144194
2011-11-09 18:16:11 +00:00
Duncan Sands
2934a0eaeb Speculatively revert commit 144124 (djg) in the hope that the 32 bit
dragonegg self-host buildbot will recover (it is complaining about object
files differing between different build stages).  Original commit message:

Add a hack to the scheduler to disable pseudo-two-address dependencies in
basic blocks containing calls. This works around a problem in which
these artificial dependencies can get tied up in calling seqeunce
scheduling in a way that makes the graph unschedulable with the current
approach of using artificial physical register dependencies for calling
sequences. This fixes PR11314.

llvm-svn: 144188
2011-11-09 14:20:48 +00:00
Benjamin Kramer
64e040aef6 Take advantage of the zero byte in StringMap when emitting dwarf stringpool entries.
llvm-svn: 144184
2011-11-09 12:12:04 +00:00
Devang Patel
d5aa583bf8 Remove extra ';'
llvm-svn: 144172
2011-11-09 06:20:49 +00:00
Eric Christopher
c916a28d7f Remove the pubnames section, no one consumes it.
llvm-svn: 144169
2011-11-09 05:24:07 +00:00
Jakob Stoklund Olesen
1239fed1e2 Collapse DomainValues across loop back-edges.
During the initial RPO traversal of the basic blocks, remember the ones
that are incomplete because of back-edges from predecessors that haven't
been visited yet.

After the initial RPO, revisit all those loop headers so the incoming
DomainValues on the back-edges can be properly collapsed.

This will properly fix execution domains on software pipelined code,
like the included test case.

llvm-svn: 144151
2011-11-09 01:06:56 +00:00
Jakob Stoklund Olesen
06121a40f3 Link to the live DomainValue after merging.
When merging two uncollapsed DomainValues, place a link to the active
DomainValue from the passive DomainValue.  This allows old stale
references to the passive DomainValue to be updated to point to the
active DomainValue.

The new resolve() function finds the active DomainValue and updates the
pointer.

This change makes old live-out lists more useful since they may contain
uncollapsed DomainValues that have since been merged into other
DomainValues.

llvm-svn: 144149
2011-11-09 00:06:18 +00:00
Jakob Stoklund Olesen
2567efc0d4 Track reference count independently from clear().
This allows clear() to be called on a DomainValue with references.

llvm-svn: 144147
2011-11-08 23:26:00 +00:00
Jakob Stoklund Olesen
b1238449b6 Call release() directly when cleaning up the remaining DomainValues.
There is no need to involve the LiveRegs array and kill() any longer.

llvm-svn: 144133
2011-11-08 22:05:17 +00:00
Jakob Stoklund Olesen
e9a83f1bba Rename all methods to follow style guide.
No functional change.

llvm-svn: 144132
2011-11-08 21:57:47 +00:00
Jakob Stoklund Olesen
e908f69bcb Handle reference counts in one function: release().
This new function will decrement the reference count, and collapse a
domain value when the last reference is gone.

This simplifies DomainValue reference counting, and decouples it from
the LiveRegs array.

llvm-svn: 144131
2011-11-08 21:57:44 +00:00
Eric Christopher
eba4abd2aa Also add the linkage name to the name accelerator tables if it exists
and is different than the normal name.

llvm-svn: 144130
2011-11-08 21:56:23 +00:00
Dan Gohman
b6cf7c4e94 Add a hack to the scheduler to disable pseudo-two-address dependencies in
basic blocks containing calls. This works around a problem in which
these artificial dependencies can get tied up in calling seqeunce
scheduling in a way that makes the graph unschedulable with the current
approach of using artificial physical register dependencies for calling
sequences. This fixes PR11314.

llvm-svn: 144124
2011-11-08 21:29:06 +00:00
Jakob Stoklund Olesen
4cc5284596 Clear old DomainValue after merging.
The old value may still be referenced by some live-out list, and we
don't wan't to collapse those instructions twice.

This fixes the "Can only swizzle VMOVD" assertion in some armv7 SPEC
builds.

<rdar://problem/10413292>

llvm-svn: 144117
2011-11-08 20:57:04 +00:00
Eric Christopher
9fc74dc3f8 Add the base ObjC method name to the names lookup table as well.
llvm-svn: 144105
2011-11-08 19:16:01 +00:00
Lang Hames
ee7de1cff0 Lower mem-ops to unaligned i32/i16 load/stores on ARM where supported.
Add support for trimming constants to GetDemandedBits. This fixes some funky
constant generation that occurs when stores are expanded for targets that don't
support unaligned stores natively.

llvm-svn: 144102
2011-11-08 18:56:23 +00:00
Pete Cooper
224434deec Added invariant field to the DAG.getLoad method and changed all calls.
When this field is true it means that the load is from constant (runt-time or compile-time) and so can be hoisted from loops or moved around other memory accesses

llvm-svn: 144100
2011-11-08 18:42:53 +00:00
Eric Christopher
49aa01035b A few more places where we can avoid multiple size queries.
llvm-svn: 144099
2011-11-08 18:38:40 +00:00
Eric Christopher
d7f22c64e0 Don't evaluate Data.size() on every iteration.
llvm-svn: 144095
2011-11-08 18:22:25 +00:00
Eli Friedman
741d364aa9 Add a bunch of calls to RemoveDeadNode in LegalizeDAG, so legalization doesn't get confused by CSE later on. Fixes PR11318.
Re-commit of r144034, with an extra fix so that RemoveDeadNode doesn't blow up.

llvm-svn: 144055
2011-11-08 01:25:24 +00:00
Eli Friedman
8d138bf571 Revert r144034 while I try to track down a crash.
llvm-svn: 144044
2011-11-07 23:53:20 +00:00
Bill Wendling
93c08673af This code is dead, what with the new EH model and the auto-upgraders in place.
Delete!

llvm-svn: 144043
2011-11-07 23:36:48 +00:00
Jakob Stoklund Olesen
9380d5daff Kill and collapse outstanding DomainValues.
DomainValues that are only used by "don't care" instructions are now
collapsed to the first possible execution domain after all basic blocks
have been processed.  This typically means the PS domain on x86.

For example, the vsel_i64 and vsel_double functions in sse2-blend.ll are
completely collapsed to the PS domain instead of containing a mix of
execution domains created by isel.

llvm-svn: 144037
2011-11-07 23:08:21 +00:00
Eli Friedman
c1bb1b2b09 Add a bunch of calls to RemoveDeadNode in LegalizeDAG, so legalization doesn't get confused by CSE later on. Fixes PR11318.
llvm-svn: 144034
2011-11-07 22:51:10 +00:00
Eric Christopher
37cd1659cb Add all completed and named types to the dwarf type accelerator tables.
llvm-svn: 144027
2011-11-07 22:11:16 +00:00
Jakob Stoklund Olesen
c1846a4d5e Use a reverse post order instead of a DFS order.
The enterBasicBlock() function is combining live-out values from
predecessor blocks.  The RPO traversal means that more predecessors
have been visited when that happens, only back-edges are missing.

llvm-svn: 144025
2011-11-07 21:59:29 +00:00
Eric Christopher
cc8024b134 Move the hash function to using and taking a StringRef.
llvm-svn: 144024
2011-11-07 21:49:35 +00:00
Eric Christopher
e655ddfde1 Simple destructor to delete the hash data we created earlier.
llvm-svn: 144023
2011-11-07 21:49:28 +00:00
Jakob Stoklund Olesen
d9a8ce3f67 Extract two methods. No functional change.
llvm-svn: 144020
2011-11-07 21:40:27 +00:00
Jakob Stoklund Olesen
7b9ab07c3d MBB doesn't need to be a class member.
llvm-svn: 144015
2011-11-07 21:23:42 +00:00
Jakob Stoklund Olesen
1d5ebd6c03 Fix pass name after the source was moved.
llvm-svn: 144014
2011-11-07 21:23:39 +00:00
Eric Christopher
2b26dd515a Use StringRef::startswith to do some string comparisons.
llvm-svn: 143982
2011-11-07 18:53:23 +00:00
Eric Christopher
1132780088 Avoid the use of a local temporary for comment twines.
llvm-svn: 143974
2011-11-07 18:34:47 +00:00
Eric Christopher
27787a743e Allow for the case where the name of the subprogram is "".
Fixes a self-host error.

llvm-svn: 143970
2011-11-07 18:10:17 +00:00
Richard Osborne
87ed868306 Don't introduce custom nodes after legalization in TargetLowering::BuildSDIV()
and TargetLowering::BuildUDIV(). Fixes PR11283

llvm-svn: 143964
2011-11-07 17:09:05 +00:00
Eric Christopher
fddc6980b7 Remove unnecessary addition to API. Replace with something much simpler.
llvm-svn: 143925
2011-11-07 09:38:42 +00:00
Eric Christopher
9374b7505a Add new files to cmake.
llvm-svn: 143924
2011-11-07 09:37:06 +00:00
Eric Christopher
f9c4db49bd Add the support code to enable the dwarf accelerator tables. Upcoming patches
to fix the types section (all types, not just global types), and testcases.

The code to do the final emission is disabled by default.

llvm-svn: 143923
2011-11-07 09:24:32 +00:00
Eric Christopher
1635d9449b Add a new dwarf accelerator table prototype with the goal of replacing
the pubnames and pubtypes tables. LLDB can currently use this format
and a full spec is forthcoming and submission for standardization is planned.

A basic summary:

The dwarf accelerator tables are an indirect hash table optimized
for null lookup rather than access to known data. They are output into
an on-disk format that looks like this:

.-------------.
|  HEADER     |
|-------------|
|  BUCKETS    |
|-------------|
|  HASHES     |
|-------------|
|  OFFSETS    |
|-------------|
|  DATA       |
`-------------'

where the header contains a magic number, version, type of hash function,
the number of buckets, total number of hashes, and room for a special
struct of data and the length of that struct.

The buckets contain an index (e.g. 6) into the hashes array. The hashes
section contains all of the 32-bit hash values in contiguous memory, and
the offsets contain the offset into the data area for the particular
hash.

For a lookup example, we could hash a function name and take it modulo the
number of buckets giving us our bucket. From there we take the bucket value
as an index into the hashes table and look at each successive hash as long
as the hash value is still the same modulo result (bucket value) as earlier.
If we have a match we look at that same entry in the offsets table and
grab the offset in the data for our final match.

llvm-svn: 143921
2011-11-07 09:18:42 +00:00
Eric Christopher
64ea0f378b Expose a way to get the beginning of the dwarf string section.
llvm-svn: 143920
2011-11-07 09:18:38 +00:00
Eric Christopher
e0fc702988 Fix up comment.
llvm-svn: 143919
2011-11-07 09:18:35 +00:00
Eric Christopher
07bba823ab Typo.
llvm-svn: 143918
2011-11-07 09:18:32 +00:00
Benjamin Kramer
4c8932e3b8 Add an option to pad an uleb128 to MCObjectWriter and remove the uleb128 encoding from the DWARF asm printer.
As a side effect we now print dwarf ulebs with .ascii directives.

llvm-svn: 143809
2011-11-05 11:52:44 +00:00
Benjamin Kramer
3c2ba1a51a Add more PRI.64 macros for MSVC and use them throughout the codebase.
llvm-svn: 143799
2011-11-05 08:57:40 +00:00
Pete Cooper
c76608c80c Added missing &. Fixes <rdar://problem/10393723>
llvm-svn: 143753
2011-11-04 23:49:14 +00:00
Rafael Espindola
a022f4813e Emit declarations before definitions if they are available. This causes DW_AT_specification to
point back in the file in the included testcase. Fixes PR11300.

llvm-svn: 143726
2011-11-04 19:00:29 +00:00
Dan Gohman
a5f382da8b Reapply r143206, with fixes. Disallow physical register lifetimes
across calls, and only check for nested dependences on the special
call-sequence-resource register.

llvm-svn: 143660
2011-11-03 21:49:52 +00:00
Pete Cooper
ad3d5b2eee Reverted r143600 - selector reference change
llvm-svn: 143646
2011-11-03 20:47:50 +00:00
Daniel Dunbar
3760ebeebb build: Add initial cut at LLVMBuild.txt files.
llvm-svn: 143634
2011-11-03 18:53:17 +00:00
Pete Cooper
c8a657a2b2 Treat objc selector reference globals as invariant so that MachineLICM can hoist them out of loops. Fixes <rdar://problem/6027699>
llvm-svn: 143600
2011-11-03 00:56:36 +00:00
Bill Wendling
4001685a66 An array of chars of length 8 will also cause the stack protector to be inserted
into the function. Reflect that here so that the array will be placed next to
the SP.
<rdar://problem/10128329>

llvm-svn: 143590
2011-11-02 23:20:58 +00:00
Nick Lewycky
691d7f80c2 Don't emit a directory entry for the value in DW_AT_comp_dir, that is always
implied by directory index zero.

llvm-svn: 143570
2011-11-02 20:55:33 +00:00
Chandler Carruth
f95461f23b Begin collecting some of the statistics for block placement discussed on
the mailing list. Suggestions for other statistics to collect would be
awesome. =]

Currently these are implemented as a separate pass guarded by a separate
flag. I'm not thrilled by that, but I wanted to be able to collect the
statistics for the old code placement as well as the new in order to
have a point of comparison. I'm planning on folding them into the single
pass if / when there is only one pass of interest.

llvm-svn: 143537
2011-11-02 07:17:12 +00:00
Jakob Stoklund Olesen
7b1107fe0e Update split candidate correctly when interference cache is full.
No test case, spotted by inspection.

llvm-svn: 143407
2011-11-01 00:02:31 +00:00
Nadav Rotem
7bfd1f069d Cleanup. Document. Make sure that this build_vector optimization only runs before the op legalizer and that the used type is legal.
llvm-svn: 143358
2011-10-31 20:08:25 +00:00
Benjamin Kramer
eb62811647 Silence compiler warning.
llvm-svn: 143308
2011-10-30 08:39:55 +00:00
Nadav Rotem
6c79131e39 Add a new DAGCombine optimization for BUILD_VECTOR.
If all of the inputs are zero/any_extended, create a new simple BV
which can be further optimized by other BV optimizations.

llvm-svn: 143297
2011-10-29 21:23:04 +00:00
Dan Gohman
826cec9a4b Revert r143206, as there are still some failing tests.
llvm-svn: 143262
2011-10-29 00:41:52 +00:00