1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 20:43:44 +02:00
Commit Graph

55924 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen
40eb30013e Avoid folding ADD instructions with FI operands.
PEI can't handle the pseudo-instructions. This can be removed when the
pseudo-instructions are replaced by normal predicated instructions.

Fixes PR13628.

llvm-svn: 162130
2012-08-17 20:55:34 +00:00
Akira Hatanaka
4e1b032521 Add stub methods for mips assembly matcher.
Patch by Vladimir Medic.

llvm-svn: 162124
2012-08-17 20:16:42 +00:00
Benjamin Kramer
4e9e4d1818 MemoryBuiltins: Properly guard ObjectSizeOffsetVisitor against cycles in the IR.
The previous fix only checked for simple cycles, use a set to catch longer
cycles too.

Drop the broken check from the ObjectSizeOffsetEvaluator. The BoundsChecking
pass doesn't have to deal with invalid IR like InstCombine does.

llvm-svn: 162120
2012-08-17 19:26:41 +00:00
Bill Wendling
0569e9a6f3 Change the linker_private_weak_def_auto' linkage to linkonce_odr_auto_hide' to
make it more consistent with its intended semantics.

The `linker_private_weak_def_auto' linkage type was meant to automatically hide
globals which never had their addresses taken. It has nothing to do with the
`linker_private' linkage type, which outputs the symbols with a `l' (ell) prefix
among other things.

The intended semantic is more like the `linkonce_odr' linkage type.

Change the name of the linkage type to `linkonce_odr_auto_hide'. And therefore
changing the semantics so that it produces the correct output for the linker.

Note: The old linkage name `linker_private_weak_def_auto' will still parse but
is not a synonym for `linkonce_odr_auto_hide'. This should be removed in 4.0.
<rdar://problem/11754934>

llvm-svn: 162114
2012-08-17 18:33:14 +00:00
Rafael Espindola
fccad41366 Assert that dominates is not given a multiple edge. Finding out if we have
multiple edges between two blocks is linear. If the caller is iterating all
edges leaving a BB that would be a square time algorithm. It is more efficient
to have the callers handle that case.

Currently the only callers are:
* GVN: already avoids the multiple edge case.
* Verifier: could only hit this assert when looking at an invalid invoke. Since
it already rejects the invoke, just avoid computing the dominance for it.

llvm-svn: 162113
2012-08-17 18:21:28 +00:00
Jakob Stoklund Olesen
36d81e300e Add comment, clean up code. No functional change.
llvm-svn: 162107
2012-08-17 16:59:09 +00:00
Benjamin Kramer
ba78a8432b TargetLowering: Use the large shift amount during legalize types. The legalizer may call us with an overly large type.
llvm-svn: 162101
2012-08-17 15:54:21 +00:00
Jakob Stoklund Olesen
476a5d42a7 Use standard pattern for iterate+erase.
Increment the MBB iterator at the top of the loop to properly handle the
current (and previous) instructions getting erased.

This fixes PR13625.

llvm-svn: 162099
2012-08-17 14:38:59 +00:00
Benjamin Kramer
d431f3a1f2 Guard MemoryBuiltins against self-looping GEPs, which can occur in unreachable code due to constant propagation.
Fixes PR13621.

llvm-svn: 162098
2012-08-17 14:16:37 +00:00
Tim Northover
1de091468c Implement NEON domain switching for scalar <-> S-register vmovs on ARM
llvm-svn: 162094
2012-08-17 11:32:52 +00:00
Craig Topper
efc1bf9ee1 Use nested switch to select arguments to reduce calls to EmitPCMP.
llvm-svn: 162089
2012-08-17 07:15:56 +00:00
Craig Topper
8fa010b216 Make ReplaceATOMIC_BINARY_64 a static function. Use a nested switch to reduce to only a single call to it thus allowing it to be inlined by the compiler.
llvm-svn: 162088
2012-08-17 06:55:11 +00:00
Craig Topper
117916e06d Remove unnecessary include of ARMGenInstrInfo.inc.
llvm-svn: 162086
2012-08-17 06:21:09 +00:00
Jakob Stoklund Olesen
88217b055d Add ADD and SUB to the predicable ARM instructions.
It is not my plan to duplicate the entire ARM instruction set with
predicated versions. We need a way of representing predicated
instructions in SSA form without requiring a separate opcode.

Then the pseudo-instructions can go away.

llvm-svn: 162061
2012-08-16 23:21:55 +00:00
Jakob Stoklund Olesen
aca66722c2 Handle ARM MOVCC optimization in PeepholeOptimizer.
Use the target independent select analysis hooks.

llvm-svn: 162060
2012-08-16 23:14:20 +00:00
Jakob Stoklund Olesen
babff4afdb Add an MCID::Select flag and TII hooks for optimizing selects.
Select instructions pick one of two virtual registers based on a
condition, like x86 cmov. On targets like ARM that support predication,
selects can sometimes be eliminated by predicating the instruction
defining one of the operands.

Teach PeepholeOptimizer to recognize select instructions, and ask the
target to optimize them.

llvm-svn: 162059
2012-08-16 23:11:47 +00:00
Roman Divacky
b95259c849 Revert r162034, r162035 and r162037.
llvm-svn: 162039
2012-08-16 19:07:59 +00:00
Roman Divacky
831ddb548a Define and handle additional fixup kinds. By Adhemerval Zanella.
llvm-svn: 162037
2012-08-16 18:37:52 +00:00
Roman Divacky
3a41549e6a Fix typo and grammar. By Adhemerval Zanella.
llvm-svn: 162032
2012-08-16 18:19:29 +00:00
Rafael Espindola
d7ec990084 Teach GVN to reason about edges dominating uses. This allows it to handle cases
where some fact lake a=b dominates a use in a phi, but doesn't dominate the
basic block itself.

This feature could also be implemented by splitting critical edges, but at least
with the current algorithm reasoning about the dominance directly is faster.

The time for running "opt -O2" in the testcase in pr10584 is 1.003 times slower
and on gcc as a single file it is 1.0007 times faster.

llvm-svn: 162023
2012-08-16 15:09:43 +00:00
Jush Lu
767c82d4e0 [arm-fast-isel] Add support for fastcc.
Without fastcc support, the caller just falls through to CallingConv::C
for fastcc, but callee still uses fastcc, this inconsistency of calling
convention is a problem, and fastcc support can fix it.

llvm-svn: 162013
2012-08-16 05:15:53 +00:00
Anitha Boyapati
161fc750a1 Patch to enable FMA on bdver2 target. Make XOP feature enable FMA4 as well.
llvm-svn: 162012
2012-08-16 04:04:02 +00:00
Anitha Boyapati
5443ee0d76 (no commit message)
llvm-svn: 162010
2012-08-16 03:50:04 +00:00
Akira Hatanaka
623a561154 Add Android ABI to Mips backend to handle functions returning vectors of four
floats.

llvm-svn: 162008
2012-08-16 03:48:05 +00:00
Jakob Stoklund Olesen
55aee8b58a Fold predicable instructions into MOVCC / t2MOVCC.
The ARM select instructions are just predicated moves. If the select is
the only use of an operand, the instruction defining the operand can be
predicated instead, saving one instruction and decreasing register
pressure.

This implementation can turn AND/ORR/EOR instructions into their
corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to
predicate any instruction, but we don't yet support predicated
instructions in SSA form.

llvm-svn: 161994
2012-08-15 22:16:39 +00:00
Bill Wendling
1124a2da20 Remove dead flag.
llvm-svn: 161990
2012-08-15 21:18:10 +00:00
Sean Callanan
de18bba4c8 Fixed a problem in the JIT memory allocator where
allocations of executable memory would not be padded
to account for the size of the allocation header.
This resulted in undersized allocations, meaning that
when the allocation was written to later the next
allocation's header would be corrupted.

llvm-svn: 161984
2012-08-15 20:53:52 +00:00
Michael J. Spencer
b1010e4913 Properly test the LLVM_USE_RVALUE_REFERENCES macro.
llvm-svn: 161978
2012-08-15 19:16:27 +00:00
Michael J. Spencer
c500bff4cd [PathV2] Add mapped_file_region. Implementation for Windows and POSIX.
llvm-svn: 161976
2012-08-15 19:05:47 +00:00
Owen Anderson
903f25db0a Fix another roundToIntegral bug where very large values could become infinity. Problem and solution identified by Steve Canon.
llvm-svn: 161969
2012-08-15 18:28:45 +00:00
Evan Cheng
625c0ca5ee Use vld1/vst1 to load/store f64 if alignment is < 4 and the target allows unaligned access. rdar://12091029
llvm-svn: 161962
2012-08-15 17:44:53 +00:00
Owen Anderson
a39979db7d Fix typo in comment.
llvm-svn: 161956
2012-08-15 16:42:53 +00:00
Jakob Stoklund Olesen
6639cea68f Add missing Rfalse operand to the predicated pseudo-instructions.
When predicating this instruction:

  Rd = ADD Rn, Rm

We need an extra operand to represent the value given to Rd when the
predicate is false:

  Rd = ADDCC Rfalse, Rn, Rm, pred

The Rd and Rfalse operands are different registers while in SSA form.
Rfalse is tied to Rd to make sure they get the same register during
register allocation.

Previously, Rd and Rn were tied, but that is not required.

Compare to MOVCC:

  Rd = MOVCC Rfalse, Rtrue, pred

llvm-svn: 161955
2012-08-15 16:17:24 +00:00
Bill Wendling
c0ce2ca906 Set the branch probability of branching to the 'normal' destination of an invoke
instruction to something absurdly high, while setting the probability of
branching to the 'unwind' destination to the bare minimum. This should set cause
the normal destination's invoke blocks to be moved closer to the invoke.

PR13612

llvm-svn: 161944
2012-08-15 12:22:35 +00:00
Kostya Serebryany
8af4402528 [asan] implement --asan-always-slow-path, which is a part of the improvement to handle unaligned partially OOB accesses. See http://code.google.com/p/address-sanitizer/issues/detail?id=100
llvm-svn: 161937
2012-08-15 08:58:58 +00:00
Owen Anderson
3b09e94409 Fix a problem with APFloat::roundToIntegral where it would return incorrect results for negative inputs to trunc. Add unit tests to verify this behavior.
llvm-svn: 161929
2012-08-15 05:39:46 +00:00
Michael Liao
cd290ba4fd fix infinite loop in instcombine with more than 4GB memcpy
- memcpy size is wrongly truncated into 32-bit and treat 8GB memcpy is
  0-sized memcpy
- as 0-sized memcpy/memset is already removed before SimplifyMemTransfer
  and SimplifyMemSet in visitCallInst, replace 0 checking with
  assertions.
- replace getZExtValue() with getLimitedValue() according to
  Eli Friedman

llvm-svn: 161923
2012-08-15 03:49:59 +00:00
Nick Lewycky
486ed972f1 Fix a typo that led to a failure to correctly verify bitcast instructions.
Patch by Stephen Hines!

llvm-svn: 161921
2012-08-15 02:37:07 +00:00
Richard Smith
e3b0fbab8a Fix undefined behavior: don't perform array indexing through a potentially null
pointer.

llvm-svn: 161919
2012-08-15 01:39:31 +00:00
Anton Korobeynikov
d13403fbd1 The names of VFP variants of half-to-float conversion instructions were
reversed. This leads to wrong codegen for float-to-half conversion
intrinsics which are used to support storage-only fp16 type.
NEON variants of same instructions are fine.

llvm-svn: 161907
2012-08-14 23:36:01 +00:00
Eric Christopher
47fee59c73 This needs braces. Spotted by Bill.
llvm-svn: 161906
2012-08-14 23:32:15 +00:00
Michael Liao
f763f96863 minor fix of X86ISD::VSEXT_MOVL dump
llvm-svn: 161902
2012-08-14 22:53:17 +00:00
Michael Liao
daebe04c2f fix PR11334
- FP_EXTEND only support extending from vectors with matching elements.
  This results in the scalarization of extending to v2f64 from v2f32,
  which will be legalized to v4f32 not matching with v2f64.
- add X86-specific VFPEXT supproting extending from v4f32 to v2f64.
- add BUILD_VECTOR lowering helper to recover back the original
  extending from v4f32 to v2f64.
- test case is enhanced to include different vector width.

llvm-svn: 161894
2012-08-14 21:24:47 +00:00
Jim Grosbach
53796945f5 Switch the fixed-length disassembler to be table-driven.
Refactor the TableGen'erated fixed length disassemblmer to use a
table-driven state machine rather than a massive set of nested
switch() statements.

As a result, the ARM Disassembler (ARMDisassembler.cpp) builds much more
quickly and generates a smaller end result. For a Release+Asserts build on
a 16GB 3.4GHz i7 iMac w/ SSD:

Time to compile at -O2 (averaged w/ hot caches):
  Previous: 35.5s
  New:       8.9s

TEXT size:
  Previous: 447,251
  New:      297,661

Builds in 25% of the time previously required and generates code 66% of
the size.

Execution time of the disassembler is only slightly slower (7% disassembling
10 million ARM instructions, 19.6s vs 21.0s). The new implementation has
not yet been tuned, however, so the performance should almost certainly
be recoverable should it become a concern.

llvm-svn: 161888
2012-08-14 19:06:05 +00:00
Owen Anderson
0db4c468c5 Fix the construction of the magic constant for roundToIntegral to be 64-bit safe. Fixes c-torture/execute/990826-0.c
llvm-svn: 161885
2012-08-14 18:51:15 +00:00
Kostya Serebryany
06dbe5559a [asan] insert crash basic blocks inline as opposed to inserting them at the end of the function. This doesn't seem to fix or break anything, but is considered to be more friendly to downstream passes
llvm-svn: 161870
2012-08-14 14:04:51 +00:00
Craig Topper
e7ac4d1df1 Factor duplicate calls to getUNDEF in several functions.
llvm-svn: 161860
2012-08-14 08:18:43 +00:00
Craig Topper
a3795f6791 Re-factor intrinsic lowering to combine common parts of similar intrinsics. Reduces compiled code size a little bit.
llvm-svn: 161859
2012-08-14 07:43:25 +00:00
Craig Topper
6a2fe056ce Change greater than to greater than or equal so that an identical sized store to the same offset is treated as completing overwriting.
llvm-svn: 161857
2012-08-14 07:32:05 +00:00
Richard Smith
2e1fc547c0 Fix undefined behavior: binding null pointer to reference. No functionality change.
llvm-svn: 161853
2012-08-14 05:31:26 +00:00
Nadav Rotem
eb22b069bb During the CodeGenPrepare we often lower intrinsics (such as objsize)
and allow some optimizations to turn conditional branches into unconditional.
This commit adds a simple control-flow optimization which merges two consecutive
basic blocks which are connected by a single edge. This allows the codegen to
operate on larger basic blocks.

rdar://11973998

llvm-svn: 161852
2012-08-14 05:19:07 +00:00
Eric Christopher
36e95a157c Grammar.
llvm-svn: 161851
2012-08-14 05:13:29 +00:00
Eric Christopher
65f6159be4 Typo.
llvm-svn: 161826
2012-08-14 01:09:10 +00:00
Owen Anderson
c5b77c0317 Add a roundToIntegral method to APFloat, which can be parameterized over various rounding modes. Use this to implement SelectionDAG constant folding of FFLOOR, FCEIL, and FTRUNC.
llvm-svn: 161807
2012-08-13 23:32:49 +00:00
Jakob Stoklund Olesen
f49206c2fd Transfer weights in transferSuccessorsAndUpdatePHIs().
llvm-svn: 161805
2012-08-13 23:13:25 +00:00
Jakob Stoklund Olesen
d25a03b9af Print out MachineBasicBlock successor weights when available.
llvm-svn: 161804
2012-08-13 23:13:23 +00:00
Nadav Rotem
a921b121e2 LICM uses AliasSet information to hoist and sink instructions. However, other passes, such as LoopRotate
may invalidate its AliasSet because SSAUpdater does not update the AliasSet properly.
This patch teaches SSAUpdater to notify AliasSet that it made changes.
The testcase in PR12901 is too big to be useful and I could not reduce it to a normal size. 

rdar://11872059 PR12901

llvm-svn: 161803
2012-08-13 23:06:54 +00:00
Nadav Rotem
685bd842e2 MemoryDependenceAnalysis attempts to find the first memory dependency for function calls.
Currently, if GetLocation reports that it did not find a valid pointer (this is the case for volatile load/stores),
we ignore the result. This patch adds code to handle the cases where we did not obtain a valid pointer.

rdar://11872864  PR12899

llvm-svn: 161802
2012-08-13 23:03:43 +00:00
Jakob Stoklund Olesen
33e364a3df Remove the TII::scheduleTwoAddrSource() hook.
It never does anything when running 'make check', and it get's in the
way of updating live intervals in 2-addr.

The hook was originally added to help form IT blocks in Thumb2 code
before register allocation, but the pass ordering has changed since
then, and we run if-conversion after register allocation now.

When the MI scheduler is enabled, there will be no less than two
schedulers between 2-addr and Thumb2ITBlockPass, so this hook is
unlikely to help anything.

llvm-svn: 161794
2012-08-13 21:52:57 +00:00
Manman Ren
159ae3b3bc ARM: enable struct byval for AAPCS-VFP.
This change is to be enabled in clang.

rdar://9877866

llvm-svn: 161789
2012-08-13 21:22:50 +00:00
Bill Wendling
59b0b53089 Whitespace cleanup.
llvm-svn: 161788
2012-08-13 21:20:43 +00:00
Jakob Stoklund Olesen
343ed1017e Count triangles and diamonds in early if-conversion.
llvm-svn: 161783
2012-08-13 21:03:27 +00:00
Jakob Stoklund Olesen
0e46305efe Delete dead typedef.
llvm-svn: 161782
2012-08-13 21:03:25 +00:00
Jakob Stoklund Olesen
8e6595639e Handle extra Tail predecessors in if-conversion.
It is still possible to if-convert if the tail block has extra
predecessors, but the tail phis must be rewritten instead of being
removed.

llvm-svn: 161781
2012-08-13 20:49:04 +00:00
Arnold Schwaighofer
dbdb2581b8 [Hexagon] Don't mark callee saved registers as clobbered by a tail call
This was causing unnecessary spills/restores of callee saved registers.

Fixes PR13572.

Patch by Pranav Bhandarkar!

llvm-svn: 161778
2012-08-13 19:54:01 +00:00
Nadav Rotem
03c4d5f036 Do not optimize (or (and X,Y), Z) into BFI and other sequences if the AND ISDNode has more than one user.
rdar://11876519

llvm-svn: 161775
2012-08-13 18:52:44 +00:00
Manman Ren
cb05c49c64 X86: move Int_CVTSD2SSrr, Int_CVTSI2SSrr, Int_CVTSI2SDrr, Int_CVTSS2SDrr from
OpTbl1 to OpTbl2 since they have 3 operands and the last operand can be changed
to a memory operand.

PR13576

llvm-svn: 161769
2012-08-13 18:29:41 +00:00
Eric Christopher
3aea549423 Add support for the %H output modifier.
Patch by Weiming Zhao.

llvm-svn: 161768
2012-08-13 18:18:52 +00:00
Manman Ren
c9f5387a5c X86: when auto-detecting the subtarget features, make sure use IsIntel to detect
Nehalem, Westmere and Sandy Bridge. AMD also has processor family 6.

llvm-svn: 161763
2012-08-13 17:26:46 +00:00
Kostya Serebryany
61dd5ba233 [asan] remove the code for --asan-merge-callbacks as it appears to be a bad idea. (partly related to Bug 13225)
llvm-svn: 161757
2012-08-13 14:08:46 +00:00
Tim Northover
b1f8be6cbe Use correct loads for vector types during extending-load operations.
Previously, we used VLD1.32 in all cases, however there are both 16 and 64-bit
accesses being selected, so we need to use an appropriate width load in those
cases.

llvm-svn: 161748
2012-08-13 09:06:31 +00:00
Craig Topper
4fc08044be Tidy up VSETCC lowering code a bit more by adding an llvm_unreachable and putting an a couple if conditions in a better order.
llvm-svn: 161746
2012-08-13 03:42:38 +00:00
Craig Topper
a438ea46bf Refactor code a bit to share commonalities. No functional change intended.
llvm-svn: 161745
2012-08-13 02:34:03 +00:00
Craig Topper
bb92d94049 Fix an unused variable warning from r161742.
llvm-svn: 161743
2012-08-13 01:26:45 +00:00
Craig Topper
1032fcf6da Remove the LowerMMXCONCAT_VECTORS function. It could never execute because there are no legal 64-bit vector types that could be used as inputs to a 128-bit concat_vectors. Remove a target specific SDNode and its patterns that become unused as a result.
llvm-svn: 161742
2012-08-13 01:23:55 +00:00
Nick Lewycky
5178e4f814 When emitting the PC range in an FDE, use the same data encoding for both ends
of the range. Fixes PR13581!

llvm-svn: 161739
2012-08-12 08:09:45 +00:00
Craig Topper
5a5ed2d691 Remove call to setOperationAction for SETCC of v4f32. SETCC returns an integer type not an FP type.
llvm-svn: 161738
2012-08-12 05:31:32 +00:00
Craig Topper
1292e1f43c Remove unnecessary call to setOperationAction for SETCC of v2i64 under SSE42. It was already called for the same under SSE2.
llvm-svn: 161737
2012-08-12 05:15:16 +00:00
Arnold Schwaighofer
c751a25aed Revert 161581: Patch to implement UMLAL/SMLAL instructions for the ARM
architecture

It broke MultiSource/Applications/JM/ldecod/ldecod on armv7 thumb O0 g and armv7
thumb O3.

llvm-svn: 161736
2012-08-12 05:11:56 +00:00
Craig Topper
4d9cbceefd Change addTypeForNeon to use MVT instead of EVT so all the calls to getSimpleVT can be removed.
llvm-svn: 161735
2012-08-12 03:16:37 +00:00
Craig Topper
709114d67f Make replace many calls to getSizeInBits() with is128BitVector/is256BitVector
llvm-svn: 161734
2012-08-12 02:23:29 +00:00
Craig Topper
a52fcd0a14 Use MVT.isXBitVector instead of EVT.isXBitVector when setting up operation actions. Compiles to smaller code.
llvm-svn: 161733
2012-08-12 00:34:56 +00:00
Michael Liao
4b95cb463a fix PR13577, an issue introduced by r161687
- FCMOV only supports a subset of X86 conditions. Skip boolean
  simplification if X86 condition is not valid for FCMOV.
- add a minimal test case for PR13577.

llvm-svn: 161732
2012-08-11 23:47:06 +00:00
Craig Topper
93e2521659 Move setOperationAction for CONCAT_VECTORS for 256-bit vectors into loop since all 256-bit types are supported.
llvm-svn: 161730
2012-08-11 22:34:26 +00:00
Benjamin Kramer
28e133c8f2 MachineCSE: Hoist isConstantPhysReg out of the loop, it checks for overlaps already.
llvm-svn: 161729
2012-08-11 20:42:59 +00:00
Benjamin Kramer
b338b22e72 PR13578: Teach MachineCSE that instructions that use a constant register can be CSE'd safely.
This is common e.g. when doing rip-relative addressing on x86_64.

llvm-svn: 161728
2012-08-11 19:05:13 +00:00
Craig Topper
b7f7fa86ec Tidy up indentation. No functional change.
llvm-svn: 161727
2012-08-11 17:53:00 +00:00
Craig Topper
ba0c3ebe9e Fix a cast that was casting away 'const' unnecessarily
llvm-svn: 161726
2012-08-11 17:46:16 +00:00
Craig Topper
3929432178 Add a couple default: llvm_unreachable() to some switch statements. Fix a bad message in an existing llvm_unreachable.
llvm-svn: 161725
2012-08-11 17:44:14 +00:00
Manman Ren
9bd686f936 X86: when we are auto-detecting the subtarget features, make sure we turn on
FeatureFastUAMem for Nehalem, Westmere and Sandy Bridge.

FeatureFastUAMem is already on if we pass in nehalem or westmere as a command
argument.

rdar: 7252306
llvm-svn: 161717
2012-08-10 23:43:32 +00:00
Jakob Stoklund Olesen
887e16b508 Add a proper if-conversion cost model.
Detect when there is not enough available ILP, so if-conversion can't
speculate instructions for free.

Compute the lengthening of the critical path when inserting a select
instruction that depends on the condition as well as both sides of the
if.

Reject conversions that would stretch the critical path by more than
half a mispredict penalty.

llvm-svn: 161713
2012-08-10 22:27:31 +00:00
Jakob Stoklund Olesen
9a004cb5fd Give MachineTraceMetrics its own debug tag.
llvm-svn: 161712
2012-08-10 22:27:29 +00:00
Jakob Stoklund Olesen
e3d7dc035c Add more trace query functions.
Trace::getResourceLength() computes the number of cycles required to
execute the trace when ignoring data dependencies. The number can be
compared to the critical path to estimate the trace ILP.

Trace::getPHIDepth() computes the data dependency depth of a PHI in a
trace successor that isn't necessarily part of the trace.

llvm-svn: 161711
2012-08-10 22:27:27 +00:00
Eli Friedman
449495cd62 The normal edge of an invoke is not allowed to branch to a block with a
landingpad.  Enforce it in the verifier, and fix the regression tests to match.

llvm-svn: 161697
2012-08-10 20:55:20 +00:00
Manman Ren
500d45c3d9 ARM: enable struct byval for AAPCS.
This change is to be enabled in clang.

rdar://9877866
PR://13350

llvm-svn: 161693
2012-08-10 20:39:38 +00:00
Jakob Stoklund Olesen
43da54d467 Add getTPred() and getFPred() functions.
They identify the PHI predecessors in both diamonds and triangles.

llvm-svn: 161689
2012-08-10 20:19:17 +00:00
Jakob Stoklund Olesen
e5fcceb4c3 Include loop-carried dependencies when computing instr heights.
When a trace ends with a back-edge, include PHIs in the loop header in
the height computations. This makes the critical path through a loop
more accurate by including the latencies of the last instructions in the
loop.

llvm-svn: 161688
2012-08-10 20:11:38 +00:00
Michael Liao
97334a5c5f add X86-specific DAG optimization to simplify boolean test
- if a boolean test (X86ISD::CMP or X86ISD:SUB) checks a boolean value
  generated from X86ISD::SETCC, try to simplify the boolean value
  generation and checking by reusing the original EFLAGS with proper
  condition code
- add hooks to X86 specific SETCC/BRCOND/CMOV, the major 3 places
  consuming EFLAGS

part of patches fixing PR12312

llvm-svn: 161687
2012-08-10 19:58:13 +00:00
Rafael Espindola
1f2b548138 Constify some basic blocks, no functionality change.
llvm-svn: 161668
2012-08-10 15:55:25 +00:00
Michael Liao
81be965deb remove tailing whitespaces and test commit
llvm-svn: 161664
2012-08-10 14:39:24 +00:00
Rafael Espindola
522e1d0acf Move BasicBlockEdge to the cpp file. No functionality change.
llvm-svn: 161663
2012-08-10 14:05:55 +00:00
Joerg Sonnenberger
bb0db71d1b stdcxx's cstdio doesn't include stdio.h, but the code using PathV2.inc
includes both. Deal with feof and ferror potentially being macros.

llvm-svn: 161658
2012-08-10 10:56:09 +00:00
Joerg Sonnenberger
f07e1e10a6 Add some missing includes for the build against stdcxx.
llvm-svn: 161657
2012-08-10 10:53:56 +00:00
Pete Cooper
22f2513465 Fix crash when when do lto on Bullet. Dynamic GEPs in SROA were incorrectly being applied to all accesses to an alloca, not just the ones which read from the GEP. Thanks to Evan for reducing the test. rdar://11861001
llvm-svn: 161654
2012-08-10 03:26:36 +00:00
Jakob Stoklund Olesen
52532aa712 Update edge weights correctly in replaceSuccessor().
When replacing Old with New, it can happen that New is already a
successor. Add the old and new edge weights instead of creating a
duplicate edge.

llvm-svn: 161653
2012-08-10 03:23:27 +00:00
Rafael Espindola
8a1cdcb7fc Remove references to compression in llvm-ar. It has been a long time since we
switched from a bytecode+bzip2 to the current bitcode.

llvm-svn: 161651
2012-08-10 01:57:52 +00:00
Jakob Stoklund Olesen
5c0e68ae5a Reapply r161633-161634 "Partition use lists so defs always come before uses.""
No changes to these patches, MRI needed to be notified when changing
uses into defs and vice versa.

llvm-svn: 161644
2012-08-10 00:21:30 +00:00
Jakob Stoklund Olesen
b2f9b8589f Also update MRI use lists when changing a use to a def and vice versa.
This was the cause of the buildbot failures.

llvm-svn: 161643
2012-08-10 00:21:26 +00:00
Chad Rosier
b0454bf13e [ms-inline asm] Add a new Inline Asm Non-Standard Dialect attribute.
This new attribute is intended to be used by the backend to determine how
the inline asm string should be parsed/printed. This patch adds the 
ia_nsdialect attribute and also adds a test case to ensure the IR is
correctly parsed, but there is no functional change at this time.

The standard dialect is assumed to be AT&T.  Therefore, this attribute
should only be added to MS-style inline assembly statements, which use
the Intel dialect.  If we ever support more dialects we'll need to
add additional state to the attribute.

llvm-svn: 161641
2012-08-10 00:00:22 +00:00
Jakob Stoklund Olesen
7680d4d42c Revert r161633-161634 "Partition use lists so defs always come before uses."
These commits broke a number of buildbots.

llvm-svn: 161640
2012-08-09 23:31:36 +00:00
Jakob Stoklund Olesen
9594629bf1 Partition use lists so defs always come before uses.
This makes it possible to speed up def_iterator by stopping at the first
use. This makes def_empty() and getUniqueVRegDef() much faster when
there are many uses.

In a +Asserts build, LiveVariables is 100x faster in one case because
getVRegDef() has an assertion that would scan to the end of a
def_iterator chain.

Spill weight calculation is significantly faster (300x in one case)
because isTriviallyReMaterializable() calls MRI->isConstantPhysReg(%RIP)
which calls def_empty(%RIP).

llvm-svn: 161634
2012-08-09 22:49:46 +00:00
Jakob Stoklund Olesen
58b6c3edfa Don't use pointer-pointers for the register use lists.
Use a more conventional doubly linked list where the Prev pointers form
a cycle. This means it is no longer necessary to adjust the Prev
pointers when reallocating the VRegInfo array.

The test changes are required because the register allocation hint is
using the use-list order to break ties.

llvm-svn: 161633
2012-08-09 22:49:42 +00:00
Jakob Stoklund Olesen
c806311dcf Move use list management into MachineRegisterInfo.
Register MachineOperands are kept in linked lists accessible via MRI's
reg_iterator interfaces. The linked list management was handled partly
by MachineOperand methods, partly by MRI methods.

Move all of the list management into MRI, delete
MO::AddRegOperandToRegInfo() and MO::RemoveRegOperandFromRegInfo().

Be more explicit about handling the cases where an MRI pointer isn't
available.

llvm-svn: 161632
2012-08-09 22:49:37 +00:00
Eric Christopher
77ae8ee419 Remove getARMRegisterNumbering and replace with calls into
the register info for getEncodingValue. This builds on the
small patch of yesterday to set HWEncoding in the register
file.

One (deprecated) use was turned into a hard number to avoid
needing register info in the old JIT.

llvm-svn: 161628
2012-08-09 22:10:21 +00:00
Jakob Stoklund Olesen
74a24cbdaa Fix a future TwoAddressInstructionPass crash.
No test case, the crash only happens when the default use list order is
changed.

llvm-svn: 161627
2012-08-09 22:08:26 +00:00
Jakob Stoklund Olesen
c8bcc2518d Don't modify MO while use_iterator is still pointing to it.
llvm-svn: 161626
2012-08-09 22:08:24 +00:00
Chad Rosier
5efd936f43 [ms-inline asm] Extend the MC AsmParser API to match MCInsts (but not emit).
This new API will be used by clang to parse ms-style inline asms.

One goal of this project is to use this style of inline asm for targets other
then x86.  Therefore, this API needs to be implemented for non-x86 targets at
some point in the future.

llvm-svn: 161624
2012-08-09 22:04:55 +00:00
Jack Carter
5ce6f4b4e5 Another 32 to 64 bit sign extension bug.
The fields in the td definition were switched.

llvm-svn: 161607
2012-08-09 19:43:18 +00:00
Arnold Schwaighofer
f3d4d73157 Patch to implement UMLAL/SMLAL instructions for the ARM architecture
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.

Bug 12213

Patch by Yin Ma!

llvm-svn: 161581
2012-08-09 15:25:52 +00:00
Nadav Rotem
63ebca3806 Fix the legalization of ExtLoad on ARM. ExpandUnalignedLoad did not properly
handle the cases where the memory value type was illegal. 
PR 13111. 

llvm-svn: 161565
2012-08-09 01:56:44 +00:00
Eric Christopher
286507043a This field isn't used anymore, use it with HWEncoding instead.
llvm-svn: 161564
2012-08-09 01:39:32 +00:00
Jim Grosbach
dfe45ef6c5 Move [SU]LEB128 encoding to a utility header.
These functions are very generic. There's no reason for them to
be tied to MCObjectWriter.

llvm-svn: 161545
2012-08-08 23:56:06 +00:00
Jakob Stoklund Olesen
74359693c4 Don't use getNextOperandForReg().
This way of using getNextOperandForReg() was unlikely to work as
intended. We don't give any guarantees about the order of operands in
the use-def chains, so looking only at operands following a given
operand in the chain doesn't make sense.

llvm-svn: 161542
2012-08-08 23:44:04 +00:00
Jakob Stoklund Olesen
a89342e961 Don't use getNextOperandForReg() in RAFast.
That particular optimization was probably premature anyway.

llvm-svn: 161541
2012-08-08 23:44:01 +00:00
Jakob Stoklund Olesen
98908a7b0c Deal with irreducible control flow when building traces.
We filter out MachineLoop back-edges during the trace-building PO
traversals, but it is possible to have CFG cycles that aren't natural
loops, and MachineLoopInfo doesn't include such cycles.

Use a standard visited set to detect such CFG cycles, and completely
ignore them when picking traces.

llvm-svn: 161532
2012-08-08 22:12:01 +00:00
Jakob Stoklund Olesen
83cdb2bdf4 Heed -stress-early-ifcvt.
llvm-svn: 161513
2012-08-08 18:24:23 +00:00
Jakob Stoklund Olesen
57743948ca Get the MispredictPenalty from MCSchedModel.
Thanks, Andy!

llvm-svn: 161507
2012-08-08 18:19:58 +00:00
Rafael Espindola
5d2595ed26 Typedefs and indentation fixes from the Andy Zhang/PAX macro argument patch.
Committing it first as it makes the "real" patch a lot easier to read.

llvm-svn: 161491
2012-08-08 14:51:03 +00:00
Anton Korobeynikov
4761e753ef Fix for .pdata and .xdata section attributes on COFF.
Patch by kai@redstar.de !

llvm-svn: 161487
2012-08-08 12:46:46 +00:00
Bill Wendling
33bd6a2d6c Add .pushsection', .popsection', and `.previous' directives to Darwin ASM.
There are situations where inline ASM may want to change the section -- for
instance, to create a variable in the .data section. However, it cannot do this
without (potentially) restoring to the wrong section. E.g.:

  asm volatile (".section __DATA, __data\n\t"
                ".globl _fnord\n\t"
                "_fnord: .quad 1f\n\t"
                ".text\n\t"
                "1:" :::);

This may be wrong if this is inlined into a function that has a "section"
attribute. The user should use `.pushsection' and `.popsection' here instead.

The addition of `.previous' is added for completeness.
<rdar://problem/12048387>

llvm-svn: 161477
2012-08-08 06:30:30 +00:00
Andrew Trick
75af469e99 Added MispredictPenalty to SchedMachineModel.
This replaces an existing subtarget hook on ARM and allows standard
CodeGen passes to potentially use the property.

llvm-svn: 161471
2012-08-08 02:44:16 +00:00
Andrew Trick
79c6fbc97e Minor cleanup of defaultDefLatency API
llvm-svn: 161470
2012-08-08 02:44:11 +00:00
Andrew Trick
749aa4269e whitespace
llvm-svn: 161469
2012-08-08 02:44:08 +00:00
Eli Friedman
a64c4c130d isAllocLikeFn is allowed to return true for functions which read memory; make
sure we account for that correctly in DeadStoreElimination.  Fixes a regression
from r158919.  PR13547.

llvm-svn: 161468
2012-08-08 02:17:32 +00:00
Jakob Stoklund Olesen
e89d9ba157 Revert "Fix a quadratic algorithm in MachineBranchProbabilityInfo."
It caused an assertion failure when compiling consumer-typeset.

llvm-svn: 161463
2012-08-08 01:10:31 +00:00
Manman Ren
967804ad0a X86: enable CSE between CMP and SUB
We perform the following:
1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering.
2> Modify MachineCSE to correctly handle implicit defs.
3> Convert SUB back to CMP if possible at peephole.

Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled
by peephole now.

rdar://11873276

llvm-svn: 161462
2012-08-08 00:51:41 +00:00
Jakob Stoklund Olesen
924ff06fdf Don't scan physreg use-def chains looking for a PIC base.
We can't rematerialize a PIC base after register allocation anyway, and
scanning physreg use-def chains is very expensive in a function with
many calls.

<rdar://problem/12047515>

llvm-svn: 161461
2012-08-08 00:40:47 +00:00
Jakob Stoklund Olesen
10c3144381 Fix a quadratic algorithm in MachineBranchProbabilityInfo.
The getSumForBlock function was quadratic in the number of successors
because getSuccWeight would perform a linear search for an already known
iterator.

llvm-svn: 161460
2012-08-08 00:20:37 +00:00
Dan Gohman
df7f8afaf2 Avoid recomputing the unique exit blocks and their insert points when doing
multiple scalar promotions on a single loop. This also has the effect of
preserving the order of stores sunk out of loops, which is aesthetically
pleasing, and it happens to fix the testcase in PR13542, though it doesn't
fix the underlying problem.

llvm-svn: 161459
2012-08-08 00:00:26 +00:00
Jakob Stoklund Olesen
08414c1860 Skip tied operand pairs that already have the same register.
llvm-svn: 161454
2012-08-07 22:47:06 +00:00
Jakob Stoklund Olesen
3c221664e3 Add SelectionDAG::getTargetIndex.
This adds support for TargetIndex operands during isel. The meaning of
these (index, offset, flags) operands is entirely defined by the target.

llvm-svn: 161453
2012-08-07 22:37:05 +00:00
Bob Wilson
51c50d44b7 Fix a serious typo in InstCombine's optimization of comparisons.
An unsigned value converted to floating-point will always be greater than
a negative constant.  Unfortunately InstCombine reversed the check so that
unsigned values were being optimized to always be greater than all positive
floating-point constants.  <rdar://problem/12029145>

llvm-svn: 161452
2012-08-07 22:35:16 +00:00
Evan Cheng
96c6741fad X86 cmp lowering is looking past truncate on the condition node. It should only
do so when the high bits are known zero. This caused a subtle miscompilation.

rdar://12027825 

llvm-svn: 161451
2012-08-07 22:21:00 +00:00
Bill Wendling
06a580b5da For non-Darwin platforms, we want to generate stack protectors only for
character arrays. This is in line with what GCC does.
<rdar://problem/10529227>

llvm-svn: 161446
2012-08-07 20:59:05 +00:00
Jakob Stoklund Olesen
301af79343 Add a new kind of MachineOperand: MO_TargetIndex.
A target index operand looks a lot like a constant pool reference, but
it is completely target-defined. It contains the 8-bit TargetFlags, a
32-bit index, and a 64-bit offset. It is preserved by all code generator
passes.

TargetIndex operands can be used to carry target-specific information in
cases where immediate operands won't suffice.

llvm-svn: 161441
2012-08-07 18:56:39 +00:00
Andrew Kaylor
adda20daff Enable lazy compilation in MCJIT
llvm-svn: 161438
2012-08-07 18:33:00 +00:00
Jakob Stoklund Olesen
8836660866 Fix a couple of typos.
llvm-svn: 161437
2012-08-07 18:32:57 +00:00
Jakob Stoklund Olesen
438bc30c3d Add trace accessor methods, implement primitive if-conversion heuristic.
Compare the critical paths of the two traces through an if-conversion
candidate. If the difference is larger than the branch brediction
penalty, reject the if-conversion. If would never pay.

llvm-svn: 161433
2012-08-07 18:02:19 +00:00
Rafael Espindola
ccfbbaa11f The dominance computation already has logic for computing if an edge dominates
a use or a BB, but it is inline in the handling of the invoke instruction.

This patch refactors it so that it can be used in other cases. For example, in

define i32 @f(i32 %x) {
bb0:
  %cmp = icmp eq i32 %x, 0
  br i1 %cmp, label %bb2, label %bb1
bb1:
  br label %bb2
bb2:
  %cond = phi i32 [ %x, %bb0 ], [ 0, %bb1 ]
  %foo = add i32 %cond, %x
  ret i32 %foo
}

GVN should be able to replace %x with 0 in any use that is dominated by the
true edge out of bb0. In the above example the only such use is the one in
the phi.

llvm-svn: 161429
2012-08-07 17:30:46 +00:00
Hal Finkel
aa174abb14 Add a comment about mftb vs. mfspr on PPC.
Thanks to Alex Rosenberg for the suggestion.

llvm-svn: 161428
2012-08-07 17:04:20 +00:00
Alexey Samsonov
fa8f91a368 Fix the representation of debug line table in DebugInfo LLVM library,
and "instruction address -> file/line" lookup.

Instead of plain collection of rows, debug line table for compilation unit is now
treated as the number of row ranges, describing sequences (series of contiguous machine
instructions). The sequences are not always listed in the order of increasing
address, so previously used std::lower_bound() sometimes produced wrong results.
Now the instruction address lookup consists of two stages: finding the correct
sequence, and searching for address in range of rows for this sequence.

llvm-svn: 161414
2012-08-07 11:46:57 +00:00
Benjamin Kramer
b8389165be PR13095: Give an inline cost bonus to functions using byval arguments.
We give a bonus for every argument because the argument setup is not needed
anymore when the function is inlined. With this patch we interpret byval
arguments as a compact representation of many arguments. The byval argument
setup is implemented in the backend as an inline memcpy, so to model the
cost as accurately as possible we take the number of pointer-sized elements
in the byval argument and give a bonus of 2 instructions for every one of
those. The bonus is capped at 8 elements, which is the number of stores
at which the x86 backend switches from an expanded inline memcpy to a real
memcpy. It would be better to use the real memcpy threshold from the backend,
but it's not available via TargetData.

This change brings the performance of c-ray in line with gcc 4.7. The included
test case tries to reproduce the c-ray problem to catch regressions for this
benchmark early, its performance is dominated by the inline decision of a
specific call.

This only has a small impact on most code, more on x86 and arm than on x86_64
due to the way the ABI works. When building LLVM for x86 it gives a small
inline cost boost to virtually any function using StringRef or STL allocators,
but only a 0.01% increase in overall binary size. The size of gcc compiled by
clang actually shrunk by a couple bytes with this patch applied, but not
significantly.

llvm-svn: 161413
2012-08-07 11:13:19 +00:00
Chandler Carruth
ca6b087618 Fix PR13412, a nasty miscompile due to the interleaved
instsimplify+inline strategy.

The crux of the problem is that instsimplify was reasonably relying on
an invariant that is true within any single function, but is no longer
true mid-inline the way we use it. This invariant is that an argument
pointer != a local (alloca) pointer.

The fix is really light weight though, and allows instsimplify to be
resiliant to these situations: when checking the relation ships to
function arguments, ensure that the argumets come from the same
function. If they come from different functions, then none of these
assumptions hold. All credit to Benjamin Kramer for coming up with this
clever solution to the problem.

llvm-svn: 161410
2012-08-07 10:59:59 +00:00
Chandler Carruth
49d4e3f282 Add a much more conservative strategy for aligning branch targets.
Previously, MBP essentially aligned every branch target it could. This
bloats code quite a bit, especially non-looping code which has no real
reason to prefer aligned branch targets so heavily.

As Andy said in review, it's still a bit odd to do this without a real
cost model, but this at least has much more plausible heuristics.

Fixes PR13265.

llvm-svn: 161409
2012-08-07 09:45:24 +00:00
Manman Ren
5d43c19d9e MachineCSE: Update the heuristics for isProfitableToCSE.
If the result of a common subexpression is used at all uses of the candidate
expression, CSE should not increase the live range of the common subexpression.

rdar://11393714 and rdar://11819721

llvm-svn: 161396
2012-08-07 06:16:46 +00:00
Bill Wendling
69f9777937 Revert r161371. Removing the 'const' before Type is a "good thing".
--- Reverse-merging r161371 into '.':
U    include/llvm/Target/TargetData.h
U    lib/Target/TargetData.cpp

llvm-svn: 161394
2012-08-07 05:51:59 +00:00
Jack Carter
32420dd092 The define for 64 bit sign extension neglected to
initialize fields of the class that it used.

The result was nonsense code.

Before:
0000000000000000 <foo>:
   0:    00441100     0x441100
   4:    03e00008     jr    ra
   8:    00000000     nop

After:
0000000000000000 <foo>:
   0:    00041000     sll    v0,a0,0x0
   4:    03e00008     jr    ra
   8:    00000000     nop 

llvm-svn: 161377
2012-08-07 00:35:22 +00:00
Bill Wendling
dc532577fd Constify the Type parameter to some methods (which are const anyway).
llvm-svn: 161371
2012-08-07 00:26:35 +00:00
Andrew Trick
35b938c991 Allow x86 subtargets to use the GenericModel defined in X86Schedule.td.
This allows codegen passes to query properties like
InstrItins->SchedModel->IssueWidth. It also ensure's that
computeOperandLatency returns the X86 defaults for loads and "high
latency ops". This should have no significant impact on existing
schedulers because X86 defaults happen to be the same as global
defaults.

llvm-svn: 161370
2012-08-07 00:25:30 +00:00
Jack Carter
d768975885 Mips relocation R_MIPS_64 relocates a 64 bit double word.
I hit this in a very large program (spirit.cpp), but 
have not figured out how to make a small make check
test for it.

llvm-svn: 161366
2012-08-07 00:01:14 +00:00
Jack Carter
3f30c3effe The Mips64InstrInfo.td definitions DynAlloc64 LEA_ADDiu64
were using a class defined for 32 bit instructions and 
thus the instruction was for addiu instead of daddiu.

This was corrected by adding the instruction opcode as a 
field in the  base class to be filled in by the defs.

llvm-svn: 161359
2012-08-06 23:29:06 +00:00
Jack Carter
fdb00bef02 Mips relocations R_MIPS_HIGHER and R_MIPS_HIGHEST.
These 2 relocations gain access to the 
highest and the second highest 16 bits
of a 64 bit object.

R_MIPS_HIGHER %higher(A+S)
The %higher(x) function is [ (((long long) x + 0x80008000LL) >> 32) & 0xffff ]. 

R_MIPS_HIGHEST %highest(A+S)
The %highest(x) function is [ (((long long) x + 0x800080008000LL) >> 48) & 0xffff ]. 

llvm-svn: 161348
2012-08-06 21:26:03 +00:00
Hal Finkel
15265edebe MFTB on PPC64 should really be encoded using MFSPR.
The MFTB instruction itself is being phased out, and its functionality
is provided by MFSPR. According to the ISA docs, using MFSPR works on all known
chips except for the 601 (which did not have a timebase register anyway)
and the POWER3.

Thanks to Adhemerval Zanella for pointing this out!

llvm-svn: 161346
2012-08-06 21:21:44 +00:00
Eric Christopher
f5132794cd Add support for the OpenBSD for Bitrig.
Patch by David Hill.

llvm-svn: 161344
2012-08-06 20:52:18 +00:00
Roman Divacky
59eec94f55 Remove empty overrides of processFunctionBeforeFrameFinalized().
llvm-svn: 161328
2012-08-06 18:14:18 +00:00
Craig Topper
dc1b95e7de Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305.
llvm-svn: 161318
2012-08-06 06:22:36 +00:00
Craig Topper
e26b30c830 Remove custom inserter for MWAIT. It doesn't do anything that couldn't be represented in a pattern.
llvm-svn: 161306
2012-08-05 00:36:57 +00:00
Craig Topper
c716a3f554 Use a COPY node instead of an explicit MOVA opcode in the custom insterter for pcmpestrm/pcmpistrm. Allows the register allocator to handle it better and prevent wasted identity moves.
llvm-svn: 161305
2012-08-05 00:17:48 +00:00
Hal Finkel
aadd19de06 Add readcyclecounter lowering on PPC64.
On PPC64, this can be done with a simple TableGen pattern.
To enable this, I've added the (otherwise missing) readcyclecounter
SDNode definition to TargetSelectionDAG.td.

llvm-svn: 161302
2012-08-04 14:10:46 +00:00
Anton Korobeynikov
b0d4fe0a5e Skip impdef regs during eabi save/restore list emission to workaround PR11902
llvm-svn: 161301
2012-08-04 13:25:58 +00:00
Anton Korobeynikov
dca34647bc Recognize vst1.64 / vld1.64 with 3 and 4 regs as load from / store to stack stuff
(this corresponds by spilling/reloading regs in DTriple / DQuad reg classes).
No testcase, found by inspection.

llvm-svn: 161300
2012-08-04 13:22:14 +00:00
Anton Korobeynikov
6dd5c91aae Add stack spill / reload instructions for DTriple and DQuad register classes, which
were missed for no reason. This fixes PR13377

llvm-svn: 161299
2012-08-04 13:16:12 +00:00
Benjamin Kramer
c1a7151fa1 Postpone the deletion of the old name in StructType::setName to allow using a slice of the old name.
Fixes PR13522. Add a rudimentary unit test to exercise the behavior.

llvm-svn: 161296
2012-08-04 09:47:02 +00:00
Jakob Stoklund Olesen
d5b3babd6f Delete a dead variable.
TwoAddressInstructionPass doesn't remat any more.

llvm-svn: 161285
2012-08-04 00:04:03 +00:00
Jakob Stoklund Olesen
69611c470c TwoAddressInstructionPass refactoring: Extract another method.
llvm-svn: 161284
2012-08-03 23:57:58 +00:00
Bob Wilson
9f6e25017a Refactor and check "onlyReadsMemory" before optimizing builtins.
This patch is mostly just refactoring a bunch of copy-and-pasted code, but
it also adds a check that the call instructions are readnone or readonly.
That check was already present for sin, cos, sqrt, log2, and exp2 calls, but
it was missing for the rest of the builtins being handled in this code.

llvm-svn: 161282
2012-08-03 23:29:17 +00:00
Jakob Stoklund Olesen
7713568689 TwoAddressInstructionPass refactoring: Extract a method.
No functional change intended, except replacing a DenseMap with a
SmallDenseMap which should behave identically.

llvm-svn: 161281
2012-08-03 23:25:45 +00:00
Jakob Stoklund Olesen
643fdb449e Begin adding support for updating LiveIntervals in TwoAddressInstructionPass.
This is far from complete, and only changes behavior when the
-early-live-intervals flag is passed to llc.

llvm-svn: 161273
2012-08-03 22:58:34 +00:00
Akira Hatanaka
ebbe0eff91 1. Redo mips16 instructions to avoid multiple opcodes for same instruction.
Change these to patterns.
2. Add another 16 instructions.

Patch by Reed Kotler.

llvm-svn: 161272
2012-08-03 22:57:02 +00:00
Jakob Stoklund Olesen
14c88af5f2 Add an experimental -early-live-intervals option.
This option runs LiveIntervals before TwoAddressInstructionPass which
will eventually learn to exploit and update the analysis.

Eventually, LiveIntervals will run before PHIElimination, and we can get
rid of LiveVariables.

llvm-svn: 161270
2012-08-03 22:12:54 +00:00
Jakob Stoklund Olesen
1e192f98dd Delete merged physreg copies in joinReservedPhysReg().
Previously, the identity copy would survive through register allocation
before it was removed by the rewriter.

llvm-svn: 161269
2012-08-03 22:12:51 +00:00
Bob Wilson
7e2fb62620 Try to reduce the compile time impact of r161232.
The previous change caused fast isel to not attempt handling any calls to
builtin functions.  That included things like "printf" and caused some
noticable regressions in compile time.  I wanted to avoid having fast isel
keep a separate list of functions that had to be kept in sync with what the
code in SelectionDAGBuilder.cpp was handling.  I've resolved that here by
moving the list into TargetLibraryInfo.  This is somewhat redundant in
SelectionDAGBuilder but it will ensure that we keep things consistent.

llvm-svn: 161263
2012-08-03 21:26:24 +00:00
Bob Wilson
7d92f01d57 Fix memcmp code-gen to honor -fno-builtin.
I noticed that SelectionDAGBuilder::visitCall was missing a check for memcmp
in TargetLibraryInfo, so that it would use custom code for memcmp calls even
with -fno-builtin.  I also had to add a new -disable-simplify-libcalls option
to llc so that I could write a test for this.

llvm-svn: 161262
2012-08-03 21:26:18 +00:00
Jakob Stoklund Olesen
bf7a000191 Completely eliminate VNInfo flags.
The 'unused' state of a value number can be represented as an invalid
def SlotIndex. This also exposed code that shouldn't have been looking
at unused value VNInfos.

llvm-svn: 161258
2012-08-03 20:59:32 +00:00
Jakob Stoklund Olesen
9af03604de Fix a couple of loops that were processing unused value numbers.
Unused VNInfos should be left alone. Their def SlotIndex doesn't point
to anything.

llvm-svn: 161257
2012-08-03 20:59:29 +00:00
Matt Beaumont-Gay
acc7302f80 Silence unused variable warning in -asserts build
llvm-svn: 161256
2012-08-03 20:54:11 +00:00
Jakob Stoklund Olesen
38aec46f18 Eliminate the VNInfo::hasPHIKill() flag.
The only real user of the flag was removeCopyByCommutingDef(), and it
has been switched to LiveIntervals::hasPHIKill().

All the code changed by this patch was only concerned with computing and
propagating the flag.

llvm-svn: 161255
2012-08-03 20:19:44 +00:00
Jakob Stoklund Olesen
74dfaec8d7 Make the hasPHIKills flag a computed property.
The VNInfo::HAS_PHI_KILL is only half supported. We precompute it in
LiveIntervalAnalysis, but it isn't properly updated by live range
splitting and functions like shrinkToUses().

It is only used in one place: RegisterCoalescer::removeCopyByCommutingDef().

This patch changes that function to use a new LiveIntervals::hasPHIKill()
function that computes the flag for a given value number.

llvm-svn: 161254
2012-08-03 20:10:24 +00:00
Jakob Stoklund Olesen
df448a4b76 Delete dead function.
llvm-svn: 161242
2012-08-03 15:21:21 +00:00
Jakob Stoklund Olesen
95dc0b7b93 Don't delete dead code in TwoAddressInstructionPass.
This functionality was added before we started running
DeadMachineInstructionElim on all targets. It serves no purpose now.

llvm-svn: 161241
2012-08-03 15:11:57 +00:00
Gabor Greif
0c8708dd91 allow 'make CPPFLAGS=<something>' work again
this makes this hack a bit more bearable
for poor souls who need to pass custom
preprocessor flags to the build process

llvm-svn: 161240
2012-08-03 13:31:24 +00:00
Bob Wilson
d1eefbeac2 Fall back to selection DAG isel for calls to builtin functions.
Fast isel doesn't currently have support for translating builtin function
calls to target instructions.  For embedded environments where the library
functions are not available, this is a matter of correctness and not
just optimization.  Most of this patch is just arranging to make the
TargetLibraryInfo available in fast isel.  <rdar://problem/12008746>

llvm-svn: 161232
2012-08-03 04:06:28 +00:00
Bob Wilson
627fc538a7 Add new getLibFunc method to TargetLibraryInfo.
This just provides a way to look up a LibFunc::Func enum value for a
function name.  Alphabetize the enums and function names so we can use a
binary search.

llvm-svn: 161231
2012-08-03 04:06:22 +00:00
Jush Lu
ca5d760bf2 [arm-fast-isel] Add support for shl, lshr, and ashr.
llvm-svn: 161230
2012-08-03 02:37:48 +00:00
Bill Wendling
58b2e7f9b0 Move the "findUsedStructTypes" functionality outside of the Module class.
The "findUsedStructTypes" method is very expensive to run. It needs to be
optimized so that LTO can run faster. Splitting this method out of the Module
class will help this occur. For instance, it can keep a list of seen objects so
that it doesn't process them over and over again.

llvm-svn: 161228
2012-08-03 00:30:35 +00:00
Eric Christopher
714e032a36 Add support for the ARM GHC calling convention, this patch was in 3.0,
but somehow managed to be dropped later.

Patch by Karel Gardas.

llvm-svn: 161226
2012-08-03 00:05:53 +00:00
Jim Grosbach
1663895970 ARM: Tidy up. Remove unused template parameters.
llvm-svn: 161222
2012-08-02 22:08:27 +00:00
Jim Grosbach
42ef3d66ab ARM: More InstAlias refactors to use #NAME#.
llvm-svn: 161220
2012-08-02 21:59:52 +00:00
Jim Grosbach
b3f7932a3d ARM: Refactor instaliases using TableGen support for #NAME#.
Now that TableGen supports references to NAME w/o it being explicitly
referenced in the definition's own name, use that to simplify
assembly InstAlias definitions in multiclasses.

llvm-svn: 161218
2012-08-02 21:50:41 +00:00
Manman Ren
2c236a30f6 X86 Peephole: fold loads to the source register operand if possible.
Add more comments and use early returns to reduce nesting in isLoadFoldable.
Also disable folding for V_SET0 to avoid introducing a const pool entry and
a const pool load.

rdar://10554090 and rdar://11873276

llvm-svn: 161207
2012-08-02 19:37:32 +00:00
Jim Grosbach
5cde605008 TableGen: Allow use of #NAME# outside of 'def' names.
Previously, def NAME values were only populated, and references to NAME
resolved, when NAME was referenced in the 'def' entry of the multiclass
sub-entry. e.g.,
multiclass foo<...> {
  def prefix_#NAME : ...
}

It's useful, however, to be able to reference NAME even when the default
def name is used. For example, when a multiclass has 'def : Pat<...>'
or 'def : InstAlias<...>' entries which refer to earlier instruction
definitions in the same multiclass. e.g.,
multiclass myMulti<RegisterClass rc> {
  def _r : myI<(outs rc:$d), (ins rc:$r), "r $d, $r", []>;

  def : InstAlias<\"wilma $r\", (!cast<Instruction>(NAME#\"_r\") rc:$r, rc:$r)>;
}

llvm-svn: 161198
2012-08-02 18:46:42 +00:00
Jakob Stoklund Olesen
a299894307 Compute the critical path length through a trace.
Whenever both instruction depths and instruction heights are known in a
block, it is possible to compute the length of the critical path as
max(depth+height) over the instructions in the block.

The stored live-in lists make it possible to accurately compute the
length of a critical path that bypasses the current (small) block.

llvm-svn: 161197
2012-08-02 18:45:54 +00:00
Akira Hatanaka
6b71c77901 Move the code that creates instances of MipsInstrInfo and MipsFrameLowering out
of MipsTargetMachine.cpp.

llvm-svn: 161191
2012-08-02 18:21:47 +00:00
Akira Hatanaka
a1ef6b9f53 Set transient stack alignment in constructor of MipsFrameLowering and re-enable
test o32_cc_vararg.ll.

llvm-svn: 161189
2012-08-02 18:15:13 +00:00
Jakob Stoklund Olesen
4be65d0d0c Verify regunit intervals along with virtreg intervals.
Don't cause regunit intervals to be computed just to verify them. Only
check the already cached intervals.

llvm-svn: 161183
2012-08-02 16:36:50 +00:00
Jakob Stoklund Olesen
ac79668cdf Avoid creating dangling physreg live ranges during DCE.
LiveRangeEdit::eliminateDeadDefs() can delete a dead instruction that
reads unreserved physregs. This would leave the corresponding regunit
live interval dangling because we don't have shrinkToUses() for physical
registers.

Fix this problem by turning the instruction into a KILL instead of
deleting it. This happens in a landing pad in
test/CodeGen/X86/2012-05-19-CoalescerCrash.ll:

  %vreg27<def,dead> = COPY %EDX<kill>; GR32:%vreg27

becomes:

  KILL %EDX<kill>

An upcoming fix to the machine verifier will catch problems like this by
verifying regunit live intervals.

This fixes PR13498. I am not including the test case from the PR since
we already have one exposing the problem once the verifier is fixed.

llvm-svn: 161182
2012-08-02 16:36:47 +00:00
Jakob Stoklund Olesen
802cd7688f Add report() functions that take a LiveInterval argument.
llvm-svn: 161178
2012-08-02 14:31:49 +00:00
Hongbin Zheng
b6bbdfa346 Implement the block_iterator of Region based on df_iterator.
llvm-svn: 161177
2012-08-02 14:20:02 +00:00
Nuno Lopes
a101ed3bc8 JIT::runFunction(): add a fast path for functions with a single argument that is a pointer.
llvm-svn: 161171
2012-08-02 12:09:32 +00:00
Jiangning Liu
b12a230a17 Support fpv4 for ARM Cortex-M4.
llvm-svn: 161163
2012-08-02 08:35:55 +00:00
Jiangning Liu
a806f72610 Fix #13035, a bug around Thumb instruction LDRD/STRD with negative #0 offset index issue.
llvm-svn: 161162
2012-08-02 08:29:50 +00:00
Jiangning Liu
69f0770c21 Fix #13138, a bug around ARM instruction DSB encoding and decoding issue.
llvm-svn: 161161
2012-08-02 08:21:27 +00:00
Jiangning Liu
e4c91e239f Fix #13241, a bug around shift immediate operand for ARM instruction ADR.
llvm-svn: 161159
2012-08-02 08:13:13 +00:00
Manman Ren
78b8d454cc X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

This patch is a rework of r160919 and was tested on clang self-host on my local
machine.

rdar://10554090 and rdar://11873276

llvm-svn: 161152
2012-08-02 00:56:42 +00:00
Jakob Stoklund Olesen
5848ce22b0 Extract some methods from verifyLiveIntervals.
No functional change.

llvm-svn: 161149
2012-08-02 00:20:20 +00:00
Jakob Stoklund Olesen
85840cd877 Also verify RegUnit intervals at uses.
llvm-svn: 161147
2012-08-01 23:52:40 +00:00
Manman Ren
1c73b43c93 X86: mark GATHER instructios as mayLoad
llvm-svn: 161143
2012-08-01 23:28:59 +00:00
Jakob Stoklund Olesen
b223807a3d Compute instruction heights through a trace.
The height on an instruction is the minimum number of cycles from the
instruction is issued to the end of the trace. Heights are computed for
all instructions in and below the trace center block.

The method for computing heights is different from the depth
computation. As we visit instructions in the trace bottom-up, heights of
used instructions are pushed upwards. This way, we avoid scanning long
use lists, looking for uses in the current trace.

At each basic block boundary, a list of live-in registers and their
minimum heights is saved in the trace block info. These live-in lists
are used when restarting depth computations on a trace that
converges with an already computed trace. They will also be used to
accurately compute the critical path length.

llvm-svn: 161138
2012-08-01 22:36:00 +00:00
Jim Grosbach
b60d4c1110 ARM: Remove redundant instalias.
llvm-svn: 161134
2012-08-01 20:33:05 +00:00
Jim Grosbach
32cbdce015 Clean up formatting.
llvm-svn: 161133
2012-08-01 20:33:02 +00:00
Jim Grosbach
c4b3246b76 Tidy up.
llvm-svn: 161132
2012-08-01 20:33:00 +00:00
Chad Rosier
078db59861 Whitespace.
llvm-svn: 161122
2012-08-01 18:39:17 +00:00
Eric Christopher
dde7784606 Temporarily revert c23b933d5f8be9b51a1d22e717c0311f65f87dcd. It's causing
failures in the debug testsuite and possibly PR13486.

llvm-svn: 161121
2012-08-01 18:19:01 +00:00
Nuno Lopes
a35e5d4af8 remove tabs from my previous commit.
Sorry, not used to this editor anymore.. XCode please come back; you're forgiven :)

llvm-svn: 161120
2012-08-01 17:13:28 +00:00
Nuno Lopes
491dd780f4 (hopefuly) fix the remaining cases where null wasnt expected (PR13497).
I'll commit a test to the clang tree.

llvm-svn: 161118
2012-08-01 16:58:51 +00:00
Jakob Stoklund Olesen
beae25c46c Add DataDep constructors. Explicitly check SSA form.
llvm-svn: 161115
2012-08-01 16:02:59 +00:00
Elena Demikhovsky
0fec7026d9 Added FMA functionality to X86 target.
llvm-svn: 161110
2012-08-01 12:06:00 +00:00
Nick Lewycky
e84df7229f Stay rational; don't assert trying to take the square root of a negative value.
If it's negative, the loop is already proven to be infinite. Fixes PR13489!

llvm-svn: 161107
2012-08-01 09:14:36 +00:00
Craig Topper
3f200a7638 Add more indirection to the disassembler tables to reduce amount of space used to store the operand types and encodings. Store only the unique combinations in a separate table and store indices in the instruction table. Saves about 32K of static data.
llvm-svn: 161101
2012-08-01 07:39:18 +00:00
Nick Kledzik
f4e550e004 Initial commit of new FileOutputBuffer support class.
Since the llvm::sys::fs::map_file_pages() support function it relies on
is not yet implemented on Windows, the unit tests for FileOutputBuffer 
are currently conditionalized to run only on unix.

llvm-svn: 161099
2012-08-01 02:29:50 +00:00
Akira Hatanaka
731add334a Implement MipsJITInfo::replaceMachineCodeForFunction.
No new test case is added.
This patch makes test JITTest.FunctionIsRecompiledAndRelinked pass on mips
platform.

Patch by Petar Jovanovic.

llvm-svn: 161098
2012-08-01 02:29:24 +00:00
Akira Hatanaka
7c93354aff Remove unused variable.
llvm-svn: 161095
2012-08-01 00:37:53 +00:00
Akira Hatanaka
c43e6b2166 Implement MipsSERegisterInfo::eliminateCallFramePseudoInstr. The function emits
instructions that decrement and increment the stack pointer before and after a
call when the function does not have a reserved call frame.

llvm-svn: 161093
2012-07-31 23:52:55 +00:00
Akira Hatanaka
24dddbed36 Add definitions of two subclasses of MipsRegisterInfo, Mips16RegisterInfo and
MipsSERegisterInfo.

llvm-svn: 161092
2012-07-31 23:41:32 +00:00
Akira Hatanaka
87fd9a992a Add definitions of two subclasses of MipsFrameLowering, Mips16FrameLowering and
MipsSEFrameLowering.

Implement MipsSEFrameLowering::hasReservedCallFrame. Call frames will not be
reserved if there is a call with a large call frame or there are variable sized
objects on the stack.

llvm-svn: 161090
2012-07-31 22:50:19 +00:00
Akira Hatanaka
45e66997d4 Add Mips16InstrInfo.cpp and MipsSEInstrInfo.cpp to CMakeLists.txt.
llvm-svn: 161083
2012-07-31 22:11:05 +00:00
Akira Hatanaka
46388c74c0 Add definitions of two subclasses of MipsInstrInfo, MipsInstrInfo (for mips16),
and MipsSEInstrInfo (for mips32/64).

llvm-svn: 161081
2012-07-31 21:49:49 +00:00
Akira Hatanaka
44f5fec97d Delete mips64 target machine classes. mips target machines can be used in place
of them.

llvm-svn: 161080
2012-07-31 21:39:17 +00:00
Akira Hatanaka
ad80f510bc Let PEI::calculateFrameObjectOffsets compute the final stack size rather than
computing it in MipsFrameLowering::emitPrologue.

llvm-svn: 161078
2012-07-31 21:28:49 +00:00
Akira Hatanaka
d43e99897c Expand DYNAMIC_STACKALLOC nodes rather than doing custom-lowering.
The frame object which points to the dynamically allocated area will not be
needed after changes are made to cease reserving call frames.

llvm-svn: 161076
2012-07-31 20:54:48 +00:00
Manman Ren
b9e9c55911 MachineSink: Sort the successors before trying to find SuccToSinkTo.
Use stable_sort instead of sort. Follow-up to r161062.

rdar://11980766

llvm-svn: 161075
2012-07-31 20:45:38 +00:00
Jakob Stoklund Olesen
3dc189f67b Compute instruction depths through the current trace.
Assuming infinite issue width, compute the earliest each instruction in
the trace can issue, when considering the latency of data dependencies.
The issue cycle is record as a 'depth' from the beginning of the trace.

This is half the computation required to find the length of the critical
path through the trace. Heights are next.

llvm-svn: 161074
2012-07-31 20:44:38 +00:00
Jakob Stoklund Olesen
b2c7febf35 Rename CT -> MTM. MachineTraceMetrics is abbreviated MTM.
llvm-svn: 161072
2012-07-31 20:25:13 +00:00
Akira Hatanaka
e1beddb7e8 Define ADJCALLSTACKDOWN/UP nodes. These nodes are emitted regardless of whether
or not it is in mips16 mode. Define MipsPseudo (mode-independant pseudo) and
PseudoSE (mips32/64 pseudo) classes.

llvm-svn: 161071
2012-07-31 19:13:07 +00:00
Akira Hatanaka
4a17cb84f3 Change name of class MipsInst to InstSE to distinguish it from mips16's
instruction class. SE stands for standard encoding.

llvm-svn: 161069
2012-07-31 18:55:01 +00:00
Akira Hatanaka
85ddbf2e38 When store nodes or memcpy nodes are created to copy the function call
arguments to the stack in MipsISelLowering::LowerCall, use stack pointer and
integer offset operands rather than frame object operands.

llvm-svn: 161068
2012-07-31 18:46:41 +00:00
Chad Rosier
9a4ff99710 [x86 frame lowering] In 32-bit mode, use ESI as the base pointer.
Previously, we were using EBX, but PIC requires the GOT to be in EBX before 
function calls via PLT GOT pointer.

llvm-svn: 161066
2012-07-31 18:29:21 +00:00
Akira Hatanaka
aab47c049b Fix type of LUXC1 and SUXC1. These instructions were incorrectly defined as
single-precision load and store.

Also avoid selecting LUXC1 and SUXC1 instructions during isel. It is incorrect
to map unaligned floating point load/store nodes to these instructions.

llvm-svn: 161063
2012-07-31 18:16:49 +00:00
Manman Ren
3d7e85d5b8 MachineSink: Sort the successors before trying to find SuccToSinkTo.
One motivating example is to sink an instruction from a basic block which has
two successors: one outside the loop, the other inside the loop. We should try
to sink the instruction outside the loop.

rdar://11980766

llvm-svn: 161062
2012-07-31 18:10:39 +00:00
Micah Villmow
122d115419 Conform to LLVM coding style.
llvm-svn: 161061
2012-07-31 18:07:43 +00:00