1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

54109 Commits

Author SHA1 Message Date
Derek Schuff
7fe1fbbe81 Revert r155745
llvm-svn: 155746
2012-04-27 23:37:41 +00:00
Derek Schuff
80bd01f406 Fix fastcc structure return with fast-isel on x86-32
On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.

llvm-svn: 155745
2012-04-27 23:27:17 +00:00
Jakob Stoklund Olesen
2cb81f69d6 Track worst case alignment padding more accurately.
Previously, ARMConstantIslandPass would conservatively compute the
address of an aligned basic block as:

  RoundUpToAlignment(Offset + UnknownPadding)

This worked fine for the layout algorithm itself, but it could fool the
verify() function because it accounts for alignment padding twice: Once
when adding the worst case UnknownPadding, and again by rounding up the
fictional block offset. This meant that when optimizeThumb2Instructions
would shrink an instruction, the conservative distance estimate could
grow. That shouldn't be possible since the woorst case alignment padding
wss already included.

This patch drops the use of RoundUpToAlignment, and depends only on
worst case padding to compute conservative block offsets. This has the
weird effect that the computed offset for an aligned block may not be
aligned.

The important difference is that shrinking an instruction can never
cause the estimated distance between two instructions to grow. The
estimated distance is always larger than the real distance that only the
assembler knows.

<rdar://problem/11339352>

llvm-svn: 155744
2012-04-27 22:58:38 +00:00
Andrew Trick
cbe7b03dbe Temporarily revert r155668: Fix the SD scheduler to avoid gluing.
This definitely caused regression with ARM -mno-thumb.

llvm-svn: 155743
2012-04-27 22:55:59 +00:00
Craig Topper
5270dd7a71 Use 'unsigned' instead of 'int' in several places when retrieving number of vector elements.
llvm-svn: 155742
2012-04-27 22:54:43 +00:00
Chad Rosier
d627fcbf2a Add x86-specific DAG combine to simplify:
x == -y --> x+y == 0
 x != -y --> x+y != 0

On x86, the generated code goes from
   negl    %esi
   cmpl    %esi, %edi
   je    .LBB0_2
to
   addl    %esi, %edi
   je    .L4

This case is correctly handled for ARM with "cmn".

Patch by Manman Ren.
rdar://11245199
PR12545

llvm-svn: 155739
2012-04-27 22:33:25 +00:00
Michael J. Spencer
ff5aec2d9d [Support/YAMLParser] Fix ASan found bugs.
llvm-svn: 155735
2012-04-27 21:12:20 +00:00
Craig Topper
b06100424e Tidy up spacing.
llvm-svn: 155733
2012-04-27 21:05:09 +00:00
Hal Finkel
a565d03d78 Don't vectorize target-specific types (ppc_fp128, x86_fp80, etc.).
Target specific types should not be vectorized. As a practical matter,
these types are already register matched (at least in the x86 case),
and codegen does not always work correctly (at least in the ppc case,
and this is not worth fixing because ppc_fp128 is currently broken and
will probably go away soon).

llvm-svn: 155729
2012-04-27 19:34:00 +00:00
David Blaikie
296c942e88 Change recurse depth limit to uint32 to fix warning.
llvm-svn: 155727
2012-04-27 19:30:32 +00:00
Dan Gohman
1bc0d2e1bc Miscellaneous accumulated cleanups.
llvm-svn: 155725
2012-04-27 18:56:31 +00:00
Lang Hames
7d83af4ed0 Fix the order of the operands in the llvm.fma intrinsic patterns for ARM,
<rdar://problem/11325085>.

llvm-svn: 155724
2012-04-27 18:51:24 +00:00
Mon P Wang
85af068593 Add an early bailout to IsValueFullyAvailableInBlock from deeply nested blocks.
The limit is set to an arbitrary 1000 recursion depth to avoid stack overflow
issues. <rdar://problem/11286839>.

llvm-svn: 155722
2012-04-27 18:09:28 +00:00
Dan Gohman
25a863dcf7 Reapply r155682, making constant folding more consistent, with a fix to work
properly with how the code handles all-undef PHI nodes.

llvm-svn: 155721
2012-04-27 17:50:22 +00:00
Richard Barton
f9237b25e6 Fix ARM assembly parsing for upper case condition codes on IT instructions.
llvm-svn: 155720
2012-04-27 17:34:01 +00:00
Benjamin Kramer
1380494168 X86: Don't emit conditional floating point moves on when targeting pre-pentiumpro architectures.
* Model FPSW (the FPU status word) as a register.
* Add ISel patterns for the FUCOM*, FNSTSW and SAHF instructions.
* During Legalize/Lowering, build a node sequence to transfer the comparison
result from FPSW into EFLAGS. If you're wondering about the right-shift: That's
an implicit sub-register extraction (%ax -> %ah) which is handled later on by
the instruction selector.

Fixes PR6679. Patch by Christoph Erhardt!

llvm-svn: 155704
2012-04-27 12:07:43 +00:00
Kostya Serebryany
c855c442c8 [asan] small optimization: do not emit "x+0" instructions
llvm-svn: 155701
2012-04-27 10:04:53 +00:00
Richard Barton
ca70156ab3 Refactor IT handling not to store the bottom bit of the condition code in the mask operand in the MCInst.
llvm-svn: 155700
2012-04-27 08:42:59 +00:00
NAKAMURA Takumi
a28147f072 Revert r155682, "Use ConstantExpr::getExtractElement when constant-folding vectors"
It broke stage2 build. stage1/clang sometimes crashed.

llvm-svn: 155699
2012-04-27 07:59:20 +00:00
Kostya Serebryany
0cd695bd39 [tsan] Atomic support for ThreadSanitizer, patch by Dmitry Vyukov
llvm-svn: 155698
2012-04-27 07:31:53 +00:00
Evan Cheng
f35523d08a Implement a bastardized ABI.
llvm-svn: 155686
2012-04-27 02:11:10 +00:00
Evan Cheng
594fb11f12 - thumbv6 shouldn't imply +thumb2. Cortex-M0 doesn't suppport 32-bit Thumb2
instructions.
- However, it does support dmb, dsb, isb, mrs, and msr.
rdar://11331541

llvm-svn: 155685
2012-04-27 01:27:19 +00:00
Dan Gohman
a72b2f97a6 Use ConstantExpr::getExtractElement when constant-folding vectors
instead of getAggregateElement. This has the advantage of being
more consistent and allowing higher-level constant folding to
procede even if an inner extract element cannot be folded.

Make ConstantFoldInstruction call ConstantFoldConstantExpression
on the instruction's operands, making it more consistent with 
ConstantFoldConstantExpression itself. This makes sure that
ConstantExprs get TargetData-aware folding before being handed
off as operands for further folding.

This causes more expressions to be folded, but due to a known
shortcoming in constant folding, this currently has the side effect
of stripping a few more nuw and inbounds flags in the non-targetdata
side of constant-fold-gep.ll. This is mostly harmless.

This fixes rdar://11324230.

llvm-svn: 155682
2012-04-27 00:54:36 +00:00
Jakob Stoklund Olesen
185c3797be Break up getProfitableChainIncrement().
The required checks are moved to ChainInstruction() itself and the
policy decisions are moved to IVChain::isProfitableInc().

Also cache the ExprBase in IVChain to avoid frequent recomputations.

No functional change intended.

llvm-svn: 155676
2012-04-26 23:33:11 +00:00
Jakob Stoklund Olesen
613b89fecd Turn IVChain into a struct.
No functional change intended.

llvm-svn: 155675
2012-04-26 23:33:09 +00:00
Chad Rosier
f3d4646377 Add instcombine patterns for the following transformations:
(x & y) | (x ^ y) -> x | y 
 (x & y) + (x ^ y) -> x | y 

Patch by Manman Ren.
rdar://10770603

llvm-svn: 155674
2012-04-26 23:29:14 +00:00
Andrew Trick
1aa00c0baa Fix the SD scheduler to avoid gluing the same node twice.
DAGCombine strangeness may result in multiple loads from the same
offset. They both may try to glue themselves to another load. We could
insist that the redundant loads glue themselves to each other, but the
beter fix is to bail out from bad gluing at the time we detect it.

Fixes rdar://11314175: BuildSchedUnits assert.

llvm-svn: 155668
2012-04-26 21:48:25 +00:00
Jim Grosbach
bf9adf2ab5 ARM: Thumb ldr(literal) base address alignment is 32-bits.
The base address for the PC-relative load is Align(PC,4), so it's the
address of the word containing the 16-bit instruction, not the address
of the instruction itself. Ugh.

rdar://11314619

llvm-svn: 155659
2012-04-26 20:48:12 +00:00
Preston Gurd
fb1760744d Trivial change to set UseLeaForSP flag in addition to toggling
the FeatureLeaForSP feature bit when llvm auto detects Intel Atom.

Patch by Andy Zhang

llvm-svn: 155655
2012-04-26 19:52:27 +00:00
Michael J. Spencer
0dd0f2c8f0 [Support/YAML] Properly fix unitialized variable warning by inserting a
'REPLACEMENT CHARACTER' (U+FFFD) when getAsInteger fails.

llvm-svn: 155653
2012-04-26 19:27:11 +00:00
Tim Northover
876c151146 Use VLD1 in NEON extenting-load patterns instead of VLDR.
On some cores it's a bad idea for performance to mix VFP and NEON instructions
and since these patterns are NEON anyway, the NEON load should be used.

llvm-svn: 155630
2012-04-26 08:46:29 +00:00
Tim Northover
b83dc53c3a Test commit.
llvm-svn: 155626
2012-04-26 08:24:07 +00:00
Craig Topper
f883096ff7 Enable detection of AVX and AVX2 support through CPUID. Add AVX/AVX2 to corei7-avx, core-avx-i, and core-avx2 cpu names.
llvm-svn: 155618
2012-04-26 06:40:15 +00:00
Chandler Carruth
587c136c31 Teach the reassociate pass to fold chains of multiplies with repeated
elements to minimize the number of multiplies required to compute the
final result. This uses a heuristic to attempt to form near-optimal
binary exponentiation-style multiply chains. While there are some cases
it misses, it seems to at least a decent job on a very diverse range of
inputs.

Initial benchmarks show no interesting regressions, and an 8%
improvement on SPASS. Let me know if any other interesting results (in
either direction) crop up!

Credit to Richard Smith for the core algorithm, and helping code the
patch itself.

llvm-svn: 155616
2012-04-26 05:30:30 +00:00
Evan Cheng
4d570a3f0e If triple is armv7 / thumbv7 and a CPU is specified, do not automatically assume
the feature set of v7a. This comes about if the user specifies something like
-arch armv7 -mcpu=cortex-m3. We shouldn't be generating instructions such as
uxtab in this case.

rdar://11318438

llvm-svn: 155601
2012-04-26 01:13:36 +00:00
Bill Wendling
d9d9230b83 Don't forget to reset 'first operand' flag when we're setting the MDNodeOperand value.
llvm-svn: 155599
2012-04-26 00:38:42 +00:00
Jakob Stoklund Olesen
e2913e1ad5 Print IV chain numbers while collecting them.
llvm-svn: 155567
2012-04-25 18:01:32 +00:00
Jakob Stoklund Olesen
24c99d2966 Remove more dead code.
llvm-svn: 155566
2012-04-25 18:01:30 +00:00
Richard Barton
e9a972bbe3 Unify internal representation of ARM instructions with a register right-shifted by #32. These are stored as shifts by #0 in the MCInst and correctly marshalled when transforming from or to assembly representation.
llvm-svn: 155565
2012-04-25 18:00:18 +00:00
Jakob Stoklund Olesen
7f1be74a4a Remove the -disable-cross-class-join option.
Cross-class joins have been normal and fully supported for a while now.
With TableGen generating the getMatchingSuperRegClass() hook, they are
unlikely to cause problems again.

llvm-svn: 155552
2012-04-25 16:17:50 +00:00
Jakob Stoklund Olesen
b8d98c5060 Cross-class joining is winning.
Remove the heuristic for disabling cross-class joins. The greedy
register allocator can handle the narrow register classes, and when it
splits a live range, it can pick a larger register class.

Benchmarks were unaffected by this change.

<rdar://problem/11302212>

llvm-svn: 155551
2012-04-25 16:17:47 +00:00
Craig Topper
5828c654b9 Add ifdef around getSubtargetFeatureName in tablegen output file so that only targets that want the function get it. This prevents other targets from getting an unused function warning.
llvm-svn: 155538
2012-04-25 06:56:34 +00:00
Craig Topper
1a016fd95d Use vector_shuffles instead of target specific unpack nodes for AVX ZERO_EXTEND/ANY_EXTEND combine. These will be converted to target specific nodes during lowering. This is more consistent with other code.
llvm-svn: 155537
2012-04-25 06:39:39 +00:00
Lang Hames
7f69fbca29 Reverting r155468. Chris and Chandler have convinced me that it's dangerous and
in poor taste.

Talking through some alternate solutions with Chandler.

llvm-svn: 155530
2012-04-25 02:16:54 +00:00
Akira Hatanaka
b3ecf903f1 Do not use $gp as a dedicated global register if the target ABI is not O32.
llvm-svn: 155522
2012-04-25 01:24:52 +00:00
Dan Gohman
64171a0b3b Simplify the known retain count tracking; use a boolean state instead
of a precise count. Also, move RRInfo's Partial field into PtrState,
now that it won't increase the size.

llvm-svn: 155513
2012-04-25 00:50:46 +00:00
Dan Gohman
3a24d34041 Build custom predecessor and successor lists for each basic block.
These lists exclude invoke unwind edges and loop backedges which
are being ignored. This makes it easier to ignore them
consistently.

llvm-svn: 155500
2012-04-24 22:53:18 +00:00
Jim Grosbach
7ac2ac85a8 ARM: improved assembler diagnostics for missing CPU features.
When an instruction match is found, but the subtarget features it
requires are not available (missing floating point unit, or thumb vs arm
mode, for example), issue a diagnostic that identifies what the feature
mismatch is.

rdar://11257547

llvm-svn: 155499
2012-04-24 22:40:08 +00:00
Andrew Trick
47f01c373e Fix a naughty header include that breaks "installed" builds.
llvm-svn: 155486
2012-04-24 20:36:19 +00:00
Nadav Rotem
3c817bb807 ConstantFoldSelectInstruction swapped the operands of the select.
Fix 12592. Patch by Matt Pharr.

llvm-svn: 155480
2012-04-24 20:18:49 +00:00