1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00
Commit Graph

15913 Commits

Author SHA1 Message Date
Silviu Baranga
f376e00699 Added support for unpredictable ADC/SBC instructions on ARM, and also fixed some corner cases involving the PC register as an operand for these instructions.
llvm-svn: 154101
2012-04-05 16:19:29 +00:00
Silviu Baranga
1c2668f700 Added support for handling unpredictable arithmetic instructions on ARM.
llvm-svn: 154100
2012-04-05 16:13:15 +00:00
James Molloy
3604b95957 An oversight when applying the patches for r150956 and r150957 to a vanilla tree meant I forgot to svn add these testcases.
Noticed while investigating PR12274!

llvm-svn: 154090
2012-04-05 10:01:12 +00:00
Jim Grosbach
5d11d38750 ARM assembly aliases for two-operand V[R]SHR instructions.
rdar://11189467

llvm-svn: 154087
2012-04-05 07:23:53 +00:00
Jim Grosbach
64f4e8d5b3 ARM assembly parsing for 'msr' plain 'cpsr' operand.
Plain 'cpsr' is an alias for 'cpsr_fc'.

rdar://11153753

llvm-svn: 154080
2012-04-05 03:17:53 +00:00
Jakob Stoklund Olesen
e1ae4f161c Pass the right sign to TLI->isLegalICmpImmediate.
LSR can fold three addressing modes into its ICmpZero node:

  ICmpZero BaseReg + Offset      => ICmp BaseReg, -Offset
  ICmpZero -1*ScaleReg + Offset  => ICmp ScaleReg, Offset
  ICmpZero BaseReg + -1*ScaleReg => ICmp BaseReg, ScaleReg

The first two cases are only used if TLI->isLegalICmpImmediate() likes
the offset.

Make sure the right Offset sign is passed to this method in the second
case. The ARM version is not symmetric.

<rdar://problem/11184260>

llvm-svn: 154079
2012-04-05 03:10:56 +00:00
Akira Hatanaka
e5ea70212f Reapply 154038 without the failing test.
llvm-svn: 154062
2012-04-04 22:16:36 +00:00
Owen Anderson
f6f930a990 Revert r154038. It was causing make check failures.
llvm-svn: 154054
2012-04-04 21:18:58 +00:00
Akira Hatanaka
4df2267566 Fix LowerGlobalAddress to produce instructions with the correct relocation
types for N32 ABI. Add new test case and update existing ones.

llvm-svn: 154038
2012-04-04 19:02:38 +00:00
Akira Hatanaka
c8028e2551 Fix LowerConstantPool to produce instructions with the correct relocation
types for N32 ABI and update test case.

llvm-svn: 154034
2012-04-04 18:26:12 +00:00
Jakob Stoklund Olesen
0419ed395c Implement ARMBaseInstrInfo::commuteInstruction() for MOVCCr.
A MOVCCr instruction can be commuted by inverting the condition. This
can help reduce register pressure and remove unnecessary copies in some
cases.

<rdar://problem/11182914>

llvm-svn: 154033
2012-04-04 18:23:42 +00:00
Akira Hatanaka
913d78a99c Fix LowerBlockAddress to produce instructions with the correct relocation
types for N32 ABI and update test case.

llvm-svn: 154031
2012-04-04 18:22:53 +00:00
Hongbin Zheng
2a8f0cf400 Add testcase for r154007, when a function has the optsize attribute,
the loop should be unrolled according the value of OptSizeUnrollThreshold.

llvm-svn: 154014
2012-04-04 13:24:40 +00:00
Rafael Espindola
88a1aeb123 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011
2012-04-04 12:51:34 +00:00
Michael J. Spencer
2f9beb2374 Add YAML parser to Support.
llvm-svn: 153977
2012-04-03 23:09:22 +00:00
Pete Cooper
4164f86b8a Add VSELECT to LegalizeVectorTypes::ScalariseVectorResult. Previously it would crash if it encountered a 1 element VSELECT. Solution is slightly more complicated than just creating a SELET as we have to mask or sign extend the vector condition if it had different boolean contents from the scalar condition. Fixes <rdar://problem/11178095>
llvm-svn: 153976
2012-04-03 22:57:55 +00:00
Eric Christopher
53ef0cf4a5 Fix thinko check for number of operands to be the one that actually
might have more than 19 operands. Add a testcase to make sure I
never screw that up again.

Part of rdar://11026482

llvm-svn: 153961
2012-04-03 17:55:42 +00:00
Nadav Rotem
d72bf636aa Add an additional testcase which checks ops with multiple users.
llvm-svn: 153939
2012-04-03 07:39:36 +00:00
Craig Topper
ce6c05e0df Add support for AVX enhanced comparison predicates. Patch from Kay Tiong Khoo.
llvm-svn: 153935
2012-04-03 05:20:24 +00:00
Akira Hatanaka
c5bbe0b434 Revert r153924. Delete test/MC/Disassembler/Mips and lib/Target/Mips/Disassembler.
llvm-svn: 153926
2012-04-03 03:01:13 +00:00
Akira Hatanaka
cecb440c11 Revert r153924. There were buildbot failures.
llvm-svn: 153925
2012-04-03 02:51:09 +00:00
Akira Hatanaka
058b0cfb55 MIPS disassembler support.
Patch by Vladimir Medic.

llvm-svn: 153924
2012-04-03 02:20:58 +00:00
Jakob Stoklund Olesen
97f47c37b6 Allocate virtual registers in ascending order.
This is just the fallback tie-breaker ordering, the main allocation
order is still descending size.

Patch by Shamil Kurmangaleev!

llvm-svn: 153904
2012-04-02 22:30:39 +00:00
Lang Hames
dbc3175c89 During two-address lowering, rescheduling an instruction does not untie
operands. Make TryInstructionTransform return false to reflect this.
Fixes PR11861.

llvm-svn: 153892
2012-04-02 19:58:43 +00:00
Rafael Espindola
40e34629cb No need to run llvm-as.
llvm-svn: 153890
2012-04-02 19:44:20 +00:00
Akira Hatanaka
f37a1c4323 Initial 64 bit direct object support.
This patch allows llvm to recognize that a 64 bit object file is being produced
and that the subsequently generated ELF header has the correct information.

The test case checks for both big and little endian flavors.

Patch by Jack Carter.

llvm-svn: 153889
2012-04-02 19:25:22 +00:00
Stepan Dyatkovskiy
0ddc03ebad Fast fix for PR12343:
http://llvm.org/bugs/show_bug.cgi?id=12343

We have not trivial way for splitting edges that are goes from indirect branch. We can do it with some tricks, but it should be additionally discussed. And it is still dangerous due to difficulty of indirect branches controlling.

Fix forbids this case for unswitching.

llvm-svn: 153879
2012-04-02 17:16:45 +00:00
Silviu Baranga
77d372b45e Added fix in TableGen instruction decoder generation. The decoder now breaks for every leaf node.
llvm-svn: 153874
2012-04-02 15:20:39 +00:00
Nadav Rotem
a9ec0e024f Optimizing swizzles of complex shuffles may generate additional complex shuffles.
Do not try to optimize swizzles of shuffles if the source shuffle has more than
a single user, except when the source shuffle is also a swizzle.

llvm-svn: 153864
2012-04-02 07:11:12 +00:00
Hal Finkel
1c045f6845 Enable prefetch generation on PPC64.
llvm-svn: 153851
2012-04-01 20:08:17 +00:00
Nadav Rotem
2729f54295 This commit contains a few changes that had to go in together.
1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B))
   (and also scalar_to_vector).

2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src).
   Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B))

3. Optimize swizzles of shuffles:  shuff(shuff(x, y), undef) -> shuff(x, y).

4. Fix an X86ISelLowering optimization which was very bitcast-sensitive.

Code which was previously compiled to this:

movd    (%rsi), %xmm0
movdqa  .LCPI0_0(%rip), %xmm2
pshufb  %xmm2, %xmm0
movd    (%rdi), %xmm1
pshufb  %xmm2, %xmm1
pxor    %xmm0, %xmm1
pshufb  .LCPI0_1(%rip), %xmm1
movd    %xmm1, (%rdi)
ret

Now compiles to this:

movl    (%rsi), %eax
xorl    %eax, (%rdi)
ret

llvm-svn: 153848
2012-04-01 19:31:22 +00:00
Hal Finkel
fd26145bc6 Add instruction itinerary for the PPC64 A2 core.
This adds a full itinerary for IBM's PPC64 A2 embedded core. These
cores form the basis for the CPUs in the new IBM BG/Q supercomputer.

llvm-svn: 153842
2012-04-01 19:22:40 +00:00
Chandler Carruth
bee52a9371 Add some more testing to cover the remaining two cases where
always-inlining is disabled: recursive functions and indirectbr.

llvm-svn: 153833
2012-04-01 10:36:17 +00:00
Chandler Carruth
1a2234d527 Fix a pretty scary bug I introduced into the always inliner with
a single missing character. Somehow, this had gone untested. I've added
tests for returns-twice logic specifically with the always-inliner that
would have caught this, and fixed the bug.

Thanks to Matt for the careful review and spotting this!!! =D

llvm-svn: 153832
2012-04-01 10:21:05 +00:00
Chandler Carruth
0d7e05e4f9 Replace four tiny tests with various uses of grep and not with a single
test and FileCheck.

llvm-svn: 153831
2012-04-01 10:11:17 +00:00
Rafael Espindola
194491d9b0 Add a triple to the test.
llvm-svn: 153818
2012-03-31 18:59:07 +00:00
Rafael Espindola
2da83dbcd4 Teach CodeGen's version of computeMaskedBits to understand the range metadata.
This is the CodeGen equivalent of r153747. I tested that there is not noticeable
performance difference with any combination of -O0/-O2 /-g when compiling
gcc as a single compilation unit.

llvm-svn: 153817
2012-03-31 18:14:00 +00:00
Chandler Carruth
8cacff57bf Initial commit for the rewrite of the inline cost analysis to operate
on a per-callsite walk of the called function's instructions, in
breadth-first order over the potentially reachable set of basic blocks.

This is a major shift in how inline cost analysis works to improve the
accuracy and rationality of inlining decisions. A brief outline of the
algorithm this moves to:

- Build a simplification mapping based on the callsite arguments to the
  function arguments.
- Push the entry block onto a worklist of potentially-live basic blocks.
- Pop the first block off of the *front* of the worklist (for
  breadth-first ordering) and walk its instructions using a custom
  InstVisitor.
- For each instruction's operands, re-map them based on the
  simplification mappings available for the given callsite.
- Compute any simplification possible of the instruction after
  re-mapping, and store that back int othe simplification mapping.
- Compute any bonuses, costs, or other impacts of the instruction on the
  cost metric.
- When the terminator is reached, replace any conditional value in the
  terminator with any simplifications from the mapping we have, and add
  any successors which are not proven to be dead from these
  simplifications to the worklist.
- Pop the next block off of the front of the worklist, and repeat.
- As soon as the cost of inlining exceeds the threshold for the
  callsite, stop analyzing the function in order to bound cost.

The primary goal of this algorithm is to perfectly handle dead code
paths. We do not want any code in trivially dead code paths to impact
inlining decisions. The previous metric was *extremely* flawed here, and
would always subtract the average cost of two successors of
a conditional branch when it was proven to become an unconditional
branch at the callsite. There was no handling of wildly different costs
between the two successors, which would cause inlining when the path
actually taken was too large, and no inlining when the path actually
taken was trivially simple. There was also no handling of the code
*path*, only the immediate successors. These problems vanish completely
now. See the added regression tests for the shiny new features -- we
skip recursive function calls, SROA-killing instructions, and high cost
complex CFG structures when dead at the callsite being analyzed.

Switching to this algorithm required refactoring the inline cost
interface to accept the actual threshold rather than simply returning
a single cost. The resulting interface is pretty bad, and I'm planning
to do lots of interface cleanup after this patch.

Several other refactorings fell out of this, but I've tried to minimize
them for this patch. =/ There is still more cleanup that can be done
here. Please point out anything that you see in review.

I've worked really hard to try to mirror at least the spirit of all of
the previous heuristics in the new model. It's not clear that they are
all correct any more, but I wanted to minimize the change in this single
patch, it's already a bit ridiculous. One heuristic that is *not* yet
mirrored is to allow inlining of functions with a dynamic alloca *if*
the caller has a dynamic alloca. I will add this back, but I think the
most reasonable way requires changes to the inliner itself rather than
just the cost metric, and so I've deferred this for a subsequent patch.
The test case is XFAIL-ed until then.

As mentioned in the review mail, this seems to make Clang run about 1%
to 2% faster in -O0, but makes its binary size grow by just under 4%.
I've looked into the 4% growth, and it can be fixed, but requires
changes to other parts of the inliner.

llvm-svn: 153812
2012-03-31 12:42:41 +00:00
Chandler Carruth
385a981fde Clean up the naming in this test. Someone pointed this out in review at
one point, and I forgot to go back and clean it up. Sorry about that. =/

llvm-svn: 153801
2012-03-31 10:38:48 +00:00
Chandler Carruth
15d7a6e00c FileCheck-ize this test, and generally tidy it up prior to changing
things around.

llvm-svn: 153799
2012-03-31 09:22:33 +00:00
Hal Finkel
45ad90afac Correctly vectorize powi.
The powi intrinsic requires special handling because it always takes a single
integer power regardless of the result type. As a result, we can vectorize
only if the powers are equal. Fixes PR12364.

llvm-svn: 153797
2012-03-31 03:38:40 +00:00
Jim Grosbach
37853d6216 ARM assembler should prefer non-aliases encoding of cmp.
When an immediate is both a value [t2_]so_imm and a [t2_]so_imm_neg,
we want to use the non-negated form to make sure we prefer the normal
encoding, not the aliased encoding via the negation of, e.g., 'cmp.w'.

llvm-svn: 153770
2012-03-30 19:59:02 +00:00
Jim Grosbach
92ee2a8454 ARM encoding for VSWP got the second operand incorrect.
Make the non-tied register operand names line up with what the base
class encoding handler expects.

rdar://11157236

llvm-svn: 153766
2012-03-30 18:53:01 +00:00
Jim Grosbach
2536615bab ARM integrated assembler should encoding choice for add/sub imm.
For 'adds r2, r2, #56' outside of an IT block, the 16-bit encoding T2
can be used for this syntax. Prefer the narrow encoding when possible.

rdar://11156277

llvm-svn: 153759
2012-03-30 17:20:40 +00:00
Jim Grosbach
9b185a753c ARM assembly parsing needs to be paranoid about negative immediates.
Make sure to treat immediates as unsigned when doing relative comparisons.

rdar://11153621

llvm-svn: 153753
2012-03-30 16:31:31 +00:00
James Molloy
70a6f5ebc7 Ensure conditional BL instructions for ARM are given the fixup fixup_arm_condbranch.
Patch by Tim Northover!

llvm-svn: 153737
2012-03-30 09:15:32 +00:00
Evan Cheng
f3c23907f5 ARM target should allow codegenprep to duplicate ret instructions to enable tailcall opt. rdar://11140249
llvm-svn: 153717
2012-03-30 01:24:39 +00:00
Bill Wendling
9746fa5b94 Testcase for r153710.
llvm-svn: 153711
2012-03-30 00:26:54 +00:00
Bill Wendling
aeff63a3d4 Add testcase for r153705
llvm-svn: 153706
2012-03-30 00:05:02 +00:00
Lang Hames
5685c98688 Change the constant in this testcase so that it results in a constant pool
load.

llvm-svn: 153704
2012-03-29 23:52:38 +00:00