1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 13:02:52 +02:00
Commit Graph

19743 Commits

Author SHA1 Message Date
Arnold Schwaighofer
d6aee045b3 LoopVectorize: Preserve debug location info
radar://14169017

llvm-svn: 185122
2013-06-28 00:38:54 +00:00
Arnold Schwaighofer
c0e3a07c99 LoopVectorize: Cache edge masks created during if-conversion
Otherwise, we end up with an exponential IR blowup.
Fixes PR16472.

llvm-svn: 185097
2013-06-27 20:31:06 +00:00
Chad Rosier
1ce13129c7 Improve the compression of the tablegen DiffLists by introducing a new sort
algorithm when assigning EnumValues to the synthesized registers.

The current algorithm, LessRecord, uses the StringRef compare_numeric
function.  This function compares strings, while handling embedded numbers.
For example, the R600 backend registers are sorted as follows:

  T1
  T1_W
  T1_X
  T1_XYZW
  T1_Y
  T1_Z
  T2
  T2_W
  T2_X
  T2_XYZW
  T2_Y
  T2_Z

In this example, the 'scaling factor' is dEnum/dN = 6 because T0, T1, T2
have an EnumValue offset of 6 from one another.  However, in other parts
of the register bank, the scaling factors are different:

dEnum/dN = 5:
  KC0_128_W
  KC0_128_X
  KC0_128_XYZW
  KC0_128_Y
  KC0_128_Z
  KC0_129_W
  KC0_129_X
  KC0_129_XYZW
  KC0_129_Y
  KC0_129_Z

The diff lists do not work correctly because different kinds of registers have
different 'scaling factors'.  This new algorithm, LessRecordRegister, tries to
enforce a scaling factor of 1.  For example, the registers are now sorted as
follows:

  T1
  T2
  T3
  ...
  T0_W
  T1_W
  T2_W
  ...
  T0_X
  T1_X
  T2_X
  ...
  KC0_128_W
  KC0_129_W
  KC0_130_W
  ...

For the Mips and R600 I see a 19% and 6% reduction in size, respectively.  I
did see a few small regressions, but the differences were on the order of a
few bytes (e.g., AArch64 was 16 bytes).  I suspect there will be even
greater wins for targets with larger register files.

Patch reviewed by Jakob.
rdar://14006013

llvm-svn: 185094
2013-06-27 19:38:13 +00:00
Nadav Rotem
311bda941c CostModel: improve the cost model for load/store of non power-of-two types such as <3 x float>, which are popular in graphics.
llvm-svn: 185085
2013-06-27 17:52:04 +00:00
Tom Stellard
e3160dde2c R600: Remove alu-split.ll test
The purpose of this test was to check boundary conditions for the size
of an ALU clause.  This test is very sensitive to changes to the
optimizer or scheduler, because it requires an exact number of ALU
instructions in order to remain valid.  It's not good to have a test
this sensitive, because it is confusing to developers who implement
optimizations and then 'break' the test.

I'm not sure if there is a good way to test these limits using lit, but
if I can come up with replacement test that isn't as sensitive I'll add
it back to the tree.

llvm-svn: 185084
2013-06-27 17:00:38 +00:00
Arnold Schwaighofer
ccd78deec7 LoopVectorize: Use vectorized loop invariant gep index anchored in loop
Use vectorized instruction instead of original instruction anchored in the
original loop.

Fixes PR16452 and t2075.c of PR16455.

llvm-svn: 185081
2013-06-27 15:11:55 +00:00
Joey Gouly
42f1898415 Add a Subtarget feature 'v8fp' to the ARM backend.
llvm-svn: 185073
2013-06-27 11:49:26 +00:00
Richard Sandiford
56466fde7f [SystemZ] Fix some embarrassing test typos
llvm-svn: 185070
2013-06-27 09:49:34 +00:00
Richard Sandiford
609a7eb0a1 [SystemZ] Allow LA and LARL to be rematerialized
llvm-svn: 185069
2013-06-27 09:42:10 +00:00
Richard Sandiford
a2d164d53e [SystemZ] Allow immediate moves to be rematerialized
llvm-svn: 185068
2013-06-27 09:38:48 +00:00
Richard Sandiford
964ffa104f [SystemZ] Add conditional store patterns
Add pseudo conditional store instructions, so that we use:

    branch foo:
    store
foo:

instead of:

    load
    branch foo:
    move
foo:
    store

z196 has real 32-bit and 64-bit conditional stores, but we don't use
any z196 instructions yet.

llvm-svn: 185065
2013-06-27 09:27:40 +00:00
Manman Ren
989245d300 Update testing case to make DI nodes have the correct format.
llvm-svn: 185061
2013-06-27 06:40:18 +00:00
Arnold Schwaighofer
5f879365c2 Fix spelling.
llvm-svn: 185052
2013-06-27 01:01:11 +00:00
Arnold Schwaighofer
18efca433e LoopVectorize: Don't store a reversed value in the vectorized value map
When we store values for reversed induction stores we must not store the
reversed value in the vectorized value map. Another instruction might use this
value.

This fixes 3 test cases of PR16455.

llvm-svn: 185051
2013-06-27 00:45:41 +00:00
Michael Gottesman
fe055b3806 Added support for the Builtin attribute.
The Builtin attribute is an attribute that can be placed on function call site that signal that even though a function is declared as being a builtin,

rdar://problem/13727199

llvm-svn: 185049
2013-06-27 00:25:01 +00:00
Chad Rosier
2b164134e8 [Mips Disassembler] Have the DecodeCCRRegisterClass function use the getReg
function to lookup the proper tablegen'ed register enumeration.  Previously,
it was using the encoded value directly.

llvm-svn: 185026
2013-06-26 22:23:32 +00:00
Akira Hatanaka
677f305c97 [mips] Do not emit ".option pic0" if target is mips64.
llvm-svn: 185012
2013-06-26 19:08:49 +00:00
Akira Hatanaka
1f2c22ad07 [mips] Improve code generation for constant multiplication using shifts, adds and
subs.

llvm-svn: 185011
2013-06-26 18:48:17 +00:00
Nadav Rotem
860aebf69a Erase all of the instructions that we RAUWed
llvm-svn: 184969
2013-06-26 17:16:09 +00:00
Joey Gouly
f4bea6681b Add a subtarget feature 'v8' to the ARM backend.
This allows for targeting the ARMv8 AArch32 variant.

llvm-svn: 184967
2013-06-26 16:58:26 +00:00
Nadav Rotem
e0a5a586b8 Do not add cse-ed instructions into the visited map because we dont want to consider them as a candidate for replacement of instructions to be visited.
llvm-svn: 184966
2013-06-26 16:54:53 +00:00
Tim Northover
2bf8ffa196 ARM: fix more cases where predication may or may not be allowed
Unfortunately this addresses two issues (by the time I'd disentangled the logic
it wasn't worth putting it back to half-broken):

+ Coprocessor instructions should all be predicable in Thumb mode.
+ BKPT should never be predicable.

llvm-svn: 184965
2013-06-26 16:52:40 +00:00
Tim Northover
817190b1e4 ARM: allow predicated barriers in Thumb mode
The barrier instructions are only "always-execute" in ARM mode, they can quite
happily sit inside an IT block in Thumb.

llvm-svn: 184964
2013-06-26 16:52:32 +00:00
Joey Gouly
a82562fd5b Remove the 'generic' CPU from the ARM eabi attributes printer.
Make v4 the default ARM architecture attribute, to match CodeGen.

llvm-svn: 184962
2013-06-26 16:39:06 +00:00
Ulrich Weigand
c2c8aeb508 [PowerPC] Accept 17-bit signed immediates for addis
The assembler currently strictly verifies that immediates for
s16imm operands are in range (-32768 ... 32767).  This matches
the behaviour of the GNU assembler, with one exception: gas
allows, as a special case, operands in an extended range
(-65536 .. 65535) for the addis instruction only (and its
extended mnemonic lis).

The main reason for this seems to be to allow using unsigned
16-bit operands for lis, e.g. like lis %r1, 0xfedc.

Since this has been supported by gas for a long time, and
assembler source code seen "in the wild" actually exploits
this feature, this patch adds equivalent support to LLVM
for compatibility reasons.

llvm-svn: 184946
2013-06-26 13:49:53 +00:00
Ulrich Weigand
66a94dc7aa [PowerPC] Support symbolic u16imm operands
Currently, all instructions taking s16imm operands support symbolic
operands.  However, for u16imm operands, we only support actual
immediate integers.  This causes the assembler to reject code like

  ori %r5, %r5, symbol@l

This patch changes the u16imm operand definition to likewise
accept symbolic operands.  In fact, s16imm and u16imm can
share the same encoding routine, now renamed to getImm16Encoding.

llvm-svn: 184944
2013-06-26 13:49:15 +00:00
Amaury de la Vieuville
8a7e3e2195 ARM: operands should be explicit when disassembled
llvm-svn: 184943
2013-06-26 13:39:07 +00:00
NAKAMURA Takumi
927f436b5d Suppress llvm/test/Other/can-execute.txt on msys bash.
llvm-svn: 184932
2013-06-26 10:56:44 +00:00
Elena Demikhovsky
ea4d3808e5 Optimized integer vector multiplication operation by replacing it with shift/xor/sub when it is possible. Fixed a bug in SDIV, where the const operand is not a splat constant vector.
llvm-svn: 184931
2013-06-26 10:55:03 +00:00
Kostya Serebryany
16c49ed479 [asan] workaround for PR16277: don't instrument AllocaInstr with alignment more than the redzone size
llvm-svn: 184928
2013-06-26 09:49:52 +00:00
Kostya Serebryany
07a94c6709 [asan] add option -asan-keep-uninstrumented-functions
llvm-svn: 184927
2013-06-26 09:18:17 +00:00
Nadav Rotem
a8fba65221 SLPVectorizer: support slp-vectorization of PHINodes between basic blocks
llvm-svn: 184888
2013-06-25 23:04:09 +00:00
Jakob Stoklund Olesen
a4ca837638 Print block frequencies in decimal form.
This is easier to read than the internal fixed-point representation.

If anybody knows the correct algorithm for converting fixed-point
numbers to base 10, feel free to fix it.

llvm-svn: 184881
2013-06-25 21:57:38 +00:00
Tom Stellard
3854f648a8 R600: Use new getNamedOperandIdx function generated by TableGen
llvm-svn: 184880
2013-06-25 21:22:18 +00:00
Arnold Schwaighofer
730386bc34 X86 cost model: Vectorizing integer division is a bad idea
radar://14057959

llvm-svn: 184872
2013-06-25 19:14:09 +00:00
Bob Wilson
f1bf7886b8 Fix SROA to avoid unnecessary scalar conversions for 1-element vectors.
When a 1-element vector alloca is promoted, a store instruction can often be
rewritten without converting the value to a scalar and using an insertelement
instruction to stuff it into the new alloca.  This patch just adds a check
to skip that conversion when it is unnecessary.  This turns out to be really
important for some ARM Neon operations where <1 x i64> is used to get around
the fact that i64 is not a legal type.

llvm-svn: 184870
2013-06-25 19:09:50 +00:00
Ulrich Weigand
3e23cfcde6 [PowerPC] Support @got modifier
Add VK_... values and relocation types necessary to support
the @got family of modifiers.  Used by the asm parser only.

llvm-svn: 184860
2013-06-25 16:49:50 +00:00
Aaron Watry
d619f91b82 R600: Add v2i32 test for vselect
Note: Only adding test for evergreen, not SI yet.

When I attempted to expand vselect for SI, I got the following:
llc: /home/awatry/src/llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp:522:
llvm::SDValue llvm::DAGTypeLegalizer::PromoteIntRes_SETCC(llvm::SDNode*):
Assertion `SVT.isVector() == N->getOperand(0).getValueType().isVector() &&
"Vector compare must return a vector result!"' failed.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184847
2013-06-25 13:55:54 +00:00
Aaron Watry
1ee98e598b R600/SI: Expand xor v2i32/v4i32
Add test cases for both vector sizes on SI and also add v2i32 test for EG.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184846
2013-06-25 13:55:52 +00:00
Aaron Watry
c7ec57ba3c R600: Add v2i32 test for setcc on evergreen
No test/expansion for SI has been added yet. Attempts to expand this
operation for SI resulted in a stacktrace in (IIRC) LegalizeIntegerTypes
which was complaining about vector comparisons being required to return
a vector type.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184845
2013-06-25 13:55:49 +00:00
Aaron Watry
73046ba281 R600/SI: Expand urem of v2i32/v4i32 for SI
Also add lit test for both cases on SI, and v2i32 for evergreen.

Note: I followed the guidance of the v4i32 EG check... UREM produces really
complex code, so let's just check that the instruction was lowered
successfully.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184844
2013-06-25 13:55:46 +00:00
Aaron Watry
c00dd00a32 R600/SI: Expand udiv v[24]i32 for SI and v2i32 for EG
Also add lit test for both cases on SI, and v2i32 for evergreen.

Note: I followed the guidance of the v4i32 EG check... UDIV produces really
complex code, so let's just check that the instruction was lowered
successfully.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184843
2013-06-25 13:55:43 +00:00
Aaron Watry
0b4bbc3714 R600/SI: Expand ashr of v2i32/v4i32 for SI
Also add lit test for both cases on SI, and v2i32 for evergreen.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184842
2013-06-25 13:55:40 +00:00
Aaron Watry
0bf6dc888a R600/SI: Expand srl of v2i32/v4i32 for SI
Also add lit test for both cases on SI, and v2i32 for evergreen.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184841
2013-06-25 13:55:37 +00:00
Aaron Watry
eafbde78e9 R600/SI: Expand shl of v2i32/v4i32 for SI
Also add lit test for both cases on SI, and v2i32 for evergreen.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184840
2013-06-25 13:55:32 +00:00
Aaron Watry
d9f602bd35 R600/SI: Expand or of v2i32/v4i32 for SI
Also add lit test for both cases on SI, and v2i32 for evergreen.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184839
2013-06-25 13:55:29 +00:00
Aaron Watry
688f496d43 R600/SI: Expand mul of v2i32/v4i32 for SI
Also add lit test for both cases on SI, and v2i32 for evergreen.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184838
2013-06-25 13:55:26 +00:00
Aaron Watry
35d817a307 R600/SI: Expand and of v2i32/v4i32 for SI
Also add lit test for both cases on SI, and v2i32 for evergreen.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 184837
2013-06-25 13:55:23 +00:00
Benjamin Kramer
3b56c8dd50 BlockFrequency: Bump up the entry frequency a bit.
This is a band-aid to fix the most severe regressions we're seeing from basing
spill decisions on block frequencies, until we have a better solution.

llvm-svn: 184835
2013-06-25 13:34:40 +00:00
Ulrich Weigand
e5affb6d66 [PowerPC] Add extended rotate/shift mnemonics
This adds all missing extended rotate/shift mnemonics to the asm parser.

llvm-svn: 184834
2013-06-25 13:17:41 +00:00