1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 06:22:56 +02:00
Commit Graph

12988 Commits

Author SHA1 Message Date
Sanjay Patel
29c427c639 [x86] generalize reassociation optimization in machine combiner to 2 instructions
Currently ( D10321, http://reviews.llvm.org/rL239486 ), we can use the machine combiner pass
to reassociate the following sequence to reduce the critical path:

A = ? op ?
B = A op X
C = B op Y
-->
A = ? op ?
B = X op Y
C = A op B

'op' is currently limited to x86 AVX scalar FP adds (with fast-math on), but in theory, it could
be any associative math/logic op (see TODO in code comment).

This patch generalizes the pattern match to ignore the instruction that defines 'A'. So instead of
a sequence of 3 adds, we now only need to find 2 dependent adds and decide if it's worth
reassociating them.

This generalization has a compile-time cost because we can now match more instruction sequences
and we rely more heavily on the machine combiner to discard sequences where reassociation doesn't
improve the critical path.

For example, in the new test case:

A = M div N
B = A add X
C = B add Y

We'll match 2 reassociation patterns, but this transform doesn't reduce the critical path:

A = M div N
B = A add Y
C = B add X

We need the combiner to reject that pattern but select this:

A = M div N
B = X add Y
C = B add A

Differential Revision: http://reviews.llvm.org/D10460

llvm-svn: 240361
2015-06-23 00:39:40 +00:00
Pawel Bylica
adda783aca Revert r240291: causes problems in self-hosted builds.
llvm-svn: 240343
2015-06-22 21:54:07 +00:00
Pawel Bylica
6619e83c03 Set missing x86 arch in a CodeGen regression test.
Fixes the regression test added in r240291.

llvm-svn: 240336
2015-06-22 21:18:10 +00:00
Simon Pilgrim
77fe8ee210 [X86][AVX2] Added missing stack folding tests for vpshufhw/vpshuflw
llvm-svn: 240332
2015-06-22 21:10:42 +00:00
Tom Stellard
cccd26327d R600/SI: Use ELF64 format instead of ELF32
Reviewers: arsenm, rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10392

llvm-svn: 240331
2015-06-22 21:03:54 +00:00
Tom Stellard
5a698953ca R600: Use EM_AMDGPU for the ELF Machine type
Reviewers: arsenm, rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10390

llvm-svn: 240330
2015-06-22 21:03:52 +00:00
Ahmed Bougacha
95d8094810 [X86] Teach load folding to accept scalar _Int users of MOVSS/MOVSD.
The _Int instructions are special, in that they operate on the full
VR128 instead of FR32.  The load folding then looks at MOVSS, at the
user, and bails out when it sees a size mismatch.

What we really know is that the rm_Int instructions don't load the
higher lanes, so folding is fine.

This happens for the straightforward intrinsic code, e.g.:

    _mm_add_ss(a, _mm_load_ss(p));

Fixes PR23349.

Differential Revision: http://reviews.llvm.org/D10554

llvm-svn: 240326
2015-06-22 20:51:51 +00:00
Alex Lorenz
b5bed015c2 MIR Serialization: Introduce a lexer for machine instructions.
This commit adds a function that tokenizes the string containing
the machine instruction. This commit also adds a struct called 
'MIToken' which is used to represent the lexer's tokens.

Reviewers: Sean Silva

Differential Revision: http://reviews.llvm.org/D10521

llvm-svn: 240323
2015-06-22 20:37:46 +00:00
Sanjay Patel
164919e49c [x86] set default reciprocal (division and square root) codegen to match GCC
D8982 ( checked in at http://reviews.llvm.org/rL239001 ) added command-line 
options to allow reciprocal estimate instructions to be used in place of
divisions and square roots.

This patch changes the default settings for x86 targets to allow that recip
codegen (except for scalar division because that breaks too much code) when
using -ffast-math or its equivalent. 

This matches GCC behavior for this kind of codegen.

Differential Revision: http://reviews.llvm.org/D10396

llvm-svn: 240310
2015-06-22 18:29:44 +00:00
Sanjoy Das
bea61317e9 [FaultMaps] Add a parser for the __llvm__faultmaps section.
Summary:
The parser is exercised by llvm-objdump using -print-fault-maps.  As is
probably obvious, the code itself was "heavily inspired" by
http://reviews.llvm.org/D10434.

Reviewers: reames, atrick, JosephTremoulet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10491

llvm-svn: 240304
2015-06-22 18:03:02 +00:00
Rafael Espindola
1bd232f704 Bring r240130 back.
Now that pr23900 is fixed, we can bring it back with no changes.

Original message:

Make all temporary symbols unnamed.

What this does is make all symbols that would otherwise start with a .L
(or L on MachO) unnamed.

Some of these symbols still show up in the symbol table, but we can just
make them unnamed.

In order to make sure we produce identical results when going thought assembly,
all .L (not just the compiler produced ones), are now unnamed.

Running llc on llvm-as.opt.bc, the peak memory usage goes from 208.24MB to
205.57MB.

llvm-svn: 240302
2015-06-22 17:52:52 +00:00
Alex Lorenz
71956fb32c MIR Serialization: Serialize machine instruction names.
This commit implements initial machine instruction serialization. It
serializes machine instruction names. The instructions are represented
using a YAML sequence of string literals and are a part of machine
basic block YAML mapping.

This commit introduces a class called 'MIParser' which will be used to
parse the machine instructions and operands.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10481

llvm-svn: 240295
2015-06-22 17:02:30 +00:00
Pawel Bylica
c3956ce2ba Fix shl folding in DAG combiner.
Summary: The code responsible for shl folding in the DAGCombiner was assuming incorrectly that all constants are less than 64 bits. This patch simply changes the way values are compared.

Test Plan: A regression test included.

Reviewers: andreadb

Reviewed By: andreadb

Subscribers: andreadb, test, llvm-commits

Differential Revision: http://reviews.llvm.org/D10602

llvm-svn: 240291
2015-06-22 15:58:11 +00:00
Elena Demikhovsky
0d6489273b AVX-512: added VPSHUFB instruction - all SKX forms
Added intrinsics and encoding tests.

llvm-svn: 240277
2015-06-22 13:00:42 +00:00
Elena Demikhovsky
07d3690bdf Reverted AVX-512 vector shuffle
llvm-svn: 240258
2015-06-22 09:01:15 +00:00
Michael Kuperstein
dec359bcf7 [X86] Allow more call sequences to use push instructions for argument passing
This allows more call sequences to use pushes instead of movs when optimizing for size.
In particular, calling conventions that pass some parameters in registers (e.g. thiscall) are now supported.

Differential Revision: http://reviews.llvm.org/D10500

llvm-svn: 240257
2015-06-22 08:31:22 +00:00
Elena Demikhovsky
9ebb2c82d0 AVX-512: Added intrinsics for VPERMT2W/D/Q/PS/PD and
VPERMI2W/D/Q/PS/PD instructions.
Added tests.

llvm-svn: 240256
2015-06-22 06:45:48 +00:00
Rafael Espindola
97aff73f30 Add the testcase from pr23900.
llvm-svn: 240253
2015-06-22 01:29:24 +00:00
Simon Pilgrim
cb02250c85 [X86][SSE] Added missing stack folding test for CVTSD2SS instruction.
llvm-svn: 240241
2015-06-21 16:07:47 +00:00
Hans Wennborg
59ddcf57cd Switch lowering: add heuristic for filling leaf nodes in the weight-balanced binary search tree
Sparse switches with profile info are lowered as weight-balanced BSTs. For
example, if the node weights are {1,1,1,1,1,1000}, the right-most node would
end up in a tree by itself, bringing it closer to the top.

However, a leaf in this BST can contain up to 3 cases, and having a single
case in a leaf node as in the example means the tree might become
unnecessarily high.

This patch adds a heauristic to the pivot selection algorithm that moves more
cases into leaf nodes unless that would lower their rank. It still doesn't
yield the optimal tree in every case, but I believe it's conservatibely correct.

llvm-svn: 240224
2015-06-20 17:14:07 +00:00
Simon Pilgrim
7e530e8e8e [X86][SSE] Fix PerformSExtCombine bug that accessed the wrong return value of an aggregate type.
Fix to rL237885 to ensure that it accesses the correct return value of an aggregate type.

llvm-svn: 240223
2015-06-20 16:19:24 +00:00
Nico Weber
cea0f30bb1 Revert 240130, it caused crashes (repro in PR23900).
llvm-svn: 240193
2015-06-19 23:43:47 +00:00
Alex Lorenz
b907fa377d MIR Parser: report an error when a basic block isn't found.
This commit reports an error when the MIR parser can't find
a basic block with the machine basic block's name.

llvm-svn: 240174
2015-06-19 20:12:03 +00:00
Alex Lorenz
98562075f0 MIR Serialization: Serialize the list of machine basic blocks with simple attributes.
This commit implements the initial serialization of machine basic blocks in a
machine function. Only the simple, scalar MBB attributes are serialized. The 
reference to LLVM IR's basic block is preserved when that basic block has a name.

Reviewers: Duncan P. N. Exon Smith

Differential Revision: http://reviews.llvm.org/D10465

llvm-svn: 240145
2015-06-19 17:43:07 +00:00
Rafael Espindola
41c3ae17be Make all temporary symbols unnamed.
What this does is make all symbols that would otherwise start with a .L
(or L on MachO) unnamed.

Some of these symbols still show up in the symbol table, but we can just
make them unnamed.

In order to make sure we produce identical results when going thought assembly,
all .L (not just the compiler produced ones), are now unnamed.

Running llc on llvm-as.opt.bc, the peak memory usage goes from 208.24MB to
205.57MB.

llvm-svn: 240130
2015-06-19 12:16:55 +00:00
Ahmed Bougacha
57ebe7191c [ARM] Look through concat when lowering in-place shuffles (VZIP, ..)
Currently, we canonicalize shuffles that produce a result larger than
their operands with:
  shuffle(concat(v1, undef), concat(v2, undef))
->
  shuffle(concat(v1, v2), undef)

because we can access quad vectors (see PerformVECTOR_SHUFFLECombine).

This is useful in the general case, but there are special cases where
native shuffles produce larger results: the two-result ops.

We can look through the concat when lowering them:
  shuffle(concat(v1, v2), undef)
->
  concat(VZIP(v1, v2):0, :1)

This lets us generate the native shuffles instead of scalarizing to
dozens of VMOVs.

Differential Revision: http://reviews.llvm.org/D10424

llvm-svn: 240118
2015-06-19 02:32:35 +00:00
Ahmed Bougacha
2ac7aab834 [ARM] Add D-sized vtrn/vuzp/vzip tests, and cleanup. NFC.
llvm-svn: 240114
2015-06-19 02:15:34 +00:00
Eric Christopher
0b2dfae3ba Fix "the the" in comments.
llvm-svn: 240112
2015-06-19 01:53:21 +00:00
Alex Lorenz
30d23ac69f MIR Serialization: Reenable one of the MIRParser tests by reverting r239805.
The test 'llvm/test/CodeGen/MIR/machine-function.mir' was disabled on 
x86 msc18 in r239805 as it failed. My commit r240054 have fixed the
problem, so this commit reverts the commit that disabled the test as
it should pass now. 

llvm-svn: 240074
2015-06-18 22:46:27 +00:00
Rafael Espindola
d5d60f7168 Improve the --expand-relocs handling of MachO.
In a relocation target can take 3 basic forms

* A r_value in scattered relocations.
* A symbol in external relocations.
* A section is non-external relocations.

Have the dump reflect that. With this change we go from

CHECK-NEXT:       Extern: 0
CHECK-NEXT:       Type: X86_64_RELOC_SUBTRACTOR (5)
CHECK-NEXT:       Symbol: 0x2
CHECK-NEXT:       Scattered: 0

To just

// CHECK-NEXT:       Type: X86_64_RELOC_SUBTRACTOR (5)
// CHECK-NEXT:       Section: __data (2)

Since the relocation is with a section, we print the seciton name and don't
need to say that it is not scattered or external.

Someone motivated can add further special cases for things like
ARM64_RELOC_ADDEND and ARM_RELOC_PAIR.

llvm-svn: 240073
2015-06-18 22:38:20 +00:00
Yi Jiang
8a33caab31 Avoid redundant select node in early if-conversion pass
llvm-svn: 240072
2015-06-18 22:34:09 +00:00
Hans Wennborg
e8d3ea5daa Switch lowering: enable whole-switch jump tables at -O0.
To same compile time, the analysis to find dense case-clusters in switches is
not done at -O0. However, when the whole switch is dense enough, it is easy to
turn it into a jump table, resulting in much faster code with no extra effort.

llvm-svn: 240071
2015-06-18 22:22:30 +00:00
Sanjay Patel
aa238e95d1 add test to show suboptimal load merging behavior
llvm-svn: 240063
2015-06-18 21:34:26 +00:00
Sanjay Patel
d9d0b2df87 fixed to test attributes and use better checks
1. Used update_llc_test_checks.py to tighten checks
2. Fixed triple (nothing Darwin-specific here)
3. Replaced CPU specifiers with attributes
4. Fixed comments
5. Removed IvyBridge run because it did not add any coverage

llvm-svn: 240058
2015-06-18 21:12:24 +00:00
Rafael Espindola
4cb02a8fb5 Use --expand-relocs in a test. It will make the next change easier to read.
llvm-svn: 240053
2015-06-18 20:57:35 +00:00
Colin LeMahieu
1bd1f71341 [Hexagon] Printing packet brackets when asm printing and adding a number of tests that test packet brackets.
llvm-svn: 240051
2015-06-18 20:43:50 +00:00
David Majnemer
2fcb4d8322 [CodeGen] Don't emit a random reference to the personality function
This should fix issues we've been seeing with Darwin.

llvm-svn: 240036
2015-06-18 18:31:46 +00:00
James Y Knight
d704cd8059 [SPARC] Repair GOT references to internal symbols.
They had been getting emitted as a section + offset reference, which
is bogus since the value needs to be the offset within the GOT, not
the actual address of the symbol's object.

Differential Revision: http://reviews.llvm.org/D10441

llvm-svn: 240020
2015-06-18 15:05:15 +00:00
Simon Pilgrim
07cbd65c87 [X86][AVX2] Added AVX2 SINT_TO_FP/UINT_TO_FP tests
llvm-svn: 240013
2015-06-18 12:32:28 +00:00
Asaf Badouh
6e78caf9ff [AVX512]
add instructions: VPAVGB and VPAVGW


review
http://reviews.llvm.org/D10504

llvm-svn: 240012
2015-06-18 12:30:53 +00:00
Elena Demikhovsky
e3fe4bf53e AVX-512: (fixed) Added encoding of all forms of VPERMT2W/D/Q/PS/PD and VPERMI2W/D/Q/PS/PD.
Intrinsics and tests for them are comming in the next patch.

llvm-svn: 240003
2015-06-18 08:56:19 +00:00
Elena Demikhovsky
dc7dd8572b reverted 239999 due to test failures
llvm-svn: 240001
2015-06-18 08:06:49 +00:00
Elena Demikhovsky
f5554ec461 AVX-512: Added encoding of all forms of VPERMT2W/D/Q/PS/PD
and VPERMI2W/D/Q/PS/PD.
Intrinsics and tests for them are comming in the next patch.

llvm-svn: 239999
2015-06-18 07:29:40 +00:00
Benjamin Kramer
d3a7e1b8f1 [AsmPrinter] Make isRepeatedByteSequence smarter about odd integer types
- zext the value to alloc size first, then check if the value repeats
  with zero padding included. If so we can still emit a .space
- Do the checking with APInt.isSplat(8), which handles non-pow2 types
- Also handle large constants (bit width > 64)
- In a ConstantArray all elements have the same type, so it's sufficient
  to check the first constant recursively and then just compare if all
  following constants are the same by pointer compare

llvm-svn: 239977
2015-06-17 23:55:17 +00:00
Simon Pilgrim
cdfbeb425a [X86][SSE] Improved support for vector i16 to float conversions.
Added explicit sign extension for v4i16/v8i16 to v4i32/v8i32 before conversion to floats. Matches existing support for v4i8/v8i8.

Follow up to D10433

llvm-svn: 239966
2015-06-17 22:43:34 +00:00
Jingyue Wu
34412b317e Add NVPTXLowerAlloca pass to convert alloca'ed memory to local address
Summary:
This is done by first adding two additional instructions to convert the
alloca returned address to local and convert it back to generic. Then
replace all uses of alloca instruction with the converted generic
address. Then we can rely NVPTXFavorNonGenericAddrSpace pass to combine
the generic addresscast and the corresponding Load, Store, Bitcast, GEP
Instruction together.

Patched by Xuetian Weng (xweng@google.com). 

Test Plan: test/CodeGen/NVPTX/lower-alloca.ll

Reviewers: jholewinski, jingyue

Reviewed By: jingyue

Subscribers: meheff, broune, eliben, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10483

llvm-svn: 239964
2015-06-17 22:31:02 +00:00
David Majnemer
c8b1f095a3 Move the personality function from LandingPadInst to Function
The personality routine currently lives in the LandingPadInst.

This isn't desirable because:
- All LandingPadInsts in the same function must have the same
  personality routine.  This means that each LandingPadInst beyond the
  first has an operand which produces no additional information.

- There is ongoing work to introduce EH IR constructs other than
  LandingPadInst.  Moving the personality routine off of any one
  particular Instruction and onto the parent function seems a lot better
  than have N different places a personality function can sneak onto an
  exceptional function.

Differential Revision: http://reviews.llvm.org/D10429

llvm-svn: 239940
2015-06-17 20:52:32 +00:00
Ahmed Bougacha
a463777c40 [CodeGenPrepare] Generalize inserted set from truncs to any inst.
It's been used before to avoid infinite loops caused by separate CGP
optimizations undoing one another.  We found one more such issue
caused by r238054.  To avoid it, generalize the "InsertedTruncs"
set to any inst, and use it to avoid touching those again.

llvm-svn: 239938
2015-06-17 20:44:32 +00:00
Colin LeMahieu
39d5f60f66 [Hexagon] Adding a number of other tests for min/max instructions and loading i1s.
llvm-svn: 239935
2015-06-17 20:29:33 +00:00
Colin LeMahieu
714049f3c9 [Hexagon] Adding some compare tests, fixing existing XFAILed tests, and removing mcpu=hexagonv4 since that's the minimum version anyway.
llvm-svn: 239917
2015-06-17 17:19:05 +00:00