1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00
Commit Graph

28479 Commits

Author SHA1 Message Date
Michael Kuperstein
b344d5ac77 [X86] A heuristic to estimate the size impact for converting stack-relative parameter movs to pushes
This gives a rough estimate of whether using pushes instead of movs is profitable, in terms of size.
We go over all calls in the MachineFunction and compute:
a) For each callsite that can not use pushes, the penalty of not having a reserved call frame.
b) For each callsite that can use pushes, the gain of actually replacing the movs with pushes (and the potential penalty of having to readjust the stack).

Differential Revision: http://reviews.llvm.org/D7561

llvm-svn: 228915
2015-02-12 08:36:35 +00:00
Ahmed Bougacha
88db0c3c30 [CodeGen] Don't blindly combine (fp_round (fp_round x)) to (fp_round x).
We used to do this DAG combine, but it's not always correct:
If the first fp_round isn't a value preserving truncation, it might
introduce a tie in the second fp_round, that wouldn't occur in the
single-step fp_round we want to fold to.
In other words, double rounding isn't the same as rounding.

Differential Revision: http://reviews.llvm.org/D7571

llvm-svn: 228911
2015-02-12 06:15:29 +00:00
George Burgess IV
395bb904a1 Fixed a bug where CFLAA would crash the compiler.
We would crash if we couldn't locate a Function that either Location's
Value belonged to. Now we just print out a debug message and return 
conservatively.

llvm-svn: 228901
2015-02-12 03:07:07 +00:00
Chandler Carruth
2af75e99bb [slp] Fix a nasty bug in the SLP vectorizer that Joerg pointed out.
Apparently some code finally started to tickle this after my
canonicalization changes to instcombine.

The bug stems from trying to form a vector type out of scalars that
aren't compatible at all. In this example, from x86_mmx values. The code
in the vectorizer that checks for reasonable types whas checking for
aggregates or vectors, but there are lots of other types that should
just never reach the vectorizer.

Debugging this was made more confusing by the lie in an assert in
VectorType::get() -- it isn't that the types are *primitive*. The types
must be integer, pointer, or floating point types. No other types are
allowed.

I've improved the assert and added a helper to the vectorizer to handle
the element type validity checks. It now re-uses the VectorType static
function and then further excludes weird target-specific types that we
probably shouldn't be touching here (x86_fp80 and ppc_fp128). Neither of
these are really reachable anyways (neither 80-bit nor 128-bit things
will get vectorized) but it seems better to just eagerly exclude such
nonesense.

I've added a test case, but while it definitely covers two of the paths
through this code there may be more paths that would benefit from test
coverage. I'm not familiar enough with the SLP vectorizer to synthesize
test cases for all of these, but was able to update the code itself by
inspection.

llvm-svn: 228899
2015-02-12 02:30:56 +00:00
Hal Finkel
b95a028674 [PowerPC] Mark jumps as expensive (using using CR bits)
On PowerPC, which has a full set of logical operations on (its multiple sets
of) condition-register bits, it is not profitable to break of complex
conditions feeding a jump into multiple jumps. We can turn off this feature of
CGP/SDAGBuilder by marking jumps as "expensive".

P7 test-suite speedups (no regressions):
MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2
	-0.626647% +/- 0.323583%
MultiSource/Benchmarks/Olden/power/power
	-18.2821% +/- 8.06481%

llvm-svn: 228895
2015-02-12 01:02:52 +00:00
Tim Northover
f976f969cd DeadArgElim: aggregate Return assessment properly.
I mistakenly thought the liveness of each "RetVal(F, i)" depended only on F. It
actually depends on the index too, which means we need to be careful about how
the results are combined before return. In particular if a single Use returns
Live, that counts for the entire object, at the granularity we're considering.

llvm-svn: 228885
2015-02-11 23:13:11 +00:00
David Majnemer
733c762449 MC, COFF: Align section contents to a four byte boundary
llvm-svn: 228879
2015-02-11 22:22:30 +00:00
Mehdi Amini
3c8f7ac243 Reassociate: cannot negate a INT_MIN value
Summary:
When trying to canonicalize negative constants out of
multiplication expressions, we need to check that the
constant is not INT_MIN which cannot be negated.

Reviewers: mcrosier

Reviewed By: mcrosier

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7286

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 228872
2015-02-11 19:54:44 +00:00
Tom Stellard
20d19bec1e R600/SI: Disable subreg liveness
This is temporary while we try to fix a crash in the register coalescer.

llvm-svn: 228861
2015-02-11 18:24:53 +00:00
Simon Pilgrim
cda2cb97d7 [X86][SSE] Added dual vector truncation tests.
llvm-svn: 228857
2015-02-11 18:14:35 +00:00
Tom Stellard
dc0b2cd2e6 R600/SI: Fix -march in test
llvm-svn: 228848
2015-02-11 17:11:48 +00:00
Jan Wen Voung
ea1b991ecf Gold-plugin: Broaden scope of get/release_input_file to scope of Module.
Summary:
Move calls to get_input_file and release_input_file out of
getModuleForFile(). Otherwise release_input_file may end up
unmapping a view of the file while the view is still being
used by the Module (on 32-bit hosts).

Fix for PR22482.

Test Plan: Add test using --no-map-whole-files.

Reviewers: rafael, nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7539

llvm-svn: 228842
2015-02-11 16:12:50 +00:00
Sanjay Patel
247ca77ae9 fixed to test features, not CPUs
llvm-svn: 228836
2015-02-11 15:00:41 +00:00
Sanjay Patel
14ceb7a2fc fixed to test features, not CPUs
llvm-svn: 228835
2015-02-11 15:00:19 +00:00
Sanjay Patel
95a977af34 fixed to test features, not CPUs
llvm-svn: 228834
2015-02-11 14:58:25 +00:00
Marek Olsak
63a18b6d87 R600/SI: Enable a lot of existing tests for VI (squashed commits)
This is a union of these commits:

* R600/SI: Enable more tests for VI which need no changes

* R600/SI: Enable V_BCNT tests for VI
    Differences:
    - v_bcnt_..._e32 -> _e64
    - s_load_dword* inline offset is in bytes instead of dwords

* R600/SI: Enable all tests for VI which use S_LOAD_DWORD
    The inline offset is changed from dwords to bytes.

* R600/SI: Enable LDS tests for VI
    Differences:
    - the s_load_dword inline offset changed from dwords to bytes
    - the tests checked very little on CI, so they have been fixed to check all
      instructions that "SI" checked

* R600/SI: Enable lshr tests for VI

* R600/SI: Fix divrem64 tests
    - "v_lshl_64" was missing "b" before "64"
    - added VI-NOT checks

* R600/SI: Enable the SI.tid test for VI

* R600/SI: Enable the frem test for VI
    Also, the frem_f64 checking is added for CI-VI.

* R600/SI: Add VI tests for rsq.clamped

llvm-svn: 228830
2015-02-11 14:26:46 +00:00
Andrea Di Biagio
70c7608263 [TTI] Improved cost heuristic for cttz/ctlz calls.
This patch is a follow-up of r228826 (see code-review: D7506).

Now that SimplifyCFG uses TargetTransformInfo for cost analysis, we 
have to fix the cost heuristic for intrinsic calls to cttz/ctlz.

This patch defines method 'getIntrinsicCost' in BasicTTIImpl: now, BasicTTIImpl
queries TLI to check if a call to cttz/ctlz is cheap for the target.

Added test cases in Transforms/SimplifyCFG/X86 to verify that on x86,
SimplifyCFG only speculates a call to cttz/ctlz if it is cheap.

Differential Revision: http://reviews.llvm.org/D7554

llvm-svn: 228829
2015-02-11 14:22:18 +00:00
James Molloy
e97d8824e7 Make buildbots better.
This testcase change was associated incorrectly to a followup commit in my git tree, not the base commit. Sorry!

llvm-svn: 228827
2015-02-11 12:24:09 +00:00
James Molloy
ba8cd33738 [SimplifyCFG] Swap to using TargetTransformInfo for cost
analysis.

We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness"
heuristic and use TTI directly. Generally NFC intended, but we're using a slightly
different heuristic now so there is a slight test churn.

Test changes:
  * combine-comparisons-by-cse.ll: Removed unneeded branch check.
  * 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq.
  * coalesce-subregs.ll: Superfluous block check.
  * 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv.
  * PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present.
  * select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI.

llvm-svn: 228826
2015-02-11 12:15:41 +00:00
Daniel Sanders
e04bb135ee [mips] Merge disassemblers into a single implementation.
Summary:
Currently we have Mips32 and Mips64 disassemblers and this causes the target
triple to affect the disassembly despite all the relevant information being in
the ELF header. These implementations do not need to be separate.

This patch merges them together such that the appropriate tables are checked
for the subtarget (e.g. Mips64 is checked when GP64 is enabled).

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7498

llvm-svn: 228825
2015-02-11 11:28:56 +00:00
James Molloy
c9ce650708 [LoopReroll] Introduce the concept of DAGRootSets.
A DAGRootSet models an induction variable being used in a rerollable
loop. For example:

   x[i*3+0] = y1
   x[i*3+1] = y2
   x[i*3+2] = y3

   Base instruction -> i*3
                    +---+----+
                   /    |     \
               ST[y1]  +1     +2  <-- Roots
                        |      |
                      ST[y2] ST[y3]

There may be multiple DAGRootSets, for example:

   x[i*2+0] = ...   (1)
   x[i*2+1] = ...   (1)
   x[i*2+4] = ...   (2)
   x[i*2+5] = ...   (2)
   x[(i+1234)*2+5678] = ... (3)
   x[(i+1234)*2+5679] = ... (3)

This concept is similar to the "Scale" member used previously, but allows
multiple independent sets of roots based off the same induction variable.

llvm-svn: 228821
2015-02-11 09:19:47 +00:00
David Majnemer
fbc347f596 AsmParser: Validate alloca's type
An alloca's type should be weird things like metadata.

llvm-svn: 228820
2015-02-11 09:13:11 +00:00
David Majnemer
61ddeb2d4b DataLayout: Report when the preferred alignment is less than the ABI
llvm-svn: 228819
2015-02-11 09:13:09 +00:00
David Majnemer
87fa2583a6 Verifier: Check for null operands in !llvm.module.flags
llvm-svn: 228818
2015-02-11 09:13:06 +00:00
David Majnemer
5434ded6b2 Verifier: Make sure !llvm.ident's operand isn't null
llvm-svn: 228815
2015-02-11 08:23:20 +00:00
David Majnemer
b2167a7a64 AsmParser: Don't crash when insertvalue has bad operands
llvm-svn: 228813
2015-02-11 07:43:58 +00:00
Reid Kleckner
f2f6e76b2c Fix invalid LLVM IR in PruneEH tests
llvm-svn: 228786
2015-02-11 02:06:47 +00:00
Reid Kleckner
86643b627c Don't promote asynch EH invokes of nounwind functions to calls
If the landingpad of the invoke is using a personality function that
catches asynch exceptions, then it can catch a trap.

Also add some landingpads to invalid LLVM IR test cases that lack them.

Over-the-shoulder reviewed by David Majnemer.

llvm-svn: 228782
2015-02-11 01:23:16 +00:00
Tom Stellard
01068df0db R600/SI: Store immediate offsets > 12-bits in soffset
This will save us from having to extend these offsets to 64-bits
and storing them in a pair of vgprs.

llvm-svn: 228776
2015-02-11 00:34:35 +00:00
Adrian Prantl
a9fa7d488c Add the missing testcase for r228764.
llvm-svn: 228766
2015-02-10 23:32:56 +00:00
Petar Jovanovic
2b0935a094 Fix makeLibCall argument (signed) in SoftenFloatRes_XINT_TO_FP function
The isSigned argument of makeLibCall function was hard-coded to false
(unsigned). This caused zero extension on MIPS64 soft float.
As the result SingleSource/Benchmarks/Stanford/FloatMM test and
SingleSource/UnitTests/2005-07-17-INT-To-FP test failed. 
The solution was to use the proper argument.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D7292

llvm-svn: 228765
2015-02-10 23:30:14 +00:00
David Majnemer
15b422ab6d EarlyCSE: Add check lines for test added in r228760
llvm-svn: 228761
2015-02-10 23:11:02 +00:00
David Majnemer
530afeca43 EarlyCSE: It isn't safe to CSE across synchronization boundaries
This fixes PR22514.

llvm-svn: 228760
2015-02-10 23:09:43 +00:00
David Majnemer
e15f9edb53 X86: @llvm.frameaddress should defer to SelectionDAG for Win CFI
llvm-svn: 228754
2015-02-10 22:00:34 +00:00
David Majnemer
9809999374 X86: Make @llvm.frameaddress work correctly with Windows unwind codes
Simply loading or storing the frame pointer is not sufficient for
Windows targets.  Instead, create a synthetic frame object that we will
lower later.  References to this synthetic object will be replaced with
the correct reference to the frame address.

llvm-svn: 228748
2015-02-10 21:22:05 +00:00
Daniel Jasper
609bd5e8bc Fix overly prescriptive test that broken on Mac after r228725.
llvm-svn: 228742
2015-02-10 20:49:05 +00:00
Andrew Kaylor
fff974fc6d Adding support for llvm.eh.begincatch and llvm.eh.endcatch intrinsics and beginning the documentation of native Windows exception handling.
Differential Revision: http://reviews.llvm.org/D7398

llvm-svn: 228733
2015-02-10 19:52:43 +00:00
Tim Northover
0eba2eb04d DeadArgElim: arguments affect all returned sub-values by default.
Unless we meet an insertvalue on a path from some value to a return, that value
will be live if *any* of the return's components are live, so all of those
components must be added to the MaybeLiveUses.

Previously we were deleting arguments if sub-value 0 turned out to be dead.

llvm-svn: 228731
2015-02-10 19:49:18 +00:00
Bill Schmidt
37700f757e [PowerPC] Fix reverted patch r227976 to avoid register assignment issues
See full discussion in http://reviews.llvm.org/D7491.

We now hide the add-immediate and call instructions together in a
separate pseudo-op, which is tagged to define GPR3 and clobber the
call-killed registers.  The PPCTLSDynamicCall pass prior to RA now
expands this op into the two separate addi and call ops, with explicit
definitions of GPR3 on both instructions, and explicit clobbers on the
call instruction.  The pass is now marked as requiring and preserving
the LiveIntervals and SlotIndexes analyses, and fixes these up after
the replacement sequences are introduced.

Self-hosting has been verified on LE P8 and BE P7 with various
optimization levels, etc.  It has also been verified with the
--no-tls-optimize flag workaround removed.

llvm-svn: 228725
2015-02-10 19:09:05 +00:00
David Majnemer
65abff8895 X86: Emit Win64 SaveXMM opcodes at the right offset in the right order
Walk the instructions marked FrameSetup and consider any stores of XMM
registers to the stack as needing a SaveXMM opcode.

This fixes PR22521.

Differential Revision: http://reviews.llvm.org/D7527

llvm-svn: 228724
2015-02-10 19:01:47 +00:00
Hal Finkel
1ec493a7ab [PowerPC] Support the (old) cntlz instruction alias
Some old assembly code uses the cntlz alias for cntlzw, binutils supports this,
and we should too. Fixes PR22519.

llvm-svn: 228719
2015-02-10 18:45:02 +00:00
Michael Zolotukhin
5d9638624f Add a test case for new unrolling heuristics.
THe heuristics were added in r228265 and r228434.

llvm-svn: 228713
2015-02-10 17:54:54 +00:00
Zoran Jovanovic
7f0e9478f6 [mips][microMIPS] Implement movep instruction
Differential Revision: http://reviews.llvm.org/D7465

llvm-svn: 228703
2015-02-10 16:36:20 +00:00
Paul Robinson
b0fca412c4 Explicitly initialize a flag in a default constructor.
Works around a Visual C++ issue.

Patch by Douglas Yung!

llvm-svn: 228699
2015-02-10 15:30:02 +00:00
Bradley Smith
78565d178b [ARM] Add armv6s[-]m as an alias to armv6[-]m
llvm-svn: 228696
2015-02-10 15:15:08 +00:00
Simon Pilgrim
6fc796d731 [X86][AVX2] Missing AVX2 memory folding instructions
Added most of the missing vector folding patterns for AVX2 (as well as fixing the vpermpd and verpmq patterns)

Differential Revision: http://reviews.llvm.org/D7492

llvm-svn: 228688
2015-02-10 13:22:57 +00:00
Jozef Kolek
b9a2be0332 [mips][microMIPS] Add disassembler tests for 16-bit instructions BREAK16 and SDBBP16
Differential Revision: http://reviews.llvm.org/D7443

llvm-svn: 228687
2015-02-10 13:20:51 +00:00
Simon Pilgrim
5892178473 [X86][XOP] Added XOP memory folding patterns + tests
This patch adds the complete AMD Bulldozer XOP instruction set to the memory folding pattern tables for stack folding, etc.

Note: Many of the XOP instructions have multiple table entries as it can fold loads from different sources.

Differential Revision: http://reviews.llvm.org/D7484

llvm-svn: 228685
2015-02-10 12:57:17 +00:00
Jozef Kolek
3c2b5264df [mips][microMIPS] Fix disassembling of 16-bit microMIPS instructions LWM16 and SWM16
Differential Revision: http://reviews.llvm.org/D7436

llvm-svn: 228683
2015-02-10 12:41:13 +00:00
Andrea Di Biagio
3063960b2e [X86][FastIsel] Avoid introducing legacy SSE instructions if the target has AVX.
This patch teaches X86FastISel how to select AVX instructions for scalar
float/double convert operations.

Before this patch, X86FastISel always selected legacy SSE instructions
for FPExt (from float to double) and FPTrunc (from double to float).

For example:
\code
  define double @foo(float %f) {
    %conv = fpext float %f to double
    ret double %conv
  }
\end code

Before (with -mattr=+avx -fast-isel) X86FastIsel selected a CVTSS2SDrr which is
legacy SSE:
  cvtss2sd %xmm0, %xmm0

With this patch, X86FastIsel selects a VCVTSS2SDrr instead:
  vcvtss2sd %xmm0, %xmm0, %xmm0

Added test fast-isel-fptrunc-fpext.ll to check both the register-register and
the register-memory float/double conversion variants.

Differential Revision: http://reviews.llvm.org/D7438

llvm-svn: 228682
2015-02-10 12:04:41 +00:00