1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00
Commit Graph

113139 Commits

Author SHA1 Message Date
Benjamin Kramer
6b4b45fd19 Fix accidental bit flip.
llvm-svn: 228936
2015-02-12 16:30:00 +00:00
Benjamin Kramer
9758ef8680 CoverageMapping: Bitvectorize code. No functionality change.
llvm-svn: 228934
2015-02-12 16:18:07 +00:00
James Molloy
a6fbcd3b92 [LoopRerolling] Be more forgiving with instruction order.
We can't solve the full subgraph isomorphism problem. But we can
allow obvious cases, where for example two instructions of different
types are out of order. Due to them having different types/opcodes,
there is no ambiguity.

llvm-svn: 228931
2015-02-12 15:54:14 +00:00
Benjamin Kramer
4b76aa3d46 MathExtras: Bring Count(Trailing|Leading)Ones and CountPopulation in line with countTrailingZeros
Update all callers.

llvm-svn: 228930
2015-02-12 15:35:40 +00:00
Tim Northover
764ad6f42f Triple: refactor redundant code.
Should be no functional change, since most of the logic removed was
completely pointless (after some previous refactoring) and the rest
duplicated elsewhere.

Patch by Kamil Rytarowski.

llvm-svn: 228926
2015-02-12 15:12:13 +00:00
Michael Kuperstein
cd8c495c25 [X86] Call frame optimization - allow stack-relative movs to be folded into a push
Since we track esp precisely, there's no reason not to allow this.

llvm-svn: 228924
2015-02-12 14:17:35 +00:00
Andrea Di Biagio
7ca0db442c [TTI] Teach the cost heuristic how to query TLI to check if a zext/trunc is 'free' for the target.
Now that SimplifyCFG uses TTI for the cost heuristic, we can teach BasicTTIImpl
how to query TLI in order to get a more accurate cost for truncates and
zero-extends.

Before this patch, the basic cost heuristic in TargetTransformInfoImplCRTPBase
would have conservatively returned a 'default' TCC_Basic for all zero-extends,
and TCC_Free for truncates on native types.

This patch improves the heuristic so that we query TLI (if available) to get
more accurate answers. If TLI is available, then methods 'isZExtFree' and
'isTruncateFree' can be used to check if a zext/trunc is free for the target.

Added more test cases to SimplifyCFG/X86/speculate-cttz-ctlz.ll.
With this change, SimplifyCFG is now able to speculate a 'cheap' cttz/ctlz
immediately followed by a free zext/trunc.

Differential Revision: http://reviews.llvm.org/D7585

llvm-svn: 228923
2015-02-12 14:17:24 +00:00
Benjamin Kramer
c7a7636094 BitVector: Remove manual bit width dispatch, this is handled by templates
NFC.

llvm-svn: 228922
2015-02-12 14:02:58 +00:00
Benjamin Kramer
d08a40831d MathExtras: Parametrize count(Trailing|Leading)Zeros on the type size.
Otherwise we will always select the generic version for e.g. unsigned
long if uint64_t is typedef'd to 'unsigned long long'. Also remove
enable_if hacks in favor of static_assert.

llvm-svn: 228921
2015-02-12 13:47:29 +00:00
Asiri Rathnayake
fae9e6a869 ARM: Fix another regression introduced in r223113
The changes in r223113 (ARM modified-immediate syntax) have broken
instructions like:
  mov r0, #~0xffffff00
The problem is that I've added a spurious range check on the immediate
operand to ensure that it lies between INT32_MIN and UINT32_MAX. While
this range check is correct in theory, it causes problems because the
operand is stored in an int64_t (by MC). So valid 32-bit constants like
\#~0xffffff00 become out of range. The solution is to simply remove this
range check. It is not possible to validate the range of the immediate
operand with the current setup because: 1) The operand is stored in an
int64_t by MC, 2) The immediate can be of the forms #imm, #-imm, #~imm
or even #((~imm)) etc. So we just chop the value to 32 bits and use it.

Also noted that the original range check was note tested by any of the
unit tests. I've added a new test to cover #~imm kind of operands.

Change-Id: I411e90d84312a2eff01b732bb238af536c4a7599
llvm-svn: 228920
2015-02-12 13:37:28 +00:00
Dmitry Vyukov
149a56946f tsan: do not instrument not captured values
I've built some tests in WebRTC with and without this change. With this change number of __tsan_read/write calls is reduced by 20-40%, binary size decreases by 5-10% and execution time drops by ~5%. For example:

$ ls -l old/modules_unittests new/modules_unittests
-rwxr-x--- 1 dvyukov 41708976 Jan 20 18:35 old/modules_unittests
-rwxr-x--- 1 dvyukov 38294008 Jan 20 18:29 new/modules_unittests
$ objdump -d old/modules_unittests | egrep "callq.*__tsan_(read|write|unaligned)" | wc -l
239871
$ objdump -d new/modules_unittests | egrep "callq.*__tsan_(read|write|unaligned)" | wc -l
148365

http://reviews.llvm.org/D7069

llvm-svn: 228917
2015-02-12 09:55:28 +00:00
Elena Demikhovsky
d8fd06be73 AVX-512: Fixed the "test" operation for i1 type
Using KORTESTW for comparison i1 value with zero was wrong since the instruction tests 16 bits.
KORTESTW may be used with KSHIFTL+KSHIFTR that clean the 15 upper bits.
I removed (X86cmp i1, 0) pattern and zero-extend i1 to i8 and then use TESTB.

There are some cases where i1 is in the mask register and the upper bits are already zeroed.
Then KORTESTW is the better solution, but it is subject for optimization.
Meanwhile, I'm fixing the correctness issue.

llvm-svn: 228916
2015-02-12 08:40:34 +00:00
Michael Kuperstein
b344d5ac77 [X86] A heuristic to estimate the size impact for converting stack-relative parameter movs to pushes
This gives a rough estimate of whether using pushes instead of movs is profitable, in terms of size.
We go over all calls in the MachineFunction and compute:
a) For each callsite that can not use pushes, the penalty of not having a reserved call frame.
b) For each callsite that can use pushes, the gain of actually replacing the movs with pushes (and the potential penalty of having to readjust the stack).

Differential Revision: http://reviews.llvm.org/D7561

llvm-svn: 228915
2015-02-12 08:36:35 +00:00
Ahmed Bougacha
88db0c3c30 [CodeGen] Don't blindly combine (fp_round (fp_round x)) to (fp_round x).
We used to do this DAG combine, but it's not always correct:
If the first fp_round isn't a value preserving truncation, it might
introduce a tie in the second fp_round, that wouldn't occur in the
single-step fp_round we want to fold to.
In other words, double rounding isn't the same as rounding.

Differential Revision: http://reviews.llvm.org/D7571

llvm-svn: 228911
2015-02-12 06:15:29 +00:00
George Burgess IV
395bb904a1 Fixed a bug where CFLAA would crash the compiler.
We would crash if we couldn't locate a Function that either Location's
Value belonged to. Now we just print out a debug message and return 
conservatively.

llvm-svn: 228901
2015-02-12 03:07:07 +00:00
Chandler Carruth
2af75e99bb [slp] Fix a nasty bug in the SLP vectorizer that Joerg pointed out.
Apparently some code finally started to tickle this after my
canonicalization changes to instcombine.

The bug stems from trying to form a vector type out of scalars that
aren't compatible at all. In this example, from x86_mmx values. The code
in the vectorizer that checks for reasonable types whas checking for
aggregates or vectors, but there are lots of other types that should
just never reach the vectorizer.

Debugging this was made more confusing by the lie in an assert in
VectorType::get() -- it isn't that the types are *primitive*. The types
must be integer, pointer, or floating point types. No other types are
allowed.

I've improved the assert and added a helper to the vectorizer to handle
the element type validity checks. It now re-uses the VectorType static
function and then further excludes weird target-specific types that we
probably shouldn't be touching here (x86_fp80 and ppc_fp128). Neither of
these are really reachable anyways (neither 80-bit nor 128-bit things
will get vectorized) but it seems better to just eagerly exclude such
nonesense.

I've added a test case, but while it definitely covers two of the paths
through this code there may be more paths that would benefit from test
coverage. I'm not familiar enough with the SLP vectorizer to synthesize
test cases for all of these, but was able to update the code itself by
inspection.

llvm-svn: 228899
2015-02-12 02:30:56 +00:00
Hal Finkel
b95a028674 [PowerPC] Mark jumps as expensive (using using CR bits)
On PowerPC, which has a full set of logical operations on (its multiple sets
of) condition-register bits, it is not profitable to break of complex
conditions feeding a jump into multiple jumps. We can turn off this feature of
CGP/SDAGBuilder by marking jumps as "expensive".

P7 test-suite speedups (no regressions):
MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2
	-0.626647% +/- 0.323583%
MultiSource/Benchmarks/Olden/power/power
	-18.2821% +/- 8.06481%

llvm-svn: 228895
2015-02-12 01:02:52 +00:00
Zachary Turner
f9549cf3ad Revert "Change Path::filename_pos() to skip the drive letter."
This reverts commit 228874.  For some reason users reported
seeing Clang taking up 25+GB of memory and bringing down
machines with this change.  Reverting until we figure it out.

llvm-svn: 228890
2015-02-12 00:05:49 +00:00
Rafael Espindola
ee900dc11e Invert the section relocation map.
It now points from rel section to section. Use it to set sh_info, avoiding
a brittle name lookup.

llvm-svn: 228889
2015-02-11 23:38:33 +00:00
Rafael Espindola
ae37ff6b40 Use the existing SymbolTableIndex instead of doing a lookup. NFC.
llvm-svn: 228888
2015-02-11 23:33:46 +00:00
Rafael Espindola
7701edc40e Create the Seciton -> Rel Section map when it is first needed. NFC.
Saves a walk over every section.

llvm-svn: 228886
2015-02-11 23:17:48 +00:00
Tim Northover
f976f969cd DeadArgElim: aggregate Return assessment properly.
I mistakenly thought the liveness of each "RetVal(F, i)" depended only on F. It
actually depends on the index too, which means we need to be careful about how
the results are combined before return. In particular if a single Use returns
Live, that counts for the entire object, at the granularity we're considering.

llvm-svn: 228885
2015-02-11 23:13:11 +00:00
Rafael Espindola
3e50fc9182 Remove unused argument. NFC.
llvm-svn: 228884
2015-02-11 23:11:18 +00:00
David Majnemer
c11a237c18 Unbreak buildbots
The next offset should be updated as well.

llvm-svn: 228883
2015-02-11 22:51:55 +00:00
Rafael Espindola
e14aaeb615 Don't recompute the entire section map just to add 3 entries. NFC.
llvm-svn: 228881
2015-02-11 22:41:26 +00:00
David Majnemer
733c762449 MC, COFF: Align section contents to a four byte boundary
llvm-svn: 228879
2015-02-11 22:22:30 +00:00
Zachary Turner
a384d75e68 Change Path::filename_pos() to skip the drive letter.
For Windows, filename_pos() tries to find the filename by
searching for separators after the last :.  Instead, it should
really check for the only location that a : is valid, which is
in the second character, and search for separators after that.

llvm-svn: 228874
2015-02-11 21:16:35 +00:00
Rafael Espindola
654724294a Remove unused argument. NFC.
llvm-svn: 228873
2015-02-11 21:08:00 +00:00
Mehdi Amini
3c8f7ac243 Reassociate: cannot negate a INT_MIN value
Summary:
When trying to canonicalize negative constants out of
multiplication expressions, we need to check that the
constant is not INT_MIN which cannot be negated.

Reviewers: mcrosier

Reviewed By: mcrosier

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7286

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 228872
2015-02-11 19:54:44 +00:00
Tom Stellard
20d19bec1e R600/SI: Disable subreg liveness
This is temporary while we try to fix a crash in the register coalescer.

llvm-svn: 228861
2015-02-11 18:24:53 +00:00
Simon Pilgrim
cda2cb97d7 [X86][SSE] Added dual vector truncation tests.
llvm-svn: 228857
2015-02-11 18:14:35 +00:00
Adrian Prantl
6d2f18726f Allow DIBuilder::replaceVTableHolder() to work with temporary nodes,
tested via the clang test CodeGenCXX/vtable-holder-self-reference.cpp .

llvm-svn: 228854
2015-02-11 17:45:10 +00:00
Adrian Prantl
d6091b2d23 Add a trackIfUnresolved to DIBuilder::createInheritance(),
tested via the clang test CodeGenCXX/vtable-holder-self-reference.cpp .

llvm-svn: 228853
2015-02-11 17:45:08 +00:00
Adrian Prantl
9ec54ab53b Generalize DIBuilder's createReplaceableForwardDecl() to a more flexible
createReplaceableCompositeType() that allows to create non-forward-declared
temporary nodes.

Paired commit with CFE.

llvm-svn: 228852
2015-02-11 17:45:05 +00:00
Tom Stellard
ba7a67583a R600: Split AMDGPUPassConfig into R600PassConfig and GCNPassConfig
llvm-svn: 228850
2015-02-11 17:11:51 +00:00
Tom Stellard
a8ce1e2a7e R600: Create an R600TargetMachine for pre-gcn GPUs
No functinality change. R600TargetMachine inherits from
AMDGPUTargetMachine.

llvm-svn: 228849
2015-02-11 17:11:50 +00:00
Tom Stellard
dc0b2cd2e6 R600/SI: Fix -march in test
llvm-svn: 228848
2015-02-11 17:11:48 +00:00
Jan Wen Voung
ea1b991ecf Gold-plugin: Broaden scope of get/release_input_file to scope of Module.
Summary:
Move calls to get_input_file and release_input_file out of
getModuleForFile(). Otherwise release_input_file may end up
unmapping a view of the file while the view is still being
used by the Module (on 32-bit hosts).

Fix for PR22482.

Test Plan: Add test using --no-map-whole-files.

Reviewers: rafael, nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7539

llvm-svn: 228842
2015-02-11 16:12:50 +00:00
Jonas Paulsson
763ee9e6b6 Fix SelectionDAG compile time issue with alias analysis.
Add new token factor node and its users to worklist if alias analysis is
turned on, in DAGCombiner::visitTokenFactor(). Alias analysis may cause
a lot of new token factors to be inserted into the DAG, and they need to
be optimized to avoid significant slow-downs.

Reviewed by Hal Finkel.

llvm-svn: 228841
2015-02-11 16:10:31 +00:00
Sanjay Patel
247ca77ae9 fixed to test features, not CPUs
llvm-svn: 228836
2015-02-11 15:00:41 +00:00
Sanjay Patel
14ceb7a2fc fixed to test features, not CPUs
llvm-svn: 228835
2015-02-11 15:00:19 +00:00
Sanjay Patel
95a977af34 fixed to test features, not CPUs
llvm-svn: 228834
2015-02-11 14:58:25 +00:00
Rafael Espindola
a27a12b323 Don't repeat name in comment and clang-format a function.
llvm-svn: 228831
2015-02-11 14:44:17 +00:00
Marek Olsak
63a18b6d87 R600/SI: Enable a lot of existing tests for VI (squashed commits)
This is a union of these commits:

* R600/SI: Enable more tests for VI which need no changes

* R600/SI: Enable V_BCNT tests for VI
    Differences:
    - v_bcnt_..._e32 -> _e64
    - s_load_dword* inline offset is in bytes instead of dwords

* R600/SI: Enable all tests for VI which use S_LOAD_DWORD
    The inline offset is changed from dwords to bytes.

* R600/SI: Enable LDS tests for VI
    Differences:
    - the s_load_dword inline offset changed from dwords to bytes
    - the tests checked very little on CI, so they have been fixed to check all
      instructions that "SI" checked

* R600/SI: Enable lshr tests for VI

* R600/SI: Fix divrem64 tests
    - "v_lshl_64" was missing "b" before "64"
    - added VI-NOT checks

* R600/SI: Enable the SI.tid test for VI

* R600/SI: Enable the frem test for VI
    Also, the frem_f64 checking is added for CI-VI.

* R600/SI: Add VI tests for rsq.clamped

llvm-svn: 228830
2015-02-11 14:26:46 +00:00
Andrea Di Biagio
70c7608263 [TTI] Improved cost heuristic for cttz/ctlz calls.
This patch is a follow-up of r228826 (see code-review: D7506).

Now that SimplifyCFG uses TargetTransformInfo for cost analysis, we 
have to fix the cost heuristic for intrinsic calls to cttz/ctlz.

This patch defines method 'getIntrinsicCost' in BasicTTIImpl: now, BasicTTIImpl
queries TLI to check if a call to cttz/ctlz is cheap for the target.

Added test cases in Transforms/SimplifyCFG/X86 to verify that on x86,
SimplifyCFG only speculates a call to cttz/ctlz if it is cheap.

Differential Revision: http://reviews.llvm.org/D7554

llvm-svn: 228829
2015-02-11 14:22:18 +00:00
James Molloy
e97d8824e7 Make buildbots better.
This testcase change was associated incorrectly to a followup commit in my git tree, not the base commit. Sorry!

llvm-svn: 228827
2015-02-11 12:24:09 +00:00
James Molloy
ba8cd33738 [SimplifyCFG] Swap to using TargetTransformInfo for cost
analysis.

We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness"
heuristic and use TTI directly. Generally NFC intended, but we're using a slightly
different heuristic now so there is a slight test churn.

Test changes:
  * combine-comparisons-by-cse.ll: Removed unneeded branch check.
  * 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq.
  * coalesce-subregs.ll: Superfluous block check.
  * 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv.
  * PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present.
  * select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI.

llvm-svn: 228826
2015-02-11 12:15:41 +00:00
Daniel Sanders
e04bb135ee [mips] Merge disassemblers into a single implementation.
Summary:
Currently we have Mips32 and Mips64 disassemblers and this causes the target
triple to affect the disassembly despite all the relevant information being in
the ELF header. These implementations do not need to be separate.

This patch merges them together such that the appropriate tables are checked
for the subtarget (e.g. Mips64 is checked when GP64 is enabled).

Reviewers: vmedic

Reviewed By: vmedic

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7498

llvm-svn: 228825
2015-02-11 11:28:56 +00:00
James Molloy
c9ce650708 [LoopReroll] Introduce the concept of DAGRootSets.
A DAGRootSet models an induction variable being used in a rerollable
loop. For example:

   x[i*3+0] = y1
   x[i*3+1] = y2
   x[i*3+2] = y3

   Base instruction -> i*3
                    +---+----+
                   /    |     \
               ST[y1]  +1     +2  <-- Roots
                        |      |
                      ST[y2] ST[y3]

There may be multiple DAGRootSets, for example:

   x[i*2+0] = ...   (1)
   x[i*2+1] = ...   (1)
   x[i*2+4] = ...   (2)
   x[i*2+5] = ...   (2)
   x[(i+1234)*2+5678] = ... (3)
   x[(i+1234)*2+5679] = ... (3)

This concept is similar to the "Scale" member used previously, but allows
multiple independent sets of roots based off the same induction variable.

llvm-svn: 228821
2015-02-11 09:19:47 +00:00
David Majnemer
fbc347f596 AsmParser: Validate alloca's type
An alloca's type should be weird things like metadata.

llvm-svn: 228820
2015-02-11 09:13:11 +00:00