1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 05:23:45 +02:00
Commit Graph

139307 Commits

Author SHA1 Message Date
Tom Stellard
d2d7b48ba5 AMDGPU/SI: Handle div_fmas hazard in GCNHazardRecognizer
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D25250

llvm-svn: 283622
2016-10-07 23:42:48 +00:00
Kyle Butt
71644ee1e5 Codegen: Tail-duplicate during placement.
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.

In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.

This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.

Issue from previous rollback fixed, and a new test was added for that
case as well. Issue was worklist/scheduling/taildup issue in layout.

Issue from 2nd rollback fixed, with 2 additional tests. Issue was
tail merging/loop info/tail-duplication causing issue with loops that share
a header block.

Differential revision: https://reviews.llvm.org/D18226

llvm-svn: 283619
2016-10-07 22:33:20 +00:00
Arnold Schwaighofer
6fad756c74 swifterror: Don't compute swifterror vregs during instruction selection
The code used llvm basic block predecessors to decided where to insert phi
nodes. Instruction selection can and will liberally insert new machine basic
block predecessors. There is not a guaranteed one-to-one mapping from pred.
llvm basic blocks and machine basic blocks.

Therefore the current approach does not work as it assumes we can mark
predecessor machine basic block as needing a copy, and needs to know the set of
all predecessor machine basic blocks to decide when to insert phis.

Instead of computing the swifterror vregs as we select instructions, propagate
them at the end of instruction selection when the MBB CFG is complete.

When an instruction needs a swifterror vreg and we don't know the value yet,
generate a new vreg and remember this "upward exposed" use, and reconcile this
at the end of instruction selection.

This will only happen if the target supports promoting swifterror parameters to
registers and the swifterror attribute is used.

rdar://28300923

llvm-svn: 283617
2016-10-07 22:06:55 +00:00
Sanjay Patel
1d88c15c1d [DAG] clean up foldSelectOfConstants(); NFCI
Rename variables, simplify logic. 
Not clear yet why we don't handle a target with ZeroOrNegativeOneBooleanContent too.

llvm-svn: 283613
2016-10-07 21:55:42 +00:00
Davide Italiano
99e8d201a0 [InstCombine] Don't unpack arrays that are too large (part 2).
This is similar to r283599, but for store instructions.
Thanks to David for pointing out!

llvm-svn: 283612
2016-10-07 21:53:09 +00:00
Zachary Turner
fba0249962 Add missing include.
llvm-svn: 283610
2016-10-07 21:40:06 +00:00
Zachary Turner
4b8a9c1349 Refactor Symbol visitor code.
Type visitor code had already been refactored previously to
decouple the visitor and the visitor callback interface.  This
was necessary for having the flexibility to visit in different
ways (for example, dumping to yaml, reading from yaml, dumping
to ScopedPrinter, etc).

This patch merely implements the same visitation pattern for
symbol records that has already been implemented for type records.

llvm-svn: 283609
2016-10-07 21:34:46 +00:00
Hongbin Zheng
57e4e747e3 [cmake] Treat polly as "in tree" if LLVM_EXTERNAL_POLLY_SOURCE_DIR is provided
Differential Revision: https://reviews.llvm.org/D25354

llvm-svn: 283608
2016-10-07 21:32:47 +00:00
Davide Italiano
c6c98e0346 [InstCombine] Don't unpack arrays that are too large
Differential Revision:  https://reviews.llvm.org/D25376

llvm-svn: 283599
2016-10-07 20:57:42 +00:00
Sanjay Patel
b124036144 [DAG] move fold (select C, 0, 1 -> xor C, 1) to a helper function; NFC
We're missing at least 3 other similar folds based on what we have in InstCombine. 

llvm-svn: 283596
2016-10-07 20:47:51 +00:00
Tom Stellard
7128408133 AMDGPU/SI: Add support for 8-byte relocations
Reviewers: arsenm, kzhuravl

Subscribers: wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25375

llvm-svn: 283593
2016-10-07 20:36:58 +00:00
Anna Thomas
99e41a98d7 [RS4GC] Strengthen coverage: add more tests
Summary: Add tests for cases where we have zero coverage in RS4GC.

Reviewers: sanjoy, reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25341

llvm-svn: 283591
2016-10-07 20:34:00 +00:00
Colin LeMahieu
c46eeafb38 [Hexagon][NFC] Using documented instruction type name V4LDST instead of MEMOP.
llvm-svn: 283582
2016-10-07 19:11:28 +00:00
Mehdi Amini
c7f513e186 Recommit "Use StringRef in LTOModule implementation (NFC)""
This reverts commit r283456 and reapply r282997, with explicitly
zeroing the struct member to workaround a bug in MSVC2013 with
zero-initialization: https://connect.microsoft.com/VisualStudio/feedback/details/802160

llvm-svn: 283581
2016-10-07 19:05:14 +00:00
Davide Italiano
f1729bd1f0 [LoopIdiomRecognize] Merge two if conditions into one. NFCI.
llvm-svn: 283579
2016-10-07 18:39:43 +00:00
Sanjay Patel
4b9d19b1ec [InstCombine] fold select X, (ext X), C
If we're going to canonicalize IR towards select of constants, try harder to create those.
Also, don't lose the metadata.

This is actually 4 related transforms in one patch:
      // select X, (sext X), C --> select X, -1, C
      // select X, (zext X), C --> select X,  1, C
      // select X, C, (sext X) --> select X, C, 0
      // select X, C, (zext X) --> select X, C, 0

Differential Revision: https://reviews.llvm.org/D25126

llvm-svn: 283575
2016-10-07 17:53:07 +00:00
Adam Nemet
1496f9df72 New utility to visualize optimization records
This is a new tool built on top of the new YAML ouput generated from
optimization remarks.  It produces HTML for easy navigation and
visualization.

The tool assumes that hotness information for the remarks is available
(the YAML file was produced with PGO).  It uses hotness to list the
remarks prioritized by the hotness on the index page.  Clicking the
source location of the remark in the list takes you the source where the
remarks are rendedered inline in the source.

For now, the tool is meant as prototype.

It's written in Python.  It uses PyYAML to parse the input.

Differential Revision: https://reviews.llvm.org/D25348

llvm-svn: 283571
2016-10-07 17:06:34 +00:00
Tom Stellard
0f9040e19e AMDGPU/SI: Emit fixups for long branches
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25366

llvm-svn: 283570
2016-10-07 16:01:18 +00:00
Simon Pilgrim
4fab02e083 [X86][SSE] Reapplied: Add vector fcopysign combine tests
Now with better lowering and fix for PR30443

llvm-svn: 283569
2016-10-07 16:00:59 +00:00
Artem Tamazov
d87d4169f2 [AMDGPU][mc] Add support for buffer_load_dwordx3, buffer_store_dwordx3.
Partially fixes Bug 28232.
Lit tests added.

Differential Revision: https://reviews.llvm.org/D25367

llvm-svn: 283567
2016-10-07 15:53:16 +00:00
Dehao Chen
00cca02991 Invoke add-discriminator at -g0 -fsample-profile
Summary: -fsample-profile needs discriminator, which will not be added if built with -g0. This patch makes sure the discriminator is added for sample-profile at -g0. A followup patch will be send out to update clang tests.

Reviewers: davidxl, dblaikie, echristo, dnovillo

Subscribers: mehdi_amini, probinson, llvm-commits

Differential Revision: https://reviews.llvm.org/D25132

llvm-svn: 283565
2016-10-07 15:21:31 +00:00
Matthew Simpson
c4d75790e7 [LV] Don't mark multi-use branch conditions uniform
Previously, we marked the branch conditions of latch blocks uniform after
vectorization if they were instructions contained in the loop. However, if a
condition instruction has users other than the branch, it may not remain
uniform. This patch ensures the conditions we mark uniform are only used by the
branch. This should fix PR30627.

Reference: https://llvm.org/bugs/show_bug.cgi?id=30627
llvm-svn: 283563
2016-10-07 15:20:13 +00:00
Krzysztof Parzyszek
daa0d1c208 Only track physical registers in LivePhysRegs
llvm-svn: 283561
2016-10-07 14:50:49 +00:00
Sam Kolton
cbf772ce03 [AMDGPU] Assembler: support v_mac_f32 DPP and SDWA. Move getNamedOperandIdx to AMDGPUBaseInfo.h
Reviewers: artem.tamazov, tstellarAMD

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D25084

llvm-svn: 283560
2016-10-07 14:46:06 +00:00
Simon Pilgrim
24e66816ec [X86][SSE] Tidied up tests - use standard check prefixes
llvm-svn: 283559
2016-10-07 14:42:22 +00:00
Konstantin Zhuravlyov
12f780d0fd [AMDGPU] AMDGPUCodeGenPrepare: remove extra ';'
llvm-svn: 283558
2016-10-07 14:39:53 +00:00
Tom Stellard
6fa6d1d712 [ValueTracking] Fix crash in GetPointerBaseWithConstantOffset()
Summary:
While walking defs of pointer operands we were assuming that the pointer
size would remain constant.  This is not true, because addresspacecast
instructions may cast the pointer to an address space with a different
pointer width.

This partial reverts r282612, which was a more conservative solution
to this problem.

Reviewers: reames, sanjoy, apilipenko

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D24772

llvm-svn: 283557
2016-10-07 14:23:29 +00:00
Konstantin Zhuravlyov
602c1c7441 [AMDGPU] Promote uniform (i1, i16] operations to i32
Differential Revision: https://reviews.llvm.org/D25302

llvm-svn: 283555
2016-10-07 14:22:58 +00:00
Benjamin Kramer
257e127eaa Remove spurious non-printable character from source file.
NFC.

llvm-svn: 283552
2016-10-07 13:46:38 +00:00
Javed Absar
a39b9200fb [ARM]: add missing switch case for cortex-r52
Adds a missing switch case for handling cortex-r52
in init-subtarget-features.

llvm-svn: 283551
2016-10-07 13:41:55 +00:00
Martin Storsjo
59ad31e4f1 [ARM] Reapply: Use __rt_div functions for divrem on Windows
Reapplying r283383 after revert in r283442. The additional fix
is a getting rid of a stray space in a function name, in the
refactoring part of the commit.

This avoids falling back to calling out to the GCC rem functions
(__moddi3, __umoddi3) when targeting Windows.

The __rt_div functions have flipped the two arguments compared
to the __aeabi_divmod functions. To match MSVC, we emit a
check for division by zero before actually calling the library
function (even if the library function itself also might do
the same check).

Not all calls to __rt_div functions for division are currently
merged with calls to the same function with the same parameters
for the remainder. This is more wasteful than a div + mls as before,
but avoids calls to __moddi3.

Differential Revision: https://reviews.llvm.org/D25332

llvm-svn: 283550
2016-10-07 13:28:53 +00:00
Javed Absar
636ccd32c2 [ARM]: Add Cortex-R52 target to LLVM
This patch adds Cortex-R52, the new ARM real-time processor, to LLVM. 
Cortex-R52 implements the ARMv8-R architecture.

llvm-svn: 283542
2016-10-07 12:06:40 +00:00
Simon Pilgrim
1b51489947 [X86][SSE] Update register class during MOVSD/MOVSS - BLENDPD/BLENDPS commutation
MOVSD/MOVSS take a 128-bit register and a FR32/FR64 register input, the commutation code wasn't taking this into account leading to verification errors.

This patch inserts a vreg copy mi to ensure that the registers are correct.

Fix for PR30607

Differential Revision: https://reviews.llvm.org/D25280

llvm-svn: 283539
2016-10-07 11:18:38 +00:00
Alexey Bataev
0a23402e2a [SLPVectorizer] Fix for PR25748: reduction vectorization after loop
unrolling.

The next code is not vectorized by the SLPVectorizer:
```
 int test(unsigned int *p) {
  int sum = 0;
  for (int i = 0; i < 8; i++)
    sum += p[i];
  return sum;
 }
```
During optimization this loop is fully unrolled and SLPVectorizer is
unable to vectorize it. Patch tries to fix this problem.

Differential Revision: https://reviews.llvm.org/D24796

llvm-svn: 283535
2016-10-07 09:39:22 +00:00
Oliver Stannard
7be9f2d236 [ARM] Don't convert switches to lookup tables of pointers with ROPI/RWPI
With the ROPI and RWPI relocation models we can't always have pointers
to global data or functions in constant data, so don't try to convert switches
into lookup tables if any value in the lookup table would require a relocation.
We can still safely emit lookup tables of other values, such as simple
constants.

Differential Revision: https://reviews.llvm.org/D24462

llvm-svn: 283530
2016-10-07 08:48:24 +00:00
Mehdi Amini
ddf3596f96 Use StringRef in ARMELFStreamer (NFC)
llvm-svn: 283529
2016-10-07 08:48:07 +00:00
Nicolai Haehnle
65a5ad0126 AMDGPU: Fix use-after-free in SIOptimizeExecMasking
Summary:
There was a bug with sequences like

   s_mov_b64 s[0:1], exec
   s_and_b64 s[2:3]<def>, s[0:1], s[2:3]<kill>
   ...
   s_mov_b64_term exec, s[2:3]

because s[2:3] was defined and used in the same instruction, ending up with
SaveExecInst inside OtherUseInsts.

Note that the test case also exposes an unrelated bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98028

Reviewers: tstellarAMD, arsenm

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25306

llvm-svn: 283528
2016-10-07 08:40:14 +00:00
Mehdi Amini
eb017ddb32 Use StringReg in TargetParser APIs (NFC)
llvm-svn: 283527
2016-10-07 08:37:29 +00:00
Mehdi Amini
2052587793 Revert "Revert "Add a static_assert to enforce that parameters to llvm::format() are not totally unsafe""
This reverts commit r283510 and reapply r283509, with updates to
clang-tools-extra as well.

llvm-svn: 283525
2016-10-07 08:25:42 +00:00
Craig Topper
00daad258e [X86] Fix patterns for VPMULLD and VPCMPEQQ to not require aligned loads.
llvm-svn: 283524
2016-10-07 06:54:43 +00:00
Craig Topper
4341a1c13b [X86] Remove unused PatFrags. NFC
llvm-svn: 283523
2016-10-07 06:54:39 +00:00
Dylan McKay
da0a66f223 [AVR] Add the AVRMCInstLower class
Summary:
This class deals with the lowering of CodeGen `MachineInstr` objects to
MC `MCInst` objects.

Reviewers: kparzysz, arsenm

Subscribers: wdng, beanz, japaric, mgorny

Differential Revision: https://reviews.llvm.org/D25269

llvm-svn: 283522
2016-10-07 06:13:09 +00:00
Matt Arsenault
c295321dee AMDGPU: Change check prefix in test
llvm-svn: 283521
2016-10-07 03:55:04 +00:00
Hal Finkel
57e95aa8b8 [llvm-opt-report] Left justify unrolling counts, etc.
In the left part of the reports, we have things like U<number>; if some of
these numbers use more digits than others, we don't want a space in between the
U and the start of the number. Instead, the space should come afterward. This
way it is clear that the number goes with the U and not any other optimization
indicator that might come later on the line.

Tests committed in r283518.

llvm-svn: 283519
2016-10-07 02:01:03 +00:00
Hal Finkel
553b67c457 [llvm-opt-report] Left justify unrolling counts, etc.
In the left part of the reports, we have things like U<number>; if some of
these numbers use more digits than others, we don't want a space in between the
U and the start of the number. Instead, the space should come afterward. This
way it is clear that the number goes with the U and not any other optimization
indicator that might come later on the line.

llvm-svn: 283518
2016-10-07 01:57:06 +00:00
David Majnemer
7890f416e2 [SimplifyCFG] Correctly test for unconditional branches in GetCaseResults
GetCaseResults assumed that a terminator with one successor was an
unconditional branch.  This is not necessarily the case, it could be a
cleanupret.

Strengthen the check by querying whether or not the terminator is
exceptional.

llvm-svn: 283517
2016-10-07 01:38:35 +00:00
Hal Finkel
4521e1e1b7 [llvm-opt-report] Use -no-demangle to disable demangling
As this is intended to be a user-facing option, -no-demangle seems much better
than -demangle=0. Add testing for the option.

llvm-svn: 283516
2016-10-07 01:30:59 +00:00
Peter Collingbourne
a067ba4392 Target: Remove unused patterns and transforms. NFC.
llvm-svn: 283515
2016-10-07 00:30:49 +00:00
Colin LeMahieu
7f4bcd51cf [Hexagon] NFC Removing 'V4_' prefix from duplex instruction names.
llvm-svn: 283514
2016-10-07 00:15:07 +00:00
Michael Kuperstein
e1ab454f4d [LV] Remove triples from target-independent vectorizer tests. NFC.
Vectorizer tests in the target-independent directory should not have a target
triple. If a test really needs to query a specific backend, it belongs in the
right target subdirectory (which "REQUIRES" the right backend). Otherwise, it
should not specify a triple.

llvm-svn: 283512
2016-10-06 23:57:25 +00:00