1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00
Commit Graph

135006 Commits

Author SHA1 Message Date
Junmo Park
8f6e9b15bb Minor code cleanups. NFC.
llvm-svn: 275637
2016-07-15 22:42:52 +00:00
Jacques Pienaar
c2fe8f630d [lanai] Small cleanup: remove/comment out unused args
llvm-svn: 275636
2016-07-15 22:38:32 +00:00
Matt Arsenault
7bd4f763f7 AMDGPU: Fix verifier error from partially undef copy
In this situation:

%VGPR2<def> = BUFFER_LOAD_DWORD_OFFSET %SGPR8_SGPR9_SGPR10_SGPR11,
%VGPR7<def,tied3> = V_MAC_F32_e32 %VGPR0<undef>, %VGPR1<kill>, %VGPR7<kill,tied0>, %EXEC<imp-use>
%VGPR3_VGPR4_VGPR5_VGPR6<def> = COPY %VGPR0_VGPR1_VGPR2_VGPR3
%VGPR4<def> = COPY %VGPR2

The copy for VGPR1 -> VGPR4 was an error from reading undefined VGPR1,
but VGPR4 is defined immediately after this copy.

llvm-svn: 275635
2016-07-15 22:32:02 +00:00
Michael Kuperstein
0c6bc1e223 ExpandPostRAPseudos should transfer implicit uses, not only implicit defs
Previously, we would expand:
%BL<def> = COPY %DL<kill>, %EBX<imp-use,kill>, %EBX<imp-def>
Into:
%BL<def> = MOV8rr %DL<kill>, %EBX<imp-def>
Dropping the imp-use on the floor.

That confused CriticalAntiDepBreaker, which (correctly) assumes that if an
instruction defs but doesn't use a register, that register is dead immediately
before the instruction - while in this case, the high lanes of EBX can be very
much alive.

This fixes PR28560.

Differential Revision: https://reviews.llvm.org/D22425

llvm-svn: 275634
2016-07-15 22:31:14 +00:00
Alexei Starovoitov
cd643a03d0 BPF: Use official ELF e_machine value
The same value for EM_BPF is being propagated to glibc,
elfutils, and binutils.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 275633
2016-07-15 22:27:55 +00:00
Jacques Pienaar
0ef2903abf [lanai] Fix build by updating calls to getLoad & getStore.
rL275592 removed the boolean parameters of SelectionDAG::getLoad and getStore, updating Lanai backend's calls to these functions.

llvm-svn: 275631
2016-07-15 22:18:33 +00:00
Zachary Turner
ae0563fb6f [pdb] Teach MsfBuilder and other classes about the Free Page Map.
Block 1 and 2 of an MSF file are bit vectors that represent the
list of blocks allocated and free in the file.  We had been using
these blocks to write stream data and other data, so we mark them
as the free page map now.  We don't yet serialize these pages to
the disk, but at least we make a note of what it is, and avoid
writing random data to them.

Doing this also necessitated cleaning up some of the tests to be
more general and hardcode fewer values, which is nice.

llvm-svn: 275629
2016-07-15 22:17:19 +00:00
Zachary Turner
dcc0901002 [pdb] Round trip the NameMap data structure to YAML.
llvm-svn: 275628
2016-07-15 22:17:08 +00:00
Zachary Turner
88e1ef47a6 [pdb] Use MsfBuilder to handle the writing PDBs.
Previously we would read a PDB, then write some of it back out,
but write the directory, super block, and other pertinent metadata
back out unchanged.  This generates incorrect PDBs since the amount
of data written was not always the same as the amount of data read.

This patch changes things to use the newly introduced `MsfBuilder`
class to write out a correct and accurate set of Msf metadata for
the data *actually* written, which opens up the door for adding and
removing type records, symbol records, and other types of data to
an existing PDB.

llvm-svn: 275627
2016-07-15 22:16:56 +00:00
Matt Arsenault
1da8b00d20 StructurizeCFG: Fix inverting constantexpr conditions
llvm-svn: 275626
2016-07-15 22:13:16 +00:00
Krzysztof Parzyszek
a91fcd9951 [Hexagon] Handle instruction latency for 0 or 2 cycles
The Hexagon schedulers need to handle instructions with a latency
of 0 or 2 more accurately. The problem, in v60, is that a dependence
between two instructions with a 2 cycle latency can use a .cur version
of the source to achieve a 0 cycle latency when the use is in the
same packet. Any othe use, must be at least 2 packets later, or a
stall occurs. In other words, the compiler does not want to schedule
the dependent instructions 1 cycle later.

To achieve this, the latency adjustment code allows only a single
dependence to have a zero latency. All other instructions have the
other value, which is typically 2 cycles. We use a heuristic to
determine which instruction gets the 0 latency.

The Hexagon machine scheduler was also changed to increase the cost
associated with 0 latency dependences than can be scheduled in the
same packet.

Patch by Brendon Cahoon.

llvm-svn: 275625
2016-07-15 21:34:02 +00:00
Matt Arsenault
e7de44dc23 AMDGPU: Remove brev intrinsic
llvm-svn: 275620
2016-07-15 21:27:13 +00:00
Matt Arsenault
6e4504e0db AMDGPU: Fix TargetPrefix for remaining r600 intrinsics
llvm-svn: 275619
2016-07-15 21:27:08 +00:00
Matt Arsenault
6779483bea AMDGPU: Remove AMDGPU.ldexp
llvm-svn: 275618
2016-07-15 21:26:56 +00:00
Matt Arsenault
3bfc10ac74 AMDGPU: Remove legacy rsq.clamped intrinsic
Mesa still has a use of llvm.AMDGPU.rsq.f64 remaining.

Also fix mismatch with non-IEEE rsq selecting to IEEE rsq.

llvm-svn: 275617
2016-07-15 21:26:52 +00:00
Matt Arsenault
c5aaf6dacb AMDGPU/R600: Delete dead code.
Dead or the same as the base implementation.

llvm-svn: 275616
2016-07-15 21:26:46 +00:00
Saleem Abdulrasool
af36d32b29 DebugInfo: reorder some initializers
Fix a few initialization ordering warnings from gcc from `-Wreorder`.  NFC.

llvm-svn: 275615
2016-07-15 21:10:31 +00:00
Saleem Abdulrasool
b56f241280 CodeGen: avoid emitting unnecessary CFI
Remove unnecessary clutter in assembly output.  When using SjLj EH, the CFI is
not actually used for anything.  Do not emit the CFI needlessly.  The minor test
adjustments are interesting.  The prologue test was just overzealous matcching.
The interesting case is the LSDA change.  It was originally added to ensure that
various compilations did not mangle the name (it explicitly checked the name!).
However, subsequent cleanups made it more reliant on the CFI to find the name.
Parse the generated code flow to generically find the label still.

llvm-svn: 275614
2016-07-15 21:10:29 +00:00
Michael Zolotukhin
52e234528e Make processInstruction from LCSSA.cpp externally available.
Summary:
When a pass tries to keep LCSSA form it's often convenient to be able to update
LCSSA for a set of instructions rather than for the entire loop. This patch makes the
processInstruction from LCSSA externally available under a name
formLCSSAForInstruction.

Reviewers: chandlerc, sanjoy, hfinkel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D22378

llvm-svn: 275613
2016-07-15 21:08:41 +00:00
Zachary Turner
e62ff3e5c3 [pdb] Introduce MsfBuilder for laying out PDB files.
Reviewed by: ruiu
Differential Revision: https://reviews.llvm.org/D22308

llvm-svn: 275611
2016-07-15 20:43:38 +00:00
Nico Weber
37fadfd6e6 Teach fast isel about the win64 calling convention.
This mostly just works.

Vectorcall rets are still not supported.

The win64_eh test change is because fast isel doesn't use rsi for temporary
computations, so it doesn't need to be pushed. The test case I'm changing was
originally added to test pushes, but by now there are other test cases in that
file exercising that code path.

https://reviews.llvm.org/D22422

llvm-svn: 275607
2016-07-15 20:18:37 +00:00
Krzysztof Parzyszek
951bb7f3b7 [Hexagon] Make MI scheduler check for stalls in previous packet on v60
Patch by Ikhlas Ajbar.

llvm-svn: 275606
2016-07-15 20:16:03 +00:00
George Burgess IV
1820150537 [CFLAA] Add attributes handling for CFLAnders.
This patch adds proper handling of stratified attributes into our
anders-style CFLAA implementation. It also comes bundled with more
CFLAnders tests. :)

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22325

llvm-svn: 275604
2016-07-15 20:02:49 +00:00
Nemanja Ivanovic
647df86d12 [PowerPC] Set kill flag for scratch register when spilling the link register
This fixes PR 28526.

llvm-svn: 275603
2016-07-15 19:56:32 +00:00
George Burgess IV
8b6295a5d8 [CFLAA] Add an initial CFLAnders implementation.
This adds an incomplete anders-style implementation for CFLAA. It's
incomplete in that it's missing interprocedural analysis, attrs
handling, etc. and that it needs more tests. More tests and features
will be added in future commits.

Patch by Jia Chen.

Differential Revision: https://reviews.llvm.org/D22291

llvm-svn: 275602
2016-07-15 19:53:25 +00:00
Derek Schuff
f7bf415c34 Fix calls to SelectionDAG::getStore
It was refactored in r275592. NFC

llvm-svn: 275601
2016-07-15 19:35:43 +00:00
Vitaly Buka
e0764366e0 Revert "[AMDGPU] Add metadata for runtime"
This reverts commit r275566.

llvm-svn: 275599
2016-07-15 19:14:57 +00:00
Krzysztof Parzyszek
5033b424f2 [Hexagon] Replace postprocessDAG with a more elaborate DAG mutation
llvm-svn: 275598
2016-07-15 19:09:37 +00:00
Jingyue Wu
b866f8d6da [ReassociateGEP] Update tests to allow missing "inbounds" on certain GEPs.
With r275532 fixing miscompilation of GVN, "inbounds" on certain GEPs in these
tests cannot be preserved any more. Left a TODO in the tests for future
reference.

llvm-svn: 275596
2016-07-15 18:47:17 +00:00
Sjoerd Meijer
5953b60be7 [MBP] Clean up of the comments, and a first attempt to better describe a part
of the algorithm.

Differential Revision: https://reviews.llvm.org/D22364

llvm-svn: 275595
2016-07-15 18:41:56 +00:00
Sanjay Patel
5deaf38cd5 add tests for associative ops blocked by a cast
These are more generalized versions of the cases added in
r275302 and r275297.

llvm-svn: 275594
2016-07-15 18:39:02 +00:00
Davide Italiano
36f66a1847 [SCCP] Merge two conditions into one. NFCI.
llvm-svn: 275593
2016-07-15 18:33:16 +00:00
Justin Lebar
4964e23787 [SelectionDAG] Get rid of bool parameters in SelectionDAG::getLoad, getStore, and friends.
Summary:
Instead, we take a single flags arg (a bitset).

Also add a default 0 alignment, and change the order of arguments so the
alignment comes before the flags.

This greatly simplifies many callsites, and fixes a bug in
AMDGPUISelLowering, wherein the order of the args to getLoad was
inverted.  It also greatly simplifies the process of adding another flag
to getLoad.

Reviewers: chandlerc, tstellarAMD

Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits

Differential Revision: http://reviews.llvm.org/D22249

llvm-svn: 275592
2016-07-15 18:27:10 +00:00
Justin Lebar
700af803a3 [CodeGen] Take a MachineMemOperand::Flags in MachineFunction::getMachineMemOperand.
Summary:
Previously we took an unsigned.

Hooray for type-safety.

Reviewers: chandlerc

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D22282

llvm-svn: 275591
2016-07-15 18:26:59 +00:00
Rong Xu
2775b2cdfa [PGO] IRPGO pre-cleanup pass changes
This patch adds a selected set of cleanup passes including a pre-inline pass
before LLVM IR PGO instrumentation. The inline is only intended to apply those
obvious/trivial ones before instrumentation so that much less instrumentation
is needed to get better profiling information. This will drastically improve
the instrumented code performance for large C++ applications. Another benefit
is the context sensitive counts that can potentially improve the PGO
optimization.

Differential Revision: http://reviews.llvm.org/D21405

llvm-svn: 275588
2016-07-15 18:10:49 +00:00
Sanjay Patel
ff20508a2c fix documentation comments; NFC
llvm-svn: 275587
2016-07-15 18:03:59 +00:00
Krzysztof Parzyszek
c68e008d2c [Hexagon] Add a scheduling DAG mutation
- Remove output dependencies on USR_OVF register.
- Update chain edge latencies between v60 vector loads/stores.

llvm-svn: 275586
2016-07-15 17:48:09 +00:00
Adam Nemet
cb89dd6834 [OptRemark,LDist] RFC: Add hotness attribute
Summary:
This is the first set of changes implementing the RFC from
http://thread.gmane.org/gmane.comp.compilers.llvm.devel/98334

This is a cross-sectional patch; rather than implementing the hotness
attribute for all optimization remarks and all passes in a patch set, it
implements it for the 'missed-optimization' remark for Loop
Distribution.  My goal is to shake out the design issues before scaling
it up to other types and passes.

Hotness is computed as an integer as the multiplication of the block
frequency with the function entry count.  It's only printed in opt
currently since clang prints the diagnostic fields directly.  E.g.:

  remark: /tmp/t.c:3:3: loop not distributed: use -Rpass-analysis=loop-distribute for more info (hotness: 300)

A new API added is similar to emitOptimizationRemarkMissed.  The
difference is that it additionally takes a code region that the
diagnostic corresponds to.  From this, hotness is computed using BFI.
The new API is exposed via an analysis pass so that it can be made
dependent on LazyBFI.  (Thanks to Hal for the analysis pass idea.)

This feature can all be enabled by setDiagnosticHotnessRequested in the
LLVM context.  If this is off, LazyBFI is not calculated (D22141) so
there should be no overhead.

A new command-line option is added to turn this on in opt.

My plan is to switch all user of emitOptimizationRemark* to use this
module instead.

Reviewers: hfinkel

Subscribers: rcox2, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D21771

llvm-svn: 275583
2016-07-15 17:23:20 +00:00
Kostya Serebryany
58e21cf3cd [libFuzzer] add ThreadedLeakTest
llvm-svn: 275582
2016-07-15 17:19:43 +00:00
David Majnemer
80d5684bf3 [AliasAnalysis] Give back AA results for fence instructions
Calling getModRefInfo with a fence resulted in crashes because fences
don't have a memory location.  Add a new predicate to Instruction
called isFenceLike which indicates that the instruction mutates memory
but not any single memory location in particular. In practice, it is a
proxy for the set of instructions which "mayWriteToMemory" but cannot be
used with MemoryLocation::get.

This fixes PR28570.

llvm-svn: 275581
2016-07-15 17:19:24 +00:00
Krzysztof Parzyszek
c179479969 [Hexagon] Update instruction itineraries
llvm-svn: 275578
2016-07-15 16:58:34 +00:00
Dehao Chen
7ad979406b [PM] Convert LoopInstSimplify Pass to new PM
Summary: Convert LoopInstSimplify to new PM. Unfortunately there is no exisiting unittest for this pass.

Reviewers: davidxl, silvas

Subscribers: silvas, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D22280

llvm-svn: 275576
2016-07-15 16:42:11 +00:00
Justin Bogner
5a8c8a3672 IR: Sort generic intrinsics before target specific ones
This splits out the intrinsic table such that generic intrinsics come
first and target specific intrinsics are grouped by target. From here
we can find out which target an intrinsic is for or differentiate
between generic and target intrinsics.

The motivation here is to make it easier to move target specific
intrinsic handling out of generic code.

llvm-svn: 275575
2016-07-15 16:31:37 +00:00
Krzysztof Parzyszek
af2caac28b [Hexagon] Fixes/changes to instruction selection
- Add patterns for rr/abs addressing modes.
- Set addrMode to PostInc where necessary.
- Misc fixes.

llvm-svn: 275574
2016-07-15 16:29:02 +00:00
Jun Bum Lim
f9a637b0bf [DSE]Enhance shorthening MemIntrinsic based on OverlapIntervals
Summary:
This change use the overlap interval map built from partial overwrite tracking to perform shortening MemIntrinsics.
Add test cases which was missing opportunities before.

Reviewers: hfinkel, eeckstein, mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D21909

llvm-svn: 275571
2016-07-15 16:14:34 +00:00
Krzysztof Parzyszek
6119f0e34d [Hexagon] Improve patterns with stack-based addressing
- Treat bitwise OR with a frame index as an ADD wherever possible, fold it
  into addressing mode.
- Extend patterns for memops to allow memops with frame indexes as address
  operands.

llvm-svn: 275569
2016-07-15 15:35:52 +00:00
Nico Weber
651963bf5d In dag-optnone.ll, use varargs instead of win64 to fast SDIsel.
The test used to rely on targeting win64 to disable fast isel,
but I'd like to teach fast isel about win64 rets.  Change the
test to use varargs to disable fast isel.

llvm-svn: 275568
2016-07-15 15:30:18 +00:00
Matthew Simpson
39038377d5 [LV] Swap A and B in interleaved access analysis (NFC)
This patch swaps A and B in the interleaved access analysis and clarifies
related comments. The algorithm is more intuitive if we let access A precede
access B in program order rather than the reverse. This change was requested in
the review of D19984.

llvm-svn: 275567
2016-07-15 15:22:43 +00:00
Yaxun Liu
f670ee0ea9 [AMDGPU] Add metadata for runtime
Added emitting metadata to elf for runtime.

Runtime requires certain information (metadata) about kernels to be able to execute and query them. Such information is emitted to an elf section as a key-value pair stream.

Differential Revision: https://reviews.llvm.org/D21849

llvm-svn: 275566
2016-07-15 14:58:21 +00:00
Jacques Pienaar
4ab4ea3179 Rename AnalyzeBranch* to analyzeBranch*.
Summary: NFC. Rename AnalyzeBranch/AnalyzeBranchPredicate to analyzeBranch/analyzeBranchPredicate to follow LLVM coding style and be consistent with TargetInstrInfo's analyzeCompare and analyzeSelect.

Reviewers: tstellarAMD, mcrosier

Subscribers: mcrosier, jholewinski, jfb, arsenm, dschuff, jyknight, dsanders, nemanjai

Differential Revision: https://reviews.llvm.org/D22409

llvm-svn: 275564
2016-07-15 14:41:04 +00:00