1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 19:52:54 +01:00
Commit Graph

162757 Commits

Author SHA1 Message Date
Pavel Labath
2baec4289f [CodeGen] Generate DWARF v5 Accelerator Tables
Summary:
This patch adds a DwarfAccelTableEmitter class, which generates an
accelerator table, as specified in DWARF v5 standard. At the moment it
only generates a DIE offset column and (if we are indexing more than one
compile unit) a CU column.

Indexing type units is not currently supported, as we don't even have
the ability to generate DWARF v5-compatible compile units.

The implementation is not data-source agnostic like the one generating
apple tables. This was not necessary as we currently only have one user
of this code, and without a second user it was not obvious to me how to
best abstract this. (The difference between these tables and the apple
ones is that they need a lot more metadata about the debug info they are
indexing).

The generation is triggered by the --accel-tables argument, which
supersedes the --dwarf-accel-tables arg -- the latter was a simple
on-off switch, but not we can choose between two kinds of accelerator
tables we can generate.

This is tested by parsing the generated tables with llvm-dwarfdump and
the DWARFVerifier, and I've also checked that GNU readelf is able to
make sense of the tables.

Differential Revision: https://reviews.llvm.org/D43286

llvm-svn: 329179
2018-04-04 12:28:20 +00:00
Nico Weber
e86c876871 Remove duplicate tablegen lines from AVR target.
They were added in r285274, in what looks like a merge mishap.
AVRGenMCCodeEmitter.inc is the only non-dupe tablegen invocation added in that
revision.

Also sort the tablegen lines to make this easier to spot in the future.

llvm-svn: 329178
2018-04-04 12:27:43 +00:00
Clement Courbet
961c076aed [llvm-exegesis] Do not initialize FileDescriptor when libpfm is not
available.

llvm-svn: 329177
2018-04-04 12:12:38 +00:00
Clement Courbet
529cdf45b4 [llvm-exegesis] Fix compilation on lld-x86_64-darwin13
YAMLTraits does not know how to serialize `size_t` portably. Use `int`
instead.

llvm-svn: 329176
2018-04-04 12:01:46 +00:00
Clement Courbet
b1515f942f [llvm-exegesis][NFC] Fix compilation warning.
llvm-svn: 329175
2018-04-04 12:01:43 +00:00
Clement Courbet
b4fbdc1e33 [llvm-exegesis][NFC] Fix a few warnings.
llvm-svn: 329174
2018-04-04 12:01:38 +00:00
Andrea Di Biagio
2c2840f2eb [Tablegen] Slightly refactor method SubtargetEmitter::EmitExtraProcessorInfo.
This patch moves most of the logic from EmitExtraProcessorInfo to a couple of
helper functions. No functional change intended.

llvm-svn: 329173
2018-04-04 11:53:13 +00:00
Clement Courbet
661ae9ae77 [llvm-exegesis] Fix build when libpfm is not available.
llvm-svn: 329172
2018-04-04 11:48:15 +00:00
Clement Courbet
fac61870af [llvm-exegesis] Fix compilation on some clang versions.
default initialization of an object of const type 'const llvm::DebugLoc' requires a user-provided default constructor.

llvm-svn: 329171
2018-04-04 11:45:53 +00:00
Benjamin Kramer
d4a8a6c97f Make helpers static. NFC.
llvm-svn: 329170
2018-04-04 11:45:11 +00:00
Clement Courbet
c7b418ae52 Re-land r329156 "Add llvm-exegesis tool."
Fixed to depend on and initialize the native target instead of X86.

llvm-svn: 329169
2018-04-04 11:37:06 +00:00
Simon Pilgrim
177fcba959 [X86][CostModel] Use generic SSE levels instead of particular CPUs for shuffle costs
llvm-svn: 329168
2018-04-04 11:14:12 +00:00
Nicolai Haehnle
fad4cbdb84 AMDGPU: Dimension-aware image intrinsics
Summary:
These new image intrinsics contain the texture type as part of
their name and have each component of the address/coordinate as
individual parameters.

This is a preparatory step for implementing the A16 feature, where
coordinates are passed as half-floats or -ints, but the Z compare
value and texel offsets are still full dwords, making it difficult
or impossible to distinguish between A16 on or off in the old-style
intrinsics.

Additionally, these intrinsics pass the 'texfailpolicy' and
'cachectrl' as i32 bit fields to reduce operand clutter and allow
for future extensibility.

v2:
- gather4 supports 2darray images
- fix a bug with 1D images on SI

Change-Id: I099f309e0a394082a5901ea196c3967afb867f04

Reviewers: arsenm, rampitec, b-sumner

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D44939

llvm-svn: 329166
2018-04-04 10:58:54 +00:00
Nicolai Haehnle
74b2a804dd StructurizeCFG: Test for branch divergence correctly
Fixes cases like the new test @nonuniform. In that test, %cc itself
is a uniform value; however, when reading it after the end of the loop in
basic block %if, its value is effectively non-uniform, so the branch is
non-uniform.

This problem was encountered in
https://bugs.freedesktop.org/show_bug.cgi?id=103743; however, this change
in itself is not sufficient to fix that bug, as there is another issue
in the AMDGPU backend.

As discovered after committing an earlier version of this change, this
exposes a subtle interaction between this pass and DivergenceAnalysis:
since we remove and re-create branch instructions, we can no longer rely
on DivergenceAnalysis for branches in subregions that were already
processed by the pass.

Explicitly remove branch instructions from DivergenceAnalysis to
avoid dangling pointers as a matter of defensive programming, and
change how we detect non-uniform subregions.

Change-Id: I32bbffece4a32f686fab54964dae1a5dd72949d4

Differential Revision: https://reviews.llvm.org/D43743

llvm-svn: 329165
2018-04-04 10:58:15 +00:00
Nicolai Haehnle
c7ac2e0996 AMDGPU: Fix copying i1 value out of loop with non-uniform exit
Summary:
When an i1-value is defined inside of a loop and used outside of it, we
cannot simply use the SGPR bitmask from the loop's last iteration.

There are also useful and correct cases of an i1-value being copied between
basic blocks, e.g. when a condition is computed outside of a loop and used
inside it. The concept of dominators is not sufficient to capture what is
going on, so I propose the notion of "lane-dominators".

Fixes a bug encountered in Nier: Automata.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103743
Change-Id: If37b969ddc71d823ab3004aeafb9ea050e45bd9a

Reviewers: arsenm, rampitec

Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D40547

llvm-svn: 329164
2018-04-04 10:57:58 +00:00
John Brawn
b6e03097ce [AArch64] Add patterns matching (fabs (fsub x y)) to (fabd x y)
Differential Revision: https://reviews.llvm.org/D44573

llvm-svn: 329163
2018-04-04 10:12:53 +00:00
Sam Parker
09b5caa8ce [DAGCombine] Improve ReduceLoadWidth for SRL
Recommitting rL321259. Previosuly this caused an issue with PPCBE but
I didn't receieve a reproducer and didn't have the time to follow up.
If the issue appears again, please provide a reproducer so I can fix
it.

Original commit message:

If the SRL node is only used by an AND, we may be able to set the
ExtVT to the width of the mask, making the AND redundant. To support
this, another check has been added in isLegalNarrowLoad which queries
whether the load is valid.

Differential Revision: https://reviews.llvm.org/D41350

llvm-svn: 329160
2018-04-04 09:26:56 +00:00
Mikhail Maltsev
304b2da27b [ARM] Do not convert some vmov instructions
Summary:
Patch https://reviews.llvm.org/D44467 implements conversion of invalid
vmov instructions into valid ones. It turned out that some valid
instructions also get converted, for example

  vmov.i64 d2, #0xff00ff00ff00ff00 ->
  vmov.i16 d2, #0xff00

Such behavior is incorrect because according to the ARM ARM section
F2.7.7 Modified immediate constants in T32 and A32 Advanced SIMD
instructions, "On assembly, the data type must be matched in the table
if possible."

This patch fixes the isNEONmovReplicate check so that the above
instruction is not modified any more.

Reviewers: rengolin, olista01

Reviewed By: rengolin

Subscribers: javed.absar, kristof.beyls, rogfer01, llvm-commits

Differential Revision: https://reviews.llvm.org/D44678

llvm-svn: 329158
2018-04-04 08:54:19 +00:00
Clement Courbet
606b211cef Revert r329156 "Add llvm-exegesis tool."
Breaks a bunch of bots.

llvm-svn: 329157
2018-04-04 08:22:54 +00:00
Clement Courbet
0f5c40aa77 Add llvm-exegesis tool.
Summary:
[llvm-exegesis][RFC] Automatic Measurement of Instruction Latency/Uops

This is the code corresponding to the RFC "llvm-exegesis Automatic Measurement of Instruction Latency/Uops".

The RFC is available on the LLVM mailing lists as well as the following document
for easier reading:
https://docs.google.com/document/d/1QidaJMJUyQdRrFKD66vE1_N55whe0coQ3h1GpFzz27M/edit?usp=sharing

Subscribers: mgorny, gchatelet, orwant, llvm-commits

Differential Revision: https://reviews.llvm.org/D44519

llvm-svn: 329156
2018-04-04 08:13:32 +00:00
Craig Topper
5d43a6828d [X86] Use the same predicate for the load for PMOVSXBQ and PMOVZXBQ.
These both use a 16-bit load, but one used loadi16_anyext and the other used extloadi32i16. The only difference between them is that loadi16_anyext checked that the load was at least 2 byte aligned and non-volatile. But the alignment doesn't matter here. Just use extloadi32i16 for both.

llvm-svn: 329154
2018-04-04 07:00:24 +00:00
Craig Topper
37cf027772 [X86] Use loadi16/loadi32 predicates in multiply patterns
llvm-svn: 329153
2018-04-04 07:00:19 +00:00
Craig Topper
c40f03b37b [X86] Remove more dead code left over from the handling of i8/i16 UMUL_LOHI/SMUL_LOHI that is no longer needed. NFC
llvm-svn: 329152
2018-04-04 07:00:16 +00:00
Max Kazantsev
c5b43d7a72 [SCEV] Prove implications for SCEVUnknown Phis
This patch teaches SCEV how to prove implications for SCEVUnknown nodes that are Phis.
If we need to prove `Pred` for `LHS, RHS`, and `LHS` is a Phi with possible incoming values
`L1, L2, ..., LN`, then if we prove `Pred` for `(L1, RHS), (L2, RHS), ..., (LN, RHS)` then we can also
prove it for `(LHS, RHS)`. If both `LHS` and `RHS` are Phis from the same block, it is sufficient
to prove the predicate for values that come from the same predecessor block.

The typical case that it handles is that we sometimes need to prove that `Phi(Len, Len - 1) >= 0`
given that `Len > 0`. The new logic was added to `isImpliedViaOperations` and only uses it and
non-recursive reasoning to prove the facts we need, so it should not hurt compile time a lot.

Differential Revision: https://reviews.llvm.org/D44001
Reviewed By: anna

llvm-svn: 329150
2018-04-04 05:46:47 +00:00
Craig Topper
f66214fc7b [X86] Remove dead code for handling i8/i16 UMUL_LOHI/SMUL_LOHI from X86ISelDAGToDAG.cpp. NFC
These are promoted to i16/i32 multiplies by a DAG combine.

llvm-svn: 329147
2018-04-04 04:38:55 +00:00
Craig Topper
00bb04b7e2 [X86] Remove some code that was only needed when i1 was a legal type. NFC
llvm-svn: 329146
2018-04-04 04:38:54 +00:00
Craig Topper
7fbc7e4375 [SimplifyCFG] Teach merge conditional stores to handle cases where the PostBB has more than 2 predecessors by inserting a new block for the store.
Summary:
Currently merge conditional stores can't handle cases where PostBB (the block we need to move the store to) has more than 2 predecessors.

This patch removes that restriction by creating a new block with only the 2 predecessors we care about and an unconditional branch to the original block. This provides a place to put the store.

Reviewers: efriedma, jmolloy, ABataev

Reviewed By: efriedma

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39760

llvm-svn: 329142
2018-04-04 03:47:17 +00:00
Vlad Tsyrklevich
2ea2431d0c Fix bad #include path in r329139
llvm-svn: 329140
2018-04-04 01:34:42 +00:00
Vlad Tsyrklevich
8f493879cd Add the ShadowCallStack pass
Summary:
The ShadowCallStack pass instruments functions marked with the
shadowcallstack attribute. The instrumented prolog saves the return
address to [gs:offset] where offset is stored and updated in [gs:0].
The instrumented epilog loads/updates the return address from [gs:0]
and checks that it matches the return address on the stack before
returning.

Reviewers: pcc, vitalybuka

Reviewed By: pcc

Subscribers: cryptoad, eugenis, craig.topper, mgorny, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D44802

llvm-svn: 329139
2018-04-04 01:21:16 +00:00
Nico Weber
70441858e3 Minor no-op cmake file style fix.
llvm-svn: 329137
2018-04-04 00:50:22 +00:00
Lang Hames
ac9f12c52a Reapply r329133 with fix.
llvm-svn: 329136
2018-04-04 00:34:54 +00:00
Lang Hames
4fae26e87a Revert r329133 "[RuntimeDyld][AArch64] Add some error pluming / generation..."
This broke a number of buildbots. Looking in to it now...

llvm-svn: 329135
2018-04-04 00:12:12 +00:00
Jessica Paquette
c7da7a2100 [MachineOutliner] Test for X86FI->getUsesRedZone() as well as Attribute::NoRedZone
This commit is similar to r329120, but uses the existing getUsesRedZone() function
in X86MachineFunctionInfo. This teaches the outliner to look at whether or not a
function *truly* uses a redzone instead of just the noredzone attribute on a
function.

Thus, after this commit, it's possible to outline from x86 without using
-mno-red-zone and still get outlining results.

This also adds a new test for the new redzone behaviour.

llvm-svn: 329134
2018-04-03 23:32:41 +00:00
Lang Hames
4c8e51f14a [RuntimeDyld][AArch64] Add some error pluming / generation to catch unhandled
relocation types on AArch64.

llvm-svn: 329133
2018-04-03 23:19:20 +00:00
Farhana Aleen
414b5b32f1 [AMDGPU] performMinMaxCombine should not optimize patterns of vectors to min3/max3.
Summary: There are no packed instructions for min3 or max3. So, performMinMaxCombine should not optimize vectors of f16 to min3/max3.

Author: FarhanaAleen

Reviewed By: arsenm

Subscribers: llvm-commits, AMDGPU

Differential Revision: https://reviews.llvm.org/D45219

llvm-svn: 329131
2018-04-03 23:00:30 +00:00
Evandro Menezes
f1c1fba1e2 [AArch64] Adjust the cost model for Exynos M3
Fix typo and simplify matching expression.

llvm-svn: 329130
2018-04-03 22:57:17 +00:00
Ikhlas Ajbar
c01943052e [Hexagon] peel loops with runtime small trip counts
Move the check canPeel() to Hexagon Target before setting PeelCount.

Differential Revision: https://reviews.llvm.org/D44880

llvm-svn: 329129
2018-04-03 22:55:09 +00:00
Reid Kleckner
d2dd25f14b 'cat' command for internal shell - Support Python 3
LLVM Bug Id : 36449

Revision 328563 caused tests to fail under python 3.

This patch modified cat.py file to support both python 2 and 3.
This patch also fixes CRLF issues on Windows.

Patch by Chamal de Silva

Differential Revision: https://reviews.llvm.org/D45077

llvm-svn: 329123
2018-04-03 22:38:25 +00:00
Sanjay Patel
166f1f336d [InstCombine] allow more fmul folds with 'reassoc'
The tests marked with 'FIXME' require loosening the check
in SimplifyAssociativeOrCommutative() to optimize completely;
that's still checking isFast() in Instruction::isAssociative().

llvm-svn: 329121
2018-04-03 22:19:19 +00:00
Jessica Paquette
9aae4e4ccc [MachineOutliner] Keep track of fns that use a redzone in AArch64FunctionInfo
This patch adds a hasRedZone() function to AArch64MachineFunctionInfo. It
returns true if the function is known to use a redzone, false if it is known
to not use a redzone, and no value otherwise.

This removes the requirement to pass -mno-red-zone when outlining for AArch64.

https://reviews.llvm.org/D45189

llvm-svn: 329120
2018-04-03 21:56:10 +00:00
Farhana Aleen
68f650fb3e Revert "MSG"
This reverts commit 9a0ce889d1c39c74d69ecad5ce9c875155ae55de.

This was committed by mistake.

llvm-svn: 329119
2018-04-03 21:51:45 +00:00
Vlad Tsyrklevich
522c7ade27 Fix bad copy-and-paste in r329108
llvm-svn: 329118
2018-04-03 21:40:27 +00:00
Jessica Paquette
a2fb3c5cf5 [MachineOutliner][NFC] Make outlined functions have internal linkage
The linkage type on outlined functions was private before. This meant that if
you set a breakpoint in an outlined function, the debugger wouldn't be able to
give a sane name to the outlined function.

This commit changes the linkage type to internal and updates any tests that
relied on the prefixes on the names of outlined functions.
 

llvm-svn: 329116
2018-04-03 21:36:00 +00:00
Farhana Aleen
6d0d2e4c5a MSG
llvm-svn: 329114
2018-04-03 21:20:39 +00:00
Gor Nishanov
e88fabad02 [coroutines] Respect alloca alignment requirements when building coroutine frame
Summary:
If an alloca need to be stored in the coroutine frame and it has an alignment specified and the alignment does not match the natural alignment of the alloca type. Insert appropriate padding into the coroutine frame to make sure that it gets requested alignment.

For example for a packet type (which natural alignment is 1), but alloca alignment is 8, we may need to insert a padding field with required number of bytes to make sure it is properly aligned.

```
%PackedStruct = type <{ i64 }>
...
  %data = alloca %PackedStruct, align 8
```

If the previous field in the coroutine frame had alignment 2, we would have [6 x i8] inserted before %PackedStruct in the coroutine frame:

```
%f.Frame = type { ..., i16, [6 x i8], %PackedStruct }
```

Reviewers: rnk, lewissbaker, modocache

Reviewed By: modocache

Subscribers: EricWF, llvm-commits

Differential Revision: https://reviews.llvm.org/D45221

llvm-svn: 329112
2018-04-03 20:54:20 +00:00
Florian Hahn
f4e7517ea8 [LoopInterchange] Add remark for calls preventing interchanging.
It also updates test/Transforms/LoopInterchange/call-instructions.ll
to use accesses where we can prove dependence after D35430.


Reviewers: sebpop, karthikthecool, blitz.opensource

Reviewed By: sebpop

Differential Revision: https://reviews.llvm.org/D45206

llvm-svn: 329111
2018-04-03 20:54:04 +00:00
Vlad Tsyrklevich
505d68d60f Add the ShadowCallStack attribute
Summary:
Introduce the ShadowCallStack function attribute. It's added to
functions compiled with -fsanitize=shadow-call-stack in order to mark
functions to be instrumented by a ShadowCallStack pass to be submitted
in a separate change.

Reviewers: pcc, kcc, kubamracek

Reviewed By: pcc, kcc

Subscribers: cryptoad, mehdi_amini, javed.absar, llvm-commits, kcc

Differential Revision: https://reviews.llvm.org/D44800

llvm-svn: 329108
2018-04-03 20:10:40 +00:00
Aaron Smith
6d33e6f11c [DebugInfoPDB] Add methods used to read function flags
The specific function flags are listed in CodeView::FunctionOption.

llvm-svn: 329105
2018-04-03 19:43:40 +00:00
Aaron Smith
45aa5ed610 [DebugInfoPDB] Add a few missing definitions to PDBTypes.h
The missing definitions are from cvconst.h shipped with DIA SDK.

Correct the url to MSDN for MemoryTypeEnum and set the underlying 
type of PDB_StackFrameType and PDB_MemoryType to uint16_t.

llvm-svn: 329104
2018-04-03 19:41:27 +00:00
Sanjay Patel
2bcb662a6b [x86] add tests for convert-FP-to-integer with constants; NFC
We don't constant fold any of these, but we could...but if we
do, we must produce the right answer.

Unlike the IR fptosi instruction or its DAG node counterpart 
ISD::FP_TO_SINT, these are not undef for an out-of-range input.

llvm-svn: 329100
2018-04-03 18:34:56 +00:00