1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 03:23:01 +02:00
Commit Graph

166021 Commits

Author SHA1 Message Date
Sean Fertile
544e57cfe8 Revert "Extend CFGPrinter and CallPrinter with Heat Colors"
This reverts r335996 which broke graph printing in Polly.

llvm-svn: 336000
2018-06-29 17:48:58 +00:00
Matt Arsenault
334df8b0bc AMDGPU: Don't use struct type for argument layout
This was introducing unnecessary padding after the explicit
arguments, depending on the alignment of the total struct type.
Also has the side effect of avoiding creating an extra GEP for
the offset from the base kernel argument to the explicit kernel
argument offset.

llvm-svn: 335999
2018-06-29 17:31:42 +00:00
Craig Topper
95f0cd3953 [X86] Limit the number of target specific nodes emitted in LowerShiftParts
The important part is the creation of the SHLD/SHRD nodes. The compare and the conditional move can use target independent nodes that can be legalized on their own. This gives some opportunities to trigger the optimizations present in the lowering for those things. And its just better to limit the number of places we emit target specific nodes.

The changed test cases still aren't optimal.

Differential Revision: https://reviews.llvm.org/D48619

llvm-svn: 335998
2018-06-29 17:24:07 +00:00
Sean Fertile
152672e597 Extend CFGPrinter and CallPrinter with Heat Colors
Extends the CFGPrinter and CallPrinter with heat colors based on heuristics or
profiling information. The colors are enabled by default and can be toggled
on/off for CFGPrinter by using the option -cfg-heat-colors for both
-dot-cfg[-only] and -view-cfg[-only].  Similarly, the colors can be toggled
on/off for CallPrinter by using the option -callgraph-heat-colors for both
-dot-callgraph and -view-callgraph.

Patch by Rodrigo Caetano Rocha!

Differential Revision: https://reviews.llvm.org/D40425

llvm-svn: 335996
2018-06-29 17:13:58 +00:00
Jonas Devlieghere
548fff58d8 [dsymutil] Rename conflicting declaration
Using MemoryBuffer as member name clashed with the llvm::MemoryBuffer
class.

llvm-svn: 335995
2018-06-29 17:11:34 +00:00
Craig Topper
d96db9ff80 [X86] Use a std::vector for the memory unfolding table.
Previously we used a DenseMap which is costly to set up due to multiple full table rehashes as the size increases and causes the table to be reallocated.

This patch changes the table to a vector of structs. We now walk the reg->mem tables and push new entries in the mem->reg table for each row not marked TB_NO_REVERSE. Once all the table entries have been created, we sort the vector. Then we can use a binary search for lookups.

Differential Revision: https://reviews.llvm.org/D48585

llvm-svn: 335994
2018-06-29 17:11:26 +00:00
Jonas Devlieghere
59f7e40dac [dsymutil] Make the CachedBinaryHolder the default
Replaces all uses of the old binary holder with its cached variant.

Differential revision: https://reviews.llvm.org/D48770

llvm-svn: 335991
2018-06-29 16:51:52 +00:00
Jonas Devlieghere
077fd2e850 [dsymutil] Introduce a new CachedBinaryHolder
The original binary holder has an optimization where it caches a static
library (archive) between consecutive calls to GetObjects. However, the
actual memory buffer wasn't cached between calls.

This made sense when dsymutil was processing objects one after each
other, but when processing them in parallel, several binaries have to be
in memory at the same time. For this reason, every link context
contained a binary holder.

Having one binary holder per context is problematic, because the same
static archive was cached for every object file. Luckily, when the file
is mmap'ed, this was only costing us virtual memory.

This patch introduces a new BinaryHolder variant that is fully cached,
for all the object files it load, as well as the static archives. This
way, we don't have to give up on this optimization of bypassing the
file system.

Differential revision: https://reviews.llvm.org/D48501

llvm-svn: 335990
2018-06-29 16:50:41 +00:00
Petar Jovanovic
8df7020618 [mips] Support shrink-wrapping
Except for -O0, it's enabled by default.

Patch by Vladimir Stefanovic.

Differential Revision: https://reviews.llvm.org/D47947

llvm-svn: 335989
2018-06-29 16:37:16 +00:00
Stanislav Mekhanoshin
e22d52e9e0 [AMDGPU] Enable LICM in the BE pipeline
This allows to hoist code portion to compute reciprocal of loop
invariant denominator in integer division after codegen prepare
expansion.

Differential Revision: https://reviews.llvm.org/D48604

llvm-svn: 335988
2018-06-29 16:26:53 +00:00
Jessica Paquette
428ec4f245 [MachineOutliner] Add always and never options to -enable-machine-outliner
This is a recommit of r335887, which was erroneously committed earlier.

To enable the MachineOutliner by default on AArch64, we need to be able to
disable the MachineOutliner and also provide an option to "always" enable the
outliner.

This adds that capability. It allows the user to still use the old
-enable-machine-outliner option, which defaults to "always". This is building
up to allowing the user to specify "always" versus the target default
outlining behaviour.

https://reviews.llvm.org/D48682

llvm-svn: 335986
2018-06-29 16:12:45 +00:00
Sanjay Patel
2c11536c37 [InstCombine] add more tests for shuffle-binop folds; NFC
The mul+shl tests add coverage for the fold enabled with D48678.
The and+or tests are not handled yet; that's D48662.

llvm-svn: 335984
2018-06-29 15:28:11 +00:00
Andrea Di Biagio
a5b30c5c97 [llvm-mca] Remove field HasReadAdvanceEntries from class ReadDescriptor.
This simplifies the logic that updates RAW dependencies in the DispatchStage.
There is no advantage in storing that flag in the ReadDescriptor; we should
simply rely on the call to `STI.getReadAdvanceCycles()` to obtain the
ReadAdvance cycles. If there are no read-advance entries, then method
`getReadAdvanceCycles()` quickly returns 0.

No functional change intended.

llvm-svn: 335977
2018-06-29 14:24:46 +00:00
Alexey Bataev
cb52c478ed [DEBUG_INFO, NVPTX] Do not emit .debug_loc section.
Summary:
.debug_loc section is not supported for NVPTX target. If there is an
object whose location can change during its lifetime, we do not generate
debug location info for this variable.

Reviewers: echristo

Subscribers: jholewinski, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D48730

llvm-svn: 335976
2018-06-29 14:23:28 +00:00
Krzysztof Parzyszek
20a63bcec7 [Hexagon] Remove unused instruction itineraties, NFC
llvm-svn: 335975
2018-06-29 13:55:28 +00:00
Sanjay Patel
d0745cae2f [InstCombine] enhance shuffle-of-binops to allow different variable ops (PR37806)
This was discussed in D48401 as another improvement for:
https://bugs.llvm.org/show_bug.cgi?id=37806

If we have 2 different variable values, then we shuffle (select) those lanes, 
shuffle (select) the constants, and then perform the binop. This eliminates a binop.

The new shuffle uses the same shuffle mask as the existing shuffle, so there's no 
danger of creating a difficult shuffle.

All of the earlier constraints still apply, but we also check for extra uses to 
avoid creating more instructions than we'll remove.

Additionally, we're disallowing the fold for div/rem because that could expose a
UB hole.

Differential Revision: https://reviews.llvm.org/D48678

llvm-svn: 335974
2018-06-29 13:44:06 +00:00
Roman Shirokiy
9298d3a7d3 Fix overconfident assert in ScalarEvolution::isImpliedViaMerge
We can have AddRec with loops having many predecessors.
This changes an assert to an early return.

Differential Revision: https://reviews.llvm.org/D48766

llvm-svn: 335965
2018-06-29 11:46:30 +00:00
Sjoerd Meijer
d2e7279f78 [AArch64] Armv8.4-A: Virtualization system registers
This adds the Secure EL2 extension.

Differential Revision: https://reviews.llvm.org/D48711

llvm-svn: 335962
2018-06-29 11:03:15 +00:00
Filipe Cabecinhas
549730082d [cmake] Change WIN32 test to CMAKE_HOST_WIN32
The test is about what can be run on the host, not the cmake target.
When cross-compiling (compiler-rt at least) on Windows, we end up with
lit being unable to run llvm-lit because it can't find the llvm-lit
module.

llvm-svn: 335961
2018-06-29 10:34:37 +00:00
Simon Pilgrim
acd3489995 [X86][SSE] Support v16i8/v32i8 vector rotations
This uses the same technique as for shifts - split the rotation into 4/2/1-bit partial rotations and select those partials based on the amount bit, making use of PBLENDVB if available. This halves the use of PBLENDVB compared to expanding to shifts, which can be a slow op.

Unfortunately I haven't found a decent way to share much of this code with the shift equivalent.

Differential Revision: https://reviews.llvm.org/D48655

llvm-svn: 335957
2018-06-29 09:36:39 +00:00
Sjoerd Meijer
e768b5aa55 [ARM][AArch64] Armv8.4-A Enablement
Initial patch adding assembly support for Armv8.4-A.

Besides adding v8.4 as a supported architecture to the usual places, this also
adds target features for the different crypto algorithms. Armv8.4-A introduced
new crypto algorithms, made them optional, and allows different combinations:

- none of the v8.4 crypto functions are supported, which is independent of the
  implementation of the Armv8.0 SHA1 and SHA2 instructions.
- the v8.4 SHA512 and SHA3 support is implemented, in this case the Armv8.0
  SHA1 and SHA2 instructions must also be implemented.
- the v8.4 SM3 and SM4 support is implemented, which is independent of the
  implementation of the Armv8.0 SHA1 and SHA2 instructions.
- all of the v8.4 crypto functions are supported, in this case the Armv8.0 SHA1
  and SHA2 instructions must also be implemented.

The v8.4 crypto instructions are added to AArch64 only, and not AArch32,
and are made optional extensions to Armv8.2-A.

The user-facing Clang options will map on these new target features, their
naming will be compatible with GCC and added in follow-up patches.

The Armv8.4-A instruction sets can be downloaded here:
https://developer.arm.com/products/architecture/a-profile/exploration-tools

Differential Revision: https://reviews.llvm.org/D48625

llvm-svn: 335953
2018-06-29 08:43:19 +00:00
Roman Lebedev
a142d0e600 SCEVExpander::expandAddRecExprLiterally(): check before casting as Instruction
Summary:
An alternative to D48597.
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=37936 | PR37936 ]].

The problem is as follows:
1. `indvars` marks `%dec` as `NUW`.
2. `loop-instsimplify` runs `instsimplify`, which constant-folds `%dec` to -1 (D47908)
3. `loop-reduce` tries to do some further modification, but crashes
    with an type assertion in cast, because `%dec` is no longer an `Instruction`,

If the runline is split into two, i.e. you first run `-indvars -loop-instsimplify`,
store that into a file, and then run `-loop-reduce`, there is no crash.

So it looks like the problem is due to `-loop-instsimplify` not discarding SCEV.
But in this case we can just not crash if it's not an `Instruction`.
This is just a local fix, unlike D48597, so there may very well be other problems.

Reviewers: mkazantsev, uabelho, sanjoy, silviu.baranga, wmi

Reviewed By: mkazantsev

Subscribers: evstupac, javed.absar, spatel, llvm-commits

Differential Revision: https://reviews.llvm.org/D48599

llvm-svn: 335950
2018-06-29 07:44:20 +00:00
Kristof Beyls
95c680fb11 Make email options of find_interesting_reviews more flexible.
This enables a few requested improvements on the original review of this
script at https://reviews.llvm.org/D46192.

This introduces 2 new command line options:

* --email-report: This option enables specifying who to email the generated
  report to. This also enables not sending any email and only printing out
  the report on stdout by not specifying this option on the command line.
* --sender: this allows specifying the email address that will be used in
  the "From" email header.

I believe that with these options the script starts having the basic
features needed to run it well on a regular basis for a group of
developers.

Differential Revision: https://reviews.llvm.org/D47930

llvm-svn: 335948
2018-06-29 07:16:27 +00:00
Craig Topper
3c896ff07f [X86] Remove masking from the avx512 packed sqrt intrinsics. Use select in IR instead.
While there improve the coverage of the intrinsic testing and add fast-isel tests.

llvm-svn: 335944
2018-06-29 05:43:26 +00:00
Tom Stellard
4bed5eda0d AMDGPU: Separate R600 and GCN TableGen files
Summary:
We now have two sets of generated TableGen files, one for R600 and one
for GCN, so each sub-target now has its own tables of instructions,
registers, ISel patterns, etc.  This should help reduce compile time
since each sub-target now only has to consider information that
is specific to itself.  This will also help prevent the R600
sub-target from slowing down new features for GCN, like disassembler
support, GlobalISel, etc.

Reviewers: arsenm, nhaehnle, jvesely

Reviewed By: arsenm

Subscribers: MatzeB, kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D46365

llvm-svn: 335942
2018-06-28 23:47:12 +00:00
Sterling Augustine
4759abd6b1 Require x86 for this test.
llvm-svn: 335939
2018-06-28 23:22:14 +00:00
Eli Friedman
2655e6dc4e [ARM] Assert that ARMDAGToDAGISel creates valid UBFX/SBFX nodes.
We don't ever check these again (unless you're using
-fno-integrated-as), so make sure the extracted bits are well-defined.

I don't think it's possible to trigger any of the assertions on trunk,
but it's difficult to prove.  (The first one depends on DAGCombine to
minimize the number of set bits in AND masks; I think the others are
mathematically impossible to hit.)

llvm-svn: 335931
2018-06-28 21:49:41 +00:00
Jessica Paquette
d91bbf243e [MachineOutliner] Never add the outliner in -O0
This is a recommit of r335879.

We shouldn't add the outliner when compiling at -O0 even if
-enable-machine-outliner is passed in. This makes sure that we
don't add it in this case.

This also removes -O0 from the outliner DWARF test.

llvm-svn: 335930
2018-06-28 21:49:24 +00:00
Sanjay Patel
7c543dc194 [InstCombine] adjust shuffle tests; NFC
Use xor for the extra uses test because div/rem have other problems.

llvm-svn: 335924
2018-06-28 21:14:02 +00:00
Jake Ehrlich
5a896e10ed [llvm-readobj] Add experimental support for SHT_RELR sections
This change adds experimental support for SHT_RELR sections, proposed
here: https://groups.google.com/forum/#!topic/generic-abi/bX460iggiKg

Definitions for the new ELF section type and dynamic array tags, as well
as the encoding used in the new section are all under discussion and are
subject to change. Use with caution!

Author: rahulchaudhry

Differential Revision: https://reviews.llvm.org/D47919

llvm-svn: 335922
2018-06-28 21:07:34 +00:00
Benjamin Kramer
9d10dcaaf9 [SupportTests] Silence -Wsign-compare warnings
llvm-svn: 335921
2018-06-28 21:03:24 +00:00
Sanjay Patel
1d6143a750 [InstCombine] fix opcode check in shuffle fold
There's no way to expose this difference currently, 
but we should use the updated variable because the
original opcodes can go stale if we transform into
something new.

llvm-svn: 335920
2018-06-28 20:52:43 +00:00
Martin Storsjo
c5e98613a7 [COFF] Fix constant sharing regression for MinGW
This fixes a regression since SVN r334523, where the object files
built targeting MinGW were rejected by GNU binutils tools. Prior to
that commit, we only put constants in comdat for MSVC configurations.

Differential Revision: https://reviews.llvm.org/D48567

llvm-svn: 335918
2018-06-28 20:28:29 +00:00
Zachary Turner
be77e2a7e3 Fix padding with custom character in formatv.
The format string for formatv allows to specify a custom padding
character instead of the default space.  This custom character was
parsed correctly, but not passed on to the formatter.

Patch by Marcel Köppe
Differential Revision: https://reviews.llvm.org/D48140

llvm-svn: 335915
2018-06-28 20:09:37 +00:00
Teresa Johnson
54c43692b3 [ThinLTO] Port InlinerFunctionImportStats handling to new PM
Summary:
The InlinerFunctionImportStats will collect and dump stats regarding how
many function inlined into the module were imported by ThinLTO.

Reviewers: wmi, dexonsmith

Subscribers: mehdi_amini, inglorion, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D48729

llvm-svn: 335914
2018-06-28 20:07:47 +00:00
Benjamin Kramer
74ec48e91c [NVPTX] Delete dead code
No functionality change.

llvm-svn: 335913
2018-06-28 20:05:35 +00:00
Eli Friedman
951f2d2f6e [ARM] Add missing Thumb2 assembler diagnostics.
Mostly just adding checks for Thumb2 instructions which correspond to
ARM instructions which already had diagnostics. While I'm here, also fix
ARM-mode strd to check the input registers correctly.

Differential Revision: https://reviews.llvm.org/D48610

llvm-svn: 335909
2018-06-28 19:53:12 +00:00
Sterling Augustine
b38cbbaefd Some targets don't have lld built, so just use a binary copy
of the input file.

llvm-svn: 335908
2018-06-28 19:47:23 +00:00
Anastasis Grammenos
1de54b8e96 [SROA] Preserve DebugLoc when rewriting alloca partitions
When rewriting an alloca partition copy the DL from the
old alloca over the the new one.

Differential Revision: https://reviews.llvm.org/D48640

llvm-svn: 335904
2018-06-28 18:58:30 +00:00
Sterling Augustine
825ac91b87 Handle absolute symbols as branch targets in disassembly.
https://reviews.llvm.org/D48554

llvm-svn: 335903
2018-06-28 18:57:13 +00:00
Zachary Turner
7db87530e1 Add a flag to FileOutputBuffer that allows modification.
FileOutputBuffer creates a temp file and on commit atomically
renames the temp file to the destination file.  Sometimes we
want to modify an existing file in place, but still have the
atomicity guarantee.  To do this we can initialize the contents
of the temp file from the destination file (if it exists), that
way the resulting FileOutputBuffer can have only selective
bytes modified.  Committing will then atomically replace the
destination file as desired.

llvm-svn: 335902
2018-06-28 18:49:09 +00:00
Simon Pilgrim
48920466ab Remove unnecessary semicolon. NFCI.
Fixes -Wpedantic warning.

llvm-svn: 335901
2018-06-28 18:37:16 +00:00
Justin Bogner
fa021a2fac [CMake] Respect CMAKE_STRIP and CMAKE_DSYMUTIL on apple platforms
This allows overriding the strip and dsymutil tools, and updates
iOS.cmake to do so. I've also added libtool to iOS.cmake, but it was
already respecting CMAKE_LIBTOOL if set.

llvm-svn: 335900
2018-06-28 18:36:52 +00:00
Vedant Kumar
812e30f156 [Debugify] Do not report line 0 locations as errors
The checking logic should not treat artificial locations as being
somehow problematic. Producing these locations can be the desired
behavior of some passes.

See llvm.org/PR37961.

llvm-svn: 335897
2018-06-28 18:21:11 +00:00
Craig Topper
ed8e39a2a3 [X86] Suppress load folding into and/or/xor if it will prevent matching btr/bts/btc.
This is a follow up to r335753. At the time I forgot about isProfitableToFold which makes this pretty easy.

Differential Revision: https://reviews.llvm.org/D48706

llvm-svn: 335895
2018-06-28 17:58:01 +00:00
Jonas Devlieghere
c40fc8755a Revert "Re-land r335297 "[X86] Implement more of x86-64 large and medium PIC code models""
Reverting because this is causing failures in the LLDB test suite on
GreenDragon.

  LLVM ERROR: unsupported relocation with subtraction expression, symbol
  '__GLOBAL_OFFSET_TABLE_' can not be undefined in a subtraction
  expression

llvm-svn: 335894
2018-06-28 17:56:43 +00:00
Jonas Devlieghere
b0b97cbec8 Revert "[OrcMCJIT] Fix test after r335508 causing it to fail on green dragon"
This reverts commit a6b904daa1d55e31187c85e5b54ef2ddc37fa713.

llvm-svn: 335893
2018-06-28 17:56:27 +00:00
Zachary Turner
829455b0ec 2 VS natvis improvements.
Optional<T> was broken due to a change in the class's internals.
That is fixed, and additionally a visualizer is added for
Expected<T>.

llvm-svn: 335892
2018-06-28 17:55:54 +00:00
Sanjay Patel
22b403910a [InstCombine] allow shl+mul combos with shuffle (select) fold (PR37806)
This is an enhancement to D48401 that was discussed in:
https://bugs.llvm.org/show_bug.cgi?id=37806

We can convert a shift-left-by-constant into a multiply (we canonicalize IR in the other 
direction because that's generally better of course). This allows us to remove the shuffle 
as we do in the regular opcodes-are-the-same cases.

This requires a small hack to make sure we don't introduce any extra poison:
https://rise4fun.com/Alive/ZGv

Other examples of opcodes where this would work are add+sub and fadd+fsub, but we already 
canonicalize those subs into adds, so there's nothing to do for those cases AFAICT. There 
are planned enhancements for opcode transforms such or -> add.

Note that there's a different fold needed if we've already managed to simplify away a binop 
as seen in the test based on PR37806, but we manage to get that one case here because this 
fold is positioned above the demanded elements fold currently.

Differential Revision: https://reviews.llvm.org/D48485

llvm-svn: 335888
2018-06-28 17:48:04 +00:00
Jessica Paquette
848840c089 [MachineOutliner] Define MachineOutliner support in TargetOptions
Targets should be able to define whether or not they support the outliner
without the outliner being added to the pass pipeline. Before this, the
outliner pass would be added, and ask the target whether or not it supports the
outliner.

After this, it's possible to query the target in TargetPassConfig, before the
outliner pass is created. This ensures that passing -enable-machine-outliner
will not modify the pass pipeline of any target that does not support it.

https://reviews.llvm.org/D48683

llvm-svn: 335887
2018-06-28 17:45:43 +00:00