1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 12:12:47 +01:00
Commit Graph

156661 Commits

Author SHA1 Message Date
Craig Topper
4b336f3aac [X86] Redefine the 128-bit version of VPGATHERQD and VGATHERQPS to use a VK2 mask instead of a VK4 mask.
This allows us to remove extra extend creation during lowering and more accurately reflects the semantics of the instruction.

While there add an extra output VT to X86 masked gather node to better match the isel pattern predicate. Currently we're exploiting the fact that the isel table doesn't count how many output results a node actually has if the result type of any can be inferred from the first result and the type constraints defined in tablegen. I think we might ultimately want to lower all MGATHER/MSCATTER to an X86ISD node with the extra mask result and stop relying on this hole in the isel checking.

llvm-svn: 318278
2017-11-15 07:46:43 +00:00
NAKAMURA Takumi
74b15a31b3 GISelWorkList.h: Fix -fmodules build in rL318210.
llvm-svn: 318275
2017-11-15 07:34:35 +00:00
NAKAMURA Takumi
b7162520e4 Fix llvm/test/Transforms/LoopRotate/pr35210.ll in rL318237, it uses debug options.
llvm-svn: 318273
2017-11-15 06:46:58 +00:00
Fangrui Song
6567b563fc NFC Remove default argument of DataLayout::getPointerABIAlignment
Differential Revision: https://reviews.llvm.org/D40005

llvm-svn: 318272
2017-11-15 06:17:32 +00:00
Craig Topper
368d3224be [X86] Add getHostCPUName support for the Gemini Lake model number which also uses Goldmont.
llvm-svn: 318271
2017-11-15 06:02:43 +00:00
Craig Topper
cd296f22b1 [X86] Add getHostCPUName support for cannonlake.
This adds an explicit model number check and fallback path to the unknown family 6 detection.

llvm-svn: 318270
2017-11-15 06:02:42 +00:00
Craig Topper
dcd7058011 [InstCombine] Simplify binops that are only used by a select and are fed by a select with the same condition.
Summary:
This patch optimizes a binop sandwiched between 2 selects with the same condition. Since we know its only used by the select we can propagate the appropriate input value from the earlier select.

As I'm writing this I realize I may need to avoid doing this for division in case the select was protecting a divide by zero?

Reviewers: spatel, majnemer

Reviewed By: majnemer

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39999

llvm-svn: 318267
2017-11-15 05:23:02 +00:00
Hiroshi Inoue
4ee809496b [PowerPC] fix up in redundant compare elimination
This patch fixes a potential problem in my previous commit (https://reviews.llvm.org/rL312514) by introducing an additional check.

llvm-svn: 318266
2017-11-15 04:23:26 +00:00
Vedant Kumar
2aae3ca7a2 [docs] Document a way to simplify names in bugpoint output
llvm-svn: 318257
2017-11-15 02:58:45 +00:00
Matt Arsenault
141501039d AMDGPU: Add separate definitions for DS insts without m0 use
llvm-svn: 318246
2017-11-15 01:34:06 +00:00
Craig Topper
c7a40b93be [X86] Correct the spelling of pentiumpro in X86TargetParser.def
Thanks to Erich Keane for spotting this.

llvm-svn: 318243
2017-11-15 01:01:50 +00:00
Matt Arsenault
f225ab4cc5 AMDGPU: Don't use MUBUF vaddr if address may overflow
Effectively revert r263964. Before we would not
allow this if vaddr was not known to be positive.

llvm-svn: 318240
2017-11-15 00:45:43 +00:00
Hans Wennborg
4937b695da Revert r318193 "[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops."
It crashes building sqlite; see reply on the llvm-commits thread.

> [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
>
>         Patch tries to improve vectorization of the following code:
>
>         void add1(int * __restrict dst, const int * __restrict src) {
>           *dst++ = *src++;
>           *dst++ = *src++ + 1;
>           *dst++ = *src++ + 2;
>           *dst++ = *src++ + 3;
>         }
>         Allows to vectorize even if the very first operation is not a binary add, but just a load.
>
>         Fixed issues related to previous commit.
>
>         Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
>
>         Reviewed By: ABataev, RKSimon
>
>         Subscribers: llvm-commits, RKSimon
>
>         Differential Revision: https://reviews.llvm.org/D28907

llvm-svn: 318239
2017-11-15 00:38:13 +00:00
Mitch Phillips
1bb0b7dfdf [cfi-verify] Validate there are no register clobbers between CFI-check and instruction execution.
Summary:
This patch adds another failure mode for `validateCFIProtection(..)`, wherein any register that affects the indirect control flow instruction is clobbered to between the CFI-check and the instruction's execution.

Also includes a modification to make MCInstrDesc::hasDefOfPhysReg public.

Reviewers: vlad.tsyrklevich

Reviewed By: vlad.tsyrklevich

Subscribers: llvm-commits, pcc, kcc

Differential Revision: https://reviews.llvm.org/D39820

llvm-svn: 318238
2017-11-15 00:35:26 +00:00
Craig Topper
4e0cd496e5 [LoopRotate] processLoop should return true even if it just simplified the loop latch without making any other changes
Simplifying a loop latch changes the IR and we need to make sure the pass manager knows to invalidate analysis passes if that happened.

PR35210 discovered a case where we failed to invalidate the post dominator tree after this simplification because we no changes other than simplifying the loop latch.

Fixes PR35210.

Differential Revision: https://reviews.llvm.org/D40035

llvm-svn: 318237
2017-11-15 00:22:42 +00:00
Evgeniy Stepanov
0f4fe8b8dd [asan] Prevent rematerialization of &__asan_shadow.
Summary:
In the mode when ASan shadow base is computed as the address of an
external global (__asan_shadow, currently on android/arm32 only),
regalloc prefers to rematerialize this value to save register spills.
Even in -Os. On arm32 it is rather expensive (2 loads + 1 constant
pool entry).

This changes adds an inline asm in the function prologue to suppress
this behavior. It reduces AsanTest binary size by 7%.

Reviewers: pcc, vitalybuka

Subscribers: aemerson, kristof.beyls, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40048

llvm-svn: 318235
2017-11-15 00:11:51 +00:00
Vedant Kumar
4d07842a77 [PGO] Bump the indexed profile format version
Differential Revision: https://reviews.llvm.org/D39447

llvm-svn: 318228
2017-11-14 23:56:48 +00:00
Petr Hosek
39a6141afc [CMake][runtimes] Don't process common options in runtimes build
This is no longer needed for any of the runtimes build and it breaks
in case we don't have the working compiler yet, e.g. when building
a compiler that uses compiler-rt and libc++ as a default runtime,
because these common options check whether these are available.

Differential Revision: https://reviews.llvm.org/D39932

llvm-svn: 318227
2017-11-14 23:56:05 +00:00
Craig Topper
6f595ad139 [X86] Fix the parameter order in the default implementation of X86_VENDOR macro in X86TargetParser.def
The default implementation doesn't do anything so the order doesn't matter, but good for cleanliness.

llvm-svn: 318226
2017-11-14 23:54:28 +00:00
Petr Hosek
c5c523800e [CMake][runtimes] Set compiler as working even for default target
Even when building builtins and runtimes for the default target
we shouldn't assume that the just built compiler is already useable.
When the compiler uses compiler-rt and libc++ as the default runtime
and C++ library, it won't be usable until we finish building runtimes.

Differential Revision: https://reviews.llvm.org/D39715

llvm-svn: 318224
2017-11-14 23:47:20 +00:00
Matt Arsenault
66ee956d8d AMDGPU: Handle or in multi-use shl ptr combine
llvm-svn: 318223
2017-11-14 23:46:42 +00:00
Hans Wennborg
3cc4cef33b Fix switch-lower-peel-top-case.ll isel pass is not registered error
The test was doing -stop-after=isel, but that pass is actually the
AMDGPUDAGToDAGISel pass, which might not be built when targeting x86_64.
This changes the test to -stop-after=expand-isel-pseudos instead.

Follow-up to r318202.

llvm-svn: 318220
2017-11-14 23:30:28 +00:00
Davide Italiano
7a04e78909 [EntryExitInstrumenter] Placate GCC, the semicolon is redundant. NFCI.
llvm-svn: 318217
2017-11-14 23:13:38 +00:00
Tim Renouf
3ac8960e91 [AMDGPU] updated PAL metadata record keys
Summary: The ABI changed before specification was finalized.

Reviewers: kzhuravl, dstuttard

Subscribers: wdng, nhaehnle, yaxunl, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D39807

llvm-svn: 318213
2017-11-14 23:05:36 +00:00
Sanjay Patel
5cbaa253c7 [Reassociate] use dyn_cast instead of isa+cast; NFCI
llvm-svn: 318212
2017-11-14 23:03:56 +00:00
Mitch Phillips
a0cefd153d [cfi-verify] Add DOT graph printing for GraphResult objects.
Allows users to view GraphResult objects in a DOT directed-graph format. This feature can be turned on through the --print-graphs flag.

Also enabled pretty-printing of instructions in output. Together these features make analysis of unprotected CF instructions much easier by providing a visual control flow graph.

Reviewers: pcc

Subscribers: llvm-commits, kcc, vlad.tsyrklevich

Differential Revision: https://reviews.llvm.org/D39819

llvm-svn: 318211
2017-11-14 22:43:13 +00:00
Aditya Nandakumar
8e51f27003 [GISel]: Rework legalization algorithm for better elimination of
artifacts along with DCE

Legalization Artifacts are all those insts that are there to make the
type system happy. Currently, the target needs to say all combinations
of extends and truncs are legal and there's no way of verifying that
post legalization, we only have *truly* legal instructions. This patch
changes roughly the legalization algorithm to process all illegal insts
at one go, and then process all truncs/extends that were added to
satisfy the type constraints separately trying to combine trivial cases
until they converge. This has the added benefit that, the target
legalizerinfo can only say which truncs and extends are okay and the
artifact combiner would combine away other exts and truncs.

Updated legalization algorithm to roughly the following pseudo code.

WorkList Insts, Artifacts;
collect_all_insts_and_artifacts(Insts, Artifacts);

do {
  for (Inst in Insts)
         legalizeInstrStep(Inst, Insts, Artifacts);
  for (Artifact in Artifacts)
         tryCombineArtifact(Artifact, Insts, Artifacts);
} while(!Insts.empty());

Also, wrote a simple wrapper equivalent to SetVector, except for
erasing, it avoids moving all elements over by one and instead just
nulls them out.

llvm-svn: 318210
2017-11-14 22:42:19 +00:00
Hans Wennborg
da29bc5094 CMake: Turn LLVM_ENABLE_LIBXML2 into a tri-state option
In addition to the current ON and OFF options, this adds the FORCE_ON
option, which causes a configuration error if libxml2 cannot be used.

Differential revision: https://reviews.llvm.org/D40050

llvm-svn: 318209
2017-11-14 22:32:49 +00:00
Simon Dardis
61a86474b9 Reland "[mips][mt][6/7] Add support for mftr, mttr instructions."
This adjusts the tests to hopfully pacify the
llvm-clang-x86_64-expensive-checks-win buildbot.

Unlike many other instructions, these instructions have aliases which
take coprocessor registers, gpr register, accumulator (and dsp accumulator)
registers, floating point registers, floating point control registers and
coprocessor 2 data and control operands.

For the moment, these aliases are treated as pseudo instructions which are
expanded into the underlying instruction. As a result, disassembling these
instructions shows the underlying instruction and not the alias.

Reviewers: slthakur, atanasyan

Differential Revision: https://reviews.llvm.org/D35253

llvm-svn: 318207
2017-11-14 22:26:42 +00:00
Rong Xu
3000478a98 [CodeGen] Fix the test case added in r318202
Add the -mtriple option to filter some platforms.

llvm-svn: 318206
2017-11-14 22:08:37 +00:00
Reid Kleckner
2e4f96b6e5 Make salvageDebugInfo of casts work for dbg.declare and dbg.addr
Summary:
Instcombine (and probably other passes) sometimes want to change the
type of an alloca. To do this, they generally create a new alloca with
the desired type, create a bitcast to make the new pointer type match
the old pointer type, replace all uses with the cast, and then simplify
the casts. We already knew how to salvage dbg.value instructions when
removing casts, but we can extend it to cover dbg.addr and dbg.declare.

Fixes a debug info quality issue uncovered in Chromium in
http://crbug.com/784609

Reviewers: aprantl, vsk

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D40042

llvm-svn: 318203
2017-11-14 21:49:06 +00:00
Rong Xu
fb310035c3 [CodeGen] Peel off the dominant case in switch statement in lowering
This patch peels off the top case in switch statement into a branch if the
probability exceeds a threshold. This will help the branch prediction and
avoids the extra compares when lowering into chain of branches.

Differential Revision: http://reviews.llvm.org/D39262

llvm-svn: 318202
2017-11-14 21:44:09 +00:00
Richard Smith
1d9eb5ed2b Fix unused variable warning.
llvm-svn: 318201
2017-11-14 21:26:46 +00:00
Hans Wennborg
7bd42058c3 Rename CountingFunctionInserter and use for both mcount and cygprofile calls, before and after inlining
Clang implements the -finstrument-functions flag inherited from GCC, which
inserts calls to __cyg_profile_func_{enter,exit} on function entry and exit.

This is useful for getting a trace of how the functions in a program are
executed. Normally, the calls remain even if a function is inlined into another
function, but it is useful to be able to turn this off for users who are
interested in a lower-level trace, i.e. one that reflects what functions are
called post-inlining. (We use this to generate link order files for Chromium.)

LLVM already has a pass for inserting similar instrumentation calls to
mcount(), which it does after inlining. This patch renames and extends that
pass to handle calls both to mcount and the cygprofile functions, before and/or
after inlining as controlled by function attributes.

Differential Revision: https://reviews.llvm.org/D39287

llvm-svn: 318195
2017-11-14 21:09:45 +00:00
Dinar Temirbulatov
7ad2acdfcd [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
Patch tries to improve vectorization of the following code:
    
        void add1(int * __restrict dst, const int * __restrict src) {
          *dst++ = *src++;
          *dst++ = *src++ + 1;
          *dst++ = *src++ + 2;
          *dst++ = *src++ + 3;
        }
        Allows to vectorize even if the very first operation is not a binary add, but just a load.
    
        Fixed issues related to previous commit.
    
        Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
    
        Reviewed By: ABataev, RKSimon
    
        Subscribers: llvm-commits, RKSimon
    
        Differential Revision: https://reviews.llvm.org/D28907

llvm-svn: 318193
2017-11-14 20:55:08 +00:00
Jake Ehrlich
c6f06ce836 [llvm-objcopy] Improve command line option help messages
I was being inconsistent with the way I was capitalizing help messages
for command line options. Additionally --remove-section wasn't using
value_desc even though it benefited from it.

Differential Revision: https://reviews.llvm.org/D39978

llvm-svn: 318190
2017-11-14 20:36:04 +00:00
Matt Arsenault
d3b08960c4 AMDGPU: Error on stack size overflow
llvm-svn: 318189
2017-11-14 20:33:14 +00:00
Ulrich Weigand
91c3170ffd [SystemZ] Do not crash when selecting an OR of two constants
In rare cases, common code will attempt to select an OR of two
constants.  This confuses the logic in splitLargeImmediate,
causing an internal error during isel.  Fixed by simply leaving
this case to common code to handle.

This fixes PR34859.

llvm-svn: 318187
2017-11-14 20:00:34 +00:00
Evandro Menezes
b38c312ed6 [AArch64] Adjust the cost model for Exynos M1 and M2
Fix the modeling of loads and stores of registers pairs.

llvm-svn: 318186
2017-11-14 19:59:43 +00:00
Martin Storsjo
43eba3b5b5 [llvm-strings] Add support for the -a/--all options
They don't actually change nay behaviour, as llvm-strings currently
checks the whole object without looking at individual sections anyway.

This allows using llvm-strings in a context that explicitly passes
the -a option.

Differential Revision: https://reviews.llvm.org/D40020

llvm-svn: 318185
2017-11-14 19:58:36 +00:00
Martin Storsjo
8a04e59b24 [ARM, AArch64] Fix an assert message, Darwin isn't the only target supporting TLS. NFC.
llvm-svn: 318184
2017-11-14 19:57:59 +00:00
Hiroshi Yamauchi
cd071125c2 Simplify irreducible loop metadata test code.
Summary:
Shorten the irreducible loop metadata test code by removing insignificant
instructions.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40043

llvm-svn: 318182
2017-11-14 19:48:59 +00:00
Easwaran Raman
c00abf7e8d [CodeGenPrepare] Disable div bypass when working set size is huge.
Summary:
Bypass of slow divs based on operand values is currently disabled for
-Os. Do the same when profile summary is available and the working set
size of the application is huge. This is similar to how loop peeling is
guarded by hasHugeWorkingSetSize. In the div bypass case, the generated
extra code (and the extra branch) tendss to outweigh the benefits of the
bypass. This results in noticeable performance improvement on an
internal application.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39992

llvm-svn: 318179
2017-11-14 19:31:51 +00:00
Ulrich Weigand
d7f751f27a [SystemZ] Fix invalid codegen using RISBMux on out-of-range bits
Before using the 32-bit RISBMux set of instructions we need to
verify that the input bits are actually within range of the 32-bit
instruction.  This fixer PR35289.

llvm-svn: 318177
2017-11-14 19:20:46 +00:00
Alex Bradbury
8f39d57648 Set hasSideEffects=0 for TargetOpcode::{CFI_INSTRUCTION,EH_LABEL,GC_LABEL,ANNOTATION_LABEL}
D37065 (committed as rL317674) explicitly set hasSideEffects for all 
TargetOpcode::* instructions where it was inferred previously. This is a 
follow-up to that patch, setting hasSideEffects=0 for CFI_INSTRUCTION, 
EH_LABEL, GC_LABEL and ANNOTATION_LABEL. All LLVM tests pass after this 
change.

This patch also modifies MachineInstr::isLabel returns true for a 
TargetOpcode::ANNOTATION_LABEL, which ensures that an annotation label won't 
be incorrectly considered safe to move.

Differential Revision: https://reviews.llvm.org/D39941

llvm-svn: 318174
2017-11-14 19:16:08 +00:00
Artem Belevich
4c09ecb079 Mark intrinsics operating on the whole warp as IntrInaccessibleMemOnly
It's needed to model the fact that they do access data from other threads in a
warp and thus can't be CSE'd.

llvm-svn: 318173
2017-11-14 19:14:00 +00:00
Simon Dardis
ce629c5471 [mips] Simplify test for 5.0.1 (NFC)
Simplify testing that an emergency spill slot is used when MSA
is used so that it can be included in the 5.0.1 release.

llvm-svn: 318172
2017-11-14 19:11:45 +00:00
Jake Ehrlich
3273135094 [llvm-objcopy] Add -strip-non-alloc option to remove all non-allocated sections
This change adds a new flag not present in GNU objcopy that we call
--strip-non-alloc.

Differential Revision: https://reviews.llvm.org/D39926

llvm-svn: 318168
2017-11-14 18:50:24 +00:00
Yaxun Liu
ca6f013ec5 CodeGen: Fix TargetLowering::LowerCallTo for sret value type
TargetLowering::LowerCallTo assumes that sret value type corresponds to a
pointer in default address space, which is incorrect, since sret value type
should correspond to a pointer in alloca address space, which may not
be the default address space. This causes assertion for amdgcn target
in amdgiz environment.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D39996

llvm-svn: 318167
2017-11-14 18:46:52 +00:00
Jake Ehrlich
0c7906a7ed [llvm-objcopy] Support the rest of the ELF formats
We haven't been supporting anything but ELF64LE since the start. Luckily
this was always accounted for and the change is pretty trivial. B35281
requests this change for ELF32LE. This change adds support for ELF32LE,
ELF64BE, and ELF32BE with all supported features that already existed
for ELF64LE.

Differential Revision: https://reviews.llvm.org/D39977

llvm-svn: 318166
2017-11-14 18:41:47 +00:00