1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 04:02:41 +01:00
Commit Graph

182481 Commits

Author SHA1 Message Date
Petr Hosek
5fea52a38d [SafeStack] Insert the deref after the offset
While debugging code that uses SafeStack, we've noticed that LLVM
produces an invalid DWARF. Concretely, in the following example:

  int main(int argc, char* argv[]) {
    std::string value = "";
    printf("%s\n", value.c_str());
    return 0;
  }

DWARF would describe the value variable as being located at:

  DW_OP_breg14 R14+0, DW_OP_deref, DW_OP_constu 0x20, DW_OP_minus

The assembly to get this variable is:

  leaq    -32(%r14), %rbx

The order of operations in the DWARF symbols is incorrect in this case.
Specifically, the deref is incorrect; this appears to be incorrectly
re-inserted in repalceOneDbgValueForAlloca.

With this change which inserts the deref after the offset instead of
before it, LLVM produces correct DWARF:

  DW_OP_breg14 R14-32

Differential Revision: https://reviews.llvm.org/D64971

llvm-svn: 366726
2019-07-22 18:52:42 +00:00
Peter Collingbourne
25569f62e9 WholeProgramDevirt: Teach the pass to respect the global's alignment.
The bytes inserted before an overaligned global need to be padded according
to the alignment set on the original global in order for the initializer
to meet the global's alignment requirements. The previous implementation
that padded to the pointer width happened to be correct for vtables on most
platforms but may do the wrong thing if the vtable has a larger alignment.

This issue is visible with a prototype implementation of HWASAN for globals,
which will overalign all globals including vtables to 16 bytes.

There is also no padding requirement for the bytes inserted after the global
because they are never read from nor are they significant for alignment
purposes, so stop inserting padding there.

Differential Revision: https://reviews.llvm.org/D65031

llvm-svn: 366725
2019-07-22 18:50:45 +00:00
Sean Fertile
ef51fda361 [PowerPC] Fix comment on MO_PLT Target Operand Flag. [NFC]
Patch by Xiangling Liao.

llvm-svn: 366724
2019-07-22 18:47:59 +00:00
Sean Fertile
3853b078f5 [Object][XCOFF] Remove extra includes from XCOFF related files. [NFC]
Differential Revision: https://reviews.llvm.org/D60885

llvm-svn: 366723
2019-07-22 18:47:55 +00:00
Peter Collingbourne
8710cb27db LowerTypeTests: Teach the pass to respect global alignments.
We were previously ignoring alignment entirely when combining globals
together in this pass. There are two main things that we need to do here:
add additional padding before each global to meet the alignment requirements,
and set the combined global's alignment to the maximum of all of the original
globals' alignments.

Since we now need to calculate layout as we go anyway, use the calculated
layout to produce GlobalLayout instead of using StructLayout.

Differential Revision: https://reviews.llvm.org/D65033

llvm-svn: 366722
2019-07-22 18:47:03 +00:00
Nilanjana Basu
fd1c0c5b07 Changes to emit CodeView debug info nested type records properly using MCStreamer directives
llvm-svn: 366720
2019-07-22 18:22:55 +00:00
Stanislav Mekhanoshin
ba8945482d [AMDGPU] Test update. NFC.
llvm-svn: 366715
2019-07-22 18:08:53 +00:00
Simon Pilgrim
e080c4be5f [SLPVectorizer] Fix some MSVC/cppcheck uninitialized variable warnings. NFCI.
llvm-svn: 366712
2019-07-22 17:57:36 +00:00
Vlad Tsyrklevich
e5e6a64128 Revert "Reland [ELF] Loose a condition for relocation with a symbol"
This reverts commit r366686 as it appears to be causing buildbot
failures on sanitizer-x86_64-linux-android and sanitizer-x86_64-linux.

llvm-svn: 366708
2019-07-22 17:48:53 +00:00
Matt Arsenault
bcf27a40b4 TableGen: Support physical register inputs > 255
This was truncating register value that didn't fit in unsigned char.
Switch AMDGPU sendmsg intrinsics to using a tablegen pattern.

llvm-svn: 366695
2019-07-22 15:02:34 +00:00
Sam Parker
8aea07e50f [ARM][LowOverheadLoops] Revert remaining pseudos
ARMLowOverheadLoops would assert a failure if it did not find all the
pseudo instructions that comprise the hardware loop. Instead of doing
this, iterate through all the instructions of the function and revert
any remaining pseudo instructions that haven't been converted.

Differential Revision: https://reviews.llvm.org/D65080

llvm-svn: 366691
2019-07-22 14:16:40 +00:00
Matt Arsenault
65dfb81343 AMDGPU/GlobalISel: Fix broken tests
llvm-svn: 366688
2019-07-22 13:33:11 +00:00
Nikola Prica
c3d21d6ff5 Reland [ELF] Loose a condition for relocation with a symbol
This patch was not the reason of the buildbot failure.

Deleted code was introduced as a work around for a bug in the gold linker
(http://sourceware.org/PR16794). Test case that was given as a reason for
this part of code, the one on previous link, now works for the gold.
This condition is too strict and when a code is compiled with debug info
it forces generation of numerous relocations with symbol for architectures
that do not have relocation addend.

Reviewers: arsenm, espindola

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D64327

llvm-svn: 366686
2019-07-22 13:07:01 +00:00
Matt Arsenault
c750f1d145 AMDGPU/GlobalISel: Remove unnecessary code
The minnum/maxnum case are dead, and the cvt is handled by the
default.

llvm-svn: 366685
2019-07-22 13:05:25 +00:00
David Green
dacf89afe9 [ARM] Fix for MVE VPT block pass
We need to ensure that the number of T's is correct when adding multiple
instructions into the same VPT block.

Differential revision: https://reviews.llvm.org/D65049

llvm-svn: 366684
2019-07-22 12:51:38 +00:00
Simon Pilgrim
743b3f3dc6 [X86] EltsFromConsecutiveLoads - support common source loads (REAPPLIED)
This patch enables us to find the source loads for each element, splitting them into a Load and ByteOffset, and attempts to recognise consecutive loads that are in fact from the same source load.

A helper function, findEltLoadSrc, recurses to find a LoadSDNode and determines the element's byte offset within it. When attempting to match consecutive loads, byte offsetted loads then attempt to matched against a previous load that has already been confirmed to be a consecutive match.

Next step towards PR16739 - after this we just need to account for shuffling/repeated elements to create a vector load + shuffle.

Fixed out of bounds load assert identified in rL366501

Differential Revision: https://reviews.llvm.org/D64551

llvm-svn: 366681
2019-07-22 12:44:10 +00:00
Matt Arsenault
bf66729ef2 AMDGPU/GlobalISel: Fix tests without asserts
The legality check is only done under NDEBUG, so the failure cases are
different in a release build.

llvm-svn: 366680
2019-07-22 12:43:41 +00:00
Christudasan Devadasan
34435ad5ca Added address-space mangling for stack related intrinsics
Modified the following 3 intrinsics:
int_addressofreturnaddress,
int_frameaddress & int_sponentry.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D64561

llvm-svn: 366679
2019-07-22 12:42:48 +00:00
Simon Pilgrim
6e13723d10 [X86][SSE] Add EltsFromConsecutiveLoads test case identified in rL366501
Test case that led to rL366441 being reverted at rL366501

llvm-svn: 366678
2019-07-22 12:17:56 +00:00
George Rimar
67617455b4 [yaml2obj] - Change how we handle implicit sections.
Instead of having the special list of implicit sections,
that are mixed with the sections read from YAML on late
stages, I just create the placeholders and add them to
the main sections list early.

That allows to significantly simplify the code.

Differential revision: https://reviews.llvm.org/D64999

llvm-svn: 366677
2019-07-22 12:01:52 +00:00
Stefan Granitz
d98301a534 Add location of SVN staging dir to git-llvm error output
Summary:
In pre-monorepo times the svn staging directory was `.git/svn`. The below error message wasn't mentioning the new name yet.

Example before:
```
Can't push git rev 104cfa289d9 because svn status is not empty:
!     llvm/trunk/include/llvm
```

Example after:
```
Can't push git rev 104cfa289d9 because status in svn staging dir (.git/llvm-upstream-svn) is not empty:
!     llvm/trunk/include/llvm
```

Reviewers: mehdi_amini, jlebar, teemperor

Reviewed By: mehdi_amini

Subscribers: llvm-commits, #llvm

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65038

llvm-svn: 366671
2019-07-22 09:47:40 +00:00
Oliver Stannard
ba7d76cd82 [IPRA][ARM] Make use of the "returned" parameter attribute
ARM has code to recognise uses of the "returned" function parameter
attribute which guarantee that the value passed to the function in r0
will be returned in r0 unmodified. IPRA replaces the regmask on call
instructions, so needs to be told about this to avoid reverting the
optimisation.

Differential revision: https://reviews.llvm.org/D64986

llvm-svn: 366669
2019-07-22 08:44:36 +00:00
George Rimar
59cf4e3d17 [llvm-readobj] - Stop using precompiled objects in file-headers.test
This converts all sub-tests except one to YAML instead of precompiled inputs.

Differential revision: https://reviews.llvm.org/D64800

llvm-svn: 366668
2019-07-22 08:10:02 +00:00
Jay Foad
c68a43fab9 [AMDGPU] Save some work when an atomic op has no uses
Summary:
In the atomic optimizer, save doing a bunch of work and generating a
bunch of dead IR in the fairly common case where the result of an
atomic op (i.e. the value that was in memory before the atomic op was
performed) is not used. NFC.

Reviewers: arsenm, dstuttard, tpr

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, t-tye, hiraditya, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64981

llvm-svn: 366667
2019-07-22 07:19:44 +00:00
Kai Luo
e648d95f6e [PowerPC][NFC] Precommit a test case where ppc-mi-peepholes miscompiles extswsli
Added a test case to show codegen differences.

llvm-svn: 366666
2019-07-22 05:32:20 +00:00
Serguei Katkov
78bfbc8ebf [Loop Peeling] Fix the handling of branch weights of peeled off branches.
Current algorithm to update branch weights of latch block and its copies is
based on the assumption that number of peeling iterations is approximately equal
to trip count.

However it is not correct. According to profitability check in one case we can decide to peel
in case it helps to reduce the number of phi nodes. In this case the number of peeled iteration
can be less then estimated trip count.

This patch introduces another way to set the branch weights to peeled of branches.
Let F is a weight of the edge from latch to header.
Let E is a weight of the edge from latch to exit.
F/(F+E) is a probability to go to loop and E/(F+E) is a probability to go to exit.
Then, Estimated TripCount = F / E.
For I-th (counting from 0) peeled off iteration we set the the weights for
the peeled latch as (TC - I, 1). It gives us reasonable distribution,
The probability to go to exit 1/(TC-I) increases. At the same time
the estimated trip count of remaining loop reduces by I.

As a result after peeling off N iteration the weights will be
(F - N * E, E) and trip count of loop becomes
F / E - N or TC - N.

The idea is taken from the review of the patch D63918 proposed by Philip.

Reviewers: reames, mkuper, iajbar, fhahn
Reviewed By: reames
Subscribers: hiraditya, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D64235

llvm-svn: 366665
2019-07-22 05:15:34 +00:00
Fangrui Song
e85621ae56 [utils] Clean up UpdateTestChecks/common.py
llvm-svn: 366664
2019-07-22 04:59:01 +00:00
Craig Topper
6d5f070772 [InstCombine] Add foldAndOfICmps test cases inspired by PR42691.
icmp ne %x, INT_MIN can be treated similarly to icmp sgt %x, INT_MIN.
icmp ne %x, INT_MAX can be treated similarly to icmp slt %x, INT_MAX.
icmp ne %x, UINT_MAX can be treated similarly to icmp ult %x, UINT_MAX.

We already treat icmp ne %x, 0 similarly to icmp ugt %x, 0

llvm-svn: 366662
2019-07-22 02:43:43 +00:00
Nemanja Ivanovic
8c7dfc09d5 [PowerPC][NFC] Precomit test case for upcoming patch
Just committing a test case for an upcoming patch so that the review can show
only the codegen differences.

llvm-svn: 366661
2019-07-21 21:03:45 +00:00
Simon Pilgrim
ffdb7cfe3b [X86] SimplifyDemandedVectorEltsForTargetNode - Move SUBV_BROADCAST narrowing handling. NFCI.
Move the narrowing of SUBV_BROADCAST to where we handle all the other opcodes.

llvm-svn: 366660
2019-07-21 19:04:44 +00:00
Nemanja Ivanovic
f160dc21c1 [PowerPC][NFC] Regenerate test using script
This test case ended up as a hybrid of generated checks and manually inserted
checks. Regenerate using script to make it consistent.

llvm-svn: 366659
2019-07-21 18:42:29 +00:00
Craig Topper
4e73afb68e [InstCombine] Update comment I missed in r366649. NFC
llvm-svn: 366658
2019-07-21 16:15:03 +00:00
Simon Pilgrim
d36aa2fa3c [SmallBitVector] Fix bug in find_next_unset for small types with indices >=32
We were creating a bitmask from a shift of unsigned instead of uintptr_t, meaning we couldn't create masks for indices above 31.

Noticed due to a MSVC analyzer warning.

llvm-svn: 366657
2019-07-21 16:06:26 +00:00
Aditya Nandakumar
1d6b95ca2a [GISel]: Attach missing range metadata while translating G_LOADs
https://reviews.llvm.org/D65048

Attach range information to G_LOAD when only defining one register.

reviewed by: arsenm

llvm-svn: 366656
2019-07-21 14:07:54 +00:00
David Green
59e56be139 [ARM] Move MVE VPT block tests into the Thumb2 directory. NFC
llvm-svn: 366655
2019-07-21 13:09:19 +00:00
Roman Lebedev
6a6c707ef7 [NFC][InstCombine] Add a few extra srem-by-power-of-two tests - extra uses
llvm-svn: 366652
2019-07-21 09:05:49 +00:00
Craig Topper
1985c47dd6 [InstCombine] Remove insertRangeTest code that handles the equality case.
For equality, the function called getTrue/getFalse with the VT
of the comparison input. But getTrue/getFalse need the boolean VT.
So if this code ever executed, it would assert.

I believe these cases are removed by InstSimplify so we don't get here.

So this patch just fixes up an assert to exclude the equality
possibility and removes the broken code.

llvm-svn: 366649
2019-07-21 06:43:38 +00:00
Craig Topper
9b03e30e86 [InstCombine] Don't use AddOne/SubOne to see if two APInts are 1 apart. Use APInt operations instead. NFCI
AddOne/SubOne create new Constant objects. That seems heavy for
comparing ConstantInts which wrap APInts. Just do the math on
on the APInts and compare them.

llvm-svn: 366648
2019-07-21 05:26:05 +00:00
Nico Weber
f25a994803 gn build: Merge r366622
llvm-svn: 366646
2019-07-21 00:03:55 +00:00
Roman Lebedev
f6ef7f16ab [NFC][InstCombine] Autogenerate a few tests
llvm-svn: 366643
2019-07-20 21:34:00 +00:00
Roman Lebedev
a874990efe [NFC][InstCombine] Add srem-by-signbit tests - still can fold to bittest
https://rise4fun.com/Alive/IIeS

llvm-svn: 366642
2019-07-20 21:33:50 +00:00
Roman Lebedev
ed828c1134 [NFC][Codegen][X86][AArch64] Add "(x s% C) == 0" tests
Much like with `urem`, the same optimization (albeit with slightly
different algorithm) applies for the signed case, too.

I'm simply copying the test coverage from `urem` case for now,
i believe it should be (close to?) sufficient.

llvm-svn: 366640
2019-07-20 19:25:44 +00:00
Roman Lebedev
609e3ba7ea [Codegen][SelectionDAG] X u% C == 0 fold: non-splat vector improvements
Summary:
Four things here:
1. Generalize the fold to handle non-splat divisors. Reasonably trivial.
2. Unban power-of-two divisors. I don't see any reason why they should
   be illegal.
   * There is no ban in Hacker's Delight
   * I think the ban came from the same bug that caused the miscompile
      in the base patch - in `floor((2^W - 1) / D)` we were dividing by
      `D0` instead of `D`, and we **were** ensuring that `D0` is not `1`,
      which made sense.
3. Unban `1` divisors. I no longer believe Hacker's Delight actually says
   that the fold is invalid for `D = 0`. Further considerations:
   * We know that
     * `(X u% 1) == 0`  can be constant-folded to `1`,
     * `(X u% 1) != 0`  can be constant-folded to `0`,
   *  Also, we know that
     * `X u<= -1` can be constant-folded to `1`,
     * `X u>  -1` can be constant-folded to `0`,
   * https://godbolt.org/z/7jnZJX https://rise4fun.com/Alive/oF6p
   * We know will end up with the following:
       `(setule/setugt (rotr (mul N, P), K), Q)`
   * Therefore, for given new DAG nodes and comparison predicates
     (`ule`/`ugt`), we will still produce the correct answer if:
     `Q` is a all-ones constant; and both `P` and `K` are *anything*
     other than `undef`.
   * The fold will indeed produce `Q = all-ones`.
4. Try to re-splat the `P` and `K` vectors - we don't care about
   their values for the lanes where divisor was `1`.

Reviewers: RKSimon, hermord, craig.topper, spatel, xbolva00

Reviewed By: RKSimon

Subscribers: hiraditya, javed.absar, dexonsmith, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63963

llvm-svn: 366637
2019-07-20 16:33:15 +00:00
Simon Pilgrim
5c2a6a3c00 [X86][SSE] Use PSADBW to improve vXi8 sum reduction (PR42674)
As detailed on PR42674, we can reduce a vXi8 down until we have the final <8 x i8>, and then use PSADBW with zero, to sum those values. We then extract the bottom i8, discarding any overflow from the upper bits of the i16 result.

llvm-svn: 366636
2019-07-20 15:20:11 +00:00
Florian Hahn
455b8d429f [Local] Zap blockaddress without users in ConstantFoldTerminator.
If the blockaddress is not destoryed, the destination block will still
be marked as having its address taken, limiting further transformations.

I think there are other places where the dead blockaddress constants are kept
around, I'll look into that as follow up.

Reviewers: craig.topper, brzycki, davide

Reviewed By: brzycki, davide

Differential Revision: https://reviews.llvm.org/D64936

llvm-svn: 366633
2019-07-20 12:25:47 +00:00
Jessica Paquette
a789c33e9a [GlobalISel][AArch64] Contract trivial same-size cross-bank copies into G_STOREs
Sometimes, you can end up with cross-bank copies between same-sized GPRs and
FPRs, which feed into G_STOREs. When these copies feed only into stores, they
aren't necessary; we can just store using the original register bank.

This provides some minor code size savings for some floating point SPEC
benchmarks. (Around 0.2% for 453.povray and 450.soplex)

This issue doesn't seem to show up due to regbankselect or anything similar. So,
this patch introduces an early select function, `contractCrossBankCopyIntoStore`
which performs the contraction when possible. The selector then continues
normally and selects the correct store opcode, eliminating needless copies
along the way.

Differential Revision: https://reviews.llvm.org/D65024

llvm-svn: 366625
2019-07-20 01:55:35 +00:00
Guanzhong Chen
5ecf5af2b4 [WebAssembly] Compute and export TLS block alignment
Summary:
Add immutable WASM global `__tls_align` which stores the alignment
requirements of the TLS segment.

Add `__builtin_wasm_tls_align()` intrinsic to get this alignment in Clang.

The expected usage has now changed to:

    __wasm_init_tls(memalign(__builtin_wasm_tls_align(),
                             __builtin_wasm_tls_size()));

Reviewers: tlively, aheejin, sbc100, sunfish, alexcrichton

Reviewed By: tlively

Subscribers: dschuff, jgravelle-google, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D65028

llvm-svn: 366624
2019-07-19 23:34:16 +00:00
Daniel Sanders
60c2b0f7af Re-commit: r366610 and r366612: Expand pseudo-components before embedding in llvm-config
There were two main problems:
* The 'nativecodegen' pseudo-component was unconditionally adding
  ${native_tgt}CodeGen even though it conditionally added ${native_tgt}Info and
  ${native_tgt}Desc. This has been fixed by making ${native_tgt}CodeGen
  conditional too
* The 'all' pseudo-component was causing library names like LLVMLLVMDemangle as
  the expansion was to a library name and not a component. There doesn't seem to
  be a list of available components anywhere so this has been fixed by moving the
  expansion of 'all' back where it was before. This manifested in different ways
  on different builders but it was the same root cause

llvm-svn: 366622
2019-07-19 22:46:47 +00:00
Matt Arsenault
0e9abc2fbd AMDGPU/GlobalISel: Legalize GEP for other 32-bit address spaces
llvm-svn: 366621
2019-07-19 22:28:44 +00:00
Stanislav Mekhanoshin
1ace6d5780 [AMDGPU] Autogenerate register sequences in tuples
Differential Revision: https://reviews.llvm.org/D65007

llvm-svn: 366619
2019-07-19 21:43:42 +00:00