1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 11:42:57 +01:00
Commit Graph

182729 Commits

Author SHA1 Message Date
Matt Arsenault
90731e897e AMDGPU/GlobalISel: Handle G_ATOMICRMW_FADD
llvm-svn: 367509
2019-08-01 03:33:15 +00:00
JF Bastien
f6df754670 [NFC] Remove obsolete LLVM_GNUC_PREREQ
The current minimum GCC version is 4.8 (soon to be 5.1), we there don't need to check for older versions. While I'm around Compiler.h, also update some of the doxygen comment.

llvm-svn: 367508
2019-08-01 03:30:45 +00:00
Matt Arsenault
1a171c4ef3 AMDGPU/GlobalISel: Allow selection of DS atomicrmw
llvm-svn: 367507
2019-08-01 03:29:01 +00:00
Matt Arsenault
cfa4ad28b6 AMDGPU: Start redefining atomic PatFrags
Start migrating to a form that will be compatible with the global isel
emitter. Also should fix some overly lax checks on the memory type,
which allowed mis-selecting some illegal atomics.

llvm-svn: 367506
2019-08-01 03:25:52 +00:00
Matt Arsenault
d58b83879e AMDGPU: Correct FP atomic patterns
These need to use an fadd, not an add. Also make the noret part clear
in the name.

llvm-svn: 367505
2019-08-01 03:22:40 +00:00
Matt Arsenault
9156809a6f AMDGPU/GlobalISel: Select simple local stores
llvm-svn: 367504
2019-08-01 03:09:15 +00:00
Matt Arsenault
c6739e6709 GlobalISel: moreElementsVector for G_LOAD/G_STORE
AMDGPU change and test is a placeholder until a future patch with
complete handling.

llvm-svn: 367503
2019-08-01 01:44:22 +00:00
Peter Collingbourne
6a9e39fd00 Create unique, but identically-named ELF sections for explicitly-sectioned functions and globals when using -function-sections and -data-sections.
This allows functions and globals to to be reordered later in the linking phase
(using the -symbol-ordering-file) even though reordering will be limited to
the scope of the explicit section.

Patch by Rahman Lavaee!

Differential Revision: https://reviews.llvm.org/D65478

llvm-svn: 367501
2019-08-01 01:38:53 +00:00
Matt Arsenault
2aa9ce0a05 Reapply "AMDGPU: Split block for si_end_cf"
This reverts commit r359363, reapplying r357634

llvm-svn: 367500
2019-08-01 01:25:27 +00:00
Philip Reames
27bc6ee5d4 Fix a release-only build warning triggered by rL367485
llvm-svn: 367499
2019-08-01 01:16:08 +00:00
Matt Arsenault
4775479b39 AMDGPU/GlobalISel: Select local loads
llvm-svn: 367498
2019-08-01 00:53:38 +00:00
Amy Huang
4ab9f86e00 Revert "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG" and
and partial fix.
Causes windows buildbot errors.

This reverts commit 6e65c34523963094acd0d6c94a5f5c64b32fe6aa and
53da7ca94343166ac68aef81db0398932fc258bb.

llvm-svn: 367496
2019-07-31 23:59:31 +00:00
Richard Smith
e5249e008a Fix build when both gtest death tests and LLVM_NODISCARD are available.
llvm-svn: 367495
2019-07-31 23:37:24 +00:00
Eli Friedman
910badf6bd [ARM] Lower "(x<<c) > 0x80000000U" to "lsls" on Thumb1.
This is extremely specific, but saves three instructions when it's
legal.  I don't think the code can be usefully generalized.

Differential Revision: https://reviews.llvm.org/D65351

llvm-svn: 367492
2019-07-31 23:19:21 +00:00
Eli Friedman
c986dd14c4 [ARM] Transform compare of masked value to shift on Thumb1.
Thumb1 has very limited immediate modes, so turning an "and" into a
shift can save multiple instructions.

It's possible to simplify the generated code for test2 and test3 in
cmp-and-fold.ll a little more, but I'll implement that as a followup.

Differential Revision: https://reviews.llvm.org/D65175

llvm-svn: 367491
2019-07-31 23:17:34 +00:00
JF Bastien
a026b577c5 [ConstExprPreter] Overflow-detecting methods use GCC or clang builtins
Differential Revision: https://reviews.llvm.org/D65536

llvm-svn: 367490
2019-07-31 23:09:18 +00:00
Craig Topper
4b86c5ad65 [ScalarizeMaskedMemIntrin] Bitcast the mask to the scalar domain and use scalar bit tests for the branches.
X86 at least is able to use movmsk or kmov to move the mask to the scalar
domain. Then we can just use test instructions to test individual bits.

This is more efficient than extracting each mask element
individually.

I special cased v1i1 to use the previous behavior. This avoids
poor type legalization of bitcast of v1i1 to i1.

I've skipped expandload/compressstore as I think we need to
handle constant masks for those better first.

Many tests end up with duplicate test instructions due to tail
duplication in the branch folding pass. But the same thing
happens when constructing similar code in C. So its not unique
to the scalarization.

Not sure if this lowering code will also be good for other targets,
but we're only testing X86 today.

Differential Revision: https://reviews.llvm.org/D65319

llvm-svn: 367489
2019-07-31 22:58:15 +00:00
Craig Topper
f234df8098 [X86] Add DAG combine to fold any_extend_vector_inreg+truncstore to an extractelement+store
We have custom code that ignores the normal promoting type legalization on less than 128-bit vector types like v4i8 to emit pavgb, paddusb, psubusb since we don't have the equivalent instruction on a larger element type like v4i32. If this operation appears before a store, we can be left with an any_extend_vector_inreg followed by a truncstore after type legalization. When truncstore isn't legal, this will normally be decomposed into shuffles and a non-truncating store. This will then combine away the any_extend_vector_inreg and shuffle leaving just the store. On avx512, truncstore is legal so we don't decompose it and we had no combines to fix it.

This patch adds a new DAG combine to detect this case and emit either an extract_store for 64-bit stoers or a extractelement+store for 32 and 16 bit stores. This makes the avx512 codegen match the avx2 codegen for these situations. I'm restricting to only when -x86-experimental-vector-widening-legalization is false. When we're widening we're not likely to create this any_extend_inreg+truncstore combination. This means we should be able to remove this code when we flip the default. I would like to flip the default soon, but I need to investigate some performance regressions its causing in our branch that I wasn't seeing on trunk.

Differential Revision: https://reviews.llvm.org/D65538

llvm-svn: 367488
2019-07-31 22:43:08 +00:00
Philip Reames
a13aa37873 Attempt to unbreak sphinx build bot by inserting a link.
llvm-svn: 367487
2019-07-31 22:14:26 +00:00
Michael Berg
526ea0f419 Migrate some more fadd and fsub cases away from UnsafeFPMath control to utilize NoSignedZerosFPMath options control
Summary: Honoring no signed zeroes is also available as a user control through clang separately regardless of fastmath or UnsafeFPMath context, DAG guards should reflect this context.

Reviewers: spatel, arsenm, hfinkel, wristow, craig.topper

Reviewed By: spatel

Subscribers: rampitec, foad, nhaehnle, wuzish, nemanjai, jvesely, wdng, javed.absar, MaskRay, jsji

Differential Revision: https://reviews.llvm.org/D65170

llvm-svn: 367486
2019-07-31 21:57:28 +00:00
Philip Reames
2a66014143 [IndVars, RLEV] Support rewriting exit values in loops without known exits (prep work)
This is a prepatory patch for future work on support exit value rewriting in loops with a mixture of computable and non-computable exit counts.  The intention is to be "mostly NFC" - i.e. not enable any interesting new transforms - but in practice, there are some small output changes.

The test differences are caused by cases wherewhere getSCEVAtScope can simplify a single entry phi without needing any knowledge of the loop.

llvm-svn: 367485
2019-07-31 21:15:21 +00:00
JF Bastien
8ee6f338a5 [NFC] allow creating error strings from a Twine
It's useful when no format needs to happen, only the Twine needs to be put together.

llvm-svn: 367484
2019-07-31 21:09:53 +00:00
Amy Huang
750484ab58 Fix to r367374 "[MS] Emit S_HEAPALLOCSITE debug info in Selection DAG"
after windows buildbot failure.

Added a check that the MachineInstr exists and is a call before trying
to add symbols around it.

llvm-svn: 367483
2019-07-31 21:03:38 +00:00
Eric Christopher
e3ccb36767 Fix unused variable warning for non-assert builds.
llvm-svn: 367482
2019-07-31 21:02:03 +00:00
Mark Lacey
e174ded2a4 [GISel] Address review feedback on passing MD_callees to lowerCall.
Preserve the nullptr default for KnownCallees that appears in
the base class.

llvm-svn: 367477
2019-07-31 20:34:05 +00:00
Mark Lacey
9c08fd02aa [GISel] Pass MD_callees metadata down in call lowering.
Summary:
This will make it possible to improve IPRA by taking into account
register usage in indirect calls.

NFC yet; this is just laying the groundwork to start building
up patches to take advantage of the information for improved register
allocation.

Reviewers: aditya_nandakumar, volkan, qcolombet, arsenm, rovka, aemerson, paquette

Subscribers: sdardis, wdng, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65488

llvm-svn: 367476
2019-07-31 20:34:02 +00:00
Peter Collingbourne
4385b79373 AArch64: Add a tagged-globals backend feature.
This feature instructs the backend to allow locally defined global variable
addresses to contain a pointer tag in bits 56-63 that will be ignored by
the hardware (i.e. TBI), but may be used by an instrumentation pass such
as HWASAN. It works by adding a MOVK instruction to the regular ADRP/ADD
sequence that sets bits 48-63 to the corresponding bits of the global, with
the linker bounds check disabled on the ADRP instruction to prevent the tag
from causing a link failure.

This implementation of the feature omits the MOVK when loading from or storing
to a global, which is sufficient for TBI. If the same approach is extended
to MTE, assuming that 0 is not configured as a catch-all tag, we will most
likely also need the MOVK in this case in order to avoid a tag mismatch.

Differential Revision: https://reviews.llvm.org/D65364

llvm-svn: 367475
2019-07-31 20:14:19 +00:00
Peter Collingbourne
3e0e02e98e SelectionDAG, MI, AArch64: Widen target flags fields/arguments from unsigned char to unsigned.
This makes the field wider than MachineOperand::SubReg_TargetFlags so that
we don't end up silently truncating any higher bits. We should still catch
any bits truncated from the MachineOperand field as a consequence of the
assertion in MachineOperand::setTargetFlags().

Differential Revision: https://reviews.llvm.org/D65465

llvm-svn: 367474
2019-07-31 20:14:09 +00:00
Wei Mi
58201367d2 [DAGCombine] Limit the number of times for the same store and root nodes
to bail out in store merging dependence check.

We run into a case where dependence check in store merging bail out many times
for the same store and root nodes in a huge basicblock. That increases compile
time by almost 100x. The patch add a map to track how many times the bailing
out happen for the same store and root, and if it is over a limit, stop
considering the store with the same root as a merging candidate.

Differential Revision: https://reviews.llvm.org/D65174

llvm-svn: 367472
2019-07-31 19:59:24 +00:00
JF Bastien
5334f6b691 [Support] Added overflow checking add, sub and mul.
Added AddOverflow, SubOverflow and MulOverflow to compute truncated results and return a flag indicating whether overflow occured.

 Differential Revision: https://reviews.llvm.org/D65494

llvm-svn: 367470
2019-07-31 19:40:07 +00:00
Craig Topper
b5eb56d77e [X86] Add test cases to show premature decomposition of vector multiplies into shift+add/sub for types that aren't legal and need to be split. NFC
llvm-svn: 367466
2019-07-31 19:05:11 +00:00
Craig Topper
90aff71f34 [X86] Add AVX512DQ command lines to vector-mul.ll to show that we use vpmullq instead of shift+add/sub for some cases. NFC
llvm-svn: 367465
2019-07-31 19:05:03 +00:00
Nico Weber
984f7cb446 gn build: Merge r367463
llvm-svn: 367464
2019-07-31 18:56:49 +00:00
Alina Sbirlea
73449602de [SCCP] Update condition to avoid overflow.
Summary:
Update condition to remove addition that may cause an overflow.
Resolves PR42814.

Reviewers: sanjoy, RKSimon

Subscribers: jlebar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65417

llvm-svn: 367461
2019-07-31 18:22:22 +00:00
Nico Weber
d2c9e74ca9 compiler-rt: Rename .cc file in lib/profile to .cpp
See https://reviews.llvm.org/D58620 for discussion.

Note how the comment in the file already said ".cpp" :)

llvm-svn: 367460
2019-07-31 18:21:08 +00:00
Lang Hames
b024a57c29 [docs] Add references to unreferenced footnotes.
Thanks to Stefan Granitz for catching the issue.

llvm-svn: 367458
2019-07-31 18:07:37 +00:00
Nico Weber
66d406df1a gn build: Merge r367456
llvm-svn: 367457
2019-07-31 18:04:03 +00:00
Nico Weber
ebd970a43f gn build: Merge r367452 and add standalone sources
llvm-svn: 367454
2019-07-31 17:56:45 +00:00
Alina Sbirlea
4d359fab94 [MemorySSA] Add additional verification for phis.
Summary:
Verify that the incoming defs into phis are the last defs from the
respective incoming blocks.
When moving blocks, insertDef must RenameUses. Adding this verification
makes GVNHoist tests fail that uncovered this issue.

Reviewers: george.burgess.iv

Subscribers: jlebar, Prazek, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63147

llvm-svn: 367451
2019-07-31 17:41:04 +00:00
Nico Weber
f7f4b8ef51 gn build: Add build files for compiler-rt/lib/profile
Differential Revision: https://reviews.llvm.org/D65518

llvm-svn: 367450
2019-07-31 17:15:32 +00:00
Nico Weber
4e2af2516a gn build: Make builtin library build on macOS
For now, it only builds the x86_64 slice.

Differential Revision: https://reviews.llvm.org/D65513

llvm-svn: 367449
2019-07-31 17:12:33 +00:00
Nico Weber
1734130bb2 gn build: Fix redundant object files in builtin lib.
compiler-rt's builtin library has generic implementations of many
functions, and then per-arch optimized implementations of some.

In the CMake build, both filter_builtin_sources() and an explicit loop
at the end of the build file (see D37166) filter out the generic
versions if a per-arch file is present.

The GN build wasn't doing this filtering. Just do the filtering manually
and explicitly, instead of being clever.

While here, also remove files from the mingw/arm build that are
redundantly listed after D39938 / r318139 (both from the CMake and the
GN build).

While here, also fix a target_os -> target_cpu typo.

Differential Revision: https://reviews.llvm.org/D65512

llvm-svn: 367448
2019-07-31 17:08:34 +00:00
Sanjay Patel
1c95190966 [InstCombine] canonicalize fneg before fmul/fdiv
Reverse the canonicalization of fneg relative to fmul/fdiv. That makes it
easier to implement the transforms (and possibly other fneg transforms) in
1 place because we can always start the pattern match from fneg (either the
legacy binop or the new unop).

There's a secondary practical benefit seen in PR21914 and PR42681:
https://bugs.llvm.org/show_bug.cgi?id=21914
https://bugs.llvm.org/show_bug.cgi?id=42681
...hoisting fneg rather than sinking seems to play nicer with LICM in IR
(although this change may expose analysis holes in the other direction).

1. The instcombine test changes show the expected neutral IR diffs from
   reversing the order.

2. The reassociation tests show that we were missing an optimization
   opportunity to fold away fneg-of-fneg. My reading of IEEE-754 says
   that all of these transforms are allowed (regardless of binop/unop
   fneg version) because:

   "For all other operations [besides copy/abs/negate/copysign], this
   standard does not specify the sign bit of a NaN result."
   In all of these transforms, we always have some other binop
   (fadd/fsub/fmul/fdiv), so we are free to flip the sign bit of a
   potential intermediate NaN operand.
   (If that interpretation is wrong, then we must already have a bug in
   the existing transforms?)

3. The clang tests shouldn't exist as-is, but that's effectively a
   revert of rL367149 (the test broke with an extension of the
   pre-existing fneg canonicalization in rL367146).

Differential Revision: https://reviews.llvm.org/D65399

llvm-svn: 367447
2019-07-31 16:53:22 +00:00
Djordje Todorovic
30604a8ec2 Reland "[DwarfDebug] Dump call site debug info"
The build failure found after the rL365467 has been
resolved.

Differential Revision: https://reviews.llvm.org/D60716

llvm-svn: 367446
2019-07-31 16:51:28 +00:00
Johannes Doerfert
552f8fc59c [docs][FIX] Add missing word to documentation in terms of SCCs
In the approval of D65299, commited as rL367440, I mentioned that my
proposed wording was lacking the word "maximal". It is added now for
correctness.

llvm-svn: 367445
2019-07-31 16:48:42 +00:00
Anusha Basana
cb472f88b8 [build] Add the ability to create a symlink for lipo
Add user enabled option to create lipo with symlink to llvm-lipo
Used rL326381 for reference.

Differential Revision: https://reviews.llvm.org/D65477

llvm-svn: 367444
2019-07-31 16:46:57 +00:00
Stanislav Mekhanoshin
1435f5cd13 [AMDGPU] Fix for vectorizer crash with pointers of different size
When vectorizer strips pointers it can eventually end up with
pointers of two different sizes, then SCEV will crash.

Differential Revision: https://reviews.llvm.org/D65480

llvm-svn: 367443
2019-07-31 16:33:11 +00:00
Philip Reames
19f3d78573 [docs] Reword documentation in terms of SCCs not cycles
Given the example:
header:
  br i1 %c, label %next, label %header
next:
  br i1 %c2, label %exit, label %header

We end up with a loop containing both header and next.  Given that, the describing the loop in terms of cycles is confusing since we have multiple distinct cycles within a single Loop.  Standardize on the SCC to clarify.

Differential Revision: https://reviews.llvm.org/D65299

llvm-svn: 367440
2019-07-31 16:24:20 +00:00
Roman Lebedev
fc4b0c74c8 [NFC][InstCombine] Add xor-or-icmp tests with icmp having extra uses
Currently InstCombiner::foldXorOfICmps() bailouts if the
ICMP it wants to invert has extra uses. As it can be seen
in the tests in previous commit, this is super unfortunate,
this is the single pattern that is left non-canonicalized.

We could analyze if we can also invert all the uses if said ICMP
at the same time, thus not bailing out there.
I'm not seeing any nicer alternative.

llvm-svn: 367439
2019-07-31 15:20:33 +00:00
Roman Lebedev
ecb5b6cd35 [NFC][InstCombine] Add baseline tests with non-canonical CLAMP pattern
As disscussed in https://reviews.llvm.org/D65148#1603922
these would all need to be canonicalized to traditional clamp pattern.

llvm-svn: 367438
2019-07-31 15:20:21 +00:00