1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 12:43:36 +01:00
Commit Graph

207594 Commits

Author SHA1 Message Date
Tony
545440c6ae [NFC][AMDGPU] Fix broken link to ClangOffloadBundler in AMDGPUUsage 2020-12-02 03:04:28 +00:00
Chen Zheng
c19797f647 [NFC][PowerPC] code refactor: split IsReassociable to fma and add.
Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D92070
2020-12-01 21:18:57 -05:00
Kazushi (Jam) Marukawa
7a3eb287f6 [VE] Add vcmp, vmax, and vmin intrinsic instructions
Add vcmp, vmax, and vmin intrinsic instructions and regression tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92387
2020-12-02 11:16:52 +09:00
Jianzhou Zhao
063bca0cca [msan] Replace 8 by kShadowTLSAlignment
Reviewed-by: eugenis

Differential Revision: https://reviews.llvm.org/D92275
2020-12-02 01:09:49 +00:00
Jessica Paquette
246910a595 Fix typo in testcase runline that got there because I have very bad hands
llvm/test/CodeGen/AArch64/GlobalISel/speculative-hardening-brcond.mir had a
slash in its runline.
2020-12-01 16:57:46 -08:00
Jessica Paquette
ab6716fff2 [AArch64][GlobalISel] Don't write to WZR in non-flag-setting G_BRCOND case
We are avoiding writing to WZR just about everywhere else.

Also update the code to use MachineIRBuilder for the sake of consistency.

We also didn't have a GlobalISel testcase for this path, so add a simple one
now.

Differential Revision: https://reviews.llvm.org/D90626
2020-12-01 16:45:37 -08:00
Fangrui Song
267a40fe17 [RISCVAsmParser] Allow a SymbolRef operand to be a complex expression
So that instructions like `lla a5, (0xFF + end) - 4` (supported by GNU as) can
be parsed.

Add a missing test that an operand like `foo + foo` is not allowed.

Reviewed By: jrtc27

Differential Revision: https://reviews.llvm.org/D92293
2020-12-01 16:08:09 -08:00
Leonard Chan
6d2c502fb1 [llvm] Fix for failing test from fdbd84c6c819d4462546961f6086c1524d5d5ae8
When handling a DSOLocalEquivalent operand change:

- Remove assertion checking that the `To` type and current type are the
  same type. This is not always a requirement.
- Add a missing bitcast from an old DSOLocalEquivalent to the type of
  the new one.
2020-12-01 15:47:55 -08:00
Jessica Paquette
26f7314cfa [AArch64][GlobalISel] Select Bcc when it's better than TB(N)Z
Instead of falling back to selecting TB(N)Z when we fail to select an
optimized compare against 0, select Bcc instead.

Also simplify selectCompareBranch a little while we're here, because the logic
was kind of hard to follow.

At -O0, this is a 0.1% geomean code size improvement for CTMark.

A simple example of where this can kick in is here:
https://godbolt.org/z/4rra6P

In the example above, GlobalISel currently produces a subs, cset, and tbnz.
SelectionDAG, on the other hand, just emits a compare and b.le.

Differential Revision: https://reviews.llvm.org/D92358
2020-12-01 15:45:14 -08:00
Tony
ebb5d91fda [NFC][AMDGPU] AMDGPU code object V4 ABI documentation
- Documantation for AMDGPU code object V4.
- Documentation clarification for code object V2 and V3.
- Documentation for the clang-offload-bundler.
- Numerous other documentation clarifications.

Change-Id: I338b327cc9e75da6c987b7e081b496402a5a020e

Differential Revision: https://reviews.llvm.org/D92434
2020-12-01 23:31:04 +00:00
LLVM GN Syncbot
f9e0510e96 [gn build] Port 3fcb0eeb152 2020-12-01 23:11:06 +00:00
Arthur Eubanks
7d8ec791df [gn build] Format all gn files
$ git ls-files '*.gn' '*.gni' | xargs llvm/utils/gn/gn.py format
2020-12-01 15:07:16 -08:00
Arthur Eubanks
611e4ee2e4 [gn build] Manually port 8fee2ee9 2020-12-01 15:06:49 -08:00
Eric Astor
42660dccde [ms] [llvm-ml] Support command-line defines
Enable command-line defines as textmacros

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D90059
2020-12-01 18:06:05 -05:00
Nico Weber
1ffcb150e6 [gn build] (manually) port 8fee2ee9a68 2020-12-01 18:02:27 -05:00
Eric Astor
4082cce59d [ms] [llvm-ml] Introduce command-line compatibility for ml.exe and ml64.exe
Switch to OptParser for command-line handling

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D90058
2020-12-01 17:43:44 -05:00
James Park
ddcdac7e76 Avoid redundant inline with LLVM_ATTRIBUTE_ALWAYS_INLINE
Fix MSVC warning when __forceinline is paired with inline.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D85264
2020-12-01 14:43:16 -08:00
David Blaikie
268ad7e16e Revert "[FastISel] Flush local value map on ever instruction" and dependent patches
This reverts commit cf1c774d6ace59c5adc9ab71b31e762c1be695b1.

This change caused several regressions in the gdb test suite - at least
a sample of which was due to line zero instructions making breakpoints
un-lined. I think they're worth investigating/understanding more (&
possibly addressing) before moving forward with this change.

Revert "[FastISel] NFC: Clean up unnecessary bookkeeping"
This reverts commit 3fd39d3694d32efa44242c099e923a7f4d982095.

Revert "[FastISel] NFC: Remove obsolete -fast-isel-sink-local-values option"
This reverts commit a474657e30edccd9e175d92bddeefcfa544751b2.

Revert "Remove static function unused after cf1c774."
This reverts commit dc35368ccf17a7dca0874ace7490cc3836fb063f.

Revert "[lldb] Fix TestThreadStepOut.py after "Flush local value map on every instruction""
This reverts commit 53a14a47ee89dadb8798ca8ed19848f33f4551d5.
2020-12-01 14:26:23 -08:00
Arthur Eubanks
655a695bbb Reland [CMake][NewPM] Move ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER into llvm/
This allows us to use its value everywhere, rather than just clang. Some
other places, like opt and lld, will use its value soon.

Rename it internally to LLVM_ENABLE_NEW_PASS_MANAGER.

The #define for it is now in llvm-config.h.

The initial land accidentally set the value of
LLVM_ENABLE_NEW_PASS_MANAGER to the string
ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER instead of its value.

Reviewed By: rnk, hans

Differential Revision: https://reviews.llvm.org/D92072
2020-12-01 14:00:32 -08:00
Arthur Eubanks
3b1abe8f29 Revert "[CMake][NewPM] Move ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER into llvm/"
The new pass manager was accidentally enabled by default with this change.

This reverts commit a36bd4c90dcca82be9b64f65dbd22e921b6485ef.
2020-12-01 13:12:12 -08:00
Arthur Eubanks
3592bb94b5 [CMake][NewPM] Move ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER into llvm/
This allows us to use its value everywhere, rather than just clang. Some
other places, like opt and lld, will use its value soon.

The #define for it is now in llvm-config.h.

Reviewed By: rnk, hans

Differential Revision: https://reviews.llvm.org/D92072
2020-12-01 11:42:17 -08:00
Nico Weber
749cad3974 [gn build] sync script: try to make sync script even clearer
Turns out startswith() takes an optional start parameter :)

No behavior change.
2020-12-01 14:35:27 -05:00
Layton Kifer
fb6a609027 [DAGCombiner][NFC] Replace duplicate implementation flipBoolean with DAG.getLogicalNOT
Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D92246
2020-12-01 22:23:04 +03:00
Fangrui Song
3b69235500 static const char *const foo => const char foo[]
By default, a non-template variable of non-volatile const-qualified type
having namespace-scope has internal linkage, so no need for `static`.
2020-12-01 10:33:18 -08:00
Arthur Eubanks
3801be51e1 [LTO][NewPM] Run verifier when doing LTO
This matches the legacy PM.

Reviewed By: ychen

Differential Revision: https://reviews.llvm.org/D92138
2020-12-01 10:14:53 -08:00
Bardia Mahjour
b7eee47753 Revert "[LV] Epilogue Vectorization with Optimal Control Flow"
This reverts commit 9c5504adceb544d9954ddb8ff3035a414f4b1423.
Reverting to investigate build failure in http://lab.llvm.org:8011/#/builders/98/builds/1461/steps/9
2020-12-01 12:50:36 -05:00
Rahman Lavaee
1916d7b3c4 Let .llvm_bb_addr_map section use the same unique id as its associated .text section.
Currently, `llvm_bb_addr_map` sections are generated per section names because we use
the `LinkedToSymbol` argument of getELFSection. This will cause the address map tables of functions
grouped into the same section when `-function-sections=true -unique-section-names=false` which is not
the intended behaviour. This patch lets the unique id of every `.text` section propagate to the associated
`.llvm_bb_addr_map` section.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D92113
2020-12-01 09:21:00 -08:00
Nikita Popov
f105ee91d5 [BasicAA] Add test for suboptimal result with unknown sizes (NFC) 2020-12-01 18:20:34 +01:00
Bardia Mahjour
63b138b338 [LV] Epilogue Vectorization with Optimal Control Flow
This is yet another attempt at providing support for epilogue
vectorization following discussions raised in RFC http://llvm.1065342.n5.nabble.com/llvm-dev-Proposal-RFC-Epilog-loop-vectorization-tt106322.html#none
and reviews D30247 and D88819.

Similar to D88819, this patch achieve epilogue vectorization by
executing a single vplan twice: once on the main loop and a second
time on the epilogue loop (using a different VF). However it's able
to handle more loops, and generates more optimal control flow for
cases where the trip count is too small to execute any code in vector
form.

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D89566
2020-12-01 12:04:29 -05:00
Nikita Popov
4955e186d9 [MemCpyOpt] Port to MemorySSA
This is a straightforward port of MemCpyOpt to MemorySSA following
the approach of D26739. MemDep queries are replaced with MSSA queries
without changing the overall structure of the pass. Some care has
to be taken to account for differences between these APIs
(MemDep also returns reads, MSSA doesn't).

Differential Revision: https://reviews.llvm.org/D89207
2020-12-01 17:57:41 +01:00
Fangrui Song
214c071c8d [X86] Support modifier @PLTOFF for R_X86_64_PLTOFF64
`gcc -mcmodel=large` can emit @PLTOFF.

Reviewed By: grimar

Differential Revision: https://reviews.llvm.org/D92294
2020-12-01 08:39:01 -08:00
Juneyoung Lee
37d6671aa8 [InstSimplify] Add tests that fold instructions with poison operands (NFC) 2020-12-02 01:01:59 +09:00
Clement Courbet
c96423a30c [MergeICmps] Fix missing split.
We were not correctly splitting a blocks for chains of length 1.

Before that change, additional instructions for blocks in chains of
length 1 were not split off from the block before removing (this was
done correctly for chains of longer size).
If this first block contained an instruction referenced elsewhere,
deleting the block, would result in invalidation of the produced value.

This caused a miscompile which motivated D92297 (before D17993,
nonnull and dereferenceable attributed were not added so MergeICmps were
not triggered.) The new test gep-references-bb.ll demonstrate the issue.

The regression was introduced in
rG0efadbbcdeb82f5c14f38fbc2826107063ca48b2.

This supersedes D92364.

Test case by MaskRay (Fangrui Song).

Differential Revision: https://reviews.llvm.org/D92375
2020-12-01 16:50:55 +01:00
Sanjay Patel
47c90fc42e [x86] adjust cost model values for minnum/maxnum with fast-math-flags
Without FMF, we lower these intrinsics into something like this:

vmaxsd	%xmm0, %xmm1, %xmm2
vcmpunordsd	%xmm0, %xmm0, %xmm0
vblendvpd	%xmm0, %xmm1, %xmm2, %xmm0

But if we can ignore NANs, the single min/max instruction is enough
because there is no need to fix up the x86 logic that corresponds to
X > Y ? X : Y.

We probably want to make other adjustments for FP intrinsics with FMF
to account for specialized codegen (for example, FSQRT).

Differential Revision: https://reviews.llvm.org/D92337
2020-12-01 10:45:53 -05:00
Benjamin Kramer
465a434844 [DAG] Remove unused variable. NFC. 2020-12-01 16:29:02 +01:00
David Green
072f29e65f [ARM] Mark select and selectcc of MVE vector operations as expand.
We already expand select and select_cc in codegenprepare, but they can
still be generated under some situations. Explicitly mark them as expand
to ensure they are not produced, leading to a failure to select the
nodes.

Differential Revision: https://reviews.llvm.org/D92373
2020-12-01 15:05:55 +00:00
Sanjay Patel
b8c83b9595 [InstCombine] canonicalize sign-bit-shift of difference to ext(icmp)
icmp is the preferred spelling in IR because icmp analysis is
expected to be better than any other analysis. This should
lead to more follow-on folding potential.

It's difficult to say exactly what we should do in codegen to
compensate. For example on AArch64, which of these is preferred:
	sub	w8, w0, w1
	lsr	w0, w8, #31

vs:
	cmp	w0, w1
	cset	w0, lt

If there are perf regressions, then we should deal with those in
codegen on a case-by-case basis.

A possible motivating example for better optimization is shown in:
https://llvm.org/PR43198 but that will require other transforms
before anything changes there.

Alive proof:
https://rise4fun.com/Alive/o4E

  Name: sign-bit splat
  Pre: C1 == (width(%x) - 1)
  %s = sub nsw %x, %y
  %r = ashr %s, C1
  =>
  %c = icmp slt %x, %y
  %r = sext %c

  Name: sign-bit LSB
  Pre: C1 == (width(%x) - 1)
  %s = sub nsw %x, %y
  %r = lshr %s, C1
  =>
  %c = icmp slt %x, %y
  %r = zext %c
2020-12-01 09:58:11 -05:00
Simon Pilgrim
285c17bd3f [DAG] Move vselect(icmp_ult, 0, sub(x,y)) -> usubsat(x,y) to DAGCombine (PR40111)
Move the X86 VSELECT->USUBSAT fold to DAGCombiner - there's nothing target specific about these folds.
2020-12-01 14:25:29 +00:00
Florian Hahn
758cff690d [ConstraintElimination] Decompose GEP %ptr, ZEXT(SHL()).
Add support to decompose a GEP with a ZEXT(SHL()) operand.
2020-12-01 14:23:21 +00:00
Kazushi (Jam) Marukawa
bfd3c64d85 [VE] Add vmul and vdiv intrinsic instructions
Add vmul and vdiv intrinsic instructions and regression tests.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D92377
2020-12-01 23:03:49 +09:00
Simon Pilgrim
307dd79679 [X86] Add PR48223 usubsat test case 2020-12-01 13:57:08 +00:00
Bhramar Vatsa
2a0264f0ea [InstCombine] Optimize away the unnecessary multi-use sign-extend
C.f. https://bugs.llvm.org/show_bug.cgi?id=47765

Added a case for handling the sign-extend (Shl+AShr) for multiple uses,
to optimize it away for an individual use,
when the demanded bits aren't affected by sign-extend.

https://rise4fun.com/Alive/lgf

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D91343
2020-12-01 16:54:00 +03:00
Roman Lebedev
b2bda36752 [InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold, 2
If the shift amount was undef for some lane, the shift amount in opposite
shift is irrelevant for that lane, and the new shift amount for that lane
can be undef.
2020-12-01 16:54:00 +03:00
Sanjay Patel
e356a6cdf3 [InstCombine] add tests for sign-bit-shift-of-sub; NFC 2020-12-01 08:01:00 -05:00
Hans Wennborg
678f3b2a4a Remove rm -f cortex-a57-misched-mla.s; hopefully the bots have all cycled past it now 2020-12-01 13:50:49 +01:00
Roman Lebedev
bd4ad50d2c Revert "[InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold"
It seems i have missed checklines, temporairly reverting,
will reland momentairly..

This reverts commit aa1aa135097ecfab6d9917a435142030eff0a226.
2020-12-01 15:47:04 +03:00
Roman Lebedev
11a8c6ce51 [NFC][InstCombine] sext.ll: @test9: avoid only differently-cased names for values and block names 2020-12-01 15:33:12 +03:00
Roman Lebedev
0e49bd7b55 [InstCombine] Improve vector undef handling for sext(ashr(shl(trunc()))) fold
If the shift amount was undef for some lane, the shift amount in opposite
shift is irrelevant for that lane, and the new shift amount for that lane
can be undef.
2020-12-01 15:13:08 +03:00
Roman Lebedev
c8f6a30ae8 [NFC][InstCombine] Improve vector undef test coverage for sext(ashr(shl(trunc()))) fold 2020-12-01 15:13:07 +03:00
Roman Lebedev
28c20628bd [InstCombine] Evaluate new shift amount for sext(ashr(shl(trunc()))) fold in wide type (PR48343)
It is not correct to compute that new shift amount in it's narrow type
and only then extend it into the wide type:

----------------------------------------
Optimization: PR48343 good
Precondition: (width(%X) == width(%r))
  %o0 = trunc %X
  %o1 = shl %o0, %Y
  %o2 = ashr %o1, %Y
  %r = sext %o2
=>
  %n0 = sext %Y
  %n1 = sub width(%o0), %n0
  %n2 = sub width(%X), %n1
  %n3 = shl %X, %n2
  %r = ashr %n3, %n2

Done: 2016
Optimization is correct!

----------------------------------------
Optimization: PR48343 bad
Precondition: (width(%X) == width(%r))
  %o0 = trunc %X
  %o1 = shl %o0, %Y
  %o2 = ashr %o1, %Y
  %r = sext %o2
=>
  %n0 = sub width(%o0), %Y
  %n1 = sub width(%X), %n0
  %n2 = sext %n1
  %n3 = shl %X, %n2
  %r = ashr %n3, %n2

Done: 1
ERROR: Domain of definedness of Target is smaller than Source's for i9 %r

Example:
%X i9 = 0x000 (0)
%Y i4 = 0x3 (3)
%o0 i4 = 0x0 (0)
%o1 i4 = 0x0 (0)
%o2 i4 = 0x0 (0)
%n0 i4 = 0x1 (1)
%n1 i4 = 0x8 (8, -8)
%n2 i9 = 0x1F8 (504, -8)
%n3 i9 = 0x000 (0)
Source value: 0x000 (0)
Target value: undef


I.e. we should be computing it in the wide type from the beginning.

Fixes https://bugs.llvm.org/show_bug.cgi?id=48343
2020-12-01 15:13:07 +03:00