1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

217546 Commits

Author SHA1 Message Date
Roman Lebedev
4e033a7fbd [NFC][ARM] Fix update_llc_test_checks for thumbv7-apple-ios, autogenerate switch-minsize.ll 2021-06-23 16:31:19 +03:00
Roman Lebedev
06e1305748 [NFC][ARM] Fix update_llc_test_checks for armv7-apple-ios, autogenerate ifcvt5.ll/ifcvt6.ll 2021-06-23 16:31:19 +03:00
Nikita Popov
bc995d8257 [ARMParallelDSP] Remove unnecessary wrapper function (NFC)
AreSequentialAccesses() forwards directly to isConsecutiveAccess()
and has an unnecessary template parameter to boot.
2021-06-23 15:27:54 +02:00
Rosie Sumpter
cc55f4d4ca [AArch64] Add CodeGen tests for vector reduction intrinsics. NFC
Tests are added for vector reduce OR, AND and XOR.

Differential Revision: https://reviews.llvm.org/D104771
2021-06-23 13:46:16 +01:00
Roman Lebedev
a7fa6b8e79 [NFCI-ish][SimplifyCFGPass] Rework and generalize ret block tail-merging
This changes the approach taken to tail-merge the blocks
to always create a new block instead of trying to reuse some block,
and generalizes it to support dealing not with just the `ret` in the future.

This effectively lifts the CallBr restriction, although this isn't really intentional.
That is the only non-NFC change here, i'm not sure if it's reasonable/feasible to temporarily retain it.

Other restrictions of the transform remain.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D104598
2021-06-23 14:33:18 +03:00
Juneyoung Lee
bc3eadbb4a [InstSimplify] Add more poison folding optimizations
This adds more poison folding optimizations to InstSimplify.

Since all binary operators propagate poison, these are fine.

Also, the precondition of `select cond, undef, x` -> `x` is relaxed to allow the case when `x` is undef.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D104661
2021-06-23 20:25:24 +09:00
Joe Ellis
f07815e348 [Verifier] Fail on overrunning and invalid indices for {insert,extract} vector intrinsics
With regards to overrunning, the langref (llvm/docs/LangRef.rst)
specifies:

   (llvm.experimental.vector.insert)
   Elements ``idx`` through (``idx`` + num_elements(``subvec``) - 1)
   must be valid ``vec`` indices. If this condition cannot be determined
   statically but is false at runtime, then the result vector is
   undefined.

   (llvm.experimental.vector.extract)
   Elements ``idx`` through (``idx`` + num_elements(result_type) - 1)
   must be valid vector indices. If this condition cannot be determined
   statically but is false at runtime, then the result vector is
   undefined.

For the non-mixed cases (e.g. inserting/extracting a scalable into/from
another scalable, or inserting/extracting a fixed into/from another
fixed), it is possible to statically check whether or not the above
conditions are met. This was previously missing from the verifier, and
if the conditions were found to be false, the result of the
insertion/extraction would be replaced with an undef.

With regards to invalid indices, the langref (llvm/docs/LangRef.rst)
specifies:

    (llvm.experimental.vector.insert)
    ``idx`` represents the starting element number at which ``subvec``
    will be inserted. ``idx`` must be a constant multiple of
    ``subvec``'s known minimum vector length.

    (llvm.experimental.vector.extract)
    The ``idx`` specifies the starting element number within ``vec``
    from which a subvector is extracted. ``idx`` must be a constant
    multiple of the known-minimum vector length of the result type.

Similarly, these conditions were not previously enforced in the
verifier. In some circumstances, invalid indices were permitted
silently, and in other circumstances, an undef was spawned where a
verifier error would have been preferred.

This commit adds verifier checks to enforce the constraints above.

Differential Revision: https://reviews.llvm.org/D104468
2021-06-23 10:33:22 +00:00
Nikita Popov
3d23594a81 [TTI] Make assertion compatible with opaque pointers
Dropping the TODO here because it applies to all uses of this method.
2021-06-23 12:21:54 +02:00
Nikita Popov
5c5d90fb09 [LLParser] Remove special handling for call address space
Spin-off from D104740: I don't think this special handling is needed
anymore. Calls in textual IR are annotated with addrspace(N) (which
defaults to the program address space from data layout) and specifies
the expected pointer address space of the callee. There is no need
to special-case the program address space on top of that, as it
already is the default expected address space, and we shouldn't
allow use of the program address space if the call was explicitly
annotated with some other address space.

The IsCall parameter is retained because it will be used again soon.

Differential Revision: https://reviews.llvm.org/D104752
2021-06-23 12:07:44 +02:00
Jay Foad
4509098899 [AMDGPU] Stop using LegacyLegalizerInfo. NFCI.
Differential Revision: https://reviews.llvm.org/D103684
2021-06-23 10:50:32 +01:00
Jay Foad
7fc0f10d11 [IR] Simplify createReplacementInstr
NFCI, although the test change shows that ConstantExpr::getAsInstruction
is better than the old implementation of createReplacementInstr because
it propagates things like the sdiv "exact" flag.

Differential Revision: https://reviews.llvm.org/D104124
2021-06-23 10:47:43 +01:00
Florian Hahn
5b47031c00 [llvm] Update tests that got missed in adee485adf84ae8a. 2021-06-23 10:29:58 +01:00
Florian Hahn
9acf2cef1e [SCEV] Support signed predicates in applyLoopGuards.
This adds handling for signed predicates, similar to how unsigned
predicates are already handled.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D104732
2021-06-23 10:21:05 +01:00
Florian Hahn
71133f8ae4 [SCEV] Add tests with single-cond range check generated by InstComb. 2021-06-23 10:16:57 +01:00
Jay Foad
72abd6c7b3 [AMDGPU] Simplify collectReachableCallees. NFCI.
Don't use SCC iterators when we're only interested in reachability.
Use df_begin/df_end inline to find reachable nodes.

Differential Revision: https://reviews.llvm.org/D104704
2021-06-23 09:11:29 +01:00
Stanislav Mekhanoshin
5459f63eb4 [AMDGPU] Propagate LDS align into to instructions
Differential Revision: https://reviews.llvm.org/D104316
2021-06-23 00:57:16 -07:00
Fangrui Song
66df23eddc [llvm-objcopy][MachO] Fix namespace style issues 2021-06-23 00:31:52 -07:00
Martin Storsjö
1bc7c6b964 Revert "[AArch64LoadStoreOptimizer] Recommit: Generate more STPs by renaming registers earlier"
This reverts commit ea011ec5ed53599305de62ca5fcfd31f4b3448c3.

This still causes some miscompiles, I'll follow up in the phabricator
review with a sample of that issue (which is part of the sample of
the previous issue).
2021-06-23 09:54:16 +03:00
Igor Kudrin
dc6a85256b [TableGen] Fix printing second PC-relative operand
If an instruction has several operands and a PC-relative one is not the
first of them, the generator may produce the code that does not pass the
'Address' parameter to the printout method. For example, for an Arm
instruction 'LE LR, $imm', it reuses the same code as for other
instructions where the second operand is not PC-relative:

void ARMInstPrinter::printInstruction(...) {
...
  case 11:
    // BF16VDOTI_VDOTD, BF16VDOTI_VDOTQ, BF16VDOTS_VDOTD, ...
    printOperand(MI, 1, STI, O);
    O << ", ";
    printOperand(MI, 2, STI, O);
    break;
...

The patch fixes that by considering 'PCRel' when comparing
'AsmWriterOperand' values.

Differential Revision: https://reviews.llvm.org/D104698
2021-06-23 13:27:37 +07:00
Min-Yih Hsu
47606c47b1 [M68k] Fix incorrect #include-ed file in M68kSubtarget
In https://reviews.llvm.org/rG2193347e72fa , a cpp file is accidentally
included instead of its header file counterpart. This patch fixes this
error.
2021-06-22 23:02:21 -07:00
Jim Lin
a9ee1397f3 [M68k] Add testcases for shift and rotate instructions
Add codegen testcases for lsl, lsr, asr, rol and ror instructions.

Reviewed By: myhsu

Differential Revision: https://reviews.llvm.org/D104685
2021-06-23 13:26:58 +08:00
Jim Lin
155f1c4b36 [M68k] Refactor codegen patterns for logic operations and add tests for it
Refactor pat for and, or and xor operation and add missing tests for it

Reviewed By: myhsu

Differential Revision: https://reviews.llvm.org/D104626
2021-06-23 13:25:24 +08:00
Max Kazantsev
1073014a44 [LoopDeletion] Exploit undef Phi inputs when symbolically executing 1st iteration
Follow-up on Roman's idea expressed in D103959.
- If a Phi has undefined inputs from live blocks:
   - and no other inputs, assume it is undef itself;
   - and exactly one non-undef input, we can assume that all undefs are equal to this input.

Differential Revision: https://reviews.llvm.org/D104618
Reviewed By: lebedev.ri, nikic
2021-06-23 11:53:48 +07:00
Max Kazantsev
c87b0786bc [Test] Clear out br i1 undef from tests to avoid UB
We don't want to test possible unexpected impact of such
branches. Replacing them with regular conditions. Idea by
Nikita Popov.
2021-06-23 11:33:57 +07:00
Max Kazantsev
567fc5f75a [LSR] Filter out zero factors. PR50765
Zero factor leads to division by zero and failure of corresponding
assert as shown in PR50765. We should filter out such factors.

Differential Revision: https://reviews.llvm.org/D104702
Reviewed By: huihuiz, reames
2021-06-23 10:43:06 +07:00
Nico Weber
d6577233fa [gn build] don't build ubsan_minimal on mac
It doesn't build there, see http://45.33.8.238/macm1/12180/step_4.txt
2021-06-22 22:21:20 -04:00
River Riddle
deae9b5f50 [mlir] Add a ThreadPool to MLIRContext and refactor MLIR threading usage
This revision refactors the usage of multithreaded utilities in MLIR to use a common
thread pool within the MLIR context, in addition to a new utility that makes writing
multi-threaded code in MLIR less error prone. Using a unified thread pool brings about
several advantages:

* Better thread usage and more control
We currently use the static llvm threading utilities, which do not allow multiple
levels of asynchronous scheduling (even if there are open threads). This is due to
how the current TaskGroup structure works, which only allows one truly multithreaded
instance at a time. By having our own ThreadPool we gain more control and flexibility
over our job/thread scheduling, and in a followup can enable threading more parts of
the compiler.

* The static nature of TaskGroup causes issues in certain configurations
Due to the static nature of TaskGroup, there have been quite a few problems related to
destruction that have caused several downstream projects to disable threading. See
D104207 for discussion on some related fallout. By having a ThreadPool scoped to
the context, we don't have to worry about destruction and can ensure that any
additional MLIR thread usage ends when the context is destroyed.

Differential Revision: https://reviews.llvm.org/D104516
2021-06-23 01:29:24 +00:00
Jon Roelofs
f2b70884ff [Remarks] Make memsize remarks report as an analysis, not a missed opportunity.
Differential revision: https://reviews.llvm.org/D104078
2021-06-22 18:22:47 -07:00
Liqiang Tao
ad4dd08538 [llvm][Inliner] Make PriorityInlineOrder lazily updated
This patch makes PriorityInlineOrder lazily updated.
The PriorityInlineOrder would lazily update the desirability of a call site if it's decreasing.

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D104654
2021-06-23 08:59:53 +08:00
Peter Collingbourne
48ca0247f9 gn build: Only build the TSan runtime on 64-bit platforms.
TSan only supports 64-bit platforms.

Differential Revision: https://reviews.llvm.org/D104755
2021-06-22 17:51:00 -07:00
Peter Collingbourne
5b5b35d51b gn build: Add support for building ubsan_minimal.
Differential Revision: https://reviews.llvm.org/D104754
2021-06-22 17:51:00 -07:00
Hongtao Yu
c6ff0d26c8 [CSSPGO][llvm-profgen] Handle return to external transition.
In a callback case, a return from internal code, say A, to external runtime can happen. The external runtime can then call back to another internal routine, say B. Making an artificial branch that looks like a return from A to B can confuse the unwinder to treat the instruction before B as the call instruction.

Reviewed By: wenlei, wmi

Differential Revision: https://reviews.llvm.org/D104546
2021-06-22 16:24:59 -07:00
Philip Reames
ae4432a64d precommit test for D104665 2021-06-22 15:52:00 -07:00
Joseph Huber
1aa9483992 [Attributor] Fix AAExecutionDomain returning true on invalid states
This patch fixes a problem with the AAExecutionDomain attributor not
checking if it is in a valid state. This can cause it to incorrectly
return that a block is executed in a single threaded context after the
attributor failed for any reason.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D103186
2021-06-22 18:12:43 -04:00
Peter Collingbourne
458d10ff3a gn build: Rebase clang-tblgen include path against root_build_dir instead of root_out_dir.
Fixes clang cross-compilation.

Also remove some redundant include path arguments.
2021-06-22 14:32:24 -07:00
Joseph Huber
9e84441896 [OpenMP] Change remaining globalization from an analysis remark to missed
After landing the globalization optimizations, the precense of globalization on
the device that was not put in shared or stack memory is a failed optimization
with performance consequences so it should indicate a missed remark.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D104735
2021-06-22 16:52:06 -04:00
Nico Weber
fe6edd20fe [gn build] manually port c747b7d1d9a2 more (config.osx_sysroot) 2021-06-22 15:33:52 -04:00
Nico Weber
d986a221ea Make lit configs relocatable again after c747b7d1d9a
See https://reviews.llvm.org/D77184 for background.
2021-06-22 15:27:32 -04:00
Bill Wendling
c084e028a6 [llvm-diff] Explicitly check ConstantArrays
Global initializers may be ConstantArrays. They need to be checked
explicitly, because different-yet-still-equivalent type names may be
used for each, and/or a GEP instruction may appear in one.
2021-06-22 12:23:38 -07:00
Bill Wendling
31cd676e0f [llvm-diff] Add support for diffing the callbr instruction
The only wrinkle is that we can't process the "blockaddress" arguments
of the callbr until the blocks have been equated. So we force them to be
"unified" before checking.

This was left out when the callbr instruction was added.

Differential Revision: https://reviews.llvm.org/D104606
2021-06-22 12:23:37 -07:00
Nikita Popov
09e246902d [OpaquePtr] Support changing load type in InstCombine
When the load type is changed to ptr, we need the load pointer type
to also be ptr, because it's not allowed to create a pointer to an
opaque pointer. This is achieved by adjusting the getPointerTo() API
to return an opaque pointer for an opaque pointer base type.

Differential Revision: https://reviews.llvm.org/D104718
2021-06-22 21:16:15 +02:00
Sami Tolvanen
d90334b511 Revert "ThinLTO: Fix inline assembly references to static functions with CFI"
This reverts commit 4474958d3a97dede2caa0920f7c4a4dc7aac57d3.

Breaks check-llvm on Mac.
2021-06-22 12:10:58 -07:00
Joseph Huber
e337b160dd [OpenMP][NFC] Add new optimizations to OpenMPOpt comment header
Summary:
Adds mentions to the new globalization optimizations added to the OpenMPOpt comment header.
2021-06-22 14:40:31 -04:00
Joseph Huber
141815765c [Attributor] Add an option to increase the max number of iterations
Right now the Attributor defaults to 32 fixed point iterations unless it is set
explicitly by a command line flag. This patch allows this to be configured when
the attributor instance is created. The maximum is then increased in OpenMPOpt
if the target is a kernel. This is because the globalization analysis can result
in larger iteration counts due to many dependent instances running at once.

Depends on D102444

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D104416
2021-06-22 14:38:25 -04:00
Sanjay Patel
296b173595 [InstCombine] reduce code duplication for FP min/max with casts fold; NFC 2021-06-22 14:15:04 -04:00
Sanjay Patel
d78c67e231 [InstCombine][test] add tests for FP min/max with negated op; NFC 2021-06-22 14:15:04 -04:00
Sanjay Patel
e5a2d28d0d [InstCombine][test] add tests for FP min/max with negated op; NFC 2021-06-22 14:15:04 -04:00
Joseph Huber
0fbe411307 [Attributor] Add interface to emit remarks in Attributor
Summary:
This patch adds support for the Attributor to emit remarks on behalf of some
other pass. The attributor can now optionally take a callback function that
returns an OptimizationRemarkEmitter object when given a Function pointer. If
this is availible then a remark will be emitted for the corresponding pass
name.

Depends on D102197

Reviewed By: sstefan1 thegameg

Differential Revision: https://reviews.llvm.org/D102444
2021-06-22 14:12:46 -04:00
David Green
bb402997c3 [ARM] Change some Gather/Scatter interface types to Instructions. NFC
These returned Values are cast to an Instruction already, this just
cleans up the interface a little to match the expected types.
2021-06-22 19:11:39 +01:00
Matt Arsenault
248de533d0 AMDGPU: Try to eliminate clearing of high bits of 16-bit instructions
These used to consistently be zeroed pre-gfx9, but gfx9 made the
situation complicated since now some still do and some don't. This
also manages to pick up a few cases that the pattern fails to optimize
away.

We handle some cases with instruction patterns, but some get
through. In particular this improves the integer cases.
2021-06-22 13:42:49 -04:00