1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00
Commit Graph

202232 Commits

Author SHA1 Message Date
Sanjay Patel
0d9f1ba952 [InstCombine] avoid 'tmp' names in tests; NFC
They may conflict with update_test_checks.py regexes.
2020-08-19 12:08:31 -04:00
Sanjay Patel
3ade39cec8 [InstCombine] reduce code duplication; NFC 2020-08-19 12:05:12 -04:00
Matt Arsenault
92c99a3fbc AMDGPU/GlobalISel: Add some bitcast tests 2020-08-19 10:38:39 -04:00
madhur13490
cc99379745 [NFC] Fix typo in AMDGPU doc
Reviewed By: t-tye, arsenm

Differential Revision: https://reviews.llvm.org/D86206
2020-08-19 14:33:26 +00:00
Matt Arsenault
a85deae728 AMDGPU/GlobalISel: Add selection tests for pointer constants 2020-08-19 10:23:56 -04:00
Benjamin Kramer
5f3856fae5 Make helpers static. NFC. 2020-08-19 16:00:03 +02:00
Roman Lebedev
b58c7be2d0 Revert "[InstCombine] Lower infinite combine loop detection thresholds"
And as being reported by Florian Hahn, there's a hit
in MultiSource/Benchmarks/mafft from the test-suite on X86 with -O3 -flto,
so reverting until addressed.

This reverts commit 71e0b82c9f5039cb3987c91075e78733ef044c07.
2020-08-19 16:53:30 +03:00
Simon Pilgrim
b4fc1fb85b Fix MSVC implicit truncation narrowing conversion warning. 2020-08-19 14:41:40 +01:00
Simon Pilgrim
0776343ee5 [X86][AVX] lowerShuffleWithVPMOV - minor refactor to more closely match lowerShuffleAsVTRUNC
Replace isBuildVectorAllZeros check by using the Zeroable bitmask instead.
2020-08-19 14:34:32 +01:00
Xing GUO
6ada6c96e0 [obj2yaml] Refactor the .debug_pub* sections dumper.
It's good to reuse the DWARF parser in lib/DebugInfo so that we don't
need to maintain a separate parser in client side (obj2yaml). Besides,
A test case is added whose length field is a very huge value which makes
obj2yaml stuck when parsing the section.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D86192
2020-08-19 21:13:52 +08:00
Simon Pilgrim
682842361d [X86] lowerShuffleWithVPMOV - remove unnecessary shuffle commutation. NFCI.
canonicalizeShuffleMaskWithCommute should have already ensured the lower elements are from V1, we do have test coverage for this already.
2020-08-19 13:28:59 +01:00
Simon Pilgrim
55538e289f [X86][AVX] getAVX512TruncNode - don't truncate from illegal vector widths.
Thanks to @fhahn for the test case.
2020-08-19 13:00:26 +01:00
Sanjay Patel
419aac80e4 [InstCombine] update stale comments in test files; NFC
I missed updating these with:
rG23bd33c6acc4
2020-08-19 07:42:06 -04:00
Roman Lebedev
17dc9c23ed [InstCombine] Lower infinite combine loop detection thresholds
It's been a month since 2f3862eb9f21e8a0d48505637fefe6e5e295c18c,
and no new bug reports about the threshold were filled,
so let's bump it again and wait again.
2020-08-19 14:37:57 +03:00
David Green
e32403463f [ARM] Change target triple to arm-none-none-eabi. NFC 2020-08-19 11:58:50 +01:00
Simon Pilgrim
fe8e9d75c1 [X86][AVX] computeKnownBitsForTargetNode - add VTRUNC/VTRUNCS/VTRUNCUS known zero upper elements handling.
Like many of the AVX512 conversion ops, the VTRUNC ops guarantee the upper destination elements are zero.
2020-08-19 11:39:27 +01:00
Paul Walker
e3dc616e5a [SVE] Add tests for fixed length vector integer operations with immediate operands. 2020-08-19 11:12:03 +01:00
Simon Pilgrim
b05b7fd391 [X86][AVX] Fold store(extract_element(vtrunc)) to truncated store
Add handling for storing the extracted lower (truncated bits) element from a X86ISD::VTRUNC node - this can be lowered to a generic truncated store directly.

Differential Revision: https://reviews.llvm.org/D86158
2020-08-19 11:10:20 +01:00
Bjorn Pettersson
1813b6efab [GlobalISel] Untabify InstructionSelectorImpl.h. NFC 2020-08-19 12:00:00 +02:00
sstefan1
d4b642dcda [OpenMPOpt] ICV tracking for calls
Introduce two new AAs. AAICVTrackerFunctionReturned which checks if a
function can have a unique ICV value after it is finished, and
AAICVCallSiteReturned which checks AAICVTrackerFunctionReturned for a
call site. This enables us to check the value of a call and if it
changes the ICV. This also changes the approach in
`getReplacementValues()` to a worklist-based approach so we can explore
all relevant BBs.

Differential Revision: https://reviews.llvm.org/D85544
2020-08-19 11:43:12 +02:00
sstefan1
de2379255b [IR] Intrinsics default attributes and opt-out flag
Intrinsic properties can now be set to default and applied to all
intrinsics. If the attributes are not needed, the user can opt-out by
setting the DisableDefaultAttributes flag to true.

Differential Revision: https://reviews.llvm.org/D70365
2020-08-19 10:50:46 +02:00
Meera Nakrani
9a87b42f74 [ARM] Enabled VMLAV and Add instructions to use VMLAVA
Used InstCombine to enable VMLAV and Add instructions to generate VMLAVA instead with tests.
2020-08-19 08:36:49 +00:00
luxufan
6bf87e1ad1 [RISCV] add the assemble and disassemble support of Zvlsseg instructions
This implements the assemble and disassemble support of RISCV Vector
extension Zvlsseg instructions, base on the 0.9 spec version.

Reviewed  by HsiangKai

Differential Revision: https://reviews.llvm.org/D84416
2020-08-19 16:22:25 +08:00
Mauri Mustonen
3d63cfe214 [utils] Fix regexp in llvm/utils/extract_vplan.py to extract VPlans.
Regarding this bug in Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46451

I went ahead and fixed the regexp pattern and now Python script is able
to extract vplan graphs from the log files. Additionally some test for
this would be nice to have but I'm not sure are Python scripts tested
in LLVM and if so where they live.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D86068
2020-08-19 08:57:12 +01:00
madhur13490
2779613cc9 [GlobalISel] Don't skip adding predicate matcher
This patch fixes a bug which skipped
adding predicate matcher for a pattern in many cases.
For example, if predicate is Load and
its memoryVT is non-null then the loop
continues and never reaches to the end which
adds the predicate matcher. This patch moves the
matcher addition to the top of the loop
so that it gets added regardless of contextual checks
later in the loop.
Other way to fix this issue is to remove all "continue" statements
in checks and let the loop continue till end.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D83034
2020-08-19 07:54:14 +00:00
Florian Hahn
1da65b5ff0 [DSE,MemorySSA] Use NumRedundantStores instead of NumNoopStores.
Legacy DSE uses NumRedundantStores, while MemorySSA DSE uses
NumNoopStores. We should just use the same counter.
2020-08-19 08:50:33 +01:00
Ronak Chauhan
4697f34ed6 Revert "[AMDGPU] Support disassembly for AMDGPU kernel descriptors"
This reverts commit cacfb02d28a3cabd4e45d2535cb0686cef48a2c9.

Reverting due to buildbot failures.
2020-08-19 13:12:29 +05:30
David Sherwood
f7a1832d69 [SVE][CodeGen] Fix scalable vector issues in DAGTypeLegalizer::GenWidenVectorLoads
In DAGTypeLegalizer::GenWidenVectorLoads the algorithm assumes it only
ever deals with fixed width types, hence the offsets for each individual
store never take 'vscale' into account. I've changed the code in that
function to use TypeSize instead of unsigned for tracking the remaining
load amount. In addition, I've changed the load loop to use the new
IncrementPointer helper function for updating the addresses in each
iteration, since this handles scalable vector types.

Also, I've added report_fatal_errors in GenWidenVectorExtLoads,
TargetLowering::scalarizeVectorLoad and TargetLowering::scalarizeVectorStores,
since these functions currently use a sequence of element-by-element
scalar loads/stores. In a similar vein, I've also added a fatal error
report in FindMemType for the case when we decide to return the element
type for a scalable vector type.

I've added new tests in

  CodeGen/AArch64/sve-split-load.ll
  CodeGen/AArch64/sve-ld-addressing-mode-reg-imm.ll

for the changes in GenWidenVectorLoads.

Differential Revision: https://reviews.llvm.org/D85909
2020-08-19 07:54:32 +01:00
Shinji Okumura
540752542a [Attributor][NFC] Add tests to range.ll
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D86128
2020-08-19 15:01:14 +09:00
LLVM GN Syncbot
4e0cb4ff28 [gn build] Port 7546b29e761 2020-08-19 03:44:19 +00:00
Yaxun (Sam) Liu
6660be7005 [HIP] Support target id by --offload-arch
This patch introduces support of target id by
-offload-arch.

Differential Revision: https://reviews.llvm.org/D60620
2020-08-18 23:43:53 -04:00
Ronak Chauhan
142f4dd209 [AMDGPU] Support disassembly for AMDGPU kernel descriptors
Decode AMDGPU Kernel descriptors as assembler directives.

Reviewed By: scott.linder

Differential Revision: https://reviews.llvm.org/D80713
2020-08-19 08:49:07 +05:30
Changpeng Fang
c3904f6ffc AMDGPU: Implement waterfall loop for MIMG instructions with 256-bit SRsrc
Summary:
  When the resource descriptor is of vgpr, we need a waterfall loop
to read into a sgpr. In this patchm we generalized the  implementation
to work for any regster class sizes, and extend the work to MIMG
instructions.

Fixes: SWDEV-223405

Reviewers:
  arsenm, nhaehnle

Differential Revision:
  https://reviews.llvm.org/D82603
2020-08-18 16:27:36 -07:00
Chuanqi Xu
3188b05ed0 [NFC][StackSafety] Test that StackLifetime looks through stripPointerCasts
StackLifetime class collects lifetime marker of an `alloca` by collect
the user of `BitCast` who is the user of the `alloca`. However, either
the `alloca` itself could be used with the lifetime marker or the `BitCast`
of the `alloca` could be transformed to other instructions. (e.g.,
it may be transformed to all zero reps in `InstCombine` pass).
This patch tries to fix this process in `collectMarkers` functions.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D85399
2020-08-18 16:21:00 -07:00
Elliott Hughes
8e3a33cacc ld128 demangle: allow space for 'L' suffix.
Summary:
Caught by HWASAN on arm64 Android (which uses ld128 for long double). This
was running the existing fuzzer.

The specific minimized fuzz input to reproduce this is:

  __cxa_demangle("1\006ILeeeEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE", 0, 0, 0);

Reviewers: eugenis, srhines, #libc_abi!

Subscribers: kristof.beyls, danielkiss, libcxx-commits

Tags: #libc_abi

Differential Revision: https://reviews.llvm.org/D77924
2020-08-18 16:14:05 -07:00
Roman Lebedev
3f4579ac3a [NFC][InstCombine] Aggregate reconstruction: use plain map
Now that we no longer require for this map to have stable iteration order,
we no longer need to pay for keeping the iteration order stable,
so switch from `SmallMapVector` to `SmallDenseMap`.
2020-08-19 01:09:25 +03:00
Roman Lebedev
2083389218 [InstCombine] PHI-aware aggregate reconstruction: properly handle duplicate predecessors
While it may seem like we can just "deduplicate" the case where
some basic block happens to be a predecessor more than once,
which happens for e.g. switches, that is not correct thing to do.
We must actually add a PHI operand for each predecessor.

This was initially reported to me by David Major
as a clang crash during gecko build for android.
2020-08-19 01:00:42 +03:00
Amara Emerson
2018ade545 Use std::make_tuple instead of initializer lists to make a bot happy:
http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux
2020-08-18 14:55:52 -07:00
Craig Topper
5408cc13ce [X86] Fix the Predicates on MMX_PSHUFWri/PSHUFWmi to include SSE1 in addition to MMX.
These instructions weren't in the initial version of MMX, but
were added when SSE1 was introduced. We already have the intrinsic
named correctly to include sse and the frontened header enforces
sse. We have one place in the backend where we DAG combine to
this intrinsic, but that's also qualified. So don't know of anything
currently broken unless someone writes their own IR and doesn't
set the sse feature.
2020-08-18 14:28:26 -07:00
David Blaikie
c9c4b5ec13 Recommit "PR44685: DebugInfo: Handle address-use-invalid type units referencing non-type units"
Originally committed as be3ef93bf58aa5546c7baadfb21d43b75fbb4e24.
Reverted by b4bffdbadfcceb3959aaf231c1542301944e5812 due to bot
failures:
http://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-expensive/17380/testReport/junit/LLVM/DebugInfo_X86/addr_tu_to_non_tu_ll/
http://45.33.8.238/win/22216/step_11.txt

MacOS failure due to testing Split DWARF which isn't compatible with
MachO.
Windows failure due to testing type units which aren't enabled on
Windows.

Fix both of these by applying an explicit x86 linux triple to the test.
2020-08-18 13:43:28 -07:00
Sanjay Patel
30bd733531 [VectorCombine] add tests for vector loads; NFC 2020-08-18 16:23:33 -04:00
Eli Friedman
610e3a8c86 [AArch64][SVE] Add patterns for integer mla/mls.
We probably want to introduce pseudo-instructions at some point, like
we have for binary operations, but this seems okay for now.

One thing I'm not sure about is whether we should be doing this as a
DAGCombine instead of directly pattern-matching it. I don't see any big
downside to doing it this way, though.

Differential Revision: https://reviews.llvm.org/D85681
2020-08-18 12:51:16 -07:00
Eli Friedman
2be753c211 [AArch64][SVE] Allow llvm.aarch64.sve.st2/3/4 with vectors of pointers.
This isn't necessaary for ACLE, but could be useful in other situations.
And the change is simple.

Differential Revision: https://reviews.llvm.org/D85251
2020-08-18 12:51:16 -07:00
Jessica Paquette
67ae683e5b [GlobalISel][CallLowering] NFC: Unify flag-setting from CallBase + AttributeList
It's annoying to have to maintain multiple, nearly identical chains of if
statements which all set the same attributes.

Add a helper function, `addFlagsUsingAttrFn` which performs the attribute
setting.

Then, use wrappers for that function in `lowerCall` and `setArgFlags`.

(Note that the flag-setting code in `setArgFlags` was missing the returned
attribute. There's no selection for this yet, so no test. It's an example of
the kind of thing this lets us avoid, though.)

Differential Revision: https://reviews.llvm.org/D86159
2020-08-18 11:07:33 -07:00
Jessica Paquette
761fea8dc0 [GlobalISel][CallLowering] Don't tail call with non-forwarded explicit sret
Similar to this commit:

faf8065a99817bcb10e6f09b558fe3e0972c35ce

Testcase is pretty much the same as

test/CodeGen/AArch64/tailcall-explicit-sret.ll

Except it uses i64 (since we don't handle the i1024 return values yet), and
doesn't have indirect tail call testcases (because we can't translate those
yet).

Differential Revision: https://reviews.llvm.org/D86148
2020-08-18 11:06:57 -07:00
Matt Arsenault
418515b7d0 GlobalISel: Implement fewerElementsVector for G_INSERT_VECTOR_ELT
Add unit tests since AMDGPU will only trigger this for gigantic
vectors, and won't use the annoying odd sized breakdown case.
2020-08-18 13:51:19 -04:00
David Blaikie
01ab206194 [WIP][DebugInfo] Lazily parse debug_loclist offsets
Parsing DWARFv5 debug_loclist offsets when a CU is parsed is weighing
down memory usage of symbolizers that don't need to parse this data at
all. There's not much benefit to caching these anyway - since they are
O(1) lookup and reading once you know where the offset list starts (and
can do bounds checking with the offset list size too).

In general, I think it might be time to start paying down some of the
technical debt of loc/loclist/range/rnglist parsing to try to unify it a
bit more.

eg:

* Currently DWARFUnit has: RangeSection, RangeSectionBase, LocSection,
  LocSectionBase, LocTable, RngListTable, LoclistTableHeader (be nice if
  these were all wrapped up in two variables - one for loclists, one for
  rnglists)

* rnglists and loclists are handled differently (see:
  LoclistTableHeader, but no RnglistTableHeader)

* maybe all these types could be less stateful - lazily parse what they
  need to, even reparsing rather than caching because it doesn't seem
  too expensive, for instance. (though admittedly so long as it's
  constantcost/overead per compilatiton that's probably adequate)

* Maybe implementing and using a DWARFDataExtractor that can be
  sub-ranged (so we could slice it up to just the single contribution) -
  though maybe that's not so useful because loc/ranges need to refer to
  it by absolute, not contribution-relative mechanisms

Differential Revision: https://reviews.llvm.org/D86110
2020-08-18 10:49:39 -07:00
Amara Emerson
f6bce1ffcd [GlobalISel] Add a combine for sext_inreg(load x), c --> sextload x
This is restricted to single use loads, which if we fold to sextloads we can
find more optimal addressing modes on AArch64.

This also fixes an overload the MachineFunction::getMachineMemOperand() method
which was incorrectly using the MF alignment instead of the MMO alignment.

Differential Revision: https://reviews.llvm.org/D85966
2020-08-18 10:42:15 -07:00
Amara Emerson
d1d273ff1c [GlobalISel] Add a combine for ashr(shl x, c), c --> sext_inreg x, c'
By detecting this sign extend pattern early, we can uncover opportunities for
more optimizations.

Differential Revision: https://reviews.llvm.org/D85965
2020-08-18 10:42:15 -07:00
Arthur Eubanks
77e79ccb03 [gn build] Add support for expensive checks
Reviewed By: hans, MaskRay

Differential Revision: https://reviews.llvm.org/D86007
2020-08-18 09:53:39 -07:00