1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-26 04:32:44 +01:00
Commit Graph

8464 Commits

Author SHA1 Message Date
Tim Northover
0fd5aa6df8 UBSAN: emit distinctive traps
Sometimes people get minimal crash reports after a UBSAN incident. This change
tags each trap with an integer representing the kind of failure encountered,
which can aid in tracking down the root cause of the problem.
2020-12-08 10:28:26 +00:00
wlei
db7fa377e4 [CSSPGO][llvm-profgen] Context-sensitive profile data generation
This stack of changes introduces `llvm-profgen` utility which generates a profile data file from given perf script data files for sample-based PGO. It’s part of(not only) the CSSPGO work. Specifically to support context-sensitive with/without pseudo probe profile, it implements a series of functionalities including perf trace parsing, instruction symbolization, LBR stack/call frame stack unwinding, pseudo probe decoding, etc. Also high throughput is achieved by multiple levels of sample aggregation and compatible format with one stop is generated at the end. Please refer to: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s for the CSSPGO RFC.

This change supports context-sensitive profile data generation into llvm-profgen. With simultaneous sampling for LBR and call stack, we can identify leaf of LBR sample with calling context from stack sample . During the process of deriving fall through path from LBR entries, we unwind LBR by replaying all the calls and returns (including implicit calls/returns due to inlining) backwards on top of the sampled call stack. Then the state of call stack as we unwind through LBR always represents the calling context of current fall through path.

we have two types of virtual unwinding 1) LBR unwinding and 2) linear range unwinding.
Specifically, for each LBR entry which can be classified into call, return, regular branch, LBR unwinding will replay the operation by pushing, popping or switching leaf frame towards the call stack and since the initial call stack is most recently sampled, the replay should be in anti-execution order, i.e. for the regular case, pop the call stack when LBR is call, push frame on call stack when LBR is return. After each LBR processed, it also needs to align with the next LBR by going through instructions from previous LBR's target to current LBR's source, which we named linear unwinding. As instruction from linear range can come from different function by inlining, linear unwinding will do the range splitting and record counters through the range with same inline context.

With each fall through path from LBR unwinding, we aggregate each sample into counters by the calling context and eventually generate full context sensitive profile (without relying on inlining) to driver compiler's PGO/FDO.

A breakdown of noteworthy changes:
- Added `HybridSample` class as the abstraction perf sample including LBR stack and call stack
* Extended `PerfReader` to implement auto-detect whether input perf script output contains CS profile, then do the parsing. Multiple `HybridSample` are extracted
* Speed up by aggregating  `HybridSample` into `AggregatedSamples`
* Added VirtualUnwinder that consumes aggregated  `HybridSample` and implements unwinding of calls, returns, and linear path that contains implicit call/return from inlining. Ranges and branches counters are aggregated by the calling context.
 Here calling context is string type, each context is a pair of function name and callsite location info, the whole context is like `main:1 @ foo:2 @ bar`.
* Added PorfileGenerater that accumulates counters by ranges unfolding or branch target mapping, then generates context-sensitive function profile including function body, inferring callee's head sample, callsite target samples, eventually records into ProfileMap.

* Leveraged LLVM build-in(`SampleProfWriter`) writer to support different serialization format with no stop
- `getCanonicalFnName` for callee name and name from ELF section
- Added regression test for both unwinding and profile generation

Test Plan:
ninja & ninja check-llvm

Reviewed By: hoy, wenlei, wmi

Differential Revision: https://reviews.llvm.org/D89723
2020-12-07 13:48:58 -08:00
Nico Weber
2e1a73029c docs: Add pointer to cmake caches for PGO
Also add a link to end-user PGO documentation.

Differential Revision: https://reviews.llvm.org/D92768
2020-12-07 15:55:26 -05:00
Hans Wennborg
0bc6a17a7e Test commit 2020-12-07 17:27:03 +01:00
Tony
0a2da89f49 [NFC][AMDGPU] AMDGPUUsage updates
- Document code object V2 gfx800.
- Document amdpal is supported by Linux Pro.

Differential Revision: https://reviews.llvm.org/D92708
2020-12-05 02:13:17 +00:00
Sean Silva
bc224c8fad [SmallVector] Allow SmallVector<T>
This patch adds a capability to SmallVector to decide a number of
inlined elements automatically. The policy is:

- A minimum of 1 inlined elements, with more as long as
sizeof(SmallVector<T>) <= 64.
- If sizeof(T) is "too big", then trigger a static_assert: this dodges
the more pathological cases

This is expected to systematically improve SmallVector use in the
LLVM codebase, which has historically been plagued by semi-arbitrary /
cargo culted N parameters, often leading to bad outcomes due to
excessive sizeof(SmallVector<T, N>). This default also makes
programming more convenient by avoiding edit/rebuild cycles due to
forgetting to type the N parameter.

Differential Revision: https://reviews.llvm.org/D92522
2020-12-03 17:21:44 -08:00
Paul C. Anagnostopoulos
15379f2764 [TableGen] Eliminate the 'code' type
Update the documentation.

Rework various backends that relied on the code type.

Differential Revision: https://reviews.llvm.org/D92269
2020-12-03 10:19:11 -05:00
Fangrui Song
649f05aa24 Switch from llvm::is_trivially_copyable to std::is_trivially_copyable
GCC<5 did not support std::is_trivially_copyable. Now LLVM builds require 5.1
we can migrate to std::is_trivially_copyable.

The Optional.h change made MSVC choke
(https://buildkite.com/llvm-project/premerge-checks/builds/18587#cd1bb616-ffdc-4581-9795-b42c284196de)
so I leave it out for now.

Differential Revision: https://reviews.llvm.org/D92514
2020-12-02 22:02:48 -08:00
Reid Kleckner
7c87aeebfe Revert "Use std::is_trivially_copyable", breaks MSVC build
Revert "Delete llvm::is_trivially_copyable and CMake variable HAVE_STD_IS_TRIVIALLY_COPYABLE"

This reverts commit 4d4bd40b578d77b8c5bc349ded405fb58c333c78.

This reverts commit 557b00e0afb2dc1776f50948094ca8cc62d97be4.
2020-12-02 14:30:46 -08:00
Fangrui Song
dffdc25f75 Use std::is_trivially_copyable
GCC<5 did not support std::is_trivially_copyable. Now LLVM builds require 5.1
we can migrate to std::is_trivially_copyable.
2020-12-02 09:58:07 -08:00
Bardia Mahjour
fbc2c5ae27 [LV] Epilogue Vectorization with Optimal Control Flow (Recommit)
This is yet another attempt at providing support for epilogue
vectorization following discussions raised in RFC http://llvm.1065342.n5.nabble.com/llvm-dev-Proposal-RFC-Epilog-loop-vectorization-tt106322.html#none
and reviews D30247 and D88819.

Similar to D88819, this patch achieve epilogue vectorization by
executing a single vplan twice: once on the main loop and a second
time on the epilogue loop (using a different VF). However it's able
to handle more loops, and generates more optimal control flow for
cases where the trip count is too small to execute any code in vector
form.

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D89566
2020-12-02 10:09:56 -05:00
David Sherwood
6d7c7dcc2b [SVE] Add support for scalable vectors with vectorize.scalable.enable loop attribute
In this patch I have added support for a new loop hint called
vectorize.scalable.enable that says whether we should enable scalable
vectorization or not. If a user wants to instruct the compiler to
vectorize a loop with scalable vectors they can now do this as
follows:

  br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !2
  ...
  !2 = !{!2, !3, !4}
  !3 = !{!"llvm.loop.vectorize.width", i32 8}
  !4 = !{!"llvm.loop.vectorize.scalable.enable", i1 true}

Setting the hint to false simply reverts the behaviour back to the
default, using fixed width vectors.

Differential Revision: https://reviews.llvm.org/D88962
2020-12-02 13:23:43 +00:00
Tony
545440c6ae [NFC][AMDGPU] Fix broken link to ClangOffloadBundler in AMDGPUUsage 2020-12-02 03:04:28 +00:00
Tony
ebb5d91fda [NFC][AMDGPU] AMDGPU code object V4 ABI documentation
- Documantation for AMDGPU code object V4.
- Documentation clarification for code object V2 and V3.
- Documentation for the clang-offload-bundler.
- Numerous other documentation clarifications.

Change-Id: I338b327cc9e75da6c987b7e081b496402a5a020e

Differential Revision: https://reviews.llvm.org/D92434
2020-12-01 23:31:04 +00:00
Bardia Mahjour
b7eee47753 Revert "[LV] Epilogue Vectorization with Optimal Control Flow"
This reverts commit 9c5504adceb544d9954ddb8ff3035a414f4b1423.
Reverting to investigate build failure in http://lab.llvm.org:8011/#/builders/98/builds/1461/steps/9
2020-12-01 12:50:36 -05:00
Bardia Mahjour
63b138b338 [LV] Epilogue Vectorization with Optimal Control Flow
This is yet another attempt at providing support for epilogue
vectorization following discussions raised in RFC http://llvm.1065342.n5.nabble.com/llvm-dev-Proposal-RFC-Epilog-loop-vectorization-tt106322.html#none
and reviews D30247 and D88819.

Similar to D88819, this patch achieve epilogue vectorization by
executing a single vplan twice: once on the main loop and a second
time on the epilogue loop (using a different VF). However it's able
to handle more loops, and generates more optimal control flow for
cases where the trip count is too small to execute any code in vector
form.

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D89566
2020-12-01 12:04:29 -05:00
Amy Huang
b375f49096 Recommit "[llvm-symbolizer] Switch to using native symbolizer by default on Windows"
This reverts commit 1b63177a56e8cd6196778d2b90295f03e96b5800.
2020-11-30 17:36:12 -08:00
Juneyoung Lee
b00d8684c3 [LangRef] missing link, minor fix 2020-11-30 23:09:36 +09:00
David Spickett
5c2736b6c2 [llvm-objdump] Document --mattr=help in --help output
This does the same as `--mcpu=help` but was only
documented in the user guide.

* Added a test for both options.
* Corrected the single dash in `-mcpu=help` text.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D92305
2020-11-30 12:52:54 +00:00
Juneyoung Lee
e5f43fdeb3 [LangRef] minor fixes to poison examples and well-defined values section (NFC) 2020-11-29 20:51:25 +09:00
Juneyoung Lee
7546d005c8 [LangRef] Add poison constant
This patch adds a description about the newly added poison constant to LangRef.

Differential Revision: https://reviews.llvm.org/D92162
2020-11-27 10:29:52 +09:00
Marek Kurdej
542b1725df [llvm-profgen] [docs] Fix invalid header. Add to ToC. NFC. 2020-11-26 10:45:05 +01:00
Amy Huang
2504c5bf49 Revert "[llvm-symbolizer] Switch to using native symbolizer by default on Windows"
Breaks some asan tests on the buildbot.

This reverts commit c74b427cb2a90309ee0c29df21ad1ca26390263c.
2020-11-23 16:29:45 -08:00
Amy Huang
f6737ef448 [llvm-symbolizer] Switch to using native symbolizer by default on Windows
llvm-symbolizer used to use the DIA SDK for symbolization on
Windows; this patch switches to using native symbolization, which was
implemented recently.

Users can still make the symbolizer use DIA by adding the `-dia` flag
in the LLVM_SYMBOLIZER_OPTS environment variable.

Differential Revision: https://reviews.llvm.org/D91814
2020-11-23 15:57:08 -08:00
Paul C. Anagnostopoulos
58226c6585 [TableGen] Eliminte source location from CodeInit
Step 1 in eliminating the 'code' type.

Differential Revision: https://reviews.llvm.org/D91932
2020-11-23 11:30:13 -05:00
Tony
7cfcd72ff5 [NFC][AMDGPU] Document kernel descriptor
- Document that the kernel descriptor defined is for code object V3.
  Document that it also applies to earlier code object formats for CP.

- Document the deprecated bits in kernel descriptor.

Differential Revision: https://reviews.llvm.org/D91458
2020-11-21 04:54:17 +00:00
wlei
dd799fc98b [llvm-profgen][NFC]Fix build failure on different platform
see titile
Test Plan:
ninja & ninja check-llvm

Reviewed By: hoy

Differential Revision: https://reviews.llvm.org/D91897
2020-11-20 16:36:04 -08:00
wlei
563627eb7b [CSSPGO][llvm-profgen] Disassemble text sections
This stack of changes introduces `llvm-profgen` utility which generates a profile data file from given perf script data files for sample-based PGO. It’s part of(not only) the CSSPGO work. Specifically to support context-sensitive with/without pseudo probe profile, it implements a series of functionalities including perf trace parsing, instruction symbolization, LBR stack/call frame stack unwinding, pseudo probe decoding, etc. Also high throughput is achieved by multiple levels of sample aggregation and compatible format with one stop is generated at the end. Please refer to: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s for the CSSPGO RFC.

This change enables disassembling the text sections to build various address maps that are potentially used by the virtual unwinder.  A switch `--show-disassembly` is being added to print the disassembly code.

Like the llvm-objdump tool, this change leverages existing LLVM components to parse and disassemble ELF binary files. So far X86 is supported.

Test Plan:

ninja check-llvm

Reviewed By: wmi, wenlei

Differential Revision: https://reviews.llvm.org/D89712
2020-11-20 14:26:26 -08:00
wlei
cc95d46ee1 [CSSPGO][llvm-profgen] Parse mmap events from perf script
This stack of changes introduces `llvm-profgen` utility which generates a profile data file from given perf script data files for sample-based PGO. It’s part of(not only) the CSSPGO work. Specifically to support context-sensitive with/without pseudo probe profile, it implements a series of functionalities including perf trace parsing, instruction symbolization, LBR stack/call frame stack unwinding, pseudo probe decoding, etc. Also high throughput is achieved by multiple levels of sample aggregation and compatible format with one stop is generated at the end. Please refer to: https://groups.google.com/g/llvm-dev/c/1p1rdYbL93s for the CSSPGO RFC.

As a starter, this change sets up an entry point by introducing PerfReader to load profiled binaries and perf traces(including perf events and perf samples). For the event, here it parses the mmap2 events from perf script to build the loader snaps, which is used to retrieve the image load address in the subsequent perf tracing parsing.

As described in llvm-profgen.rst, the tool being built aims to support multiple input perf data (preprocessed by perf script) as well as multiple input binary images. It should also support dynamic reload/unload shared objects by leveraging the loader snaps being built by this change

Reviewed By: wenlei, wmi

Differential Revision: https://reviews.llvm.org/D89707
2020-11-20 14:26:26 -08:00
Alex Richardson
9c96f39f77 Add a default address space for globals to DataLayout
This is similar to the existing alloca and program address spaces (D37052)
and should be used when creating/accessing global variables.
We need this in our CHERI fork of LLVM to place all globals in address space 200.
This ensures that values are accessed using CHERI load/store instructions
instead of the normal MIPS/RISC-V ones.

The problem this is trying to fix is that most of the time the type of
globals is created using a simple PointerType::getUnqual() (or ::get() with
the default address-space value of 0). This does not work for us and we get
assertion/compilation/instruction selection failures whenever a new call
is added that uses the default value of zero.

In our fork we have removed the default parameter value of zero for most
address space arguments and use DL.getProgramAddressSpace() or
DL.getGlobalsAddressSpace() whenever possible. If this change is accepted,
I will upstream follow-up patches to use DL.getGlobalsAddressSpace() instead
of relying on the default value of 0 for PointerType::get(), etc.

This patch and the follow-up changes will not have any functional changes
for existing backends with the default globals address space of zero.
A follow-up commit will change the default globals address space for
AMDGPU to 1.

Reviewed By: dylanmckay

Differential Revision: https://reviews.llvm.org/D70947
2020-11-20 15:46:52 +00:00
Pavel Iliin
2529cb73ff [AArch64] Out-of-line atomics (-moutline-atomics) implementation.
This patch implements out of line atomics for LSE deployment
mechanism. Details how it works can be found in llvm/docs/Atomics.rst
Options -moutline-atomics and -mno-outline-atomics to enable and disable it
were added to clang driver. This is clang and llvm part of out-of-line atomics
interface, library part is already supported by libgcc. Compiler-rt
support is provided in separate patch.

Differential Revision: https://reviews.llvm.org/D91157
2020-11-20 13:30:12 +00:00
Leonard Chan
c24d9d2b01 [llvm][IR] Add dso_local_equivalent Constant
The `dso_local_equivalent` constant is a wrapper for functions that represents a
value which is functionally equivalent to the global passed to this. That is, if
this accepts a function, calling this constant should have the same effects as
calling the function directly. This could be a direct reference to the function,
the `@plt` modifier on X86/AArch64, a thunk, or anything that's equivalent to the
resolved function as a call target.

When lowered, the returned address must have a constant offset at link time from
some other symbol defined within the same binary. The address of this value is
also insignificant. The name is leveraged from `dso_local` where use of a function
or variable is resolved to a symbol in the same linkage unit.

In this patch:
- Addition of `dso_local_equivalent` and handling it
- Update Constant::needsRelocation() to strip constant inbound GEPs and take
  advantage of `dso_local_equivalent` for relative references

This is useful for the [Relative VTables C++ ABI](https://reviews.llvm.org/D72959)
which makes vtables readonly. This works by replacing the dynamic relocations for
function pointers in them with static relocations that represent the offset between
the vtable and virtual functions. If a function is externally defined,
`dso_local_equivalent` can be used as a generic wrapper for the function to still
allow for this static offset calculation to be done.

See [RFC](http://lists.llvm.org/pipermail/llvm-dev/2020-August/144469.html) for more details.

Differential Revision: https://reviews.llvm.org/D77248
2020-11-19 10:26:17 -08:00
Nick Desaulniers
b2b1b97849 Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch"
This reverts commit b7926ce6d7a83cdf70c68d82bc3389c04009b841.

Going with a simpler approach.
2020-11-17 17:27:14 -08:00
Florian Hahn
4864887dc5 [VPlan] Add VPDef class.
This patch introduces a new VPDef class, which can be used to
manage VPValues defined by recipes/VPInstructions.

The idea here is to mirror VPUser for values defined by a recipe. A
VPDef can produce either zero (e.g. a store recipe), one (most recipes)
or multiple (VPInterleaveRecipe) result VPValues.

To traverse the def-use chain from a VPDef to its users, one has to
traverse the users of all values defined by a VPDef.

VPValues now contain a pointer to their corresponding VPDef, if one
exists. To traverse the def-use chain upwards from a VPValue, we first
need to check if the VPValue is defined by a VPDef. If it does not have
a VPDef, this means we have a VPValue that is not directly defined
iniside the plan and we are done.

If we have a VPDef, it is defined inside the region by a recipe, which
is a VPUser, and the upwards def-use chain traversal continues by
traversing all its operands.

Note that we need to add an additional field to to VPVAlue to link them
to their defs. The space increase is going to be offset by being able to
remove the SubclassID field in future patches.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D90558
2020-11-17 16:18:11 +00:00
Michael Liao
f1ef8ad5ff [InferAddrSpace] Teach to handle assumed address space.
- In certain cases, a generic pointer could be assumed as a pointer to
  the global memory space or other spaces. With a dedicated target hook
  to query that address space from a given value, infer-address-space
  pass could infer and propagate that to all its users.

Differential Revision: https://reviews.llvm.org/D91121
2020-11-16 17:06:33 -05:00
Paul C. Anagnostopoulos
af5277fb13 [TableGen] Improve a couple of descriptions in the command guide
Differential Revision: https://reviews.llvm.org/D91484
2020-11-15 09:59:59 -05:00
Paul C. Anagnostopoulos
17cd31c30e [TableGen] Add frontend/backend phase timing capability.
Describe in the BackEnd Developer's Guide. Instrument a few backends.

Remove an old unused timing facility. Add a null backend for timing
the parser.

Differential Revision: https://reviews.llvm.org/D91388
2020-11-14 10:10:29 -05:00
Nikita Popov
effaed5675 [LangRef] Clarify GEP inbounds wrapping semantics
Clarify the semantics of GEP inbounds, in particular with regard
to what it means for wrapping. This cleans up some confusion on
when it is legal to apply nuw/nsw flags to various parts of the
GEP calculation.

Differential Revision: https://reviews.llvm.org/D90708
2020-11-13 17:49:41 +01:00
Paul C. Anagnostopoulos
82f80b36cd [TableGen] Enhance the six comparison bang operators.
Update the Programmer's Reference.

Differential Revision: https://reviews.llvm.org/D91036
2020-11-13 09:57:27 -05:00
serge-sans-paille
ae7304bdea llvmbuildectomy - compatibility with ocaml bindings
Use exact component name in add_ocaml_library.
Make expand_topologically compatible with new architecture.
Fix quoting in is_llvm_target_library.
Fix LLVMipo component name.
Write release note.
2020-11-13 14:35:52 +01:00
Florian Hahn
041da6277f Add !annotation metadata and remarks pass.
This patch adds a new !annotation metadata kind which can be used to
attach annotation strings to instructions.

It also adds a new pass that emits summary remarks per function with the
counts for each annotation kind.

The intended uses cases for this new metadata is annotating
'interesting' instructions and the remarks should provide additional
insight into transformations applied to a program.

To motivate this, consider these specific questions we would like to get answered:

* How many stores added for automatic variable initialization remain after optimizations? Where are they?
* How many runtime checks inserted by a frontend could be eliminated? Where are the ones that did not get eliminated?

Discussed on llvm-dev as part of 'RFC: Combining Annotation Metadata and Remarks'
(http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html)

Reviewed By: thegameg, jdoerfert

Differential Revision: https://reviews.llvm.org/D91188
2020-11-13 13:24:10 +00:00
Florian Hahn
744ac7e74e [docs] Fix undefined reference in ORCv2 design doc.
This fixes a typo introduced in 984e87923f1096c815cef900cda0926c68286ddf
which caused the docs build to fail.
2020-11-13 09:44:48 +00:00
serge-sans-paille
82b6e6053d llvmbuildectomy - replace llvm-build by plain cmake
No longer rely on an external tool to build the llvm component layout.

Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.

These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.

Differential Revision: https://reviews.llvm.org/D90848
2020-11-13 10:35:24 +01:00
Lang Hames
19bcc02c62 [docs] Fix formatting, clarify comment in ORCv2 doc 2020-11-12 13:11:01 +11:00
Lang Hames
56fbd9511b [docs] Fix formatting in ORCv2.rst.
Bold and fixed-width do not appear to mix well.
2020-11-12 11:08:58 +11:00
Lang Hames
f3031ea7aa [docs] Update ORCv2 design doc.
Fixes some formatting and wording, and adds a roadmap section.
2020-11-12 10:33:29 +11:00
Renato Golin
b0b9198036 [docs] link new support policy from developer policy
Adding new paragraphs under "Introducing New Components" section to
check the different levels of support we have, to help introduction of
smaller set of changes without overwhelming new collaborators and
potentially losing the contribution.

Differential Revision: D91013
2020-11-10 19:40:57 +00:00
David Green
f1fd013f54 [Sphinx] Fix langref formatting. NFC 2020-11-10 16:47:43 +00:00
David Green
0773b05cfa [ARM] Alter t2DoLoopStart to define lr
This changes the definition of t2DoLoopStart from
t2DoLoopStart rGPR
to
GPRlr = t2DoLoopStart rGPR

This will hopefully mean that low overhead loops are more tied together,
and we can more reliably generate loops without reverting or being at
the whims of the register allocator.

This is a fairly simple change in itself, but leads to a number of other
required alterations.

 - The hardware loop pass, if UsePhi is set, now generates loops of the
   form:
       %start = llvm.start.loop.iterations(%N)
     loop:
       %p = phi [%start], [%dec]
       %dec = llvm.loop.decrement.reg(%p, 1)
       %c = icmp ne %dec, 0
       br %c, loop, exit
 - For this a new llvm.start.loop.iterations intrinsic was added, identical
   to llvm.set.loop.iterations but produces a value as seen above, gluing
   the loop together more through def-use chains.
 - This new instrinsic conceptually produces the same output as input,
   which is taught to SCEV so that the checks in MVETailPredication are not
   affected.
 - Some minor changes are needed to the ARMLowOverheadLoop pass, but it has
   been left mostly as before. We should now more reliably be able to tell
   that the t2DoLoopStart is correct without having to prove it, but
   t2WhileLoopStart and tail-predicated loops will remain the same.
 - And all the tests have been updated. There are a lot of them!

This patch on it's own might cause more trouble that it helps, with more
tail-predicated loops being reverted, but some additional patches can
hopefully improve upon that to get to something that is better overall.

Differential Revision: https://reviews.llvm.org/D89881
2020-11-10 15:57:58 +00:00
Paul C. Anagnostopoulos
25ecf898fc [TableGen] Add the !filter bang operator.
Add a test. Update the Programmer's Reference.

Use it in some TableGen files.

Differential Revision: https://reviews.llvm.org/D91008
2020-11-09 10:56:55 -05:00