1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-22 10:42:39 +01:00
Commit Graph

212178 Commits

Author SHA1 Message Date
LemonBoy
f00c577e2d [LegalizeDAG] Implement promotion rules for SELECT_CC
Implement the promotion rule for SELECT_CC nodes by upcasting all the parameters and downcasting the result.
The AArch64 target makes use of this rule and, since it was not implemented, in some cases the instruction selector would hit an assertion upon encountering the illegal node.

This patch requires D97840, the included test cases hit both problems.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D97859
2021-03-05 18:22:55 +01:00
RamNalamothu
b6b6e47136 [AMDGPU] Do not attempt sgpr spills to vgpr, when it is disabled
This covers a path missed in https://reviews.llvm.org/D95768.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D98013
2021-03-05 22:47:21 +05:30
Nico Weber
cef0f85d70 [gn build] allow setting clang_base_path to a source-absolute path
With this, you can set `clang_base_path = "//out/gn1"` in `out/gn2/args.gn` and
the build in out/gn2 will use clang and lld from out/gn1.

Setting `clang_base_path` to an absolute path (with e.g.
`clang_base_path = getenv("HOME") + "/src/..."`) should behave as before.

Differential Revision: https://reviews.llvm.org/D97989
2021-03-05 12:13:51 -05:00
Jinsong Ji
0026d38644 [PowerPC] Update Copy/Paste encodings according to ISA3.1
Copy-paste P9 insns were added back in 2016,
however, looks like the opcodes has changed in ISA3.1.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D97416
2021-03-05 17:05:50 +00:00
gbtozers
c52cf11f42 [DebugInfo] Add DIArgList MD to store multple values in DbgVariableIntrinsics
This patch adds a new metadata node, DIArgList, which contains a list of SSA
values. This node is in many ways similar in function to the existing
ValueAsMetadata node, with the difference being that it tracks a list instead of
a single value. Internally, it uses ValueAsMetadata to track the individual
values, but there is also a reasonable amount of DIArgList-specific
value-tracking logic on top of that. Similar to ValueAsMetadata, it is a special
case in parsing and printing due to the fact that it requires a function state
(as it may reference function-local values).

This patch should not result in any immediate functional change; it allows for
DIArgLists to be parsed and printed, but debug variable intrinsics do not yet
recognize them as a valid argument (outside of parsing).

Differential Revision: https://reviews.llvm.org/D88175
2021-03-05 17:02:24 +00:00
Simon Pilgrim
5ac93b57f2 [X86] X86ISelDAGToDAG.cpp - include cstdint instead of stdint.h NFCI.
Fixes clang-tidy warning
2021-03-05 15:58:20 +00:00
Simon Pilgrim
f3b4ff0bf8 [X86] X86DAGToDAGISel::Select - merge X86::TEST load bitsize checks. NFCI. 2021-03-05 15:58:20 +00:00
LemonBoy
dd61803fe7 [AArch64] Legalize horizontal fmax/fmin reductions on f16 vectors
Expand the horizontal reduction during the instruction selection phase, but only if the target doesn't support the full fp16 instruction set.

Fixes https://bugs.llvm.org/show_bug.cgi?id=49401

Reviewed By: aemerson

Differential Revision: https://reviews.llvm.org/D97840
2021-03-05 16:09:37 +01:00
Ilya Leoshkevich
13dd94d4c3 [BPF] Add support for floats and doubles
Some BPF programs compiled on s390 fail to load, because s390
arch-specific linux headers contain float and double types. At the
moment there is no BTF_KIND for floats and doubles, so the release
version of LLVM ends up emitting type id 0 for them, which the
in-kernel verifier does not accept.

Introduce support for such types to libbpf by representing them using
the new BTF_KIND_FLOAT.

Reviewed By: yonghong-song

Differential Revision: https://reviews.llvm.org/D83289
2021-03-05 15:10:11 +01:00
Stephen Tozer
e0cb677eb6 Reapply "[DebugInfo] Add new instruction and DIExpression operator for variadic debug values"
Rewrites test to use correct architecture triple; fixes incorrect
reference in SourceLevelDebugging doc; simplifies `spillReg` behaviour
so as to not be dependent on changes elsewhere in the patch stack.

This reverts commit d2000b45d033c06dc7973f59909a0ad12887ff51.
2021-03-05 12:32:05 +00:00
Abhina Sreeskantharajan
4a1dabde23 [test] Use host platform specific error message substitution in lit tests
This patch uses the errno python library to print out the correct error messages instead of hardcoding the error message per platform.

Reviewed By: jhenderson, ASDenysPetrov

Differential Revision: https://reviews.llvm.org/D97472
2021-03-05 07:21:53 -05:00
Sebastian Neubauer
21a926deb5 [AMDGPU] Keep skip branch for ds instructions
Same as other memory instructions, ds instructions add latency even if
exec is zero. Jumping over them if exec=0 is cheaper than executing
them.
With this change, the branch instruction that skips over a basic block
if exec=0 is not removed when the block contains a ds instruction.

Differential Revision: https://reviews.llvm.org/D97922
2021-03-05 12:34:09 +01:00
Jingu Kang
d39908fe43 [AArch64] Add missing intrinsics for vrnd 2021-03-05 11:26:12 +00:00
Simon Pilgrim
6ecc32e8eb Fix Wdocumentation unknown parameter warning. NFCI. 2021-03-05 11:24:44 +00:00
LLVM GN Syncbot
19b1fd4d0e [gn build] Port a60d06d8b757 2021-03-05 11:09:38 +00:00
Simon Pilgrim
ed6eb96f4a Revert rG8198d83965ba4b9db6922b44ef3041030b2bac39: "[X86] Pass to transform amx intrinsics to scalar operation."
This reverts commit 8198d83965ba4b9db6922b44ef3041030b2bac39.due to buildbot breakages
2021-03-05 11:09:14 +00:00
Simon Pilgrim
890083a2dc [X86] X86ISelLowering.cpp - try to use for-range loops. NFCI. 2021-03-05 11:09:14 +00:00
Jann Horn
f1adf2f3c1 [test] Fix new CodeGenPrepare test for non-X86 systems
The new test llvm/test/Transforms/CodeGenPrepare/remove-assume-block.ll
breaks on non-X86 machines. Change it to look like the existing test
llvm/test/Transforms/CodeGenPrepare/X86/delete-assume-dead-code.ll
to fix it.

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D97952
2021-03-05 11:48:38 +01:00
Andy Wingo
df20558925 [WebAssembly][yaml2obj][obj2yaml] Elem sections for nonzero tables
With reference types, tables can have non-zero table numbers.  This
commit adds support for element sections against these tables.

Differential Revision: https://reviews.llvm.org/D97923
2021-03-05 11:45:15 +01:00
Petar Avramovic
14d56a77f5 Reland AMDGPU/GlobalISel: Combine zext(trunc x) to x after RegBankSelect
Recommit bf5a5826504754788a8f1e3fec7a7dc95cda5782. Depends on
4c8fb7ddd6fa49258e0e9427e7345fb56ba522d4 which was reverted.

RegBankSelect creates zext and trunc when it selects banks for uniform i1.
Add zext_trunc_fold from generic combiner to post RegBankSelect combiner.

Differential Revision: https://reviews.llvm.org/D95432
2021-03-05 11:05:37 +01:00
Petar Avramovic
ead4216ca3 Reland [GlobalISel] Combine zext(trunc x) to x
Recommit 4112299ee761a9b6a309c8ff4a7e75f8c8d8851b. Depends on
4c8fb7ddd6fa49258e0e9427e7345fb56ba522d4 which was reverted.

Combine zext(trunc x) to x when truncated bits are known to be zero.

Differential Revision: https://reviews.llvm.org/D96031
2021-03-05 11:05:37 +01:00
David Sherwood
dcd9c105f6 [SVE][LoopVectorize] Add support for extracting the last lane of a scalable vector
There are certain loops like this below:

  for (int i = 0; i < n; i++) {
    a[i] = b[i] + 1;
    *inv = a[i];
  }

that can only be vectorised if we are able to extract the last lane of the
vectorised form of 'a[i]'. For fixed width vectors this already works since
we know at compile time what the final lane is, however for scalable vectors
this is a different story. This patch adds support for extracting the last
lane from a scalable vector using a runtime determined lane value. I have
added support to VPIteration for runtime-determined lanes that still permit
the caching of values. I did this by introducing a new class called VPLane,
which describes the lane we're dealing with and provides interfaces to get
both the compile-time known lane and the runtime determined value. Whilst
doing this work I couldn't find any explicit tests for extracting the last
lane values of fixed width vectors so I added tests for both scalable and
fixed width vectors.

Differential Revision: https://reviews.llvm.org/D95139
2021-03-05 09:57:56 +00:00
James Henderson
6d55f9cfe4 [llvm-objcopy] Fix crash for binary input files with non-ascii names
The code was using the standard isalnum function which doesn't handle
values outside the non-ascii range. Switching to using llvm::isAlnum
instead ensures we don't provoke undefined behaviour, which can in some
cases result in crashes.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D97663
2021-03-05 08:57:40 +00:00
James Henderson
7843c09e83 [llvm-objcopy][test] Fix test that could have passed spuriously
The test was showing that when --strip-unneeded is specified for an
executable, all the symbols are stripped. However, the set of symbols
used in the test would be stripped by --strip-unneeded for an ET_REL
object too. Fix this by adding additional symbols that aren't normally
stripped by --strip-unneeded.

Reviewed by: MaskRay

Differential Revision: https://reviews.llvm.org/D97664
2021-03-05 08:57:39 +00:00
Luo, Yuanke
44ff98c17b [X86] Pass to transform amx intrinsics to scalar operation.
This pass runs in any situations but we skip it when it is not O0 and the
function doesn't have optnone attribute. With -O0, the def of shape to amx
intrinsics is near the amx intrinsics code. We are not able to find a
point which post-dominate all the shape and dominate all amx intrinsics.
To decouple the dependency of the shape, we transform amx intrinsics
to scalar operation, so that compiling doesn't fail. In long term, we
 should improve fast register allocation to allocate amx register.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D93594
2021-03-05 16:02:02 +08:00
Yang Fan
95fd24739b [JITLink] Fix Wtype-limits gcc warning (NFC)
GCC warning:
```
In file included from /usr/include/c++/9/cassert:44,
from /home/vsts/work/1/llvm-project/llvm/include/llvm/ADT/BitVector.h:21,
from /home/vsts/work/1/llvm-project/llvm/include/llvm/Support/Program.h:17,
from /home/vsts/work/1/llvm-project/llvm/include/llvm/Support/Process.h:32,
from /home/vsts/work/1/llvm-project/llvm/lib/ExecutionEngine/JITLink/JITLinkMemoryManager.cpp:11:
/home/vsts/work/1/llvm-project/llvm/lib/ExecutionEngine/JITLink/JITLinkMemoryManager.cpp: In member function ‘virtual llvm::Expected<std::unique_ptr<llvm::jitlink::JITLinkMemoryManager::Allocation> > llvm::jitlink::InProcessMemoryManager::allocate(const llvm::jitlink::JITLinkDylib*, const SegmentsRequestMap&)’:
/home/vsts/work/1/llvm-project/llvm/lib/ExecutionEngine/JITLink/JITLinkMemoryManager.cpp:129:40: warning: comparison of unsigned expression >= 0 is always true [-Wtype-limits]
129 |   assert(SlabRemaining.allocatedSize() >= 0 && "Mapping exceeds allocation");
    |          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~
```

The return type of `allocatedSize()` is `size_t`, thus the expression
`SlabRemaining.allocatedSize() >= 0` always evaluate to `true`.
2021-03-05 15:28:01 +08:00
Craig Topper
462bb91be3 [SelectionDAG] Assert that operands to SelectionDAG::getNode are not DELETED_NODE to catch issues like PR49393 earlier.
I'm not sure this would catch all such issues, but it would catch some.

The problem for PR49393 was that we were holding a reference to a node that
wasn't connect edto the DAG across a function that could delete unused nodes. In
this particular case we managed to try to use the deleted node while it was in
the deleted state before its memory got recycled.

It could also happen that we delete the node, something allocates a new node
which recycles the memory. Then  we try to use the reference we were holding and
it is now a completely different node with different valid opcode. This patch
would not catch that.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D97969
2021-03-04 23:05:32 -08:00
Craig Topper
787f70958a [TargetLowering] Use HandleSDNodes to prevent nodes from being deleted by recursive calls in getNegatedExpression.
For binary or ternary ops we call getNegatedExpression multiple
times and then compare costs. While we're doing this we need to
hold a node from the first call across the second call, but its
not yet attached to the DAG. Its possible the second call creates
an identical node and then decides it didn't need it so will try
to delete it if it has no uses. This can cause a reference to the
node we're holding further up the call stack to become invalidated.

To prevent this, we can use a HandleSDNode to artifically give
the node a use without connecting it to the DAG.

I've used a std::list of HandleSDNodes so we can create handles
only when we have a node to hold. HandleSDNode does not have
default constructor and cannot be copied or moved.

Fixes PR49393.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D97914
2021-03-04 22:48:25 -08:00
Fangrui Song
faffc2a6ea [DebugInfo] Delete unused DIVariable::getSource 2021-03-04 21:53:13 -08:00
Fangrui Song
99ae7ea6d6 [DebugInfo] Delete deleted getLine/getColumn
r250405 deleted the functions from DILexicalBlockBase.
2021-03-04 21:49:21 -08:00
Michael Kruse
9bf95cf501 [clang][OpenMP] Use OpenMPIRBuilder for workshare loops.
Initial support for using the OpenMPIRBuilder by clang to generate loops using the OpenMPIRBuilder. This initial support is intentionally limited to:
 * Only the worksharing-loop directive.
 * Recognizes only the nowait clause.
 * No loop nests with more than one loop.
 * Untested with templates, exceptions.
 * Semantic checking left to the existing infrastructure.

This patch introduces a new AST node, OMPCanonicalLoop, which becomes parent of any loop that has to adheres to the restrictions as specified by the OpenMP standard. These restrictions allow OMPCanonicalLoop to provide the following additional information that depends on base language semantics:
 * The distance function: How many loop iterations there will be before entering the loop nest.
 * The loop variable function: Conversion from a logical iteration number to the loop variable.

These allow the OpenMPIRBuilder to act solely using logical iteration numbers without needing to be concerned with iterator semantics between calling the distance function and determining what the value of the loop variable ought to be. Any OpenMP logical should be done by the OpenMPIRBuilder such that it can be reused MLIR OpenMP dialect and thus by flang.

The distance and loop variable function are implemented using lambdas (or more exactly: CapturedStmt because lambda implementation is more interviewed with the parser). It is up to the OpenMPIRBuilder how they are called which depends on what is done with the loop. By default, these are emitted as outlined functions but we might think about emitting them inline as the OpenMPRuntime does.

For compatibility with the current OpenMP implementation, even though not necessary for the OpenMPIRBuilder, OMPCanonicalLoop can still be nested within OMPLoopDirectives' CapturedStmt. Although OMPCanonicalLoop's are not currently generated when the OpenMPIRBuilder is not enabled, these can just be skipped when not using the OpenMPIRBuilder in case we don't want to make the AST dependent on the EnableOMPBuilder setting.

Loop nests with more than one loop require support by the OpenMPIRBuilder (D93268). A simple implementation of non-rectangular loop nests would add another lambda function that returns whether a loop iteration of the rectangular overapproximation is also within its non-rectangular subset.

Reviewed By: jdenny

Differential Revision: https://reviews.llvm.org/D94973
2021-03-04 22:52:59 -06:00
Juneyoung Lee
5f3a69dfff [LangRef] lifetime intrinsics: don't use word 'offset'
from Philip's comments
2021-03-05 12:53:13 +09:00
Luke
796838761e [RISCV] Enable fixed-length vectorization of LoopVectorizer for RISC-V Vector
By implementing the method "unsigned RISCVTTIImpl::getRegisterBitWidth(bool Vector)",
fixed-length vectorization is enabled when possible. Without this method, the
"#pragma clang loop" directive is needed to enable vectorization(or the cost model
may inform LLVM that "Vectorization is possible but not beneficial").

Reviewed By: frasercrmck

Differential Revision: https://reviews.llvm.org/D97549
2021-03-05 10:54:51 +08:00
Wei Mi
ad2a6f2861 [SampleFDO] Another fix to prevent repeated indirect call promotion in
sample loader pass.

In https://reviews.llvm.org/rG5fb65c02ca5e91e7e1a00e0efdb8edc899f3e4b9,
to prevent repeated indirect call promotion for the same indirect call
and the same target, we used zero-count value profile to indicate an
indirect call has been promoted for a certain target. We removed
PromotedInsns cache in the same patch. However, there was a problem in
that patch described below, and that problem led me to add PromotedInsns
back as a mitigation in
https://reviews.llvm.org/rG4ffad1fb489f691825d6c7d78e1626de142f26cf.

When we get value profile from metadata by calling getValueProfDataFromInst,
we need to specify the maximum possible number of values we expect to read.
We uses MaxNumPromotions in the last patch so the maximum number of value
information extracted from metadata is MaxNumPromotions. If we have many
values including zero-count values when we write the metadata, some of them
will be dropped when we read them because we only read MaxNumPromotions
values. It will allow repeated indirect call promotion again. We need to
make sure if there are values indicating promoted targets, those values need
to be saved in metadata with higher priority than other values.

The patch fixed that problem. We change to use -1 to represent the count
of a promoted target instead of 0 so it is easier to sort the values.
When we prepare to update the metadata in updateIDTMetaData, we will sort
the values in the descending count order and extract only MaxNumPromotions
values to write into metadata. Since -1 is the max uint64_t number, if we
have equal to or less than MaxNumPromotions of -1 count values, they will
all be kept in metadata. If we have more than MaxNumPromotions of -1 count
values, we will only save MaxNumPromotions such values maximally. In such
case, we have logic in place in doesHistoryAllowICP to guarantee no more
promotion in sample loader pass will happen for the indirect call, because
it has been promoted enough.

With this change, now we can remove PromotedInsns without problem.

Differential Revision: https://reviews.llvm.org/D97350
2021-03-04 18:44:12 -08:00
Chen Zheng
38673999a0 [XCOFF][DebugInfo] support DWARF for XCOFF for assembly output.
Reviewed By: jasonliu

Differential Revision: https://reviews.llvm.org/D95518
2021-03-04 21:07:52 -05:00
George Balatsouras
5d0845c0e0 [dfsan] Remove hardcoded shadow width in array.ll
As a preparation step for fast8 support, we need to update the tests
to pass in both modes. That requires generalizing the shadow width
and remove any hard coded references that assume it's always 2 bytes.

Reviewed By: stephan.yichao.zhao

Differential Revision: https://reviews.llvm.org/D97988
2021-03-04 17:12:16 -08:00
Yonghong Song
6efa716e20 BPF: permit type modifiers for __builtin_btf_type_id() relocation
Lorenz Bauer from Cloudflare tried to use "const struct <name>"
as the type for __builtin_btf_type_id(*(const struct <name>)0, 1)
relocation and hit a llvm BPF fatal error.
   https://lore.kernel.org/bpf/a3782f71-3f6b-1e75-17a9-1827822c2030@fb.com/

   ...
   fatal error: error in backend: Empty type name for BTF_TYPE_ID_REMOTE reloc

Currently, we require the debuginfo type itself must have a name.
In this case, the debuginfo type is "const" which points to "struct <name>".
The "const" type does not have a name, hence the above fatal error
will be triggered.

Let us permit "const" and "volatile" type modifiers. We skip modifiers
in some other cases as well like structure member type tracing.
This can aviod the above fatal error.

Differential Revision: https://reviews.llvm.org/D97986
2021-03-04 16:27:23 -08:00
David Blaikie
2677513543 Move llvm/Analysis/ObjCARCUtil.h to IR to fix layering.
This is included from IR files, and IR doesn't/can't depend on Analysis
(because Analysis depends on IR).

Also fix the implementation - don't use non-member static in headers, as
it leads to ODR violations, inaccurate "unused function" warnings, etc.
And fix the header protection macro name (we don't generally include
"LIB" in the names, so far as I can tell).
2021-03-04 16:14:53 -08:00
Nico Weber
9bd7831b75 [gn build] port b973e2e2f27e 2021-03-04 18:41:04 -05:00
Jianzhou Zhao
037f84c024 [dfsan] Propagate origin tracking at store
This is a part of https://reviews.llvm.org/D95835.

Reviewed By: morehouse, gbalats

Differential Revision: https://reviews.llvm.org/D97789
2021-03-04 23:34:44 +00:00
Philip Reames
5a592bf8e6 [docs] Remove some stale wording from gc.relocate description
We dropped support for the non-bundle form a while back, but I apparently missed updating one place in the docs.
2021-03-04 15:18:11 -08:00
Philip Reames
42428235cc [docs] Move statepoint related intrinsics into main LangRef 2021-03-04 15:13:27 -08:00
Amara Emerson
7b8da20247 [AArch64][GlobalISel][RegBankSelect] Improve rbs of G_BUILD_VECTOR when fed by fp values.
This is actually two changes. One is to avoid copies when fp values are fed into
a build_vector, without being able to tell from the opcode.

The other is that build_vectors are also marked as only defining FP, since they
produce vector results.

Differential Revision: https://reviews.llvm.org/D97968
2021-03-04 15:09:05 -08:00
Heejin Ahn
f1f186931b [WebAssembly] Fix ExceptionInfo grouping again
This is a case D97677 missed. When taking out remaining BBs that are
reachable from already-taken-out exceptions (because they are not
subexcptions but unwind destinations), I assumed the remaining BBs are
not EH pads, but they can be. For example,
```
try {
  try {
    throw 0;
  } catch (int) { // (a)
  }
} catch (int) {   // (b)
}
try {
  foo();
} catch (int) {   // (c)
}
```
In this code, (b) is the unwind destination of (a) so its exception is
taken out of (a)'s exception, But even though the next try-catch is not
inside the first two-level try-catches, because the first try always
throws, its continuation BB is unreachable and the whole rest of the
function is dominated by EH pad (a), including EH pad (c). So after we
take out of (b)'s exception out of (a)'s, we also need to take out (c)'s
exception out of (a)'s, because (c) is reachable from (b).

This adds one more step before what we did for remaining BBs in D97677;
it traverses EH pads first to take subexceptions out of their incorrect
parent exception. It's the same thing as D97677, but because we can do
this before we add BBs to exceptions' sets, we don't need to fix sets
and only need to fix parent exception pointers.

Other changes are variable name changes (I changed `WE` -> `SrcWE`,
`UnwindWE` -> `DstWE` for clarity), some comment changes, and a drive-by
fix in a bug in a `LLVM_DEBUG` print statement.

Fixes https://github.com/emscripten-core/emscripten/issues/13588.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D97929
2021-03-04 15:05:13 -08:00
LLVM GN Syncbot
e39ae82bd1 [gn build] Port 561abd83ffec 2021-03-04 22:58:35 +00:00
Heejin Ahn
58ecf46111 [WebAssembly] Disable uses of __clang_call_terminate
Background:

Wasm EH, while using Windows EH (catchpad/cleanuppad based) IR, uses
Itanium-based libraries and ABIs with some modifications.

`__clang_call_terminate` is a wrapper generated in Clang's Itanium C++
ABI implementation. It contains this code, in C-style pseudocode:
```
void __clang_call_terminate(void *exn) {
  __cxa_begin_catch(exn);
  std::terminate();
}
```
So this function is a wrapper to call `__cxa_begin_catch` on the
exception pointer before termination.

In Itanium ABI, this function is called when another exception is thrown
while processing an exception. The pointer for this second, violating
exception is passed as the argument of this `__clang_call_terminate`,
which calls `__cxa_begin_catch` with that pointer and calls
`std::terminate` to terminate the program.

The spec (https://libcxxabi.llvm.org/spec.html) for `__cxa_begin_catch`
says,
```
When the personality routine encounters a termination condition, it
will call __cxa_begin_catch() to mark the exception as handled and then
call terminate(), which shall not return to its caller.
```

In wasm EH's Clang implementation, this function is called from
cleanuppads that terminates the program, which we also call terminate
pads. Cleanuppads normally don't access the thrown exception and the
wasm backend converts them to `catch_all` blocks. But because we need
the exception pointer in this cleanuppad, we generate
`wasm.get.exception` intrinsic (which will eventually be lowered to
`catch` instruction) as we do in the catchpads. But because terminate
pads are cleanup pads and should run even when a foreign exception is
thrown, so what we have been doing is:
1. In `WebAssemblyLateEHPrepare::ensureSingleBBTermPads()`, we make sure
terminate pads are in this simple shape:
```
%exn = catch
call @__clang_call_terminate(%exn)
unreachable
```
2. In `WebAssemblyHandleEHTerminatePads` pass at the end of the
pipeline, we attach a `catch_all` to terminate pads, so they will be in
this form:
```
%exn = catch
call @__clang_call_terminate(%exn)
unreachable
catch_all
call @std::terminate()
unreachable
```
In `catch_all` part, we don't have the exception pointer, so we call
`std::terminate()` directly. The reason we ran HandleEHTerminatePads at
the end of the pipeline, separate from LateEHPrepare, was it was
convenient to assume there was only a single `catch` part per `try`
during CFGSort and CFGStackify.

---

Problem:

While it thinks terminate pads could have been possibly split or calls
to `__clang_call_terminate` could have been duplicated,
`WebAssemblyLateEHPrepare::ensureSingleBBTermPads()` assumes terminate
pads contain no more than calls to `__clang_call_terminate` and
`unreachable` instruction. I assumed that because in LLVM very limited
forms of transformations are done to catchpads and cleanuppads to
maintain the scoping structure. But it turned out to be incorrect;
passes can merge cleanuppads into one, including terminate pads, as long
as the new code has a correct scoping structure. One pass that does this
I observed was `SimplifyCFG`, but there can be more. After this
transformation, a single cleanuppad can contain any number of other
instructions with the call to `__clang_call_terminate` and can span many
BBs. It wouldn't be practical to duplicate all these BBs within the
cleanuppad to generate the equivalent `catch_all` blocks, only with
calls to `__clang_call_terminate` replaced by calls to `std::terminate`.

Unless we do more complicated transformation to split those calls to
`__clang_call_terminate` into a separate cleanuppad, it is tricky to
solve.

---

Solution (?):

This CL just disables the generation and use of `__clang_call_terminate`
and calls `std::terminate()` directly in its place.

The possible downside of this approach can be, because the Itanium ABI
intended to "mark" the violating exception handled, we don't do that
anymore. What `__cxa_begin_catch` actually does is increment the
exception's handler count and decrement the uncaught exception count,
which in my opinion do not matter much given that we are about to
terminate the program anyway. Also it does not affect info like stack
traces that can be possibly shown to developers.

And while we use a variant of Itanium EH ABI, we can make some
deviations if we choose to; we are already different in that in the
current version of the EH spec we don't support two-phase unwinding. We
can possibly consider a more complicated transformation later to
reenable this, but I don't think that has high priority.

Changes in this CL contains:
- In Clang, we don't generate a call to `wasm.get.exception()` intrinsic
  and `__clang_call_terminate` function in terminate pads anymore; we
  simply generate calls to `std::terminate()`, which is the default
  implementation of `CGCXXABI::emitTerminateForUnexpectedException`.
- Remove `WebAssembly::ensureSingleBBTermPads() function and
  `WebAssemblyHandleEHTerminatePads` pass, because terminate pads are
  already `catch_all` now (because they don't need the exception
  pointer) and we don't need these transformations anymore.
- Change tests to use `std::terminate` directly. Also removes tests that
  tested `LateEHPrepare::ensureSingleBBTermPads` and
  `HandleEHTerminatePads` pass.
- Drive-by fix: Add some function attributes to EH intrinsic
  declarations

Fixes https://github.com/emscripten-core/emscripten/issues/13582.

Reviewed By: dschuff, tlively

Differential Revision: https://reviews.llvm.org/D97834
2021-03-04 14:26:35 -08:00
William S. Moses
7d598a83ed Revert "[Attributor] Enable heap-to-stack of any size"
This reverts commit 51bd42ef9b870787afbeeffcd33adce765f70f23.
2021-03-04 17:24:56 -05:00
Sanjay Patel
2a43c2e32f [LoopVectorize] propagate fast-math-flags from induction instructions
This code assumed that FP math was only permissable if it was
fully "fast", so it hard-coded "fast" when creating new instructions.

The underlying code already allows matching recurrences/reductions
that are only "reassoc", so this change should prevent the potential
miscompile seen in the test diffs (we created "fast" ops even though
none existed in the original code).

I don't know if we need to create the temporary IRBuilder objects
used here, so that could be follow-up clean-up.

There's an open question about whether we should require "nsz" in
addition to "reassoc" here. InstCombine uses that combo for its
reassociative folds, but I think codegen is not as strict.
2021-03-04 17:21:32 -05:00
William S. Moses
6f4e6fcc04 [Attributor] Enable heap-to-stack of any size
Enable Attributor's heap-to-stack to lower unbounded allocations given a max size of -1

Differential Revision: https://reviews.llvm.org/D97873
2021-03-04 17:17:23 -05:00
dfukalov
7524927003 [NFC][AliasSetTracker] Remove implicit conversion AliasResult to integer.
Preparation to make AliasResult scoped enumeration.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D97973
2021-03-05 00:53:27 +03:00