SPMDization D102307 detects incompatible OpenMP runtime calls to abort converting a target region to SPMD mode. Calls to memory allocation/de-allocation routines kmpc_alloc_shared, kmpc_free_shared are incompatible unless they are removed by AAHeapToStack/AAHeapToShared analysis. This patch extends SPMDization detection to include AAHeapToStack/AAHeapToShared analysis results for enlarging the scope of possible SPMDized regions detected.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D105634
Fixing a typo in SampleContextTracker to use debug name when debug linkage name is no present. This should only affect C programs.
Saw 0.6% perf win on Cinder which is mostly C code.
Reviewed By: wenlei, wmi
Differential Revision: https://reviews.llvm.org/D106599
This patch adds llvm-readobj and the binutils symlink for readelf to
LLVM_TOOLCHAIN_TOOLS.
Tvoid *thread, void *attr,hey are required by some (most?)
autoconf-built libraries, adding these allows me to build newlib with
the toolchain generated this way.
Also opened an issue for that some days ago, see
https://bugs.llvm.org/show_bug.cgi?id=50698
Reviewed By: sbc100
Differential Revision: https://reviews.llvm.org/D104957
This should fix build breaks for 'development' mode. The other modes
were unaffected - 'release' because it doesn't use TFUtils.cpp, and the
mixed mode because the AOT compiled code brings in the necessary include
dirs anyway.
This function is called when some predecessor of an empty return block
ends with a conditional branch, with both successors being empty ret blocks.
Now, because of the way SimplifyCFG works, it might happen to simplify
one of the blocks in a way that makes a conditional branch
into an unconditional one, since it's destinations are now identical,
but it might not have actually simplified said conditional branch
into an unconditional one yet.
So, we have to check that ourselves first,
especially now that SimplifyCFG aggressively tail-merges
all ret and resume blocks.
Even if it was an unconditional branch already,
`SimplifyCFGOpt::simplifyReturn()` doesn't call `FoldReturnIntoUncondBranch()`
by default.
Reland of 31859f896.
This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and
lowering methods for load and stores of reference types from IR
globals. Once the lowering creates the new nodes, tablegen pattern
matches those and converts them to Wasm global.get/set.
Reviewed By: tlively
Differential Revision: https://reviews.llvm.org/D104797
We previously had issues identifying macros not registered with a lowercase name.
Reviewed By: mstorsjo, thakis
Differential Revision: https://reviews.llvm.org/D106453
The logical (select) form of and/or will now be a source of problems.
We don't really account for it's inverted form, yet it exists,
and presumably we should treat it just like non-inverted form:
https://alive2.llvm.org/ce/z/BU9AXkhttps://bugs.llvm.org/show_bug.cgi?id=51149 reports a reportedly-serious
perf regression that will hopefully be mitigated by this.
Update shl/lshr/ashr costs based on the worst case costs from the script in D103695 - many of the 128-bit shifts (usually where integer multiplies aren't used) have similar behaviour to AVX1 so we can merge them.
Noticed while trying to clean up the shift costs model for SSE4 targets using the script in D10369 - SLM double-pumps all the 128-bit vector conversion ops and only use FP0 pipe - numbers taken from Intel AOM + Agner.
We should only add the fake lowering entry for the matrix remark if the
transpose is not lowered on its own. `MapVector::insert` is used to insert
the entry during proper lowering which does not overwrite the fake entry in
the map.
We actually had test coverage for this but the reference output code was
wrong; it was storing undef rather than the transposed column.
Also add an assert that would have caught this.
Differential Revision: https://reviews.llvm.org/D106457
This changes the cost to (LT.first-1) * cost(add) + 2, where the cost of
an add is assumed to be 1. This brings it inline with the other
reductions.
Differential Revision: https://reviews.llvm.org/D106240
D101977 added `BooleanStateWithPtrSetVector` to store pointers to a set meanwhile
tracking boolean state. One of the limitation is that it can only store pointer.
We might want it to store other types of values, such as integer for parallel
level. This patch generalizes the idea and create `BooleanStateWithSetVector`.
`BooleanStateWithPtrSetVector` therefore becomes a type alias of `BooleanStateWithSetVector`.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D106149
This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtin and intrinsic for "__stbcx".
Reviewed By: nemanjai, #powerpc
Differential revision: https://reviews.llvm.org/D106484
This patch avoids computing discounts for predicated instructions when the
VF is scalable.
There is no support for vectorization of loops with division because the
vectorizer cannot guarantee that zero divisions will not happen.
This loop now does not use VF scalable
```
for (long long i = 0; i < n; i++)
if (cond[i])
a[i] /= b[i];
```
Differential Revision: https://reviews.llvm.org/D101916
Opaque values (of zero size) can be stored in memory with the
implemention of reference types in the WebAssembly backend. Since
MachineMemOperand uses LLTs we need to be able to support
zero-sized scalars types in LLTs.
Differential Revision: https://reviews.llvm.org/D105423
The purpose of patch is to learn Loop idiom recognition pass how to recognize simple memmove patterns
in similar way like GCC: https://godbolt.org/z/fh95e83od
LoopIdiomRecognize already has machinery for memset and memcpy recognition, patch tries to extend exisiting capabilities with minimal effort.
Differential Revision: https://reviews.llvm.org/D104464