Summary:
There is no need to write out gcdas when forking because we can just reset the counters in the parent process.
Let say a counter is N before the fork, then fork and this counter is set to 0 in the child process.
In the parent process, the counter is incremented by P and in the child process it's incremented by C.
When dump is ran at exit, parent process will dump N+P for the given counter and the child process will dump 0+C, so when the gcdas are merged the resulting counter will be N+P+C.
About exec** functions, since the current process is replaced by an another one there is no need to reset the counters but just write out the gcdas since the counters are definitely lost.
To avoid to have lists in a bad state, we just lock them during the fork and the flush (if called explicitely) and lock them when an element is added.
Reviewers: marco-c
Reviewed By: marco-c
Subscribers: hiraditya, cfe-commits, #sanitizers, llvm-commits, sylvestre.ledru
Tags: #clang, #sanitizers, #llvm
Differential Revision: https://reviews.llvm.org/D74953
Summary:
This should produce slightly better error messages in case of failures.
Only slightly, because this code was pretty careful about that to begin
with -- I've seen code which does much worse.
Reviewers: jhenderson, dblaikie
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74899
Summary:
The type used to represent functional units in MC is
'unsigned', which is 32 bits wide. This is currently
not a problem in any upstream target as no one seems
to have hit the limit on this yet, but in our
downstream one, we need to define more than 32
functional units.
Increasing the size does not seem to cause a huge
size increase in the binary (an llc debug build went
from 1366497672 to 1366523984, a difference of 26k),
so perhaps it would be acceptable to have this patch
applied upstream as well.
Subscribers: hiraditya, jsji, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71210
This optimization bypasses GOT loads and calls/branches through stubs when the
ultimate target of the access/branch is found to be within range of the
reference.
Extra debugging output is also added to the generic JITLink algorithm and
basic GOT and Stubs builder utility to aid debugging.
The gather intrinsics use a floating point mask when the result
type is FP. But we call DemandedBits on the mask assuming its an
integer type. We also use integer types when we create it from
generic IR. So add a bitcast to the intrinsic path to guarantee
the integer type.
The type profile we use for the isel patterns lied about how
many operands the gather/scatter node has to skip the index
and scale operands. This allowed us to expand the baseptr
operand into base, displacement, and segment and then merge
the index and scale with them in the final instruction during
isel. This is kind of a hack that relies on isel not checking the
number of operands at all.
This commit switches to custom isel where we can manage this
directly without relying on holes in the isel checking.
Summary:
The IR printing always prints out all functions in a module with the new pass manager, even with -filter-print-funcs specified. This is being fixed in this change. However, there are two exceptions, i.e, with user-specified wildcast switch -filter-print-funcs=* or -print-module-scope, under which IR of all functions should be printed.
Test Plan:
make check-clang
make check-llvm
Reviewers: wenlei
Reviewed By: wenlei
Subscribers: wenlei, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D74814
Leave the gather/scatter subclasses, but make them inherit from
MemIntrinsicSDNode and delete their constructor and destructor.
This way we can still have the getIndex, getMask, etc. convenience
functions.
In order to build the Linux kernel, the back chain must be supported with
packed-stack. The back chain is then stored topmost in the register save
area.
Review: Ulrich Weigand
Differential Revision: https://reviews.llvm.org/D74506
This version fixes a buildbot failure cause by picking the wrong insert
point for XORs. We cannot pick the XOR binary operator as insert point,
as it is not guaranteed that both input operands for the overflow
intrinsic are defined before it.
This reverts the revert commit
c7fc0e5da6c3c36eb5f3a874a6cdeaedb26856e0.
A question about this behavior came up on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2020-February/139003.html
...and as part of backend improvements in D73978.
We decided not to implement a more general change that would have
folded any FP binop with nearly arbitrary constant + undef operand
to undef because that is not theoretically correct (even if it is
practically correct).
This is the SDAG-equivalent to the IR change in D74713.
Add a map from BasicBlocks to overlap intervals. For partial writes, we
can keep track of those in IOLs. We only add candidates that are valid
for eliminations.
Reviewers: dmgreen, bryant, asbirlea, Tyker
Reviewed By: asbirlea
Differential Revision: https://reviews.llvm.org/D73757
If a simplication occurs the operand will be added to the worklist.
But since the demanded mask was based on N, we need to make sure
we revisit N in case there are more simplifications to be done.
Returning SDValue(N, 0) as we do, only tells DAG combine that
something changed, but that won't make it add anything to the
worklist.
Found while playing around with using VEXTRACT_STORE in more cases.
But I guess this doesn't affect any of our existing tests.
We can use MOVLPS which will load 64 bits, but we need a v4f32
result type. We already have isel patterns for this.
The code here is a little hacky. We can probably improve it with
more isel patterns.
This is similar to using movd which we do for sse2 targets.
I've added a DAG combine for VEXTRACT_STORE to use SimplifyDemandedVectorElts
to clean up some artifacts from type legalization.
The GenericLLVMIRPlatformSupport class runs a transform on all LLVM IR added to
the LLJIT instance to replace instances of llvm.global_ctors with a specially
named function that runs the corresponing static initializers (See
(GlobalCtorDtorScraper from lib/ExecutionEngine/Orc/LLJIT.cpp). This patch
updates the GenericIRPlatform class to check for this specially named function
in other materialization units that are added to the JIT and, if found, add
the function to the initializer work queue. Doing this allows object files
that were compiled from IR and cached to be reloaded in subsequent JIT sessions
without their initializers being skipped.
To enable testing this patch also updates the lli tool's -jit-kind=orc-lazy mode
to respect the -enable-cache-manager and -object-cache-dir options, and modifies
the CompileOnDemandLayer to rename extracted submodules to include a hash of the
names of their symbol definitions. This allows a simple object caching scheme
based on module names (which was already implemented in lli) to work with the
lazy JIT.
This patch adds new errors and error checking to the ObjectLinkingLayer to
catch cases where a compiled or loaded object either:
(1) Contains definitions not covered by its responsibility set, or
(2) Is missing definitions that are covered by its responsibility set.
Proir to this patch providing the correct set of definitions was treated as
an API contract requirement, however this requires that the client be confident
in the correctness of the whole compiler / object-cache pipeline and results
in difficult-to-debug assertions upon failure. Treating this as a recoverable
error results in clearer diagnostics.
The performance overhead of this check is one comparison of densemap keys
(symbol string pointers) per linking object, which is minimal. If this overhead
ever becomes a problem we can add the check under a flag that can be turned off
if the client fully trusts the rest of the pipeline.
With this --shuffle-sections=seed produces the same result in every
host.
Reviewed By: grimar, MaskRay
Differential Revision: https://reviews.llvm.org/D74971
I've noticed that it is not convenient to create YAMLs from
binaries (using obj2yaml) that have to be test cases for obj2yaml
later (after applying yaml2obj).
The problem, for example is that obj2yaml emits "DynamicSymbols:"
key instead of .dynsym. It also does not create .dynstr.
And when a YAML document without explicitly defined .dynsym/.dynstr
is given to yaml2obj, we have issues:
1) These sections are placed after non-allocatable sections (I've fixed it in D74756).
2) They have VA == 0. User needs create descriptions for such sections explicitly manually
to set a VA.
This patch addresses (2). I suggest to let yaml2obj assign virtual addresses by itself.
It makes an output binary to be much closer to "normal" ELF.
(It is still possible to use "Address: 0x0" for a section to get the original behavior
if it is needed)
Differential revision: https://reviews.llvm.org/D74764
Similar to what do for other operations that use a subset of bits.
Allows us to remove a pattern that shrinks a load. Which was
incorrect if the load was volatile.
Added two flags to omit uncommon or dead paths in the CFG graphs:
-cfg-hide-unreachable-paths
-cfg-hide-deoptimize-paths
The main purpose is performance analysis when such block are not
"interesting" from perspective of common path performance.
Reviewed By: apilipenko, davidxl
Differential Revision: https://reviews.llvm.org/D74346
Summary:
We already sorted the blocks when fixing up a set of mutual
loop entries, however, there can be multiple sets of such
mutual loop entries, and the order we encounter them
should not be random, so sort them too.
Fixes https://bugs.llvm.org/show_bug.cgi?id=44982
Patch by Alon Zakai (kripken)
Reviewers: aheejin, sbc100, dschuff
Subscribers: mgrang, sunfish, hiraditya, jgravelle-google, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74999
This reverts commit 977cd661cf019039dec7ffdd15bf0ac500828c87.
It breaks OpenCL testing. OpenCL Runtime is using PT_LOAD information
to calculate memory for global variables. This commit should be relanded once
the OpenCL runtime stops relying on PT_LOAD information for calculating global
variable memory size.
Differential Revision: https://reviews.llvm.org/D74995
Heads-up message: https://lists.llvm.org/pipermail/llvm-dev/2020-February/139390.html
GNU as started to emit warnings for changed sh_type or sh_flags in 2000.
GNU as>=2.35 will emit errors for most sh_type/sh_flags change, and error for entsize change.
Some cases remain warnings for legacy reasons:
.section .init_array,"ax", @progbits
.section .init_array,"ax", @init_array
# And some obscure sh_flags changes (OS/Processor specific flags)
The rationale of a diagnostic (warning or error) is that sh_type,
sh_flags or sh_entsize changes usually indicate user errors. The values
are taken from the first .section directive. Successive directives are ignored.
We just try to be rigid and emit errors for all sh_type/sh_flags/sh_entsize change.
A possible improvement in the future is to reuse
llvm-readobj/ELFDumper.cpp:getSectionTypeString so that we can name the
type in the diagnostics.
Reviewed By: psmith
Differential Revision: https://reviews.llvm.org/D73999
Summary:
GNU objdump prints the method name in disassembly output, and upon further investigation this seems to come from debug info, not the symbol table.
Some additional refactoring is necessary to make this work even when the line number is 0/the filename is unknown. The added test case includes a note for this scenario.
See http://llvm.org/PR41341 for more info.
Reviewers: dblaikie, MaskRay, jhenderson
Reviewed By: MaskRay
Subscribers: ormris, jvesely, aprantl, kerbowa, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D74507
Recently I had to use it and although one assumes it returns null if
there's no parent loop, I think it helps to doc it.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D74890
To unblock the builders this disables a test for which the CHECK lines
need to be updated. The patch causing the failure was not reverted
because it is needed for a different problem we are investigating. Here
we just need to update the CHECK lines which will happen in the
meantime.
This patch adds a cache that is valid only for the duration of a call
to getKnownBits. With such short lived cache we avoid all the problems
of cache invalidation while still getting the benefits of reusing
the information we already computed.
This cache is useful whenever an instruction occurs more than once
in a chain of computation.
E.g.,
v0 = G_ADD v1, v2
v3 = G_ADD v0, v1
Previously we would compute the known bits for:
v1, v2, v0, then v1 again and finally v3.
With the patch, now we won't have to recompute v1 again.
NFC
Summary:
Blocks in a loop can be in any order as long as the loop header is the
first block in Blocks.
With some order of Blocks, cloneLoopWithPreheader would trigger the
assertion in addBasicBlockToLoop.
Example:
define void @test(i64 %N) {
preheader.i:
br label %header.i
header.i:
%i = phi i64 [ 0, %preheader.i ], [ %inc.i, %latch.i ]
br label %header.j
header.j:
%j = phi i64 [ 0, %header.i ], [ %inc.j, %latch.j ]
br label %header.k
header.k:
%k = phi i64 [ 0, %header.j ], [ %inc.k, %latch.k ]
call void @baz(i64 %i, i64 %j, i64 %k)
br label %latch.k
latch.k:
%inc.k = add nsw i64 %k, 1
%cmp.k = icmp slt i64 %inc.k, %N
br i1 %cmp.k, label %header.k, label %latch.j
latch.j:
%inc.j = add nsw i64 %j, 1
%cmp.j = icmp slt i64 %inc.j, %N
br i1 %cmp.j, label %header.j, label %latch.i
latch.i:
%inc.i = add nsw i64 %i, 1
%cmp.i = icmp slt i64 %inc.i, %N
br i1 %cmp.i, label %header.i, label %exit.i
exit.i:
ret void
}
declare void @baz(i64, i64, i64)
If the blocks of loop-i is in the order: header.i, latch.k, header.k,
header.j, latch.j, latch.i,
then cloneLoopWithPreheader would trigger the assertion in
addBasicBlockToLoop
assert(contains(SameHeader) && getHeader() == SameHeader->getHeader() &&
"Incorrect LI specified for this loop!");
As latch.k is in both loop-j and loop-k, it would be set as the header
of both loops after adding latch.k.
If we update loop headers during cloning blocks, then after adding
header.k,
the header of loop-k would be updated with header.k,
while the header of loop-j stays as latch.k.
When adding header.j, SameHeader is loop-k, SameHeader->getHeader() is
header.k, but getHeader() is latch.k, which trigger the assertion.
Reviewer: jdoerfert, Meinersbur, fhahn, kbarton, hfinkel, bmahjour,
etiotto
Reviewed By: Meinersbur
Subscribers: hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D74382
We want flag users to check individual fast-math flags,
not that all of them are set. This was also probably
not working as intended because NoFPExcept isn't always
set on non-strict nodes.