Summary: Change the old form of G->getType()->getAddressSpace() to the new G->getAddressSpace() (underneath does the same).
Patch by Ehud Katz <ehudkatz@gmail.com>
Reviewers: tejohnson, chandlerc
Reviewed By: tejohnson
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69550
This is a partial fix for the issues described in commit message of 027aa27 (the revert of G24609). Unfortunately, I can't provide test coverage for it on it's own as the only (known) wrong example is still wrong, but due to a separate issue.
These fixes are cases where when performing unrelated DAG combines, we were dropping the atomicity flags entirely.
Summary:
This patch adds MIR parsing and printing for heap alloc markers, which were
added in D69136. They are printed as an operand similar to pre-/post-instr
symbols, with a heap-alloc-marker token and a metadata node.
Reviewers: rnk
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69864
This was added to inhibit a warning from gcc 7.3 according to
the comment. However, it triggers warning from PVS. In addition
I cannot reproduce it with gcc 7.4 and I also cannot reproduce
it with gcc 7.3 using compiler explorer.
Differential Revision: https://reviews.llvm.org/D69863
When writing an email for a follow up proposal, I realized one of the diffs in the committed change was incorrect. Digging into it revealed that the fix is complicated enough to require some thought, so reverting in the meantime.
The problem is visible in this diff (from the revert):
; X64-SSE-LABEL: store_fp128:
; X64-SSE: # %bb.0:
-; X64-SSE-NEXT: movaps %xmm0, (%rdi)
+; X64-SSE-NEXT: subq $24, %rsp
+; X64-SSE-NEXT: .cfi_def_cfa_offset 32
+; X64-SSE-NEXT: movaps %xmm0, (%rsp)
+; X64-SSE-NEXT: movq (%rsp), %rsi
+; X64-SSE-NEXT: movq {{[0-9]+}}(%rsp), %rdx
+; X64-SSE-NEXT: callq __sync_lock_test_and_set_16
+; X64-SSE-NEXT: addq $24, %rsp
+; X64-SSE-NEXT: .cfi_def_cfa_offset 8
; X64-SSE-NEXT: retq
store atomic fp128 %v, fp128* %fptr unordered, align 16
ret void
The problem here is three fold:
1) x86-64 doesn't guarantee atomicity of anything larger than 8 bytes. Some platforms observably break this guarantee, others don't, but the codegen isn't considering this, so it's wrong on at least some platforms.
2) When I started to track down the problem, I discovered that DAGCombiner had stripped the atomicity off the store entirely. This comes down to idiomatic usage of DAG.getStore passing all MMO components separately as opposed to just passing the MMO.
3) On x86 (not -64), there are cases where 8 byte atomiciy is supported, but only for floating point operations. This would seem to imply that operation typing matters for correctness, and DAGCombine happily folds away bitcasts. I'm not 100% sure there's a problem here, but I'm not entirely sure there isn't either.
I plan on returning to each issue in turn; sorry for the churn here.
llvm-objdump -D this file:
int a[100000];
int main() { return 0; }
Will produce an error: "The end of the file was unexpectedly encountered".
This happens because of a check in Binary.h checkOffset. (Addr + Size > M.getBufferEnd()).
The sh_offset and sh_size fields can be ignored for SHT_NOBITS sections.
Fix the error by changing ELFObjectFile<ELFT>::getSectionContents to use
the file base for SHT_NOBITS sections.
Reviewed By: grimar, MaskRay
Differential Revision: https://reviews.llvm.org/D69192
Without this patch, when using lit's internal shell, if `not` on a lit
RUN line calls `env`, `diff`, or any of the other in-process shell
builtins that lit implements, lit accidentally searches for the latter
as an external executable. What's worse is that works fine when a
developer is testing on a platform where those executables are
available and behave as expected, but it then breaks on other
platforms.
`not` seems useful for some builtins, such as `diff`, so this patch
supports such uses. `not --crash` does not seem useful for builtins,
so this patch diagnoses such uses. In all cases, this patch ensures
shell builtins are found behind any sequence of `env` and `not`
commands.
`not` calling `env` calling an external command appears useful when
the `env` and external command are part of a lit substitution, as in
D65156. This patch supports that by looking through any sequence of
`env` and `not` commands, building the environment from the `env`s,
and storing the `not`s. The `not`s are then added back to the command
line without the `env`s to execute externally. This avoids the need
to replicate the `not` implementation, in particular the `--crash`
option, in lit.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D66531
Summary:
G_GEP is rather poorly named. It's a simple pointer+scalar addition and
doesn't support any of the complexities of getelementptr. I therefore
propose that we rename it. There's a G_PTR_MASK so let's follow that
convention and go with G_PTR_ADD
Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm
Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69734
addOperand() method of AMDGPU disassembler returns SoftFail
on error. All instances which may lead to that place are
an impossible encdoing, not something which is possible to
encode, but semantically incorrect as described for SoftFail.
Then tablegen generates a check of the following form:
if (Decode...(..) == MCDisassembler::Fail) { return MCDisassembler::Fail; }
Since we can only return Success and SoftFail that is dead
code as detected by the static code analyzer.
Solution: return Fail as it should be.
See https://bugs.llvm.org/show_bug.cgi?id=43886
Differential Revision: https://reviews.llvm.org/D69819
Summary:
This is largely based off of the slides from the keynote
Depends on D69545
Reviewers: volkan, rovka, arsenm
Subscribers: wdng, arphaman, Petar.Avramovic, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69644
Summary:
This patch factors out code to merge a basic block with its sole
successor -- partly for readability and partly to facilitate an
upcoming patch of my own.
Reviewers: wmi
Subscribers: hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69852
Rewrite one of the invalid macho test input file with YAML file. The
original invalid macho is breaking our internal test infrastusture
because it is too broken to be copy around.
rdar://problem/56879982
--only-keep-debug produces a debug file as the output that only
preserves contents of sections useful for debugging purposes (the
binutils implementation preserves SHT_NOTE and non-SHF_ALLOC sections),
by changing their section types to SHT_NOBITS and rewritting file
offsets.
See https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html
The intended use case is:
```
llvm-objcopy --only-keep-debug a a.dbg
llvm-objcopy --strip-debug a b
llvm-objcopy --add-gnu-debuglink=a.dbg b
```
The current layout algorithm is incapable of deleting contents and
shrinking segments, so it is not suitable for implementing the
functionality.
This patch adds a new algorithm which assigns sh_offset to sections
first, then modifies p_offset/p_filesz of program headers. It bears a
resemblance to lld/ELF/Writer.cpp.
Reviewed By: jhenderson, jakehehrlich
Differential Revision: https://reviews.llvm.org/D67137
`llvm::objcopy:🧝:*Section::classof` matches Type and Flags, yet Type
and Flags are mutable (by setSectionFlagsAndTypes and upcoming
--only-keep-debug feature). Add OriginalType & OriginalFlags to be used
in classof, to prevent classof results from changing.
Reviewed By: jakehehrlich, jhenderson, alexshap
Differential Revision: https://reviews.llvm.org/D69739
Summary:
To drive the automaton we used a uint64_t as an action type. This
contained the transition's resource requirements as a conjunction:
(a OR b) AND (b OR c)
We encoded this conjunction as a sequence of four 16-bit bitmasks.
This limited the number of addressable functional units to 16, which
is quite low and has bitten many people in the past.
Instead, the DFAEmitter now generates a lookup table from InstrItinerary
class (index of the ItinData inside the ProcItineraries) to an internal
action index which is essentially a dense embedding of the conjunctive
form. Because we never materialize the conjunctive form, we no longer
have the 16 FU restriction.
In this patch we limit to 64 functional units due to using a uint64_t
bitmask in the DFAEmitter. Now that we've decoupled these representations
we can increase this in future.
Reviewers: ThomasRaoux, kparzysz, majnemer
Reviewed By: ThomasRaoux
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69110
This recommits 2be17087f8c38934b7fc9208ae6cf4e9b4d44f4b (reverted in
d3ec06d219788801380af1948c7f7ef9d3c6100b for heap-use-after-free) with a fix
in IAI's reset() which was not clearing the set of interleave groups after
deleting them.
When eliminating a pair of
`llvm.objc.autoreleaseReturnValue`
followed by
`llvm.objc.retainAutoreleasedReturnValue`
we need to make sure that the instructions in between are safe to
ignore.
Other than bitcasts and useless GEPs, it's also safe to ignore lifetime
markers for both static allocas (lifetime.start/lifetime.end) and dynamic
allocas (stacksave/stackrestore).
These get added by the inliner as part of the return sequence and can
prevent the transformation from happening in practice.
Differential Revision: https://reviews.llvm.org/D69833
Summary:
This patch factors out common code to update the SSA form in
JumpThreading.cpp -- partly for readability and partly to facilitate
an coming patch of my own.
Reviewers: wmi
Subscribers: hiraditya, jfb, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69811
This adds AA to Post-RA Machine Scheduling, allowing the pass more
freedom when handling memory operations.
My understanding is that this was just never done, not that it is
inherently incorrect to do so. The older PostRA List scheduler already
makes use of AA, it's just that the MI PostRA Scheduler was never taught
to use it.
Differential Revision: https://reviews.llvm.org/D69814
Summary:
Functions replaceStoreOfFPConstant() and OptimizeFloatStore() both
replace store of float by a store of an integer unconditionally. However
this generates wrong code when the store that is replaced is an indexed
or truncating store. This commit solves this issue by adding an early
return in these functions when the store being considered is not a
normal store.
Bug was only observed on out of tree targets, hence the lack of testcase
in this commit.
Reviewers: efriedma
Subscribers: hiraditya, arphaman, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D68420
This feature controls whether AA is used into the backend, and was
previously turned on for certain subtargets to help create less
constrained scheduling graphs. This patch turns it on for all
subtargets, so that they can all make use of the extra information to
produce better code.
Differential Revision: https://reviews.llvm.org/D69796
In the ARM backend, for historical reasons we have only some targets
using Machine Scheduling. The rest use the old list scheduler as they
are using itinaries and the list scheduler seems to produce better code
(and not crash running out of register on v6m codes). So whether to use
the MIScheduler or not is checked at runtime from the subtarget
features.
This is fine, except for post-ra scheduling. Whether to use the old
post-ra list scheduler or the post-ra machine schedule is decided as the
pass manager is set up, in arms case from a newly constructed subtarget.
Under some situations, like LTO, this won't include the correct cpu so
can pick the wrong option. This can have a surprising effect on
performance.
To fix that, this patch overrides targetSchedulesPostRAScheduling and
addPreSched2 in the ARM backend, adding _both_ post-ra schedulers and
picking at runtime which to execute. To pick between the two I've had to
add a enablePostRAMachineScheduler() method that normally returns
enableMachineScheduler() && enablePostRAScheduler(), which can be
overridden to enable just one of PostRAMachineScheduler vs
PostRAScheduler.
Thanks to David Penry for the identifying this problem.
Differential Revision: https://reviews.llvm.org/D69775
Summary:
That fold keeps growing and growing :(
I think this may be one of the last pieces for it.
Since D67677/D67725, the fold knowns the general form
of the pattern - where some masking is needed:
https://rise4fun.com/Alive/F5Rhttps://rise4fun.com/Alive/gslRa
But there is one more huge piece missing - if you are extracting some bits,
it is not impossible that the origin is wider than the extraction,
i.e. there may be a truncation. And we don't deal with that yet.
But we can, and the generalization remains fully identical:
https://rise4fun.com/Alive/Uarhttps://rise4fun.com/Alive/5SW
After a preparatory cleanup i think the diff looks rather clean.
One missing piece is that in some patterns (especially pat. b),
`-1` only needs to be `-1` in final type, but that is for later..
https://bugs.llvm.org/show_bug.cgi?id=42563
Reviewers: spatel, nikic
Reviewed By: spatel
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69125