x86_mmx is conceptually a vector already. Don't introduce an extra conversion between it and scalar i64.
I'm using VectorType::isValidElementType which checks for floating point, integer, and pointers to hopefully make this more readable than just blacklisting x86_mmx.
Differential Revision: https://reviews.llvm.org/D69964
Summary:
I need to make use of this pass from a driver program that isn't opt.
Therefore this patch moves this pass into the LLVM library so that it is
available for use elsewhere.
There was one function I kept in tools/opt which is exportDebugifyStats()
this is because it's serializing the statistics into a human readable
format and this seemed more in keeping with opt than a library function
Reviewers: vsk, aprantl
Subscribers: mgorny, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69926
Without this change, when a nested tag type of any kind (enum, class,
struct, union) is used as a variable type, it is emitted without
emitting the parent type. In CodeView, parent types point to their inner
types, and inner types do not point back to their parents. We already
walk over all of the parent scopes to build the fully qualified name.
This change simply requests their type indices as we go along to enusre
they are all emitted.
Fixes PR43905
Reviewers: akhuang, amccarth
Differential Revision: https://reviews.llvm.org/D69924
As the test cases show, we end up with an insert/extract and a
bitcast to/from i64. x86_mmx is for some purposes conceptually a
vector. We shouldn't be adding scalar conversions around it.
Since _m64 is defined as <1 x i64> and intrinsics use x86_mmx
as their input/output these extra scalar operations prevent
the X86 backend from generating good code especially on 32-bit
targets where i64 gets split.
Instcombiner pass was erasing trivially dead instruction without updating dependent llvm.dbg.value.
which was not showing programmer current state of variables while debugging.
As a part of this fix I did following,
Iterate throught all the users (llvm.dbg) of a instruction which is trivially dead and set each if them undef, Before deleting the instruction.
Now user will see optimized out, when try to print those variables.
This fixes
https://bugs.llvm.org/show_bug.cgi?id=43893
This is my first fix to llvm.
Patch by kamlesh kumar!
Differential Revision: https://reviews.llvm.org/D69809
Summary: The 'BB->getParent()' pointer was utilized before it was verified against nullptr. Check lines: 3567, 3581.
Reviewers: jyknight, RKSimon
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69751
This reverts commit c989993ba1a666f04f7aee7df51d9f4de0588b71.
maskray already fixed the explicit instantiation definition in the .cpp
file, and these extern template declarations seem to be causing
warnings that I don't understand.
I happen to be using clang-cl+lld-link locally, and I get these link
errors:
lld-link: error: undefined symbol: public: unsigned short __cdecl llvm::object::XCOFFSectionHeader<struct llvm::object::XCOFFSectionHeader64>::getSectionType(void) const
>>> referenced by C:\src\llvm-project\llvm\tools\llvm-readobj\XCOFFDumper.cpp:106
>>> tools\llvm-readobj\CMakeFiles\llvm-readobj.dir\XCOFFDumper.cpp.obj:(public: virtual void __cdecl `anonymous namespace'::XCOFFDumper::printSectionHeaders(void))
I suspect this is because the explicit template instaniation appears
before the inline method definitions in the .cpp file, so they aren't
available at the point of instantiation. Move the explicit instantiation
later.
Also, forward declare the explicit instantiation for good measure.
shift (logic (shift X, C0), Y), C1 --> logic (shift X, C0+C1), (shift Y, C1)
This is an IR translation of an existing SDAG transform added here:
rL370617
So we again have 9 possible patterns with a commuted IR variant of each pattern:
https://rise4fun.com/Alive/VlIhttps://rise4fun.com/Alive/n1mhttps://rise4fun.com/Alive/1Vn
Part of the motivation is to allow easier recognition and subsequent
canonicalization of bswap patterns as discussed in PR43146:
https://bugs.llvm.org/show_bug.cgi?id=43146
We had to delay this transform because it used to allow the SLP vectorizer
to create awful reductions out of simple load-combines.
That problem was fixed with:
rL375025
(we'll bring back load combining in IR someday...)
The backend is also better equipped to deal with these patterns now
using hooks like TLI.getShiftAmountThreshold().
The only remaining potential controversy is that the -reassociate pass
tends to reverse this kind of pattern (to help GVN?). But since -reassociate
doesn't do anything with these specific patterns, there is no conflict currently.
Finally, there's a new pass proposal at D67383 for general tree-height-reduction
reassociation, and it could use a cost model to decide how to optimally rearrange
these kinds of ops for a target. That patch appears to be stalled.
Differential Revision: https://reviews.llvm.org/D69842
SUMMARY:
According to https://reviews.llvm.org/D68575#inline-617586, Create a NFC patch for it.
Using crtp to refactor the xcoff section header
Move the define of SectionFlagsReservedMask and SectionFlagsTypeMask from XCOFFDumper.cpp to XCOFFObjectFile.h
Reviewers: hubert.reinterpretcast,jasonliu
Subscribers: rupprecht, seiyai,hiraditya
Differential Revision: https://reviews.llvm.org/D69131
Add options to control floating point behavior: trapping and
exception behavior, rounding, and control of optimizations that affect
floating point calculations. More details in UsersManual.rst.
Reviewers: rjmccall
Differential Revision: https://reviews.llvm.org/D62731
Patch allows importing declarations of functions and variables, referenced
by the initializer of some other readonly variable.
Differential revision: https://reviews.llvm.org/D69561
We have a vector compare reduction problem seen in PR39665 comment 2:
https://bugs.llvm.org/show_bug.cgi?id=39665#c2
Or slightly reduced here:
define i1 @cmp2(<2 x double> %a0) {
%a = fcmp ogt <2 x double> %a0, <double 1.0, double 1.0>
%b = extractelement <2 x i1> %a, i32 0
%c = extractelement <2 x i1> %a, i32 1
%d = and i1 %b, %c
ret i1 %d
}
SLP would not attempt to turn this into a vector reduction because there is an
artificial lower limit on that transform. We can not completely remove that limit
without inducing regressions though, so this patch just hacks an extra attempt at
creating a 2-way reduction to the end of the analysis.
As shown in the test file, we are still not getting some of the motivating cases,
so follow-on patches will be needed to solve those cases.
Differential Revision: https://reviews.llvm.org/D59710
`saa` and `saad` are 32-bit and 64-bit store atomic add instructions.
memory[base] = memory[base] + rt
These instructions are available for "Octeon+" CPU. The patch adds support
for both instructions to MIPS assembler and diassembler and introduces new
CPU type - "octeon+".
Next patches will implement `.set arch=octeon+` directive and `AFL_EXT_OCTEONP`
ISA extension flag support.
Differential Revision: https://reviews.llvm.org/D69849
It broke Chromium, causing "Instruction does not dominate all uses!" errors.
See https://bugs.chromium.org/p/chromium/issues/detail?id=1022297#c1 for a
reproducer.
> If the recurrence PHI node has a single user, we can sink any
> instruction without side effects, given that all users are dominated by
> the instruction computing the incoming value of the next iteration
> ('Previous'). We can sink instructions that may cause traps, because
> that only causes the trap to occur later, but not on any new paths.
>
> With the relaxed check, we also have to make sure that we do not have a
> direct cycle (meaning PHI user == 'Previous), which indicates a
> reduction relation, which potentially gets missed by
> ReductionDescriptor.
>
> As follow-ups, we can also sink stores, iff they do not alias with
> other instructions we move them across and we could also support sinking
> chains of instructions and multiple users of the PHI.
>
> Fixes PR43398.
>
> Reviewers: hsaito, dcaballe, Ayal, rengolin
>
> Reviewed By: Ayal
>
> Differential Revision: https://reviews.llvm.org/D69228
This caused Chromium builds to fail with "inlinable function call in a function
with debug info must have a !dbg location" errors. See
https://bugs.chromium.org/p/chromium/issues/detail?id=1022296#c1 for a
reproducer.
> Currently, clang emits subprograms for declared functions when the
> target debugger or DWARF standard is known to support entry values
> (DW_OP_entry_value & the GNU equivalent).
>
> Treat DW_AT_tail_call the same way to allow debuggers to follow cross-TU
> tail calls.
>
> Pre-patch debug session with a cross-TU tail call:
>
> ```
> * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
> frame #1: 0x0000000100000f99 main`main at a.c:8:10 [opt]
> ```
>
> Post-patch (note that the tail-calling frame, "helper", is visible):
>
> ```
> * frame #0: 0x0000000100000fa4 main`target at b.c:4:3 [opt]
> frame #1: 0x0000000100000f80 main`helper [opt] [artificial]
> frame #2: 0x0000000100000f99 main`main at a.c:8:10 [opt]
> ```
>
> rdar://46577651
>
> Differential Revision: https://reviews.llvm.org/D69743
It converts binary contents of .hash and .gnu.hash that were generated by a linker
to YAML descriptions.
I've also dropped Shift2 and BloomFilter values because they are not needed here.
Differential revision: https://reviews.llvm.org/D69881
Summary:
When adjusting function entry counts after inlining, Funciton::setEntryCount is called without providing an import function list. The side effect of that is the previously set import function list will be dropped. The import function list is used by ThinLTO to help import hot cross module callee for LTO inlining, so dropping that during ThinLTO pre-link may adversely affect LTO inlining. The fix is to keep the list while updating entry counts for inlining.
Reviewers: wmi, davidxl, tejohnson
Subscribers: mehdi_amini, hiraditya, dexonsmith, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D69736