As we emit different linetables format on different operating
systems, this currently fails on linux. Speculative commit
to fix the bots.
llvm-svn: 319997
dsymutil doesn't yet understand the new format and the change,
among others, breaks a large fraction of the debugger tests on
mac OS.
rdar://problem/35856354
llvm-svn: 319995
This extends r319391. It teaches the segment builder to emit the right
completed segment when more than one region ends at the same location.
Fixes PR35495.
llvm-svn: 319990
Instead of having .o files contain linear-memory and function table
definitions, use imports. This is more consistent with the stack pointer
being imported, and it's consistent with the linker being the one to
decide whether linear memory and function table are imported or defined
in the linked output. This implements tool-conventions #23.
Differential Revision: https://reviews.llvm.org/D40875
llvm-svn: 319989
Summary:
This patch adds MachineCombiner patterns for transforming
(fsub (fmul x y) z) into (fma x y (fneg z)). This has a lower
latency on micro architectures where fneg is cheap.
Patch based on work by George Steed.
Reviewers: rengolin, joelkevinjones, joel_k_jones, evandro, efriedma
Reviewed By: evandro
Subscribers: aemerson, javed.absar, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D40306
llvm-svn: 319980
As a new access is generated spanning across multiple fields, we need to
propagate alias info from all the fields to form the most generic alias info.
rdar://35602528
Differential Revision: https://reviews.llvm.org/D40617
llvm-svn: 319979
This patch factors out the main code transformation utilities in the pgo-driven
indirect call promotion pass and places them in Transforms/Utils. The change is
intended to be a non-functional change, letting non-pgo-driven passes share a
common implementation with the existing pgo-driven pass.
The common utilities are used to conditionally promote indirect call sites to
direct call sites. They perform the underlying transformation, and do not
consider profile information. The pgo-specific details (e.g., the computation
of branch weight metadata) have been left in the indirect call promotion pass.
Differential Revision: https://reviews.llvm.org/D40658
llvm-svn: 319963
Summary:
When calculating the RootLatency, we add up all the latencies of the
deleted instructions. But for NewRootLatency we only add the latency of
the new root instructions, ignoring the latencies of the other
instructions inserted. This leads the combiner to underestimate the cost
of patterns which add multiple instructions. This patch fixes that by
summing up the latencies of all new instructions. For NewRootNode, the
more complex getLatency function is used.
Note that we may be slightly more precise than just summing up
all latencies. For example, consider a pattern like
r1 = INS1 ..
r2 = INS2 ..
r3 = INS3 r1, r2
I think in some other places, the total latency of the pattern would be
estimated as lat(INS3) + max(lat(INS1), lat(INS2)). If you consider
that worth changing, I think it would be best to do in a follow-up
patch.
Reviewers: Gerolf, sebpop, spop, fhahn
Reviewed By: fhahn
Subscribers: evandro, llvm-commits
Differential Revision: https://reviews.llvm.org/D40307
llvm-svn: 319951
Summary:
There is no need to replace the original call instruction if no
VarArgs need to be forwarded.
Reviewers: davide, rnk, majnemer, efriedma
Reviewed By: efriedma
Subscribers: eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D40412
llvm-svn: 319947
Patch by David Major.
The NSS project's .def files make heavy use of semicolons in a
frightening attempt at portability:
https://hg.mozilla.org/projects/nss/raw-file/tip/lib/ckfw/capi/nsscapi.def
lld-link was treating the semicolon as part of the export name,
resulting in unresolved symbols. This patch includes ';' in the list of
characters to split on.
Differential Revision: https://reviews.llvm.org/D39968
llvm-svn: 319933
Previously the lambda for AVX512 passed out a flag that indicated whether AVX512BW was required and that was checked against the AVX512BW subtarget flag outside.
This patch changes the interface to pass the AVX512BW subtarget bit in and return its value if we detect 16 or 8 bit types.
llvm-svn: 319919
The header include was required to work around PR19898, as noted in that
comment. That PR has since been marked resolved fixed, and the
configuration check passes without the header inclusion both when
compiling on Windows with cl and when cross-compiling on Linux using
clang-cl.
I noticed this because the inclusion was cased incorrectly (Intrin.h
instead of intrin.h), which when cross-compiling on a case sensitive
file system would cause the intrin.h from the Windows SDK to be included
(which LLVM can't handle) instead of the one from clang's resource
directory, making the check fail. This is the same issue as r309980.
Correcting the case of the inclusion makes the check pass when cross
compiling, but it seems better to get rid of the inclusion entirely,
since it appears to be unnecessary now.
Differential Revision: https://reviews.llvm.org/D40910
llvm-svn: 319917
Summary:
An undef extract index can be arbitrarily chosen to be an
out-of-range index value, which would result in the instruction being undef.
This change closes a gap identified while working on lowering vector permute intrinsics
with variable index vectors to pure LLVM IR.
Reviewers: arsenm, spatel, majnemer
Reviewed By: arsenm, spatel
Subscribers: fhahn, nhaehnle, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D40231
llvm-svn: 319910
Tabort (transaction abort) does not load from memory.
mayLoad flag removed from corresponding TABORT machine instruction.
Review: Ulrich Weigand
llvm-svn: 319905
Most likely, this is not how we want to handle this in the long term. This
code should probably be in the Swift repo and somehow plugged into the
opt-viewer. This is still however very experimental at this point so I don't
want to over-engineer it at this point.
llvm-svn: 319902
they can be overridden when cross compiling.
Summary:
Since CROSS_TOOLCHAN_FLAGS can set CMAKE_(C|CXX)_COMPILER
variables, move the compiler variables up front so they can be
overridden.
This is a followup to https://reviews.llvm.org/D40229 committed in rL319620.
Thanks to Pavel Labath for reporting this issue.
Reviewers: labath, beanz
Subscribers: mgorny, llvm-commits
Differential Revision: https://reviews.llvm.org/D40896
llvm-svn: 319898
Csmith discovered a program that caused wrong code generation with -O0:
When handling a SIGN_EXTEND in expandRxSBG(), RxSBG.BitSize may be less than
the Input width (if a truncate was previously traversed), so maskMatters()
should be called with a masked based on the width of the sign extend result
instead.
Review: Ulrich Weigand
llvm-svn: 319892