A function can be marked as norecurse if:
* The SCC to which it belongs has cardinality 1; and either
a) It does not call any non-norecurse function. This includes self-recursion; or
b) It only has one callsite and the function that callsite is within is marked norecurse.
a) is best propagated bottom-up and b) is best propagated top-down.
We build up the norecurse attributes bottom-up using the existing SCC pass, and mark functions with no obvious recursion (but not provably norecurse) to sweep later, top-down.
llvm-svn: 252862
This encompasses several changes which are all interconnected:
- Use the MC framework for printing almost all instructions.
- AsmStrings are now live.
- This introduces an indirection between LLVM vregs and WebAssembly registers,
and a new pass, WebAssemblyRegNumbering, for computing a basic the mapping.
This addresses some basic issues with argument registers and unused registers.
- The way ARGUMENT instructions are handled no longer generates redundant
get_local+set_local for every argument.
This also changes the assembly syntax somewhat; most notably, MC's printing
use sigils on label names, so those are no longer present, and push/pop now
have a sigil to keep them unambiguous.
The usage of set_local/get_local/$push/$pop will continue to evolve
significantly. This patch is just one step of a larger change.
llvm-svn: 252858
Summary:
The option outputs statistics in CSV format preceded by 1 header line.
This is intended for machine processing of the output.
-verbosity=0 should likely be set.
Differential Revision: http://reviews.llvm.org/D14600
llvm-svn: 252856
- Factor out code to query and modify the sign bit of a floatingpoint
value as an integer. This also works if none of the targets integer
types is big enough to hold all bits of the floatingpoint value.
- Legalize FABS(x) as FCOPYSIGN(x, 0.0) if FCOPYSIGN is available,
otherwise perform bit manipulation on the sign bit. The previous code
used "x >u 0 ? x : -x" which is incorrect for x being -0.0! It also
takes 34 instructions on ARM Cortex-M4. With this patch we only
require 5:
vldr d0, LCPI0_0
vmov r2, r3, d0
lsrs r2, r3, #31
bfi r1, r2, #31, #1
bx lr
(This could be further improved if the compiler would recognize that
r2, r3 is zero).
- Only lower FCOPYSIGN(x, y) = sign(x) ? -FABS(x) : FABS(x) if FABS is
available otherwise perform bit manipulation on the sign bit.
- Perform the sign(x) test by masking out the sign bit and comparing
with 0 rather than shifting the sign bit to the highest position and
testing for "<s 0". For x86 copysignl (on 80bit values) this gets us:
testl $32768, %eax
rather than:
shlq $48, %rax
sets %al
testb %al, %al
Differential Revision: http://reviews.llvm.org/D11172
llvm-svn: 252839
First create a list of candidates, then transform. This simplifies the code in
that you have don't have to worry that you may be using an invalidated
iterator.
Previously, each time we created a memset/memcpy we would reevaluate the entire
loop potentially resulting in lots of redundant work for large basic blocks.
llvm-svn: 252817
When working with tokens, it is often the case that one has instructions
which consume a token and produce a new token. Currently, we have no
mechanism to represent an initial token state.
Instead, we can create a notional "empty token" by inventing a new
constant which captures the semantics we would like. This new constant
is called ConstantTokenNone and is written textually as "token none".
Differential Revision: http://reviews.llvm.org/D14581
llvm-svn: 252811
Summary:
This change introduces the notion of "deoptimization" operand bundles.
LLVM can recognize and optimize these in more precise ways than it can a
generic "unknown" operand bundles.
The current form of this special recognition / optimization is an enum
entry in LLVMContext, a LangRef blurb and a verifier rule. Over time we
will teach LLVM to do more aggressive optimization around deoptimization
operand bundles, exploiting known facts about kinds of state
deoptimization operand bundles are allowed to track.
Reviewers: reames, majnemer, chandlerc, dexonsmith
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14551
llvm-svn: 252806
This is a step towards consolidating some of the information regarding
attributes in a single place.
This patch moves the enum attributes in Attributes.h to the table-gen
file. Additionally, it adds definitions of target independent string
attributes that will be used in follow-up commits by the inliner to
check attribute compatibility.
rdar://problem/19836465
llvm-svn: 252796
This is a follow-up from the previous discussion on the thread:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151019/307763.html
The LibLTO lto_get_error_message() API reads error messages from a std::string
sLastErrorString. Instead of passing this string around as an argument, this
patch creates a diagnostic handler and then sends this handler to the
constructor of LTOCodeGenerator.
Differential Revision: http://reviews.llvm.org/D14313
llvm-svn: 252791
Summary:
Don't fold
(zext (and (load x), cst)) -> (and (zextload x), (zext cst))
if
(and (load x) cst)
will match as a zextload already and has additional users.
For example, the following IR:
%load = load i32, i32* %ptr, align 8
%load16 = and i32 %load, 65535
%load64 = zext i32 %load16 to i64
store i32 %load16, i32* %dst1, align 4
store i64 %load64, i64* %dst2, align 8
used to produce the following aarch64 code:
ldr w8, [x0]
and w9, w8, #0xffff
and x8, x8, #0xffff
str w9, [x1]
str x8, [x2]
but with this change produces the following aarch64 code:
ldrh w8, [x0]
str w8, [x1]
str x8, [x2]
Reviewers: resistor, mcrosier
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14340
llvm-svn: 252789
Summary: Other personalities don't use this special frame slot.
Reviewers: majnemer, andrew.w.kaylor, rnk
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D14580
llvm-svn: 252778
This patch adds DWARF values for the Delphi language and Borland C++
language extensions.
Reviewed by: dblaikie
Subscribers: llvm-commits, majnemer
Differential Revision: http://reviews.llvm.org/D14522
llvm-svn: 252776
Summary: Inlined callsites need to be emitted in debug info so that sample profile can be annotated to the correct inlined instance.
Reviewers: dnovillo, dblaikie
Subscribers: dblaikie, llvm-commits
Differential Revision: http://reviews.llvm.org/D14511
llvm-svn: 252768
The discriminators pass relied on the presence of llvm.dbg.cu to decide
whether to add discriminators, but this fails in the case where debug
info is only enabled partially when -fprofile-sample-use is active.
The reason llvm.dbg.cu is not present in these cases is to prevent
codegen from emitting debug info (as it is only used for the sample
profile pass).
This changes the discriminators pass to also emit discriminators even
when debug info is not being emitted.
llvm-svn: 252763
MIPS32 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any MIPS32
implementation.
The net result of allowing this speculation for the regression tests in this patch is
that we get this code:
ctlz:
jr $ra
clz $2, $4
cttz:
addiu $1, $4, -1
not $2, $4
and $1, $2, $1
clz $1, $1
addiu $2, $zero, 32
jr $ra
subu $2, $2, $1
Instead of:
ctlz:
beqz $4, $BB0_2
addiu $2, $zero, 32
clz $2, $4
$BB0_2:
jr $ra
nop
cttz:
beqz $4, $BB1_2
addiu $2, $zero, 32
addiu $1, $4, -1
not $2, $4
and $1, $2, $1
clz $1, $1
addiu $2, $zero, 32
subu $2, $2, $1
$BB1_2:
jr $ra
nop
See D14469 for the larger motivation.
Differential Revision: http://reviews.llvm.org/D14500
llvm-svn: 252755
The committer didn't respond at http://reviews.llvm.org/D14338, so we've got to fix this for them.
This test doesn't pass with thumbv6, so I suppose what they meant is thumbv7.
llvm-svn: 252754
Summary:
Only DSPr2 is present because it appears we've never added DSPr1 tests.
We'll have to correct that in a later patch.
Reviewers: vkalintiris
Subscribers: llvm-commits, dsanders
Differential Revision: http://reviews.llvm.org/D14448
llvm-svn: 252752
I missed the side-effects of ParseBFI in my previous attempt (r252748).
Thanks dblaikie for the suggestion of adding a void use of the unused
variable instead.
llvm-svn: 252751
Summary:
This patch adds a new CMake module for working with ExternalProjects. This wrapper for ExternalProject supports using just-built tools and can hook up dependencies properly so that projects get cleared out.
The example usage here is for the llvm test-suite. In this example, the test-suite is setup as dependent on clang and lld if they are in-tree. If the clang or lld binaries change the test-suite is re-configured, cleaned, and rebuilt.
This cleanup and abstraction wrapping ExternalProject can be extended and applied to other runtime libraries like compiler-rt and libcxx.
Reviewers: samsonov, jroelofs, rengolin, jmolloy
Subscribers: jmolloy, llvm-commits
Differential Revision: http://reviews.llvm.org/D14513
llvm-svn: 252747