When -ffast-math is in effect (on Linux, at least), clang defines
__FINITE_MATH_ONLY__ > 0 when including <math.h>. This causes the
preprocessor to include <bits/math-finite.h>, which renames the sqrt functions.
For instance, "sqrt" is renamed as "__sqrt_finite".
This patch adds the 3 new names in such a way that they will be treated
as equivalent to their respective original names.
llvm-svn: 182739
isConsecutiveLS is a slightly more general form of
SelectionDAG::isConsecutiveLoad. Aside from also handling stores, it also does
not assume equality of the chain operands is necessary. In the case of the PPC
backend, this chain condition is checked in a more general way by the
surrounding code.
Mostly, this part of the refactoring in preparation for supporting optimized
unaligned stores.
llvm-svn: 182723
When expanding unaligned Altivec loads, we use the decremented offset trick to
prevent page faults. Unfortunately, if we have a sequence of consecutive
unaligned loads, this leads to suboptimal code generation because the 'extra'
load from the first unaligned load can be combined with the base load from the
second (but only if the decremented offset trick is not used for the first).
Search up and down the chain, through loads and token factors, looking for
consecutive loads, and if one is found, don't use the offset reduction trick.
These duplicate loads are later combined to yield the desired sequence (in the
future, we might want a more-powerful chain search, but that will require some
changes to allow the combiner routines to access the AA object).
This should complete the initial implementation of the optimized unaligned
Altivec load expansion. There is some refactoring that should be done, but
that will happen when the unaligned store expansion is added.
llvm-svn: 182719
reject things like: "for (auto Entry : SomeStringMap)". Previously
this would copy the value but not the tail allocated string data
(the key).
llvm-svn: 182713
The lvsl permutation control instruction is a function only of the alignment of
the pointer operand (relative to the 16-byte natural alignment of Altivec
vectors). As a result, multiple lvsl intrinsics where the operands differ by a
multiple of 16 can be combined.
llvm-svn: 182708
Use a field in the SelectionDAGNode object to track its IR ordering.
This adds fields and utility classes without changing existing
interfaces or functionality.
llvm-svn: 182701
Altivec only directly supports aligned loads, but the loads have a strange
property: If given an unaligned address, they truncate the address to the next
lower aligned address, and load from there. This property, along with an extra
load and some special-purpose permutation-control instructions that generate
the appropriate permutations from the original unaligned address, allow
efficient lowering of aligned loads. This code uses the trick explained in the
Apple Velocity Engine optimization overview document to prevent the needed
extra load from possibly causing a page fault if the original address happens
to be aligned.
As noted in the FIXMEs, there are several additional optimizations that can be
performed to reduce the cost of these loads even more. These will be
implemented in future commits.
llvm-svn: 182691
Previously, an invalid instruction like:
foo %r1, %r0
would generate the rather odd error message:
....: error: unknown token in expression
foo %r1, %r0
^
We now get the more informative:
....: error: invalid instruction
foo %r1, %r0
^
The same would happen if an address were used where a register was expected.
We now get "invalid operand for instruction" instead.
llvm-svn: 182644
The idea is to make sure that:
(1) "register expected" is restricted to cases where ParseRegister()
is called and the token obviously isn't a register.
(2) "invalid register" is restricted to cases where a register-like "%..."
sequence is found, but the "..." makes no sense.
(3) the generic "invalid operand for instruction" is used in cases where
the wrong register type is used (GPR instead of FPR, etc.).
(4) the new "invalid register pair" is used if the register has the right type,
but is not a valid register pair.
Testing of (1)-(3) is now restricted to regs-bad.s. It uses a representative
instruction for each register class to make sure that only registers from
that class are accepted.
(4) is tested by both regs-bad.s (which checks all invalid register pairs)
and insn-bad.s (which tests one invalid pair for each instruction that
requires a pair).
While there, I changed "Number" to "Num" for consistency with the
operand class.
llvm-svn: 182643
as the BinaryOperator, *not* in the block where the IRBuilder is currently
inserting into. Fixes a bug where scalarizePHI would create instructions
that would not dominate all uses.
llvm-svn: 182639
Other than recognizing the attribute, the patch does little else.
It changes the branch probability analyzer so that edges into
blocks postdominated by a cold function are given low weight.
Added analysis and code generation tests. Added documentation for the
new attribute.
llvm-svn: 182638
There was exactly one caller using this API right, the others were relying on
specific behavior of the default implementation. Since it's too hard to use it
right just remove it and standardize on the default behavior.
Defines away PR16132.
llvm-svn: 182636
In these builds, the asserts() are completely compiled out of the code
leaving "End" unused. Directly accessing it, should not have a
performance impact, as it is just a data member.
llvm-svn: 182634
This patch builds on some existing code to do CFG reconstruction from
a disassembled binary:
- MCModule represents the binary, and has a list of MCAtoms.
- MCAtom represents either disassembled instructions (MCTextAtom), or
contiguous data (MCDataAtom), and covers a specific range of addresses.
- MCBasicBlock and MCFunction form the reconstructed CFG. An MCBB is
backed by an MCTextAtom, and has the usual successors/predecessors.
- MCObjectDisassembler creates a module from an ObjectFile using a
disassembler. It first builds an atom for each section. It can also
construct the CFG, and this splits the text atoms into basic blocks.
MCModule and MCAtom were only sketched out; MCFunction and MCBB were
implemented under the experimental "-cfg" llvm-objdump -macho option.
This cleans them up for further use; llvm-objdump -d -cfg now generates
graphviz files for each function found in the binary.
In the future, MCObjectDisassembler may be the right place to do
"intelligent" disassembly: for example, handling constant islands is just
a matter of splitting the atom, using information that may be available
in the ObjectFile. Also, better initial atom formation than just using
sections is possible using symbols (and things like Mach-O's
function_starts load command).
This brings two minor regressions in llvm-objdump -macho -cfg:
- The printing of a relocation's referenced symbol.
- An annotation on loop BBs, i.e., which are their own successor.
Relocation printing is replaced by the MCSymbolizer; the basic CFG
annotation will be superseded by more related functionality.
llvm-svn: 182628