fsub X, +0 ==> X
fsub X, -0 ==> X, when we know X is not -0
fsub +/-0.0, (fsub -0.0, X) ==> X
fsub nsz +/-0.0, (fsub +/-0.0, X) ==> X
fsub nnan ninf X, X ==> 0.0
fadd nsz X, 0 ==> X
fadd [nnan ninf] X, (fsub [nnan ninf] 0, X) ==> 0
where nnan and ninf have to occur at least once somewhere in this expression
fmul X, 1.0 ==> X
llvm-svn: 169940
m_ConstantFP - match and bind a float constant
m_SpecificConstantFP - match a specific floating point value or vector of floats of that value
m_FPOne - match a floating point 1.0 or vector of 1.0s
m_NegZero - match -0.0
m_AnyZero - match 0 or -0.0
llvm-svn: 169939
ScalarTargetTransformInfo::getIntImmCost() instead. "Legal" is a poorly defined
term for something like integer immediate materialization. It is always possible
to materialize an integer immediate. Whether to use it for memcpy expansion is
more a "cost" conceern.
llvm-svn: 169929
Given a thread-local symbol x with global-dynamic access, the generated
code to obtain x's address is:
Instruction Relocation Symbol
addis ra,r2,x@got@tlsgd@ha R_PPC64_GOT_TLSGD16_HA x
addi r3,ra,x@got@tlsgd@l R_PPC64_GOT_TLSGD16_L x
bl __tls_get_addr(x@tlsgd) R_PPC64_TLSGD x
R_PPC64_REL24 __tls_get_addr
nop
<use address in r3>
The implementation borrows from the medium code model work for introducing
special forms of ADDIS and ADDI into the DAG representation. This is made
slightly more complicated by having to introduce a call to the external
function __tls_get_addr. Using the full call machinery is overkill and,
more importantly, makes it difficult to add a special relocation. So I've
introduced another opcode GET_TLS_ADDR to represent the function call, and
surrounded it with register copies to set up the parameter and return value.
Most of the code is pretty straightforward. I ran into one peculiarity
when I introduced a new PPC opcode BL8_NOP_ELF_TLSGD, which is just like
BL8_NOP_ELF except that it takes another parameter to represent the symbol
("x" above) that requires a relocation on the call. Something in the
TblGen machinery causes BL8_NOP_ELF and BL8_NOP_ELF_TLSGD to be treated
identically during the emit phase, so this second operand was never
visited to generate relocations. This is the reason for the slightly
messy workaround in PPCMCCodeEmitter.cpp:getDirectBrEncoding().
Two new tests are included to demonstrate correct external assembly and
correct generation of relocations using the integrated assembler.
Comments welcome!
Thanks,
Bill
llvm-svn: 169910
because that method is only getting called for MCInstFragment. These
fragments aren't even generated when RelaxAll is set, which is why the
flag reference here is superfluous. Removing it simplifies the code
with no harmful effects.
An assertion is added higher up to make sure this path is never
reached.
llvm-svn: 169886
Since now we have an autogenerated TOC, a manually written table of all passes
was removed.
Patch by Anthony Mykhailenko with small fixes by me.
llvm-svn: 169867
Use explicitely aligned store and load instructions to deal with argument and
retval shadow. This matters when an argument's alignment is higher than
__msan_param_tls alignment (which is the case with __m128i).
llvm-svn: 169859
instead of the instruction. I've left a forwarding wrapper for the
instruction so users with the instruction don't need to create
a GEPOperator themselves.
This lets us remove the copy of this code in instsimplify.
I've looked at most of the other copies of similar code, and this is the
only one I've found that is actually exactly the same. The one in
InlineCost is very close, but it requires re-mapping non-constant
indices through the cost analysis value simplification map. I could add
direct support for this to the generic routine, but it seems overly
specific.
llvm-svn: 169853
the GEP instruction class.
This is part of the continued refactoring and cleaning of the
infrastructure used by SROA. This particular operation is also done in
a few other places which I'll try to refactor to share this
implementation.
llvm-svn: 169852
Accordingly, add helper funtions getSimpleValueType (in parallel to
getValueType) in SDValue, SDNode, and TargetLowering.
This is the first, in a series of patches.
llvm-svn: 169837
try to reduce the width of this load, and would end up transforming:
(truncate (lshr (sextload i48 <ptr> as i64), 32) to i32)
to
(truncate (zextload i32 <ptr+4> as i64) to i32)
We lost the sext attached to the load while building the narrower i32
load, and replaced it with a zext because lshr always zext's the
results. Instead, bail out of this combine when there is a conflict
between a sextload and a zext narrowing. The rest of the DAG combiner
still optimize the code down to the proper single instruction:
movswl 6(...),%eax
Which is exactly what we wanted. Previously we read past the end *and*
missed the sign extension:
movl 6(...), %eax
llvm-svn: 169802