The pattern matching failed to recognize all instances of "-1", because when
comparing against "-1" we didn't use an APInt of the same bitwidth.
This commit fixes this and also adds inverse versions of the conditon to catch
more cases.
llvm-svn: 222722
This handles cases where we are comparing a masked value against itself.
The analysis could be further improved by making it recursive but such
expense is not currently justified.
llvm-svn: 222716
The attn instruction is not part of the Power ISA, but is documented in the A2
user manual, and is accepted by the GNU assembler for the A2 and the POWER4+.
Reported as part of PR21650.
llvm-svn: 222712
This does not matter on newer cores (where we can use reciprocal estimates in
fast-math mode anyway), but for older cores this allows us to generate better
fast-math code where we have multiple FDIVs with a common divisor.
llvm-svn: 222710
We were matching against the assume intrinsic in every check. Since we know that it must be an assume, this is just wasted work. Somewhat surprisingly, matching an intrinsic id is actually relatively expensive. It devolves to a string construction and comparison in Function::isIntrinsic.
I originally spotted this because it showed up in a performance profile of my compiler. I've since discovered a separate issue which seems to be the actual root cause, but this is minor perf goodness regardless.
I'm likely to follow up with another change to factor out the comparison matching. There's no need to match the compare instruction in every single one of the tests.
Differential Revision: http://reviews.llvm.org/D6312
llvm-svn: 222709
Clarify the wording around !invariant.load to properly reflect the semantics of such loads with respect to control dependence and location lifetime. To the best of my knowledge, the revised wording respects the actual implementation and understanding of issues involved highlighted in the recent 'Optimization hints for "constant" loads' thread on LLVMDev.
In particular, I'm aiming for the following results:
- To clarify that an invariant.load can fault and must respect control dependence. In particular, it is not sound to unconditionally pull an invariant load out of a loop if that loop would potentially never execute.
- To clarify that the invariant nature of a given pointer does not preclude the modification of that location through a pointer which is unrelated to the load operand. In particular, initializing a location and then passing a pointer through an opaque intrinsic which produces a new unrelated pointer, should behave as expected provided that the intrinsic is memory dependent on the initializing store.
- To clarify that storing a value to an invariant location is defined. It can not, for example, be considered unreachable. The value stored can be assumed to be equal to the value of any previous (or following!) invariant load, but the store itself is defined.
I recommend that anyone interested in using !invariant.load, or optimizing for them, read over the discussion in the review thread. A number of motivating examples are discussed.
Differential Revision: http://reviews.llvm.org/D6346
llvm-svn: 222700
When processing an assignment in the integrated assembler that sets
a symbol to the value of another symbol, we need to copy the st_other
bits that encode the local entry point offset.
Modeled after MipsTargetELFStreamer::emitAssignment handling of the
ELF::STO_MIPS_MICROMIPS flag.
llvm-svn: 222672
We would create an instruction but not inserting it.
Not inserting the unused instruction would lead us to verification
failure.
This fixes PR21653.
llvm-svn: 222659
Fix JRADDIUSP instruction, remove delay slot flag because this instruction
doesn't have delay slot.
Differential Revision: http://reviews.llvm.org/D6365
llvm-svn: 222658
With the help of new method readInstruction16() two bytes are read and
decodeInstruction() is called with DecoderTableMicroMips16, if this fails
four bytes are read and decodeInstruction() is called with
DecoderTableMicroMips32.
Differential Revision: http://reviews.llvm.org/D6149
llvm-svn: 222648
This patch teaches function 'transformVSELECTtoBlendVECTOR_SHUFFLE' how to
convert VSELECT dag nodes to shuffles on targets that do not have SSE4.1.
On pre-SSE4.1 targets, we can still perform blend operations using movss/movsd.
Also, removed a target specific combine that performed a premature lowering of
VSELECT nodes to target specific MOVSS/MOVSD nodes.
llvm-svn: 222647
We tried to get the result of DataLayout::getLargestLegalIntTypeSize but
we didn't have a DataLayout. This resulted in opt crashing.
This fixes PR21651.
llvm-svn: 222645
Fill in omission of `cast_or_null<>` and `dyn_cast_or_null<>` for types
that wrap pointers (e.g., smart pointers).
Type traits need to be slightly stricter than for `cast<>` and
`dyn_cast<>` to resolve ambiguities with simple types.
There didn't seem to be any unit tests for pointer wrappers, so I tested
`isa<>`, `cast<>`, and `dyn_cast<>` while I was in there.
This only supports pointer wrappers with a conversion to `bool` to check
for null. If in the future it's useful to support wrappers without such
a conversion, it should be a straightforward incremental step to use the
`simplify_type` machinery for the null check. In that case, the unit
tests should be updated to remove the `operator bool()` from the
`pointer_wrappers::PTy`.
llvm-svn: 222644