Summary: This way we can get rid of one of the fields in the .def file.
Reviewers: llvm-commits
Subscribers: zturner
Differential Revision: http://reviews.llvm.org/D20251
llvm-svn: 269461
For BITREVERSE, bit shifting/masking every bit in a vector element is a very lengthy procedure.
If the input vector type is a whole multiple of bytes wide then we can split this into a BSWAP shuffle stage (to reverse at the byte level) and then a BITREVERSE stage applied to each byte. Most vector capable targets can efficiently BSWAP using shuffles resulting in a considerable reduction in instructions.
With this patch targets would only need to implement a target specific vXi8 BITREVERSE implementation to efficiently reverse most legal vector types.
Differential Revision: http://reviews.llvm.org/D19978
llvm-svn: 269290
Currently cost based loop rotation algo can only be turned on with
two conditions: the function has real profile data, and -precise-rotation-cost
flag is turned on. This is not convenient for developers to experiment
when profile is not available. Add a new option to force the new
rotation algorithm -force-precise-rotation-cost
llvm-svn: 269266
This is to fix the bug in https://llvm.org/bugs/show_bug.cgi?id=27612.
When spill is hoisted to a BB with landingpad successor, and if the VNI
of the spill reg lives into the landingpad successor, the spill should be
inserted before the call which may throw exception. InsertPointAnalysis
is used to compute the safe insert point.
http://reviews.llvm.org/D20027 is a preparing patch for this patch.
Differential Revision: http://reviews.llvm.org/D19884.
llvm-svn: 269249
InsertPointAnalysis.
Because both split and spill hoisting want to use LastSplitPoint computation
result, extract the LastSplitPoint computation from SplitAnalysis class which
also contains a bunch of other analysises only related to split.
Differential Revision: http://reviews.llvm.org/D20027.
llvm-svn: 269248
It's awkward to force callers of SelectNodeTo to figure out whether
the node was morphed or CSE'd. Update uses here instead of requiring
callers to (sometimes) do it.
llvm-svn: 269235
This means SelectCode unconditionally returns nullptr now. I'll follow
up with a change to make that return void as well, but it seems best
to keep that one very mechanical.
This is part of the work to have Select return void instead of an
SDNode *, which is in turn part of llvm.org/pr26808.
llvm-svn: 269136
Usually subregister definitions are consider uses of the remaining
lanes that did not get defined. Add a comment why the code in
ScheduleDAGInstrs does not add use dependencies regardless.
llvm-svn: 269107
for the same subprogram.
This fixes a bug where DW_AT_abstract_origin is being emitted twice for
the same subprogram if a function is both inlined and emitted in the same
translation unit, by restoring the pre-r266446 behavior.
http://reviews.llvm.org/D20072
llvm-svn: 269103
Summary:
While setting kill flags on instructions inside a BUNDLE, we bail out as soon
as we set kill flag on a register. But we are missing a check when all the
registers already have the correct kill flag set. We need to bail out in that
case as well.
This patch refactors the old code and simply makes use of the addRegisterKilled
function in MachineInstr.cpp in order to determine whether to set/remove kill
on an instruction.
Reviewers: apazos, t.p.northover, pete, MatzeB
Subscribers: MatzeB, davide, llvm-commits
Differential Revision: http://reviews.llvm.org/D17356
llvm-svn: 269092
An example from Hexagon where things went wrong:
%R0<def> = L2_loadrigp <ga:@fp04> ; load function address
J2_callr %R0<kill>, ..., %R0<imp-def> ; call *R0, return value in R0
ScheduleDAGInstrs::buildSchedGraph would visit all instructions going
backwards, and in each instruction it would visit all operands in their
order on the operand list. In the case of this call, it visited the use
of R0 first, then removed it from the set Uses after it visited the def.
This caused the DAG to be missing the data dependence edge on R0 between
the load and the call.
Differential Revision: http://reviews.llvm.org/D20102
llvm-svn: 269076
Currently, SelectionDAG assumes 8/16-bit cmpxchg returns either a sign
extended result, or a zero extended result. SystemZ takes a third
option by returning junk in the high bits (rotated contents of the other
bytes in the memory word). In that case, don't use Assert*ext, and
zero-extend the result ourselves if a comparison is needed.
Differential Revision: http://reviews.llvm.org/D19800
llvm-svn: 269075
SystemZ (and probably other targets as well) can fold a memory operand
by changing the opcode into a new instruction that as a side-effect
also clobbers the CC-reg.
In order to do this, liveness of that reg must first be checked. When
LIS is passed, getRegUnit() can be called on it and the right
LiveRange is computed on demand.
Reviewed by Matthias Braun.
http://reviews.llvm.org/D19861
llvm-svn: 269026
We now use LiveRangeCalc::extendToUses() instead of a specially designed
algorithm in constructMainRangeFromSubranges():
- The original motivation for constructMainRangeFromSubranges() were
differences between the main liverange and subranges because of hidden
dead definitions. This case however cannot happen anymore with the
DetectDeadLaneMasks pass in place.
- It simplifies the code.
- This fixes a longstanding bug where we did not properly create new SSA
values on merging control flow (the MachineVerifier missed most of
these cases).
- Move constructMainRangeFromSubranges() to LiveIntervalAnalysis and
LiveRangeCalc to better match the implementation/available helper
functions.
llvm-svn: 269016
Move the register stackification and coloring passes to run very late, after
PEI, tail duplication, and most other passes. This means that all code emitted
and expanded by those passes is now exposed to these passes. This also
eliminates the need for prologue/epilogue code to be manually stackified,
which significantly simplifies the code.
This does require running LiveIntervals a second time. It's useful to think
of these late passes not as late optimization passes, but as a domain-specific
compression algorithm based on knowledge of liveness information. It's used to
compress the code after all conventional optimizations are complete, which is
why it uses LiveIntervals at a phase when actual optimization passes don't
typically need it.
Differential Revision: http://reviews.llvm.org/D20075
llvm-svn: 269012
Many files include Passes.h but only a fraction needs to know about the
TargetPassConfig class. Move it into an own header. Also rename
Passes.cpp to TargetPassConfig.cpp while we are at it.
llvm-svn: 269011
Add convenience function to create MachineModuleInfo and
MachineFunctionAnalysis passes and add them to a pass manager.
Despite factoring out some shared code in
LiveIntervalTest/LLVMTargetMachine this will be used by my upcoming llc
change.
llvm-svn: 269002
After looking at D19087 again, it occurred to me that we can do better. If we consolidate
the valueHasExactlyOneBitSet() transforms, we won't incur extra overhead from calling it a
2nd time, and we can shrink SimplifySetCC() a bit. No functional change intended.
Differential Revision: http://reviews.llvm.org/D20050
llvm-svn: 268932
In case of COPY-like instruction we may be able to deduce that a certain
input is unused, based on the used lanes of the register defined by the
instruction.
This even works accross otherwise incompatible copies (no need to have
compatible lanemasks, completely unused operands are still completely
unused). It even makes sense to redo the analysis in this case since we
gained information for a case we previously stopped at because of the
incompatible masks.
llvm-svn: 268815
Relying on the caller to clean up after we've replaced all uses of a
node won't work when we've migrated to the `void Select(...)` API.
llvm-svn: 268774
If we don't, values that aren't precisely representable in f16 could
be used as-is in a promoted f32 operation, which would produce
incorrect results.
AArch64 had the correct behavior; add a focused test.
Fixes http://llvm.org/PR26871
llvm-svn: 268700
This is a step towards removing the rampant undefined behaviour in
SelectionDAG, which is a part of llvm.org/PR26808.
We rename SelectionDAGISel::Select to SelectImpl and update targets to
match, and then change Select to return void and consolidate the
sketchy behaviour we're trying to get away from there.
Next, we'll update backends to implement `void Select(...)` instead of
SelectImpl and eventually drop the base Select implementation.
llvm-svn: 268693
This opcode never happens in practice, and yet the logic we have in
place to handle it would be undefined behaviour if we ever executed
it. Remove it rather than trying to refactor code that's never
reached.
llvm-svn: 268692
Some vector bit operations are promoted instead of having custom lowering. This patch changes the isOperationLegalOrCustom tests for vector AND/OR operations to use a new TLI helper isOperationLegalOrCustomOrPromote instead, allowing the SSE implementations to stay on the simd unit.
Differential Revision: http://reviews.llvm.org/D19805
llvm-svn: 268561
Vector bit operations are typically promoted instead of having custom lowering. This patch changes the isOperationLegalOrCustom tests for vector AND/OR operations to use isOperationLegalOrPromote instead, allowing the SSE implementations to stay on the simd unit.
Differential Revision: http://reviews.llvm.org/D19805
llvm-svn: 268504
The replaced load may have implicit-defs and those defs may be used
in the block of the original load. Make sure to update the liveness
accordingly.
This is a generalization of r267817.
llvm-svn: 268412