If the SI_KILL operand is constant, we can either clear the exec mask if
the operand is negative, or do nothing otherwise.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 202337
any ranges to the list of ranges for the CU as we don't want to emit
them anyway. This ensures that we will still emit ranges if we have
a compile unit compiled with only line tables and one compiled with
full debug info requested (we'll emit for the one with full debug info).
Update testcase metadata accordingly to continue emitting ranges.
llvm-svn: 202333
and update everything accordingly. This can be used to conditionalize
the amount of output in the backend based on the amount of debug
requested/metadata emission scheme by a front end (e.g. clang).
Paired with a commit to clang.
llvm-svn: 202332
This handles pathological cases in which we see 2x increase in spill
code for large blocks (~50k instructions). I don't have a unit test
for this behavior.
Fixes rdar://16072279.
llvm-svn: 202304
The current approach to lower a vsetult is to flip the sign bit of the
operands, swap the operands and then use a (signed) pcmpgt. psubus (unsigned
saturating subtract) can be used to emulate a vsetult more efficiently:
+ case ISD::SETULT: {
+ // If the comparison is against a constant we can turn this into a
+ // setule. With psubus, setule does not require a swap. This is
+ // beneficial because the constant in the register is no longer
+ // destructed as the destination so it can be hoisted out of a loop.
I also enable lowering via psubus in a few other cases where it's clearly
beneficial: setule and setuge if minu/maxu cannot be used.
rdar://problem/14338765
Patch by Adam Nemet <anemet@apple.com>.
llvm-svn: 202301
The aggressive anti-dependency breaker scans instructions, bottom-up, within the
scheduling region in order to find opportunities where register renaming can
be used to break anti-dependencies.
Unfortunately, the aggressive anti-dep breaker was treating a register definition
as defining all of that register's aliases (including super registers). This behavior
is incorrect when the super register is live and there are other definitions of
subregisters of the super register.
For example, given the following sequence:
%CR2EQ<def> = CROR %CR3UN, %CR3UN<kill>
%CR2GT<def> = IMPLICIT_DEF
%X4<def> = MFOCRF8 %CR2
the analysis of the first subregister definition would work as expected:
Anti: %CR2GT<def> = IMPLICIT_DEF
Def Groups: CR2GT=g194->g0(via CR2)
Antidep reg: CR2GT (zero group)
Use Groups:
but the analysis of the second one would not:
Anti: %CR2EQ<def> = CROR %CR3UN, %CR3UN<kill>
Def Groups: CR2EQ=g195
Antidep reg: CR2EQ
Rename Candidates for Group g195: ...
because, when processing the %CR2GT<def>, we'd mark all super registers of
%CR2GT (%CR2 in this case) as defined. As a result, when processing
%CR2EQ<def>, %CR2 no longer appears to be live, and %CR2EQ<def>'s group is not
%unioned with the %CR2 group.
I don't have an in-tree test case for this yet (and even if I did, I don't have
a small one).
llvm-svn: 202294
We should apply fastcc whenever profitable. We can expand this list,
but there are lots of conventions with performance implications that we
don't want to change.
Differential Revision: http://llvm-reviews.chandlerc.com/D2705
llvm-svn: 202293
COFF object files with 0 as string table size are currently rejected. This
prevents us from reading object files written by tools like cvtres that
violate the PECOFF spec and write 0 instead of 4 for the size of an empty
string table.
llvm-svn: 202292
We don't have any test with more than 6 address spaces, so a DenseMap is
probably not the correct answer.
An unsorted array would also be OK, but we have to sort it for printing anyway.
llvm-svn: 202275
This includes instructions with aggregate operands (insert/extract), instructions with vector operands (insert/extract/shuffle), binary arithmetic and bitwise instructions, conversion instructions and terminators.
Work was done by lama.saba@intel.com.
llvm-svn: 202262
The table argument is always 128-bit (and interpreted as <16 x i8>) so the
extra specifier for it is just clutter.
No user-visible behaviour change, so no tests.
llvm-svn: 202258
Summary:
Fixes an issue where a test attempts to use -mcpu=cortex-a15 on non-ARM targets.
This triggers an assertion on MIPS since it doesn't know what ABI to use by default for
unrecognized processors.
Reviewers: rengolin
Reviewed By: rengolin
CC: llvm-commits, aemerson, rengolin
Differential Revision: http://llvm-reviews.chandlerc.com/D2876
llvm-svn: 202256
Summary:
This should fix the MCJIT unit tests that were broken by r201792 on the MIPS buildbot.
MIPS currently uses the default implementation of sys::getHostCPUName() which
always returns "generic". For now, we will accept "generic" and coerce it to
"mips32" or "mips64" depending on the target architecture like we do for empty
CPU names.
Reviewers: jacksprat, matheusalmeida
Reviewed By: jacksprat
Differential Revision: http://llvm-reviews.chandlerc.com/D2878
llvm-svn: 202253
the default.
Based on the patch by Matt Arsenault, D1764!
I switched one place to use the more direct pointer type to compute the
desired address space, and I reworked the memcpy rewriting section to
reflect significant refactorings that this patch helped inspire.
Thanks to several of the folks who helped review and improve the patch
as well.
llvm-svn: 202247
to work independently for the slice side and the other side.
This allows us to only compute the minimum of the two when we actually
rewrite to a memcpy that needs to take the minimum, and preserve higher
alignment for one side or the other when rewriting to loads and stores.
This fix was inspired by seeing the result of some refactoring that
makes addrspace handling better.
llvm-svn: 202242
target_link_libraries(INTERFACE) doesn't bring inter-target dependencies in add_library,
although final targets have dependencies to whole dependent libraries.
It makes most libraries can be built in parallel.
target_link_libraries(PRIVATE) is used to shaared library.
Each dependent library is linked to the target.so, and its user will not see its grandchildren.
For example,
- libclang.so has sufficient libclang*.a(s).
- c-index-test requires just only libclang.so.
FIXME: lld is tweaked minimally. Adding INTERFACE in each library would be better thing.
llvm-svn: 202241
For now, use both keywords, INTERFACE and PRIVATE via the variable,
- ${cmake_2_8_12_INTERFACE}
- ${cmake_2_8_12_PRIVATE}
They could be cleaned up when we introduce 2.8.12.
llvm-svn: 202239