1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00
Commit Graph

2212 Commits

Author SHA1 Message Date
Craig Topper
af680f3cea [SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and simplifyDemandedBits
This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently.

This is largely a mechanical transformation from KnownZero to Known.Zero.

Differential Revision: https://reviews.llvm.org/D32569

llvm-svn: 301620
2017-04-28 05:31:46 +00:00
Craig Topper
566764bb04 [APInt] Use inplace shift methods where possible. NFCI
llvm-svn: 301612
2017-04-28 03:36:24 +00:00
Krzysztof Parzyszek
ce1e95e40d Move size and alignment information of regclass to TargetRegisterInfo
1. RegisterClass::getSize() is split into two functions:
   - TargetRegisterInfo::getRegSizeInBits(const TargetRegisterClass &RC) const;
   - TargetRegisterInfo::getSpillSize(const TargetRegisterClass &RC) const;
2. RegisterClass::getAlignment() is replaced by:
   - TargetRegisterInfo::getSpillAlignment(const TargetRegisterClass &RC) const;

This will allow making those values depend on subtarget features in the
future.

Differential Revision: https://reviews.llvm.org/D31783

llvm-svn: 301221
2017-04-24 18:55:33 +00:00
Sjoerd Meijer
02cf92d4d2 [Arch64AsmParser] better diagnostic for isb
Instruction isb takes as an operand either 'sy' or an immediate value. This
improves the diagnostic when the string is not 'sy' and adds a test case for
this which was missing. This also adds tests to check invalid inputs for dsb
and dmb.

Differential Revision: https://reviews.llvm.org/D32227

llvm-svn: 301165
2017-04-24 08:22:20 +00:00
Renato Golin
74c3c2a375 Revert "[APInt] Fix a few places that use APInt::getRawData to operate within the normal API."
This reverts commit r301105, 4, 3 and 1, as a follow up of the previous
revert, which broke even more bots.

For reference:
Revert "[APInt] Use operator<<= where possible. NFC"
Revert "[APInt] Use operator<<= instead of shl where possible. NFC"
Revert "[APInt] Use ashInPlace where possible."

PR32754.

llvm-svn: 301111
2017-04-23 12:15:30 +00:00
Craig Topper
fa57b3748d [APInt] Use operator<<= instead of shl where possible. NFC
llvm-svn: 301103
2017-04-23 05:18:31 +00:00
Daniel Sanders
4cd719403f [globalisel][tablegen] Revise API for ComplexPattern operands to improve flexibility.
Summary:
Some targets need to be able to do more complex rendering than just adding an
operand or two to an instruction. For example, it may need to insert an
instruction to extract a subreg first, or it may need to perform an operation
on the operand.

In SelectionDAG, targets would create SDNode's to achieve the desired effect
during the complex pattern predicate. This worked because SelectionDAG had a
form of garbage collection that would take care of SDNode's that were created
but not used due to a later predicate rejecting a match. This doesn't translate
well to GlobalISel and the churn was wasteful.

The API changes in this patch enable GlobalISel to accomplish the same thing
without the waste. The API is now:
	InstructionSelector::OptionalComplexRendererFn selectArithImmed(MachineOperand &Root) const;
where Root is the root of the match. The return value can be omitted to
indicate that the predicate failed to match, or a function with the signature
ComplexRendererFn can be returned. For example:
	return OptionalComplexRendererFn(
	       [=](MachineInstrBuilder &MIB) { MIB.addImm(Immed).addImm(ShVal); });
adds two immediate operands to the rendered instruction. Immed and ShVal are
captured from the predicate function.

As an added bonus, this also reduces the amount of information we need to
provide to GIComplexOperandMatcher.

Depends on D31418

Reviewers: aditya_nandakumar, t.p.northover, qcolombet, rovka, ab, javed.absar

Reviewed By: ab

Subscribers: dberris, kristof.beyls, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D31761

llvm-svn: 301079
2017-04-22 15:11:04 +00:00
Matthias Braun
f8901da1df AArch64FrameLowering: Check if the ExtraCSSpill register is actually unused
The code assumed that when saving an additional CSR register
(ExtraCSSpill==true) we would have a free register throughout the
function. This was not true if this CSR register is also used to pass
values as in the swiftself case.

rdar://31451816

llvm-svn: 301057
2017-04-21 22:42:08 +00:00
Hans Wennborg
9099fd3422 Re-commit r301040 "X86: Don't emit zero-byte functions on Windows"
In addition to the original commit, tighten the condition for when to
pad empty functions to COFF Windows.  This avoids running into problems
when targeting e.g. Win32 AMDGPU, which caused test failures when this
was committed initially.

llvm-svn: 301047
2017-04-21 21:48:41 +00:00
Hans Wennborg
c49631b624 Revert r301040 "X86: Don't emit zero-byte functions on Windows"
This broke almost all bots. Reverting while fixing.

llvm-svn: 301041
2017-04-21 21:10:37 +00:00
Hans Wennborg
7edac5718c X86: Don't emit zero-byte functions on Windows
Empty functions can lead to duplicate entries in the Guard CF Function
Table of a binary due to multiple functions sharing the same RVA,
causing the kernel to refuse to load that binary.

We had a terrific bug due to this in Chromium.

It turns out we were already doing this for Mach-O in certain
situations. This patch expands the code for that in
AsmPrinter::EmitFunctionBody() and renames
TargetInstrInfo::getNoopForMachoTarget() to simply getNoop() since it
seems it was used for not just Mach-O anyway.

Differential Revision: https://reviews.llvm.org/D32330

llvm-svn: 301040
2017-04-21 20:58:12 +00:00
Akira Hatanaka
6e48df725c [AArch64] Improve code generation for logical instructions taking
immediate operands.

This commit adds an AArch64 dag-combine that optimizes code generation
for logical instructions taking immediate operands. The optimization
uses demanded bits to change a logical instruction's immediate operand
so that the immediate can be folded into the immediate field of the
instruction.

This recommits r300932 and r300930, which was causing dag-combine to
loop forever. The problem was that optimizeLogicalImm was returning
true even when there was no change to the immediate node (which happened
when the immediate was all zeros or ones), which caused dag-combine to
push and pop the same node to the work list over and over again without
making any progress.

This commit fixes the bug by returning false early in optimizeLogicalImm
if the immediate is all zeros or ones. Also, it changes the code to
compare the immediate with 0 or Mask rather than calling
countPopulation.

rdar://problem/18231627

Differential Revision: https://reviews.llvm.org/D5591

llvm-svn: 301019
2017-04-21 18:53:12 +00:00
Joel Jones
80a708bab0 [AArch64] Refactor instruction selection lowering for addresses. NFCI
Factor out the common code used for generating addresses into common
templated functions that call overloaded versions of a new function,
getTargetNode.

Tested with make check-llvm with targets AArch64.

Differential Revision: https://reviews.llvm.org/D32169

llvm-svn: 301005
2017-04-21 17:31:03 +00:00
Daniel Sanders
a3de070727 [globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule.
Summary:
The SelectionDAG importer now imports rules with Predicate's attached via
Requires, PredicateControl, etc. These predicates are implemented as
bitset's to allow multiple predicates to be tested together. However,
unlike the MC layer subtarget features, each target only pays for it's own
predicates (e.g. AArch64 doesn't have 192 feature bits just because X86
needs a lot).

Both AArch64 and X86 derive at least one predicate from the MachineFunction
or Function so they must re-initialize AvailableFeatures before each
function. They also declare locals in <Target>InstructionSelector so that
computeAvailableFeatures() can use the code from SelectionDAG without
modification.

Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab

Reviewed By: rovka

Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D31418

llvm-svn: 300993
2017-04-21 15:59:56 +00:00
Chad Rosier
097b9bdbd2 [AArch64][Falkor] Refine modeling of store-release exclusive instructions.
llvm-svn: 300987
2017-04-21 14:58:32 +00:00
Chad Rosier
ca140999d4 [AArch64][Falkor] Refine resource needs of STRQ with register offset.
llvm-svn: 300984
2017-04-21 14:33:13 +00:00
Daniel Sanders
0deb184c59 Revert r300964 + r300970 - [globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule.
It's causing llvm-clang-x86_64-expensive-checks-win to fail to compile and I
haven't worked out why. Reverting to make it green while I figure it out.

llvm-svn: 300978
2017-04-21 14:09:20 +00:00
Chad Rosier
633c2c7f8f [AArch64][Falkor] Refine loads/stores that require an extra LD pipe.
llvm-svn: 300976
2017-04-21 13:55:41 +00:00
Chad Rosier
6aaa4fd30c [AArch64][Falkor] Fix number of microops for WriteSTIdx missed in r300892.
llvm-svn: 300975
2017-04-21 13:37:01 +00:00
Chad Rosier
6266ca1e87 [AArch64] Fix a few missed pre/post-inc in Falkor.
llvm-svn: 300974
2017-04-21 13:36:57 +00:00
Daniel Sanders
9e0319164d [globalisel][tablegen] Import SelectionDAG's rule predicates and support the equivalent in GIRule.
Summary:
The SelectionDAG importer now imports rules with Predicate's attached via
Requires, PredicateControl, etc. These predicates are implemented as
bitset's to allow multiple predicates to be tested together. However,
unlike the MC layer subtarget features, each target only pays for it's own
predicates (e.g. AArch64 doesn't have 192 feature bits just because X86
needs a lot).

Both AArch64 and X86 derive at least one predicate from the MachineFunction
or Function so they must re-initialize AvailableFeatures before each
function. They also declare locals in <Target>InstructionSelector so that
computeAvailableFeatures() can use the code from SelectionDAG without
modification.

Reviewers: rovka, qcolombet, aditya_nandakumar, t.p.northover, ab

Reviewed By: rovka

Subscribers: aemerson, rengolin, dberris, kristof.beyls, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D31418

llvm-svn: 300964
2017-04-21 10:27:20 +00:00
Akira Hatanaka
3808b889a6 Revert r300932 and r300930.
It seems that r300930 was creating an infinite loop in dag-combine when
compling the following file:

MultiSource/Benchmarks/MiBench/consumer-typeset/z21.c

llvm-svn: 300940
2017-04-21 01:31:50 +00:00
Akira Hatanaka
2926db7287 [AArch64] Use suffix ULL to shift a 64-bit value.
llvm-svn: 300932
2017-04-21 00:35:27 +00:00
Akira Hatanaka
62264dba17 [AArch64] Improve code generation for logical instructions taking
immediate operands.

This commit adds an AArch64 dag-combine that optimizes code generation
for logical instructions taking immediate operands. The optimization
uses demanded bits to change a logical instruction's immediate operand
so that the immediate can be folded into the immediate field of the
instruction.

This recommits r300913, which broke bots because I didn't fix a call to
ShrinkDemandedConstant in SIISelLowering.cpp after changing the APIs of
TargetLoweringOpt and TargetLowering.

rdar://problem/18231627

Differential Revision: https://reviews.llvm.org/D5591

llvm-svn: 300930
2017-04-21 00:05:16 +00:00
Akira Hatanaka
e8da9bdf75 Revert "[AArch64] Improve code generation for logical instructions taking"
This reverts r300913.

This broke bots.

llvm-svn: 300916
2017-04-20 23:03:30 +00:00
Akira Hatanaka
2966c6087f [AArch64] Improve code generation for logical instructions taking
immediate operands.

This commit adds an AArch64 dag-combine that optimizes code generation
for logical instructions taking immediate operands. The optimization
uses demanded bits to change a logical instruction's immediate operand
so that the immediate can be folded into the immediate field of the
instruction.

rdar://problem/18231627

Differential Revision: https://reviews.llvm.org/D5591

llvm-svn: 300913
2017-04-20 22:47:56 +00:00
Tim Northover
4339e74b9f AArch64: lower "fence singlethread" to a pure compiler barrier.
Single-threaded fences aren't required to provide any synchronization with
other processing elements so there's no need for a DMB. They should still be a
barrier for compiler optimizations though.

llvm-svn: 300905
2017-04-20 21:57:45 +00:00
Chad Rosier
a31719b7de [AArch64] Whitespace/ordering fixes for Falkor machine description. NFC.
llvm-svn: 300893
2017-04-20 21:11:17 +00:00
Chad Rosier
97e4be3cf1 [AArch64] Refine Falkor machine description for pre/post-inc and stores.
llvm-svn: 300892
2017-04-20 21:11:09 +00:00
Chad Rosier
144bd62acc [AArch64] Improve scheduling of logical operations on Falkor.
llvm-svn: 300871
2017-04-20 18:50:21 +00:00
John Brawn
c166bdc7f9 [AArch64] Fix handling of zero immediate in fmov instructions
Currently fmov #0 with a vector destination is handle incorrectly and results in
fmov #-1.9375 being emitted but should instead give an error. This is due to the
way we cope with fmov #0 with a scalar destination being an alias of fmov zr, so
fix this by actually doing it through an alias.

Differential Revision: https://reviews.llvm.org/D31949

llvm-svn: 300830
2017-04-20 10:13:54 +00:00
John Brawn
7a9b2d5f6f [AArch64] Fix handling of integer fp immediates
When an integer is used as an fp immediate we're failing to check the return
value of getFP64Imm, so invalid values are silently permitted. Fix this by
merging together the integer and real handling.

llvm-svn: 300828
2017-04-20 10:10:10 +00:00
Aditya Nandakumar
528682bdac [GISEL]: Move getConstantVReg to Utils
NFCI

llvm-svn: 300751
2017-04-19 20:48:50 +00:00
Kristof Beyls
47f9c68a6b [GlobalISel] Support vector-of-pointers in LLT
This fixes PR32471.

As comment 10 on that bug report highlights
(https://bugs.llvm.org//show_bug.cgi?id=32471#c10), there are quite a
few different defendable design tradeoffs that could be made, including
not representing pointers at all in LLT.

I decided to go for representing vector-of-pointer as a concept in LLT,
while keeping the size of the LLT type 64 bits (this is an increase from
48 bits before). My rationale for keeping pointers explicit is that on
some targets probably it's very handy to have the distinction between
pointer and non-pointer (e.g. 68K has a different register bank for
pointers IIRC). If we keep a scalar pointer, it probably is easiest to
also have a vector-of-pointers to keep LLT relatively conceptually clean
and orthogonal, while we don't have a very strong reason to break that
orthogonality.  Once we gain more experience on the use of LLT, we can
of course reconsider this direction.

Rejecting vector-of-pointer types in the IRTranslator is also an option
to avoid the crash reported in PR32471, but that is only a very
short-term solution; also needs quite a bit of code tweaks in places,
and is probably fragile. Therefore I didn't consider this the best
option.

llvm-svn: 300664
2017-04-19 07:23:57 +00:00
Matt Arsenault
1fd25dc5d0 DAG: Make mayBeEmittedAsTailCall parameter const
llvm-svn: 300603
2017-04-18 21:16:46 +00:00
Craig Topper
debb291669 [APInt] Use lshrInPlace to replace lshr where possible
This patch uses lshrInPlace to replace code where the object that lshr is called on is being overwritten with the result.

This adds an lshrInPlace(const APInt &) version as well.

Differential Revision: https://reviews.llvm.org/D32155

llvm-svn: 300566
2017-04-18 17:14:21 +00:00
Kristof Beyls
1c3a629c41 Revert "[GlobalISel] Support vector-of-pointers in LLT"
This reverts r300535 and r300537.
The newly added tests in test/CodeGen/AArch64/GlobalISel/arm64-fallback.ll
produces slightly different code between LLVM versions being built with different compilers.
E.g., dependent on the compiler LLVM is built with, either one of the following
can be produced:

remark: <unknown>:0:0: unable to legalize instruction: %vreg0<def>(p0) = G_EXTRACT_VECTOR_ELT %vreg1, %vreg2; (in function: vector_of_pointers_extractelement)
remark: <unknown>:0:0: unable to legalize instruction: %vreg2<def>(p0) = G_EXTRACT_VECTOR_ELT %vreg1, %vreg0; (in function: vector_of_pointers_extractelement)

Non-determinism like this is clearly a bad thing, so reverting this until
I can find and fix the root cause of the non-determinism.

llvm-svn: 300538
2017-04-18 09:26:36 +00:00
Kristof Beyls
507db3264b [GlobalISel] Support vector-of-pointers in LLT
This fixes PR32471.

As comment 10 on that bug report highlights
(https://bugs.llvm.org//show_bug.cgi?id=32471#c10), there are quite a
few different defendable design tradeoffs that could be made, including
not representing pointers at all in LLT.

I decided to go for representing vector-of-pointer as a concept in LLT,
while keeping the size of the LLT type 64 bits (this is an increase from
48 bits before). My rationale for keeping pointers explicit is that on
some targets probably it's very handy to have the distinction between
pointer and non-pointer (e.g. 68K has a different register bank for
pointers IIRC). If we keep a scalar pointer, it probably is easiest to
also have a vector-of-pointers to keep LLT relatively conceptually clean
and orthogonal, while we don't have a very strong reason to break that
orthogonality. Once we gain more experience on the use of LLT, we can
of course reconsider this direction.

Rejecting vector-of-pointer types in the IRTranslator is also an option
to avoid the crash reported in PR32471, but that is only a very
short-term solution; also needs quite a bit of code tweaks in places,
and is probably fragile. Therefore I didn't consider this the best
option.

llvm-svn: 300535
2017-04-18 08:12:45 +00:00
Davide Italiano
8ba440ea9a [Target] Use hasOneUse() instead of getNumUses().
The latter does a liner scan over a linked list, therefore is
much more expensive.

llvm-svn: 300518
2017-04-18 00:29:54 +00:00
Tim Northover
754e0805e4 AArch64: put nonlazybind special handling behind a flag for now.
It's basically a terrible idea anyway but objc_msgSend gets emitted like that.
We can decide on a better way to deal with it in the unlikely event that anyone
actually uses it.

llvm-svn: 300474
2017-04-17 18:18:47 +00:00
Konstantin Zhuravlyov
003829fb86 Distinguish between code pointer size and DataLayout::getPointerSize() in DWARF info generation
llvm-svn: 300463
2017-04-17 17:41:25 +00:00
Tim Northover
1091adce40 AArch64: support nonlazybind
It's almost certainly not a good idea to actually use it in most cases (there's
a pretty large code size overhead on AArch64), but we can't do those
experiments until it's supported.

llvm-svn: 300462
2017-04-17 17:27:56 +00:00
Andrew V. Tischenko
b1cf1f0925 This patch closes PR#32216: Better testing of schedule model instruction latencies/throughputs.
The details are here: https://reviews.llvm.org/D30941

llvm-svn: 300311
2017-04-14 07:44:23 +00:00
Adam Nemet
83409b8174 [AArch64] Avoid partial register writes on lane 0 of BUILD_VECTOR for i8/i16/f16
This further improves Ahmed's change in rL299482.  See the new comment for the
rationale.

The patch recovers most of the regression for bzip2 after D31965. We're down
to +2.68% from +6.97%.

Differential Revision: https://reviews.llvm.org/D32028

llvm-svn: 300276
2017-04-13 23:32:47 +00:00
Jonas Paulsson
90b172efa0 [SystemZ] TargetTransformInfo cost functions implemented.
getArithmeticInstrCost(), getShuffleCost(), getCastInstrCost(),
getCmpSelInstrCost(), getVectorInstrCost(), getMemoryOpCost(),
getInterleavedMemoryOpCost() implemented.

Interleaved access vectorization enabled.

BasicTTIImpl::getCastInstrCost() improved to check for legal extending loads,
in which case the cost of the z/sext instruction becomes 0.

Review: Ulrich Weigand, Renato Golin.
https://reviews.llvm.org/D29631

llvm-svn: 300052
2017-04-12 11:49:08 +00:00
Balaram Makam
81c7716817 [AArch64] Fix scheduling info for INS(vector, general) instruction.
llvm-svn: 299994
2017-04-11 22:14:10 +00:00
Evandro Menezes
b91f66143d [AArch64] Simplify MacroFusion
This patch assumes that the dependents to be scanned for the ExitSU are its
predecessors; otherwise, the successors of the instr are scanned.

Furthermore, sometimes the ExitSU was being fused twice, since it may be
fused once when scanning the successors from the beginning of the BB and
then again when scanning the predecessors of ExitSU.  Thus, when scanning
the successors of an instr, skip the ExitSU.

llvm-svn: 299974
2017-04-11 19:13:11 +00:00
Matthew Simpson
ad553a16e2 [ARM/AArch64] Ensure valid vector element types for interleaved accesses
This patch refactors and strengthens the type checks performed for interleaved
accesses. The primary functional change is to ensure that the interleaved
accesses have valid element types. The added test cases previously failed
because the element type is f128.

Differential Revision: https://reviews.llvm.org/D31817

llvm-svn: 299864
2017-04-10 18:34:37 +00:00
Balaram Makam
e15b94c10a [AArch64] Refine Falkor Machine Model - Part 3
This concludes the refinements to Falkor Machine Model.
  It includes SchedPredicates for immediate zero and LSL Fast.
  Forwarding logic is also modeled for vector multiply and
  accumulate only.

llvm-svn: 299810
2017-04-08 03:30:15 +00:00
Petr Hosek
d89c86a2d9 [AArch64] Allow global register asm("x18") or asm("w18") under -ffixed-x18
When using -ffixed-x18, the x18 (or w18) register can safely be used
with the "global register variable" GCC extension, but the backend
fails to recognize it.

Patch by Roland McGrath.

Differential Revision: https://reviews.llvm.org/D31793

llvm-svn: 299799
2017-04-07 20:41:58 +00:00