1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 02:52:53 +02:00
Commit Graph

19555 Commits

Author SHA1 Message Date
Rafael Espindola
2c21fe4650 Stop producing .data.rel sections.
If a section is rw, it is irrelevant if the dynamic linker will write to
it or not.

It looks like llvm implemented this because gcc was doing it. It looks
like gcc implemented this in the hope that it would put all the
relocated items close together and speed up the dynamic linker.

There are two problem with this:
* It doesn't work. Both bfd and gold will map .data.rel to .data and
  concatenate the input sections in the order they are seen.
* If we want a feature like that, it can be implemented directly in the
  linker since it knowns where the dynamic relocations are.

llvm-svn: 253436
2015-11-18 06:02:15 +00:00
Cong Hou
db19e7bb44 Remove a redundant assertion in MachineBasicBlock.cpp. NFC.
llvm-svn: 253426
2015-11-18 01:55:56 +00:00
Cong Hou
cc74715856 Remove redundant code in MachineBasicBlock.cpp. NFC.
llvm-svn: 253425
2015-11-18 01:45:10 +00:00
Cong Hou
f11a4c60a1 Improving edge probabilities computation when choosing the best successor in machine block placement.
When looking for the best successor from the outer loop for a block
belonging to an inner loop, the edge probability computation can be
improved so that edges in the inner loop are ignored. For example,
suppose we are building chains for the non-loop part of the following
code, and looking for B1's best successor. Assume the true body is very
hot, then B3 should be the best candidate. However, because of the
existence of the back edge from B1 to B0, the probability from B1 to B3
can be very small, preventing B3 to be its successor. In this patch, when
computing the probability of the edge from B1 to B3, the weight on the
back edge B1->B0 is ignored, so that B1->B3 will have 100% probability.

if (...)
  do {
    B0;
    ... // some branches
    B1;
  } while(...);
else
  B2;
B3;


Differential revision: http://reviews.llvm.org/D10825

llvm-svn: 253414
2015-11-18 00:52:52 +00:00
David Blaikie
133cb3aa66 Generalize ownership/passing semantics to allow dsymutil to own abbreviations via unique_ptr
While still allowing CodeGen/AsmPrinter in llvm to own them using a bump
ptr allocator. (might be nice to replace the pointers there with
something that at least automatically calls their dtors, if that's
necessary/useful, rather than having it done explicitly (I think a typed
BumpPtrAllocator already does this, or maybe a unique_ptr with a custom
deleter, etc))

llvm-svn: 253409
2015-11-18 00:34:10 +00:00
Reid Kleckner
00daa6cd20 [WinEH] Move WinEHFuncInfo from MachineModuleInfo to MachineFunction
Summary:
Now that there is a one-to-one mapping from MachineFunction to
WinEHFuncInfo, we don't need to use a DenseMap to select the right
WinEHFuncInfo for the current funclet.

The main challenge here is that X86WinEHStatePass is an IR pass that
doesn't have access to the MachineFunction. I gave it its own
WinEHFuncInfo object that it uses to calculate state numbers, which it
then throws away. As long as nobody creates or removes EH pads between
this pass and SDAG construction, we will get the same state numbers.

The other thing X86WinEHStatePass does is to mark the EH registration
node. Instead of communicating which alloca was the registration through
WinEHFuncInfo, I added the llvm.x86.seh.ehregnode intrinsic.  This
intrinsic generates no code and simply marks the alloca in use.

Reviewers: JCTremoulet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14668

llvm-svn: 253378
2015-11-17 21:10:25 +00:00
Pat Gavlin
189930fea1 Lower statepoints with multi-def targets.
Statepoint lowering currently expects that the target method of a
statepoint only defines a single value. This precludes using
statepoints with ABIs that return values in multiple registers
(e.g. the SysV AMD64 ABI). This change adds support for lowering
statepoints with mutli-def targets.

llvm-svn: 253339
2015-11-17 16:04:21 +00:00
Dan Gohman
3ee3e60f77 Use TargetRegisterInfo for printing MachineOperand register comments
Several places in AsmPrinter.cpp print comments describing MachineOperand
registers using MCRegisterInfo, which uses MCOperand-oriented names. This
doesn't work for targets that use virtual registers exclusively, as
WebAssembly does, since virtual registers are represented and printed
differently.

This patch preserves what seems to be the spirit of r229978, avoiding the
use of TM.getSubtargetImpl(), while still using MachineOperand-oriented
printing for MachineOperands.

Differential Revision: http://reviews.llvm.org/D14709

llvm-svn: 253338
2015-11-17 16:01:28 +00:00
Rafael Espindola
47008fdea7 Drop prelink support.
The way prelink used to work was

* The compiler decides if a given section only has relocations that
are know to point to the same DSO. If so, it names it
.data.rel.ro.local<something>.
* The static linker puts all of these together.
* The prelinker program assigns addresses to each library and resolves
the local relocations.

There are many problems with this:
* It is incompatible with address space randomization.
* The information passed by the compiler is redundant. The linker
knows if a given relocation is in the same DSO or not. If could sort
by that if so desired.
* There are newer ways of speeding up DSO (gnu hash for example).
* Even if we want to implement this again in the compiler, the previous
  implementation is pretty broken. It talks about relocations that are
  "resolved by the static linker". If they are resolved, there are none
  left for the prelinker. What one needs to track is if an expression
  will require only dynamic relocations that point to the same DSO.

At this point it looks like the prelinker is an historical curiosity.
For example, fedora has retired it because it failed to build for two
releases
(http://pkgs.fedoraproject.org/cgit/prelink.git/commit/?id=eb43100a8331d91c801ee3dcdb0a0bb9babfdc1f)

This patch removes support for it. That is, it stops printing the
".local" sections.

llvm-svn: 253280
2015-11-17 00:51:23 +00:00
Matthias Braun
088a172bf6 Assume lane masks are always precise
Allowing imprecise lane masks in case of more than 32 sub register lanes
lead to some tricky corner cases, and I need another bugfix for another
one. Instead I rather declare lane masks as precise and let tablegen
abort if we do not have enough bits.

This does not affect any in-tree target, even AMDGPU only needs 16 lanes
at the moment. If the 32 lanes turn out to be a problem in the future,
then we can easily change the LaneBitmask typedef to uint64_t.

Differential Revision: http://reviews.llvm.org/D14557

llvm-svn: 253279
2015-11-17 00:50:55 +00:00
Keno Fischer
0118e7cd94 [DIBuilder] Make createReferenceType take size and align
Summary: Since we're passing references to dbg.value as pointers,
we need to have the frontend properly declare their sizes and
alignments (as it already does for regular pointers) in preparation
for my upcoming patch to have the verifer check that the sizes agree.

Also augment the backend logic that skips actually emitting this
information into DWARF such that it also handles reference types.

Reviewers: aprantl, dexonsmith, dblaikie

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D14275

llvm-svn: 253186
2015-11-16 07:57:32 +00:00
Akira Hatanaka
0fb89f87e0 Reduce the size of MCRelaxableFragment.
MCRelaxableFragment previously kept a copy of MCSubtargetInfo and
MCInst to enable re-encoding the MCInst later during relaxation. A copy
of MCSubtargetInfo (instead of a reference or pointer) was needed
because the feature bits could be modified by the parser.

This commit replaces the MCSubtargetInfo copy in MCRelaxableFragment
with a constant reference to MCSubtargetInfo. The copies of
MCSubtargetInfo are kept in MCContext, and the target parsers are now
responsible for asking MCContext to provide a copy whenever the feature
bits of MCSubtargetInfo have to be toggled.
 
With this patch, I saw a 4% reduction in peak memory usage when I
compiled verify-uselistorder.lto.bc using llc.

rdar://problem/21736951

Differential Revision: http://reviews.llvm.org/D14346

llvm-svn: 253127
2015-11-14 06:35:56 +00:00
Quentin Colombet
a869c64da5 [ShrinkWrapping] Disable the optimization for functions with sanitize like
attribute.

Even if the target supports shrink-wrapping, the prologue and epilogue
must not move because a crash can happen anywhere and sanitizers need
to be able to unwind from the PC of the crash.

llvm-svn: 253116
2015-11-14 01:55:17 +00:00
Matthias Braun
59fc898d10 MachineScheduler: Print initial pressure in debug dump
llvm-svn: 253097
2015-11-13 22:30:31 +00:00
Matthias Braun
b2e09e4430 MachineScheduler: Improve debug output for "only one node in readyset"
When there is only 1 node left in the ready queue and it is picked call
the reason "ONLY1" instead of "NOCAND".

llvm-svn: 253096
2015-11-13 22:30:29 +00:00
James Molloy
bbd6827ba1 [SDAG] Fix expansion of BITREVERSE
Richard Trieu noted that UBSan detected an overflowing shift, and the obvious fix caused a crash.

What was happening was that the shiftee (1U) was indeed too small for the possible range of shifts it had to handle, but also we were using "VT.getSizeInBits()" to get the maximum type bitwidth, but we wanted "VT.getScalarSizeInBits()" to get the vector lane size instead of the entire vector size.

Use an APInt for the shift and VT.getScalarSizeInBits().

llvm-svn: 253023
2015-11-13 10:02:36 +00:00
Sanjoy Das
1bbbcc7ad0 [ImplicitNulls] Add some clarifying comments; NFC
llvm-svn: 253020
2015-11-13 08:14:00 +00:00
Tom Stellard
375168229e Revert "Remove unnecessary call to getAllocatableRegClass"
This reverts commit r252565.

This also includes the revert of the commit mentioned below in order to
avoid breaking tests in AMDGPU:

Revert "AMDGPU: Set isAllocatable = 0 on VS_32/VS_64"

This reverts commit r252674.

llvm-svn: 252956
2015-11-12 21:43:25 +00:00
Sanjoy Das
154f32f3a8 [ImplicitNulls] Fix wrapping by breaking up a condition, NFC
llvm-svn: 252947
2015-11-12 20:51:49 +00:00
Sanjoy Das
3c4432b86c [ImplicitNull] Extract out a HazardDetector class, NFC
This will make later functional changes easier to follow.

llvm-svn: 252946
2015-11-12 20:51:44 +00:00
Quentin Colombet
33ea7dc043 [ShrinkWrap] Fix a typo in a comment.
llvm-svn: 252918
2015-11-12 18:16:27 +00:00
Quentin Colombet
fd63272890 [ShrinkWrap] Make sure we do not mess up with EH funclet lowering.
ShrinkWrapping does not understand exception handling constraints for now, so
make sure we do not mess with them by aborting on functions that use EH
funclets.

llvm-svn: 252917
2015-11-12 18:13:42 +00:00
Andrew Kaylor
084c26f59e [WinEH] Fix problem with removing an element from a SetVector while iterating.
Patch provided by Yaron Keren. (Thanks!)

llvm-svn: 252913
2015-11-12 17:36:03 +00:00
James Molloy
c197f50184 [SDAG] Introduce a new BITREVERSE node along with a corresponding LLVM intrinsic
Several backends have instructions to reverse the order of bits in an integer. Conceptually matching such patterns is similar to @llvm.bswap, and it was mentioned in http://reviews.llvm.org/D14234 that it would be best if these patterns were matched in InstCombine instead of reimplemented in every different target.

This patch introduces an intrinsic @llvm.bitreverse.i* that operates similarly to @llvm.bswap. For plumbing purposes there is also a new ISD node ISD::BITREVERSE, with simple expansion and promotion support.

The intention is that InstCombine's BSWAP detection logic will be extended to support BITREVERSE too, and @llvm.bitreverse intrinsics emitted (if the backend supports lowering it efficiently).

llvm-svn: 252878
2015-11-12 12:29:09 +00:00
Matthias Braun
f78978fb25 LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization
- Factor out code to query and modify the sign bit of a floatingpoint
  value as an integer. This also works if none of the targets integer
  types is big enough to hold all bits of the floatingpoint value.

- Legalize FABS(x) as FCOPYSIGN(x, 0.0) if FCOPYSIGN is available,
  otherwise perform bit manipulation on the sign bit. The previous code
  used "x >u 0 ? x : -x" which is incorrect for x being -0.0! It also
  takes 34 instructions on ARM Cortex-M4. With this patch we only
  require 5:
    vldr d0, LCPI0_0
    vmov r2, r3, d0
    lsrs r2, r3, #31
    bfi r1, r2, #31, #1
    bx lr
  (This could be further improved if the compiler would recognize that
   r2, r3 is zero).

- Only lower FCOPYSIGN(x, y) = sign(x) ? -FABS(x) : FABS(x) if FABS is
  available otherwise perform bit manipulation on the sign bit.

- Perform the sign(x) test by masking out the sign bit and comparing
  with 0 rather than shifting the sign bit to the highest position and
  testing for "<s 0". For x86 copysignl (on 80bit values) this gets us:
    testl $32768, %eax
  rather than:
    shlq $48, %rax
    sets %al
    testb %al, %al

Differential Revision: http://reviews.llvm.org/D11172

llvm-svn: 252839
2015-11-12 01:02:47 +00:00
Reid Kleckner
9b299cbd1c [WinEH] Don't forward branches across empty EH pad BBs
For really simple SEH catchpads, we tried to forward the invoke unwind
edge across the empty block.

llvm-svn: 252822
2015-11-11 23:09:31 +00:00
Geoff Berry
b6376cdd2e [DAGCombiner] Improve zextload optimization.
Summary:
Don't fold
  (zext (and (load x), cst)) -> (and (zextload x), (zext cst))
if
  (and (load x) cst)
will match as a zextload already and has additional users.

For example, the following IR:

  %load = load i32, i32* %ptr, align 8
  %load16 = and i32 %load, 65535
  %load64 = zext i32 %load16 to i64
  store i32 %load16, i32* %dst1, align 4
  store i64 %load64, i64* %dst2, align 8

used to produce the following aarch64 code:

	ldr		w8, [x0]
	and	w9, w8, #0xffff
	and	x8, x8, #0xffff
	str		w9, [x1]
	str		x8, [x2]

but with this change produces the following aarch64 code:

	ldrh		w8, [x0]
	str		w8, [x1]
	str		x8, [x2]

Reviewers: resistor, mcrosier

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14340

llvm-svn: 252789
2015-11-11 19:42:52 +00:00
Matt Arsenault
18946e0a04 Add target preference for GatherAllAliases max depth
llvm-svn: 252775
2015-11-11 18:44:33 +00:00
Dehao Chen
beb2088014 clang-format lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
llvm-svn: 252769
2015-11-11 18:09:47 +00:00
Dehao Chen
96e036f1dd Emit discriminator for inlined callsites.
Summary: Inlined callsites need to be emitted in debug info so that sample profile can be annotated to the correct inlined instance.

Reviewers: dnovillo, dblaikie

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D14511

llvm-svn: 252768
2015-11-11 18:08:18 +00:00
Matthias Braun
2e3ca91011 MachineInstr: addRegisterDefReadUndef() => setRegisterDefReadUndef()
This way we can not only add but also remove read undef flags.

llvm-svn: 252678
2015-11-11 00:41:58 +00:00
Matthias Braun
05ca020506 TableGen: Emit LaneMask for register classes without subregisters as ~0u
This makes it slightly easier to handle classes with and without
subregister uniformly.

llvm-svn: 252671
2015-11-10 23:23:05 +00:00
Sanjay Patel
76de0627f7 less indent; NFCI
llvm-svn: 252643
2015-11-10 20:09:02 +00:00
Matt Arsenault
35ac43f257 LegalizeDAG: Implement promote for scalar_to_vector
This allows avoiding the default Expand behavior which
introduces stack usage. Bitcast the scalar and replace
the missing elements with undef.

This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.

llvm-svn: 252632
2015-11-10 18:48:11 +00:00
Matt Arsenault
40270b244e LegalizeDAG: Implement promote for insert_vector_elt
This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.

llvm-svn: 252631
2015-11-10 18:48:08 +00:00
Matt Arsenault
bf0861c559 LegalizeDAG: Implement promote for extract_vector_elt
This is for AMDGPU to implement v2i64 extract as extract of
half of a v4i32.

This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.

llvm-svn: 252630
2015-11-10 18:48:04 +00:00
Sanjay Patel
da6034c9d4 add 'MustReduceDepth' as an objective/cost-metric for the MachineCombiner
This is one of the problems noted in PR25016:
https://llvm.org/bugs/show_bug.cgi?id=25016
and:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/090998.html

The spilling problem is independent and not addressed by this patch.

The MachineCombiner was doing reassociations that don't improve or even worsen the critical path. 
This is caused by inclusion of the "slack" factor when calculating the critical path of the original
code sequence. If we don't add that, then we have a more conservative cost comparison of the old code
sequence vs. a new sequence. The more liberal calculation must be preserved, however, for the AArch64
MULADD patterns because benchmark regressions were observed without that.

The two failing test cases now have identical asm that does what we want:
a + b + c + d ---> (a + b) + (c + d)

Differential Revision: http://reviews.llvm.org/D13417

llvm-svn: 252616
2015-11-10 16:48:53 +00:00
Andy Ayers
abf4852bd1 Support for emitting inline stack probes
For CoreCLR on Windows, stack probes must be emitted as inline sequences that probe successive stack pages
between the current stack limit and the desired new stack pointer location. This implements support for
the inline expansion on x64.

For in-body alloca probes, expansion is done during instruction lowering. For prolog probes, a stub call
is initially emitted during prolog creation, and expanded after epilog generation, to avoid complications
that arise when introducing new machine basic blocks during prolog and epilog creation.

Added a new test case, modified an existing one to exclude non-x64 coreclr (for now).

Add test case

Fix tests

llvm-svn: 252578
2015-11-10 01:50:49 +00:00
Matt Arsenault
84707b59d1 Remove unnecessary call to getAllocatableRegClass
I'm not sure what the point of this was. I'm not sure why
you would ever define an instruction that produces an unallocatable
register class. No tests fail with this removed, and it seems like
it should be a verifier error to define such an instruction.

This was problematic for AMDGPU because it would make bad decisions
by arbitrarily changing the register class when unsetting isAllocatable
for VS_32/VS_64, which is currently set as a workaround to this problem.

AMDGPU uses the VS_32/VS_64 register classes to represent operands which
can use either VGPRs or SGPRs. When  isAllocatable is unset for these,
this would need to pick  either the SGPR or VGPR class and insert either
a copy we don't want, or an illegal copy we would need to deal with
later. A semi-arbitrary register class ordering decision is made in tablegen,
which resulted in always picking a VGPR class because it happens to have
more registers than the SGPR register class. We really just want to
use whatever register class the original register had.

llvm-svn: 252565
2015-11-10 00:30:14 +00:00
Matthias Braun
4741d0521e MachineVerifier: Streamline live interval related error reporting
Simply perform additional report_context() calls after a report()
instead of adding more and more overloaded variations of report().  Also
improve several instances where information was output in an ad-hoc way
probably because no matching report() overload was available.

llvm-svn: 252552
2015-11-09 23:59:33 +00:00
Matthias Braun
12671c208c MachineVerifier: Add missing linebreak
MachineInstr::print() with SkipOppers==true does not produce a
linebreak, so we have to do that in MachineVerifier::report().

llvm-svn: 252551
2015-11-09 23:59:29 +00:00
Matthias Braun
a06515f6f6 MachineVerifier: MI::print has no TargetMachine overload
The code was passing a target machine pointer which degraded to a true
operand to SkipOppers.

llvm-svn: 252550
2015-11-09 23:59:25 +00:00
Matthias Braun
8a14eebeb0 MachineVerifier: print list of live intervals if available
llvm-svn: 252549
2015-11-09 23:59:23 +00:00
Sanjay Patel
a712e76470 add a SelectionDAG method to check if no common bits are set in two nodes; NFCI
This was suggested in:
http://reviews.llvm.org/D13956

and is a follow-on to:
http://reviews.llvm.org/rL252515
http://reviews.llvm.org/rL252519

This lets us remove logically equivalent/duplicated code from DAGCombiner and X86ISelDAGToDAG.

A corresponding function for IR instructions already exists in ValueTracking.

llvm-svn: 252539
2015-11-09 23:31:38 +00:00
David Majnemer
624c46d055 [WinEH] Don't emit CATCHRET from visitCatchPad
Instead, emit a CATCHPAD node which will get selected to a target
specific sequence.

llvm-svn: 252528
2015-11-09 23:07:48 +00:00
Reid Kleckner
f7f2cdef8b [WinEH] Tweak funclet prologue/epilogue insertion to pass verifier
For some reason we'd never run MachineVerifier on WinEH code, and you
explicitly have to ask for it with llc. I added it to a few test cases
to get some coverage.

Fixes PR25461.

llvm-svn: 252512
2015-11-09 21:04:00 +00:00
Andrew Kaylor
766dfcbfcc [WinEH] Re-committing r252249 (Clone funclets with multiple parents) with additional fixes for determinism problems
Differential Revision: http://reviews.llvm.org/D14454

llvm-svn: 252508
2015-11-09 19:59:02 +00:00
Oliver Stannard
62d3e33afa [CodeGen] Always promote f16 if not legal
We don't currently have any runtime library functions for operations on
f16 values (other than conversions to and from f32 and f64), so we
should always promote it to f32, even if that is not a legal type. In
that case, the f32 values would be softened to f32 library calls.

SoftenFloatRes_FP_EXTEND now needs to check the promoted operand's type,
as it may ne a no-op or require a different library call.

getCopyFromParts and getCopyToParts now need to cope with a
floating-point value stored in a larger integer part, as is the case for
any target that needs to store an f16 value in a 32-bit integer
register.

Differential Revision: http://reviews.llvm.org/D12856

llvm-svn: 252459
2015-11-09 11:03:18 +00:00
Yaron Keren
08120f8736 Erase unused FunctionDIs variables after r252219.
llvm-svn: 252401
2015-11-07 10:21:25 +00:00
Joseph Tremoulet
6a3a04f00f [WinEH] Update exception pointer registers
Summary:
The CLR's personality routine passes these in rdx/edx, not rax/eax.

Make getExceptionPointerRegister a virtual method parameterized by
personality function to allow making this distinction.

Similarly make getExceptionSelectorRegister a virtual method parameterized
by personality function, for symmetry.


Reviewers: pgavlin, majnemer, rnk

Subscribers: jyknight, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D14344

llvm-svn: 252383
2015-11-07 01:11:31 +00:00