1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-28 14:32:51 +01:00
Commit Graph

14199 Commits

Author SHA1 Message Date
Nadav Rotem
12105d6078 In SimplifySelectOps we pulled two loads through a select node despite the fact that one was dependent on the other.
rdar://12513091

llvm-svn: 166196
2012-10-18 18:06:48 +00:00
Bob Wilson
b6adb70bdd Temporarily revert the TargetTransform changes.
The TargetTransform changes are breaking LTO bootstraps of clang.  I am
working with Nadav to figure out the problem, but I am reverting it for now
to get our buildbots working.

This reverts svn commits: 165665 165669 165670 165786 165787 165997
and I have also reverted clang svn 165741

llvm-svn: 166168
2012-10-18 05:43:52 +00:00
Michael Liao
f58d16a933 Revert part of r166049 back and enable test case in r166125.
- Folding (trunc (concat ... X )) to (concat ... (trunc X) ...) is valid
  when '...' are all 'undef's.
- r166125 relies on this transformation.

llvm-svn: 166155
2012-10-17 23:45:54 +00:00
Michael Liao
3accf514d6 Revert r166049
- In general, it's unsafe for this transformation.

llvm-svn: 166135
2012-10-17 22:41:15 +00:00
Michael Liao
b168cd6995 Teach DAG combine to fold (extract_subvec (concat v1, ..) i) to v_i
- If the extracted vector has the same type of all vectored being concatenated
  together, it should be simplified directly into v_i, where i is the index of
  the element being extracted.

llvm-svn: 166125
2012-10-17 20:48:33 +00:00
Jakob Stoklund Olesen
1eec126712 Switch MRI::UsedPhysRegs to a register unit bit vector.
This is a more compact, less redundant representation, and it avoids
scanning long lists of aliases for ARM D-registers, for example.

llvm-svn: 166124
2012-10-17 20:26:33 +00:00
Evan Cheng
b5e95007fe Add a really faster pre-RA scheduler (-pre-RA-sched=linearize). It doesn't use
any scheduling heuristics nor does it build up any scheduling data structure
that other heuristics use. It essentially linearize by doing a DFA walk but
it does handle glues correctly.

IMPORTANT: it probably can't handle all the physical register dependencies so
it's not suitable for x86. It also doesn't deal with dbg_value nodes right now
so it's definitely is still WIP.

rdar://12474515

llvm-svn: 166122
2012-10-17 19:39:36 +00:00
Jakob Stoklund Olesen
19bfbc3745 Merge MRI::isPhysRegOrOverlapUsed() into isPhysRegUsed().
All callers of these functions really want the isPhysRegOrOverlapUsed()
functionality which also checks aliases. For historical reasons, targets
without register aliases were calling isPhysRegUsed() instead.

Change isPhysRegUsed() to also check aliases, and switch all
isPhysRegOrOverlapUsed() callers to isPhysRegUsed().

llvm-svn: 166117
2012-10-17 18:44:18 +00:00
Andrew Trick
9422ce72bc misched: Better handling of invalid latencies in the machine model
llvm-svn: 166107
2012-10-17 17:27:10 +00:00
Jakob Stoklund Olesen
bd6db6eeb9 Use a SparseSet instead of a BitVector for UsedInInstr in RAFast.
This is just as fast, and it makes it possible to avoid leaking the
UsedPhysRegs BitVector implementation through
MachineRegisterInfo::addPhysRegsUsed().

llvm-svn: 166083
2012-10-17 01:37:59 +00:00
Jakob Stoklund Olesen
96ecdfd8dd Avoid rematerializing a redef immediately after the old def.
PR14098 contains an example where we would rematerialize a MOV8ri
immediately after the original instruction:

  %vreg7:sub_8bit<def> = MOV8ri 9; GR32_ABCD:%vreg7
  %vreg22:sub_8bit<def> = MOV8ri 9; GR32_ABCD:%vreg7

Besides being pointless, it is also wrong since the original instruction
only redefines part of the register, and the value read by the new
instruction is wrong.

The problem was the LiveRangeEdit::allUsesAvailableAt() didn't
special-case OrigIdx == UseIdx and found the wrong SSA value.

llvm-svn: 166068
2012-10-16 22:51:58 +00:00
Jakob Stoklund Olesen
1cfbe5c549 Revert r166046 "Switch back to the old coalescer for now to fix the 32 bit bit"
A fix for PR14098, including the test case is in the next commit.

llvm-svn: 166067
2012-10-16 22:51:55 +00:00
Michael Liao
ee2ce36cda Teach DAG combine to fold (trunc (fptoXi x)) to (fptoXi x)
llvm-svn: 166049
2012-10-16 19:38:35 +00:00
Rafael Espindola
2f08719190 Switch back to the old coalescer for now to fix the 32 bit bit
llvm+clang+compiler-rt bootstrap.

llvm-svn: 166046
2012-10-16 19:34:06 +00:00
Stepan Dyatkovskiy
09c6b0a273 Issue:
Stack is formed improperly for long structures passed as byval arguments for
EABI mode.

If we took AAPCS reference, we can found the next statements:

A: "If the argument requires double-word alignment (8-byte), the NCRN (Next
Core Register Number) is rounded up to the next even register number." (5.5
Parameter Passing, Stage C, C.3).

B: "The alignment of an aggregate shall be the alignment of its most-aligned
component." (4.3 Composite Types, 4.3.1 Aggregates).

So if we have structure with doubles (9 double fields) and 3 Core unused
registers (r1, r2, r3): caller should use r2 and r3 registers only.
Currently r1,r2,r3 set is used, but it is invalid.

Callee VA routine should also use r2 and r3 regs only. All is ok here. This
behaviour is guessed by rounding up SP address with ADD+BFC operations.

Fix:
Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and
8 byte alignment, we waste odd registers then.

P.S.:
I also improved LDRB_POST_IMM regression test. Since ldrb instruction will
not generated by current regression test after this patch. 

llvm-svn: 166018
2012-10-16 07:16:47 +00:00
Andrew Trick
af9fb59623 misched: Added handleMove support for updating all kill flags, not just for allocatable regs.
This is a medium term workaround until we have a more robust solution
in the form of a register liveness utility for postRA passes.

llvm-svn: 166001
2012-10-16 00:22:51 +00:00
Jakob Stoklund Olesen
bf5a17c340 Remove unused BitVectors from getAllocatableSet().
llvm-svn: 165999
2012-10-16 00:05:06 +00:00
Jakob Stoklund Olesen
c808be0c56 Remove RegisterClassInfo::isReserved() and isAllocatable().
Clients can use the equivalent functions in MRI.

llvm-svn: 165990
2012-10-15 22:41:03 +00:00
Jakob Stoklund Olesen
bde4d183c1 Remove LIS::isAllocatable() and isReserved() helpers.
All callers can simply use the corresponding MRI functions.

llvm-svn: 165985
2012-10-15 22:14:34 +00:00
Jakob Stoklund Olesen
56bb584754 Switch most getReservedRegs() clients to the MRI equivalent.
Using the cached bit vector in MRI avoids comstantly allocating and
recomputing the reserved register bit vector.

llvm-svn: 165983
2012-10-15 21:57:41 +00:00
Jakob Stoklund Olesen
677503ea4e Freeze the reserved registers as soon as isel is complete.
Also provide an MRI::getReservedRegs() function to access the frozen
register set, and isReserved() and isAllocatable() methods to test
individual registers.

The various implementations of TRI::getReservedRegs() are quite
complicated, and many passes need to look at the reserved register set.
This patch makes it possible for these passes to use the cached copy in
MRI, avoiding a lot of malloc traffic and repeated calculations.

llvm-svn: 165982
2012-10-15 21:33:06 +00:00
Bill Wendling
7a89835ee4 Move the Attributes::Builder outside of the Attributes class and into its own class named AttrBuilder. No functionality change.
llvm-svn: 165960
2012-10-15 20:35:56 +00:00
Rafael Espindola
03c701aad3 Make sure we iterate over newly created instructions. Fixes pr13625. Testcase to
follow in one sec.

llvm-svn: 165951
2012-10-15 18:21:07 +00:00
Andrew Trick
a5e2aeb12b misched: ILP scheduler for experimental heuristics.
llvm-svn: 165950
2012-10-15 18:02:27 +00:00
Micah Villmow
272663afc2 Resubmit the changes to llvm core to update the functions to support different pointer sizes on a per address space basis.
llvm-svn: 165941
2012-10-15 16:24:29 +00:00
Bill Wendling
ea1286d8bf Remove the bitwise XOR operator from the Attributes class. Replace it with the equivalent from the builder class.
llvm-svn: 165893
2012-10-14 06:56:13 +00:00
Jakob Stoklund Olesen
4aac24404c Drop <def,dead> flags when merging into an unused lane.
The new coalescer can merge a dead def into an unused lane of an
otherwise live vector register.

Clear the <dead> flag when that happens since the flag refers to the
full virtual register which is still live after the partial dead def.

This fixes PR14079.

llvm-svn: 165877
2012-10-13 17:26:47 +00:00
Jakob Stoklund Olesen
533711462c Allow for loops in LiveIntervals::pruneValue().
It is possible that the live range of the value being pruned loops back
into the kill MBB where the search started. When that happens, make sure
that the beginning of KillMBB is also pruned.

Instead of starting a DFS at KillMBB and skipping the root of the
search, start a DFS at each KillMBB successor, and allow the search to
loop back to KillMBB.

This fixes PR14078.

llvm-svn: 165872
2012-10-13 16:15:31 +00:00
Jakob Stoklund Olesen
74661b859d Use a transposed algorithm for handleMove().
Completely update one interval at a time instead of collecting live
range fragments to be updated. This avoids building data structures,
except for a single SmallPtrSet of updated intervals.

Also share code between handleMove() and handleMoveIntoBundle().

Add support for moving dead defs across other live values in the
interval. The MI scheduler can do that.

llvm-svn: 165824
2012-10-12 21:31:57 +00:00
Jakob Stoklund Olesen
8a4d65228b Fix coalescing with IMPLICIT_DEF values.
PHIElimination inserts IMPLICIT_DEF instructions to guarantee that all
PHI predecessors have a live-out value. These IMPLICIT_DEF values are
not considered to be real interference when coalescing virtual
registers:

  %vreg1 = IMPLICIT_DEF
  %vreg2 = MOV32r0

When joining %vreg1 and %vreg2, the IMPLICIT_DEF instruction and its
value number should simply be erased since the %vreg2 value number now
provides a live-out value for the PHI predecesor block.

llvm-svn: 165813
2012-10-12 18:03:04 +00:00
Ulrich Weigand
dd9a6100a0 Fix big-endian codegen bug in DAGTypeLegalizer::ExpandRes_BITCAST
On PowerPC, a bitcast of <16 x i8> to i128 may run through a code
path in ExpandRes_BITCAST that attempts to do an intermediate
bitcast to a <4 x i32> vector, and then construct the Hi and Lo parts
of the resulting i128 by pairing up two of those i32 vector elements
each.  The code already recognizes that on a big-endian system, the
first two vector elements form the Hi part, and the final two vector
elements form the Lo part (vice-versa from the little-endian situation).

However, we also need to take endianness into account when forming each
of those separate pairs:  on a big-endian system, vector element 0 is
the *high* part of the pair making up the Hi part of the result, and
vector element 1 is the low part of the pair.  The code currently always
uses vector element 0 as the low part and vector element 1 as the high
part, as is appropriate for little-endian platforms only.

This patch fixes this by swapping the vector elements as they are
paired up as appropriate.

llvm-svn: 165802
2012-10-12 15:42:58 +00:00
Evan Cheng
76b896ec91 Legalizer optimize a pair of div / mod to a call to divrem libcall if they are
not legal. However, it should use a div instruction + mul + sub if divide is
legal. The rem legalization code was missing a check and incorrectly uses a
divrem libcall even when div is legal.

rdar://12481395

llvm-svn: 165778
2012-10-12 01:15:47 +00:00
Sean Silva
2174f713f4 Remove unnecessary classof()'s
isa<> et al. automatically infer when the cast is an upcast (including a
self-cast), so these are no longer necessary.

llvm-svn: 165767
2012-10-11 23:30:49 +00:00
Micah Villmow
4eb108750d Revert 165732 for further review.
llvm-svn: 165747
2012-10-11 21:27:41 +00:00
Micah Villmow
d8b76fdc50 Add in the first iteration of support for llvm/clang/lldb to allow variable per address space pointer sizes to be optimized correctly.
llvm-svn: 165726
2012-10-11 17:21:41 +00:00
Jakob Stoklund Olesen
a7a8e389d2 Pass an explicit operand number to addLiveIns.
Not all instructions define a virtual register in their first operand.
Specifically, INLINEASM has a different format.

<rdar://problem/12472811>

llvm-svn: 165721
2012-10-11 16:46:07 +00:00
Michael Liao
4b4368007b Follow the same routine to add target float expansion hook
llvm-svn: 165707
2012-10-11 07:22:01 +00:00
Andrew Trick
2904a19a19 misched: Handle "transient" non-instructions.
llvm-svn: 165701
2012-10-11 05:37:06 +00:00
Nadav Rotem
b82a3821f7 Add a new interface to allow IR-level passes to access codegen-specific information.
llvm-svn: 165665
2012-10-10 22:04:55 +00:00
Micah Villmow
5f526cf767 Add in support for expansion of all of the comparison operations to the absolute minimum required set. This allows a backend to expand any arbitrary set of comparisons as long as a minimum set is supported.
The minimum set of required instructions is ISD::AND, ISD::OR, ISD::SETO(or ISD::SETOEQ) and ISD::SETUO(or ISD::SETUNE). Everything is expanded into one of two patterns:
Pattern 1: (LHS CC1 RHS) Opc (LHS CC2 RHS)
Pattern 2: (LHS CC1 LHS) Opc (RHS CC2 RHS)

llvm-svn: 165655
2012-10-10 20:50:51 +00:00
Michael Liao
c434bfd7e4 Add alternative support for FP_ROUND from v2f32 to v2f64
- Due to the current matching vector elements constraints in ISD::FP_EXTEND,
  rounding from v2f32 to v2f64 is scalarized. Add a customized v2f32 widening
  to convert it into a target-specific X86ISD::VFPEXT to work around this
  constraints. This patch also reverts a previous attempt to fix this issue by
  recovering the scalarized ISD::FP_EXTEND pattern and thus significantly
  reduces the overhead of supporting non-power-2 vector FP extend.

llvm-svn: 165625
2012-10-10 16:32:15 +00:00
Stepan Dyatkovskiy
5182bb8695 Issue description:
SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack
objects and byval parameters. So loading byval parameters from stack may be
inserted *before* it will be stored, since these operations are treated as
independent.

Fix:
Currently ARMTargetLowering::LowerFormalArguments saves byval registers with
FixedStack MachinePointerInfo. To fix the problem we need to store byval
registers with MachinePointerInfo referenced to first the "byval" parameter.

Also commit adds two new fields to the InputArg structure: Function's argument
index and InputArg's part offset in bytes relative to the start position of
Function's argument. E.g.: If function's argument is 128 bit width and it was
splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index,
but different offset values. 

llvm-svn: 165616
2012-10-10 11:37:36 +00:00
Bill Wendling
f3c4f64b79 Remove the final bits of Attributes being declared in the Attribute
namespace. Use the attribute's enum value instead. No functionality change
intended.

llvm-svn: 165610
2012-10-10 07:36:45 +00:00
Lang Hames
bda4fef456 My earlier "fix" for PBQP (see r165201) was incorrect. The real issue was that
checkRegMaskInterference only initializes the bitmask on the first interference.

This fixes PR14027 and (re)fixes PR13945.

llvm-svn: 165608
2012-10-10 06:39:48 +00:00
Andrew Trick
9ba6a8d7ea misched: fall-back to a target hook for instr bundles.
llvm-svn: 165606
2012-10-10 05:43:18 +00:00
Andrew Trick
4ca94d939c misched: Use the TargetSchedModel interface wherever possible.
Allows the new machine model to be used for NumMicroOps and OutputLatency.

Allows the HazardRecognizer to be disabled along with itineraries.

llvm-svn: 165603
2012-10-10 05:43:09 +00:00
Andrew Trick
6ef4c5cf64 misched: Add computeInstrLatency to TargetSchedModel.
llvm-svn: 165566
2012-10-09 23:44:32 +00:00
Andrew Trick
0a8af76cb4 misched: Allow flags to disable hasInstrSchedModel/hasInstrItineraries for external users of TargetSchedule.
llvm-svn: 165564
2012-10-09 23:44:26 +00:00
Andrew Trick
54d088900c misched: Remove LoopDependencies heuristic.
This wasn't contributing anything significant to postRA heuristics except compile time (by my measurements) and will be replaced by a more general heuristic for cross-region dependencies within the scheduler itself.

llvm-svn: 165563
2012-10-09 23:44:23 +00:00
Bill Wendling
04e6cf2045 Use the attribute enums to query if a parameter has an attribute.
llvm-svn: 165550
2012-10-09 21:38:14 +00:00