1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 21:42:54 +02:00
Commit Graph

64146 Commits

Author SHA1 Message Date
Craig Topper
6670549a2c Lift alignment restrictions on load/store folding of VEXTRACTI128/VINSERTI128.
llvm-svn: 191073
2013-09-20 05:37:49 +00:00
Andrew Trick
439918c874 Allow subtarget selection of the default MachineScheduler and document the interface.
The global registry is used to allow command line override of the
scheduler selection, but does not work well as the normal selection
API. For example, the same LLVM process should be able to target
multiple targets or subtargets.

llvm-svn: 191071
2013-09-20 05:14:41 +00:00
Richard Smith
6dfe60433d Revert r191062; the build break was also fixed in a different (incompatible) way in r191060.
llvm-svn: 191065
2013-09-20 01:24:10 +00:00
Richard Smith
255a969b21 Unbreak Clang build after r191050: don't pass a StringRef to snprintf.
llvm-svn: 191062
2013-09-20 00:38:18 +00:00
David Blaikie
8dd7f9c9a7 DebugInfo: GDBIndexEntry*String conversion functions now return const char* for easy llvm::formating
This was previously invoking UB by passing a user-defined type to
format. Thanks to Jordan Rose for pointing this out.

llvm-svn: 191060
2013-09-20 00:33:15 +00:00
David Blaikie
7e8457a199 Add braces to suppress Clang's dangling-else warning.
These violations were introduced in r191049

llvm-svn: 191059
2013-09-20 00:33:11 +00:00
David Blaikie
954130581b DebugInfo: constrain gnu pubnames test further
Ensures that the pubnames entries actually refer to the intended
entities. This test could be more flexible if there was a way to do
multiline FileCheck matches with captures (in that way the test wouldn't
need to have hardcoded offset values and would thus be resilient to
changes in the layout of the DIEs in this CU).

llvm-svn: 191055
2013-09-19 23:43:46 +00:00
Richard Mitton
fe7ffd01f6 Added support for generate DWARF .debug_aranges sections automatically.
llvm-svn: 191052
2013-09-19 23:21:01 +00:00
Andrew Trick
6fd8b52af9 Rename ConvergingScheduler to GenericScheduler.
This was an experimental scheduler a year ago. It's now used by
several subtargets, both in-order and out-of-order, and it
is about to be enabled by default for x86 and armv7. It will be the
new GenericScheduler for subtargets that don't provide their own
SchedulingStrategy.

llvm-svn: 191051
2013-09-19 23:10:59 +00:00
David Blaikie
f60a2aeec3 DebugInfo: llvm-dwarfdump support for gnu_pubnames section
llvm-svn: 191050
2013-09-19 23:01:29 +00:00
Kai Nacke
64a3fccd60 PR16726: extend rol/ror matching
C-like languages promote types like unsigned short to unsigned int before
performing an arithmetic operation. Currently the rotate matcher in the
DAGCombiner does not consider this situation.

This commit extends the DAGCombiner in the way that the pattern

(or (shl ([az]ext x), (*ext y)), (srl ([az]ext x), (*ext (sub 32, y))))

is folded into

([az]ext (rotl x, y))

The matching is restricted to aext and zext because in this cases the upper
bits are either undefined or known. Test case is included.

This fixes PR16726.

llvm-svn: 191049
2013-09-19 23:00:28 +00:00
Kai Nacke
21c0476931 Revert PR16726: extend rol/ror matching
There is a buildbot failure. Need to investigate this.

llvm-svn: 191048
2013-09-19 22:53:36 +00:00
Kai Nacke
fe02753846 PR16726: extend rol/ror matching
C-like languages promote types like unsigned short to unsigned int before
performing an arithmetic operation. Currently the rotate matcher in the
DAGCombiner does not consider this situation.

This commit extends the DAGCombiner in the way that the pattern

(or (shl ([az]ext x), (*ext y)), (srl ([az]ext x), (*ext (sub 32, y))))

is folded into

([az]ext (rotl x, y))

The matching is restricted to aext and zext because in this cases the upper
bits are either undefined or known. Test case is included.

This fixes PR16726.

llvm-svn: 191045
2013-09-19 22:36:39 +00:00
David Blaikie
7d5a020241 DebugInfo: Improve IR annotation comments for GNU pubthings.
llvm-svn: 191043
2013-09-19 22:19:37 +00:00
Shuxin Yang
44b8e62121 [Fast-math] Disable "(C1/X)*C2 => (C1*C2)/X" if C1/X has multiple uses.
If "C1/X" were having multiple uses, the only benefit of this
transformation is to potentially shorten critical path. But it is at the
cost of instroducing additional div.

  The additional div may or may not incur cost depending on how div is
implemented. If it is implemented using Newton–Raphson iteration, it dosen't
seem to incur any cost (FIXME). However, if the div blocks the entire
pipeline, that sounds to be pretty expensive. Let CodeGen to take care 
this transformation.

  This patch sees 6% on a benchmark.

rdar://15032743

llvm-svn: 191037
2013-09-19 21:13:46 +00:00
Benjamin Kramer
b681a45715 InstCombine: Don't allow turning vector-of-pointer loads into vector-of-integer.
The code below can't handle any pointers. PR17293.

llvm-svn: 191036
2013-09-19 20:59:04 +00:00
David Blaikie
a8dfbd9e12 Unshift the GDB index/GNU pubnames constants modified in r191025
Based on code review feedback from Eric Christopher, unshifting these
constants as they can appear in the gdb_index itself, shifted a further
24 bits. This means that keeping them preshifted is a bit inflexible, so
let's not do that.

Given the motivation, wrap up some nicer enums, more type safety, and
some utility functions.

llvm-svn: 191035
2013-09-19 20:40:26 +00:00
Anders Waldenborg
c34c43d63b llvm-c: Add LLVMGetPointerToFunction
Differential Revision: http://llvm-reviews.chandlerc.com/D1715

llvm-svn: 191030
2013-09-19 19:55:06 +00:00
Eric Christopher
61eb11007b Remove extraneous space, the asm printing infrastructure adds a space
in normally.

llvm-svn: 191026
2013-09-19 18:41:40 +00:00
David Blaikie
d4db76886f DebugInfo: Simplify gnu_pubnames index computation.
Names open to bikeshedding. Could switch back to the constants being
unshifted, but this way seems a bit easier to work with.

llvm-svn: 191025
2013-09-19 18:39:59 +00:00
Yi Jiang
ed81b17719 X86 horizontal vector reduction cost model
llvm-svn: 191021
2013-09-19 17:48:48 +00:00
David Blaikie
907e26020c Remove unnecessary conditional operators performing bool->bool conversion.
llvm-svn: 191020
2013-09-19 17:33:35 +00:00
David Blaikie
af845722b2 Fix a typo and simplify a boolean expression.
llvm-svn: 191018
2013-09-19 17:27:48 +00:00
Shuxin Yang
76ccefc5f8 GVN proceeds in the presence of dead code.
This is how it ignores the dead code:
1) When a dead branch target, say block B, is identified, all the
    blocks dominated by B is dead as well.

2) The PHIs of those blocks in dominance-frontier(B) is updated such
   that the operands corresponding to dead predecessors are replaced
   by "UndefVal".

   Using lattice's jargon, the "UndefVal" is the "Top" in essence.
   Phi node like this "phi(v1 bb1, undef xx)" will be optimized into
   "v1" if v1 is constant, or v1 is an instruction which dominate this
   PHI node.

3) When analyzing the availability of a load L, all dead mem-ops which
   L depends on disguise as a load which evaluate exactly same value as L.

4) The dead mem-ops will be materialized as "UndefVal" during code motion.

llvm-svn: 191017
2013-09-19 17:22:51 +00:00
Evgeniy Stepanov
d26ac53a42 [msan] Wrap indirect functions.
Adds a flag to the MemorySanitizer pass that enables runtime rewriting of
indirect calls. This is part of MSanDR implementation and is needed to return
control to the DynamiRio-based helper tool on transition between instrumented
and non-instrumented modules. Disabled by default.

llvm-svn: 191006
2013-09-19 15:22:35 +00:00
Benjamin Kramer
f939b2b330 DAGCombiner: Don't fold vector muls with constants that look like a splat of a power of 2 but differ in bit width.
PR17283.

llvm-svn: 191000
2013-09-19 13:28:20 +00:00
Justin Holewinski
69db0d8365 [NVPTX] Support constant vector globals
llvm-svn: 190997
2013-09-19 12:51:46 +00:00
Amara Emerson
7ad0409c56 [ARMv8] Add support for the v8 cryptography extensions.
llvm-svn: 190996
2013-09-19 11:59:01 +00:00
Tim Northover
89d57eb12b X86: FrameIndex addressing modes do have a base register.
When selecting the DAG (add (WrapperRIP ...), (FrameIndex ...)), X86 code had
spotted the FrameIndex possibility and was working out whether it could fold
the WrapperRIP into this.

The test for forming a %rip version is notionally whether we already have a
base or index register (%rip precludes both), but we were forgetting to account
for the register that would be inserted later to access the frame.

rdar://problem/15024520

llvm-svn: 190995
2013-09-19 11:33:53 +00:00
Andrew Trick
e482b777ac Revert "Encapsulate PassManager debug flags to avoid static init and cxa_exit."
Working on a better solution to this.

This reverts commit 7d4e9934e7ca83094c5cf41346966c8350179ff2.

llvm-svn: 190990
2013-09-19 06:02:43 +00:00
Andrew Trick
2a04344d58 Encapsulate PassManager debug flags to avoid static init and cxa_exit.
This puts all the global PassManager debugging flags, like
-print-after-all and -time-passes, behind a managed static. This
eliminates their static initializers and, more importantly, exit-time
destructors.

The only behavioral change I anticipate is that tools need to
initialize the PassManager before parsing the command line in order to
export these options, which makes sense. Tools that already initialize
the standard passes (opt/llc) don't need to do anything new.

llvm-svn: 190974
2013-09-18 23:31:16 +00:00
Andrew Trick
b62e54c1ca whitespace
llvm-svn: 190973
2013-09-18 23:31:10 +00:00
Reed Kotler
03a3269d15 Fix two issues regarding Got pointer (GP) setup.
1) make sure that the first two instructions of the sequence cannot
separate from each other. The linker requires that they be sequential.
If they get separated, it can still work but it will not work in all
cases because the first of the instructions mostly involves the hi part
of the pc relative offset and that part changes slowly. You would have
to be at the right boundary for this to matter.
2) make sure that this sequence begins  on a longword boundary. 
There appears to be a bug in binutils which makes some of these calculations
get messed up if the instruction sequence does not begin on a longword
boundary. This is being investigated with the appropriate binutils folks.

llvm-svn: 190966
2013-09-18 22:46:09 +00:00
Adrian Prantl
0dcd042673 Debug info: Get rid of the VLA indirection hack in FastISel.
Use the DIVariable::isIndirect() flag set by the frontend instead of
guessing whether to set the machine location's indirection bit.
Paired commit with CFE.

llvm-svn: 190961
2013-09-18 22:08:59 +00:00
Filip Pizlo
f7fbb11b4c Make DynamicLibrary use ManagedStatic. This is pretty simple and should just work as
advertised - but it does have the caveat that calls to DynamicLibrary::AddSymbol will 
"reset" if you shutdown llvm and try to come back for seconds.  This is a subtle 
behavior change, but I'm assuming that nobody is affected by it.

llvm-svn: 190946
2013-09-18 16:40:14 +00:00
Chandler Carruth
264c13d042 More XCore TTI cleanup -- remove an unused private field flagged by
-Wunused-private-field with Clang.

llvm-svn: 190941
2013-09-18 14:11:11 +00:00
Kostya Serebryany
9e042a92a5 [asan] call __asan_stack_malloc_N only if use-after-return detection is enabled with the run-time option
llvm-svn: 190939
2013-09-18 14:07:14 +00:00
NAKAMURA Takumi
222f693048 Target/XCore/CMakeLists.txt: Add XCoreTargetTransformInfo.cpp.
llvm-svn: 190937
2013-09-18 12:59:41 +00:00
Robert Lytton
b41e4ff222 Prevent LoopVectorizer and SLPVectorizer running if the target has no vector registers.
XCore target: Add XCoreTargetTransformInfo
This is where getNumberOfRegisters() resides, which in turn returns the
number of vector registers (=0).

llvm-svn: 190936
2013-09-18 12:43:35 +00:00
Richard Sandiford
76d1801e90 [SystemZ] Add unsigned compare-and-branch instructions
For some reason I never got around to adding these at the same time as
the signed versions.  No idea why.

I'm not sure whether this SystemZII::BranchC* stuff is useful, or whether
it should just be replaced with an "is normal" flag.  I'll leave that
for later though.

There are some boundary conditions that can be tweaked, such as preferring
unsigned comparisons for equality with [128, 256), and "<= 255" over "< 256",
but again I'll leave those for a separate patch.

llvm-svn: 190930
2013-09-18 09:56:40 +00:00
Joey Gouly
e14ac63b96 [ARMv8] Add CRC instructions.
Patch by Bradley Smith!

llvm-svn: 190928
2013-09-18 09:45:55 +00:00
Filip Pizlo
0690c81bd3 Revert r190921. It broke Windows.
I'll roll it back in when I have a chance to look at it in detail.

llvm-svn: 190923
2013-09-18 06:37:55 +00:00
Filip Pizlo
b12b6ccfd3 Make DynamicLibrary use ManagedStatic. This is pretty simple and should just work as
advertised - but it does have the caveat that calls to DynamicLibrary::AddSymbol will 
"reset" if you shutdown llvm and try to come back for seconds.  This is a subtle 
behavior change, but I'm assuming that nobody is affected by it.

llvm-svn: 190921
2013-09-18 06:03:27 +00:00
Craig Topper
7ef60f689f Prevent extra calls to ToggleFeature for Feature64Bit and FeatureCMOV if they've already been enabled. The extra call ends up clearing the bit in FeatureBits since its a 'toggle'. Can't prove that anything was broken because of this since I don't think the FeatureBits for these are used.
llvm-svn: 190920
2013-09-18 06:01:53 +00:00
Craig Topper
01e805a531 Fix X86 subtarget to not overwrite the autodetected features by calling InitMCProcessorInfo right after detecting them. Instead add a new function that only updates the scheduling model and call that.
llvm-svn: 190919
2013-09-18 05:54:09 +00:00
Craig Topper
194d1e2a5a Revert accidental commit I had to make to get the test case in PR17268 to still work correctly.
llvm-svn: 190917
2013-09-18 04:10:17 +00:00
Craig Topper
5d022196de Lift alignment restrictions for load/store folding on VINSERTF128/VEXTRACTF128. Fixes PR17268.
llvm-svn: 190916
2013-09-18 03:55:53 +00:00
David Blaikie
8914db31b9 ifndef NDEBUG-out an asserts-only constant committed in r190863
llvm-svn: 190905
2013-09-18 00:11:27 +00:00
Matt Arsenault
e36237bda3 Fix a constant folding address space place I missed.
If address space 0 was smaller than the address space
in a constant inttoptr/ptrtoint pair, the wrong mask size
would be used.

llvm-svn: 190899
2013-09-17 23:23:16 +00:00
Reid Kleckner
130539949d COFF: Ensure that objects produced by LLVM link with /safeseh
Summary:
We indicate that the object files are safe by emitting a @feat.00
absolute address symbol.  The address is presumably interpreted as a
bitfield of features that the compiler would like to enable.  Bit 0 is
documented in the PE COFF spec to opt in to "registered SEH", which is
what /safeseh enables.

LLVM's object files are safe by default because LLVM doesn't know how to
produce SEH handlers.

Reviewers: Bigcheese

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1691

llvm-svn: 190898
2013-09-17 23:18:05 +00:00
Quentin Colombet
1950396a9e Revert the load slicing done in r190870.
To avoid regressions with bitfield optimizations, this slicing should take place
later, like ISel time.

llvm-svn: 190891
2013-09-17 22:01:26 +00:00
Reid Kleckner
9df46bd1f5 COFF: Emit all MCSymbols rather than filtering out some of them
In particular, this means we emit non-external symbols defined to
variables, such as aliases or absolute addresses.

This is needed to implement /safeseh, and it appears there was some
confusion about what symbols to emit previously.

llvm-svn: 190888
2013-09-17 21:24:44 +00:00
Reid Kleckner
d5bafd6e0c COFF: Remove ExportSection, which has been dead since r114823
llvm-svn: 190887
2013-09-17 21:24:02 +00:00
Eric Christopher
cf815a772b Move variable into assert to avoid unused variable warning.
llvm-svn: 190886
2013-09-17 21:13:57 +00:00
Matt Arsenault
6fd5ad85d0 Cleanup handling of constant function casts.
Some of this code is no longer necessary since int<->ptr casts are no
longer occur as of r187444.

This also fixes handling vectors of pointers, and adds a bunch of new
testcases for vectors and address spaces.

llvm-svn: 190885
2013-09-17 21:10:14 +00:00
Bill Schmidt
ae6c256723 [PowerPC] Add a FIXME.
Documenting a design choice to generate only medium model sequences for TLS
addresses at this time.  Small and large code models could be supported if
necessary.

llvm-svn: 190883
2013-09-17 20:22:05 +00:00
Bill Schmidt
f26512486a [PowerPC] Fix problems with large code model (PR17169).
Large code model on PPC64 requires creating and referencing TOC entries when
using the addis/ld form of addressing.  This was not being done in all cases.
The changes in this patch to PPCAsmPrinter::EmitInstruction() fix this.  Two
test cases are also modified to reflect this requirement.

Fast-isel was not creating correct code for loading floating-point constants
using large code model.  This also requires the addis/ld form of addressing.
Previously we were using the addis/lfd shortcut which is only applicable to
medium code model.  One test case is modified to reflect this requirement.

llvm-svn: 190882
2013-09-17 20:03:25 +00:00
Arnold Schwaighofer
eabde1ffce Costmodel: Add support for horizontal vector reductions
Upcoming SLP vectorization improvements will want to be able to estimate costs
of horizontal reductions. Add infrastructure to support this.

We model reductions as a series of (shufflevector,add) tuples ultimately
followed by an extractelement. For example, for an add-reduction of <4 x float>
we could generate the following sequence:

 (v0, v1, v2, v3)
   \   \  /  /
     \  \  /
       +  +

 (v0+v2, v1+v3, undef, undef)
    \      /
 ((v0+v2) + (v1+v3), undef, undef)

 %rdx.shuf = shufflevector <4 x float> %rdx, <4 x float> undef,
                           <4 x i32> <i32 2, i32 3, i32 undef, i32 undef>
 %bin.rdx = fadd <4 x float> %rdx, %rdx.shuf
 %rdx.shuf7 = shufflevector <4 x float> %bin.rdx, <4 x float> undef,
                          <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
 %bin.rdx8 = fadd <4 x float> %bin.rdx, %rdx.shuf7
 %r = extractelement <4 x float> %bin.rdx8, i32 0

This commit adds a cost model interface "getReductionCost(Opcode, Ty, Pairwise)"
that will allow clients to ask for the cost of such a reduction (as backends
might generate more efficient code than the cost of the individual instructions
summed up). This interface is excercised by the CostModel analysis pass which
looks for reduction patterns like the one above - starting at extractelements -
and if it sees a matching sequence will call the cost model interface.

We will also support a second form of pairwise reduction that is well supported
on common architectures (haddps, vpadd, faddp).

 (v0, v1, v2, v3)
  \   /    \  /
 (v0+v1, v2+v3, undef, undef)
    \     /
 ((v0+v1)+(v2+v3), undef, undef, undef)

  %rdx.shuf.0.0 = shufflevector <4 x float> %rdx, <4 x float> undef,
        <4 x i32> <i32 0, i32 2 , i32 undef, i32 undef>
  %rdx.shuf.0.1 = shufflevector <4 x float> %rdx, <4 x float> undef,
        <4 x i32> <i32 1, i32 3, i32 undef, i32 undef>
  %bin.rdx.0 = fadd <4 x float> %rdx.shuf.0.0, %rdx.shuf.0.1
  %rdx.shuf.1.0 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef,
        <4 x i32> <i32 0, i32 undef, i32 undef, i32 undef>
  %rdx.shuf.1.1 = shufflevector <4 x float> %bin.rdx.0, <4 x float> undef,
        <4 x i32> <i32 1, i32 undef, i32 undef, i32 undef>
  %bin.rdx.1 = fadd <4 x float> %rdx.shuf.1.0, %rdx.shuf.1.1
  %r = extractelement <4 x float> %bin.rdx.1, i32 0

llvm-svn: 190876
2013-09-17 18:06:50 +00:00
Arnold Schwaighofer
43c2040076 SLPVectorizer: Don't vectorize phi nodes that use invoke values
We can't insert an insertelement after an invoke. We would have to split a
critical edge. So when we see a phi node that uses an invoke we just give up.

radar://14990770

llvm-svn: 190871
2013-09-17 17:03:29 +00:00
Quentin Colombet
3d996b9289 [InstCombiner] Slice a big load in two loads when the elements are next to each
other in memory.

The motivation was to get rid of truncate and shift right instructions that get
in the way of paired load or floating point load.
E.g.,
Consider the following example:
struct Complex {
  float real;
  float imm;
};

When accessing a complex, llvm was generating a 64-bits load and the imm field
was obtained by a trunc(lshr) sequence, resulting in poor code generation, at
least for x86.

The idea is to declare that two load instructions is the canonical form for
loading two arithmetic type, which are next to each other in memory.

Two scalar loads at a constant offset from each other are pretty
easy to detect for the sorts of passes that like to mess with loads. 

<rdar://problem/14477220>

llvm-svn: 190870
2013-09-17 16:57:34 +00:00
Preston Gurd
3d32b70ccf Remove unused code, which had been commented out.
llvm-svn: 190869
2013-09-17 16:53:36 +00:00
Serge Pavlov
3d5c20a228 Added documentation to getMemsetStores.
llvm-svn: 190866
2013-09-17 16:24:42 +00:00
Ben Langmuir
c0ab36fe2e Add llvm.x86.* intrinsics for Intel SHA Extensions
Add llvm.x86.* intrinsics for all of the Intel SHA Extensions instructions, as
well as tests. Also remove mayLoad and hasSideEffects, which can be inferred
from the instruction patterns.

llvm-svn: 190864
2013-09-17 13:44:39 +00:00
Kostya Serebryany
f9c84976da [asan] inline the calls to __asan_stack_free_* with small sizes. Yet another 10%-20% speedup for use-after-return
llvm-svn: 190863
2013-09-17 12:14:50 +00:00
Joey Gouly
10dd14fb9e [ARM] Fix the deprecation of MCR encodings that map to CP15{ISB,DSB,DMB}.
llvm-svn: 190862
2013-09-17 09:54:57 +00:00
Stepan Dyatkovskiy
124d49fc91 Bugfix for PR17099:
Wrong cast operation.
MergeFunctions emits Bitcast instead of pointer-to-integer operation.
Patch fixes MergeFunctions::writeThunk function. It replaces
unconditional Bitcast creation with "Value* createCast(...)" method, that
checks operand types and selects proper instruction.
See unit-test as example.

llvm-svn: 190859
2013-09-17 09:36:11 +00:00
Elena Demikhovsky
28417de9de AVX-512: Converted to Unix style
llvm-svn: 190851
2013-09-17 07:34:34 +00:00
Craig Topper
ece6095ce4 Add AES and SHA instructions to the load folding tables.
llvm-svn: 190850
2013-09-17 06:50:11 +00:00
Craig Topper
4b8534b86a Fix column alignment. No functional change.
llvm-svn: 190849
2013-09-17 06:05:17 +00:00
Kevin Qin
3be5824550 Implement 3 AArch64 neon instructions : umov smov ins.
llvm-svn: 190839
2013-09-17 02:21:02 +00:00
Quentin Colombet
475709bdc2 [SelectionDAG] Teach the vector scalarizer about TRUNCATE.
When a truncate node defines a legal vector type but uses an illegal
vector type, the legalization process was splitting the vector until
<1 x vector> type, but then it was failing to scalarize the node because
it did not know how to handle TRUNCATE.

<rdar://problem/14989896>

llvm-svn: 190830
2013-09-17 00:26:56 +00:00
Adrian Prantl
dfcc535bbe Debug info: Fix PR16736 and rdar://problem/14990587.
A DBG_VALUE is register-indirect iff the first operand is a register
_and_ the second operand is an immediate.

llvm-svn: 190821
2013-09-16 23:29:03 +00:00
Matt Arsenault
513e7539be MemCpyOptimizer: Use max legal int size instead of pointer size
If there are no legal integers, assume 1 byte.

This makes more sense than using the pointer size as
a guess for the maximum GPR width.

It is conceivable to want to use some 64-bit pointers
on a target where 64-bit integers aren't legal.

llvm-svn: 190817
2013-09-16 22:43:16 +00:00
Jakub Staszak
84872fcf98 Use reference instead of copy.
llvm-svn: 190813
2013-09-16 22:03:38 +00:00
Bill Schmidt
b9dc991f55 [PowerPC] Fix PR17155 - Ignore COPY_TO_REGCLASS during emit.
Fast-isel generates a COPY_TO_REGCLASS for widening f32 to f64, which
is a nop on PPC64.  This is needed to keep the register class system
happy, but on the fast-isel path it is not removed before emit as it
is for DAG select.  Ignore this op when emitting instructions.

llvm-svn: 190795
2013-09-16 17:25:12 +00:00
Arnold Schwaighofer
11f318e34c Don't vectorize if there are outside loop users of the induction variable.
We would have to compute the pre increment value, either by computing it on
every loop iteration or by splitting the edge out of the loop and inserting a
computation for it there.

For now, just give up vectorizing such loops.

Fixes PR17179.

llvm-svn: 190790
2013-09-16 16:17:24 +00:00
Evgeniy Stepanov
f45c97a722 [msan] Check return value of main().
llvm-svn: 190782
2013-09-16 13:24:32 +00:00
Vladimir Medic
c143ef4d9e This patch implements Mips load/store instructions from/to coprocessor 2. Test cases are added.
llvm-svn: 190780
2013-09-16 10:29:42 +00:00
Benjamin Kramer
a65d61ae4b ARM: Deduplicate ConstantPoolValues.
llvm-svn: 190779
2013-09-16 10:17:31 +00:00
Richard Sandiford
ea2b1a8b94 [SystemZ] Improve extload handling
The port originally had special patterns for extload, mapping them to the
same instructions as sextload.  It seemed neater to have patterns that
match "an extension that is allowed to be signed" and "an extension that
is allowed to be unsigned".

This was originally meant to be a clean-up, but it does improve the handling
of promoted integers a little, as shown by args-06.ll.

llvm-svn: 190777
2013-09-16 09:03:10 +00:00
Craig Topper
4edc623bda Make F16C feature flag imply AVX rather than just checking both at the patterns.
llvm-svn: 190775
2013-09-16 04:29:58 +00:00
Peter Collingbourne
cf3b1a2910 Implement function prefix data as an IR feature.
Previous discussion:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-July/063909.html

Differential Revision: http://llvm-reviews.chandlerc.com/D1191

llvm-svn: 190773
2013-09-16 01:08:15 +00:00
Hal Finkel
5bb449bca0 PPC: Don't restrict lvsl generation to after type legalization
This is a re-commit of r190764, with an extra check to make sure that we're not
performing the transformation on illegal types (a small test case has been
added for this as well).

Original commit message:

The PPC backend uses a target-specific DAG combine to turn unaligned Altivec
loads into a permutation-based sequence when possible. Unfortunately, the
target-specific DAG combine is not always called on all loads of interest
(sometimes the routines in DAGCombine call CombineTo such that the new node and
users are not added to the worklist); allowing the combine to trigger early
(before type legalization) mitigates this problem. Because the autovectorizers
only create legal vector types, I don't expect a lot of cases where this
optimization is enabled by type legalization in practice.

llvm-svn: 190771
2013-09-15 22:09:58 +00:00
Benjamin Kramer
b6950f09cc Replace some unnecessary vector copies with references.
llvm-svn: 190770
2013-09-15 22:04:42 +00:00
Benjamin Kramer
e8d495c088 ELF: Add support for the exclude section bit for gas compat.
llvm-svn: 190769
2013-09-15 19:53:20 +00:00
David Majnemer
29e93ff017 MC: Add support for '?' flags in .section directives
Summary:
The '?' flag uses the last section group if the last had a section
group.  We treat combining an explicit section group and the '?' as a
hard error.

This fixes PR17198.

Reviewers: rafael, bkramer

Reviewed By: bkramer

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1686

llvm-svn: 190768
2013-09-15 19:24:16 +00:00
Kai Nacke
9c30babbf2 Fix alignment of unwind data.
For alignment purposes, the instruction array will always have an even
number of entries, with the final entry potentially unused (in which
case the array will be one longer than indicated by the count of unwind
codes field).

Reviewed by Anton Korobeynikov, Charles Davis and Nico Rieck.

llvm-svn: 190767
2013-09-15 18:01:09 +00:00
Kai Nacke
1fb6384f6d Generate IMAGE_REL_AMD64_ADDR32NB relocations for SEH
data structures.

The Win64 EH data structures must be of type IMAGE_REL_AMD64_ADDR32NB
instead of IMAGE_REL_AMD64_ADDR32. This is easiely achieved by adding
the VK_COFF_IMGREL32 modifier to the symbol reference.
Change also references to start and end of the SEH range of a function
as offsets to start of the function.

Reviewed by Jim Grosbach, Charles Davis and Nico Rieck.

llvm-svn: 190766
2013-09-15 17:46:46 +00:00
Hal Finkel
c45bfe85cc Revert r190764: PPC: Don't restrict lvsl generation to after type legalization
This is causing test-suite failures.

Original commit message:

The PPC backend uses a target-specific DAG combine to turn unaligned Altivec
loads into a permutation-based sequence when possible. Unfortunately, the
target-specific DAG combine is not always called on all loads of interest
(sometimes the routines in DAGCombine call CombineTo such that the new node and
users are not added to the worklist); allowing the combine to trigger early
(before type legalization) mitigates this problem. Because the autovectorizers
only create legal vector types, I don't expect a lot of cases where this
optimization is enabled by type legalization in practice.

llvm-svn: 190765
2013-09-15 15:41:11 +00:00
Hal Finkel
ae7feec56e PPC: Don't restrict lvsl generation to after type legalization
The PPC backend uses a target-specific DAG combine to turn unaligned Altivec
loads into a permutation-based sequence when possible. Unfortunately, the
target-specific DAG combine is not always called on all loads of interest
(sometimes the routines in DAGCombine call CombineTo such that the new node and
users are not added to the worklist); allowing the combine to trigger early
(before type legalization) mitigates this problem. Because the autovectorizers
only create legal vector types, I don't expect a lot of cases where this
optimization is enabled by type legalization in practice.

llvm-svn: 190764
2013-09-15 15:20:54 +00:00
Hal Finkel
fc7b3598ec Prevent assert in CombinerGlobalAA with null values
DAGCombiner::isAlias can be called with SrcValue1 or SrcValue2 null, and we
can't use AA in this case (if we try, then the casting code in AA will assert).

llvm-svn: 190763
2013-09-15 02:19:49 +00:00
Reed Kotler
0d8133f6fe Expand the mask capability for deciding which functions are mips16 and mips32
so it can be better used for general interoperability testing between mips32
and mips16.

llvm-svn: 190762
2013-09-15 02:09:08 +00:00
Benjamin Kramer
7df265741c Remove unused StringRef that no compiler warned about, I wonder why.
llvm-svn: 190759
2013-09-14 22:55:54 +00:00
Ben Langmuir
1023593e49 Add the remaining Intel SHA instructions
Also assembly/disassembly tests, and for sha256rnds2, aliases with an explicit
xmm0 dependency.

llvm-svn: 190754
2013-09-14 15:03:21 +00:00
Robert Wilhelm
0ba533b69c Fix spelling.
llvm-svn: 190750
2013-09-14 09:34:59 +00:00
Robert Wilhelm
78f2958d70 Fix spelling.
llvm-svn: 190749
2013-09-14 09:34:24 +00:00
Chandler Carruth
d47d52e219 Remove the long, long defunct IR block placement pass.
This pass was based on the previous (essentially unused) profiling
infrastructure and the assumption that by ordering the basic blocks at
the IR level in a particular way, the correct layout would happen in the
end. This sometimes worked, and mostly didn't. It also was a really
naive implementation of the classical paper that dates from when branch
predictors were primarily directional and when loop structure wasn't
commonly available. It also didn't factor into the equation
non-fallthrough branches and other machine level details.

Anyways, for all of these reasons and more, I wrote
MachineBlockPlacement, which completely supercedes this pass. It both
uses modern profile information infrastructure, and actually works. =]

llvm-svn: 190748
2013-09-14 09:28:14 +00:00
Zoran Jovanovic
2922e5d9d9 Fixed bug when generating Load Upper Immediate microMIPS instruction.
llvm-svn: 190746
2013-09-14 07:35:41 +00:00
Zoran Jovanovic
8aeb0d5244 Support for microMIPS DIV instructions.
llvm-svn: 190745
2013-09-14 07:15:21 +00:00
Zoran Jovanovic
579084a489 Support for misc microMIPS instructions.
llvm-svn: 190744
2013-09-14 06:49:25 +00:00
Filip Pizlo
9a441082e4 Make PrettyStackTraceEntry use ManagedStatic for its ThreadLocal.
This was somewhat tricky because ~PrettyStackTraceEntry() may run after 
llvm_shutdown() has been called. This is rare and only happens for a common idiom 
used in the main() functions of command-line tools. This works around the idiom by 
skipping the stack clean-up if the PrettyStackTraceHead ManagedStatic is not 
constructed (i.e. llvm_shutdown() has been called).

llvm-svn: 190730
2013-09-13 22:59:47 +00:00
Hal Finkel
ac7ef3f74f Add missing break statement in PPCISelLowering
As it turns out, not a problem in practice, but it should be there.

llvm-svn: 190720
2013-09-13 20:09:02 +00:00
Preston Gurd
0411803c14 Adds support for Atom Silvermont (SLM) - -march=slm
Implements Instruction scheduler latencies for Silvermont,
using latencies from the Intel Silvermont Optimization Guide.

Auto detects SLM.

Turns on post RA scheduler when generating code for SLM.

llvm-svn: 190717
2013-09-13 19:23:28 +00:00
Quentin Colombet
4b46915824 [Peephole] Rewrite copies to avoid cross register banks copies.
By definition copies across register banks are not coalescable. Still, it may be
possible to get rid of such a copy when the value is available in another
register of the same register file.
Consider the following example, where capital and lower letters denote different
register file:
b = copy A <-- cross-bank copy
...
C = copy b <-- cross-bank copy

This could have been optimized this way:
b = copy A  <-- cross-bank copy
...
C = copy A <-- same-bank copy

Note: b and C's definitions may be in different basic blocks.

This patch adds a peephole optimization that looks through a chain of copies
leading to a cross-bank copy and reuses a source that is on the same register
file if available.

This solution could also be used to get rid of some copies (e.g., A could have
been used instead of C). However, we do not do so because:
- It may over constrain the coloring of the source register for coalescing.
- The register allocator may not be able to find a nice split point for the
  longer live-range, leading to more spill.

<rdar://problem/14742333>

llvm-svn: 190713
2013-09-13 18:26:31 +00:00
Joey Gouly
0af412fe63 [ARMv8] Change hasV8Fp to hasFPARMv8, and other command line options
to be more consistent.

llvm-svn: 190692
2013-09-13 13:46:57 +00:00
Evgeniy Stepanov
47ad94a685 [msan] Add source file:line to stack origin reports.
Compiler part.

llvm-svn: 190689
2013-09-13 12:54:49 +00:00
Joey Gouly
2b0127dd73 [ARMv8] Emit the proper .fpu directive.
Patch by Bradley Smith!

llvm-svn: 190683
2013-09-13 11:51:52 +00:00
Zoran Jovanovic
be019657bd Test commit to verify that commit access works.
llvm-svn: 190676
2013-09-13 10:08:05 +00:00
Richard Sandiford
c0e0e27a84 [SystemZ] Use getTarget{Insert,Extract}Subreg rather than getMachineNode
Just a clean-up, no behavioral change intended.

llvm-svn: 190673
2013-09-13 09:12:44 +00:00
Richard Sandiford
30374b51cb [SystemZ] Try to fold shifts into TMxx
E.g. "SRL %r2, 2; TMLL %r2, 1" => "TMLL %r2, 4".

llvm-svn: 190672
2013-09-13 09:09:50 +00:00
Duncan Sands
5dbc902c8f Avoid a compiler warning about Found not being used when assertions are
disabled.

llvm-svn: 190668
2013-09-13 08:16:06 +00:00
Tim Northover
5eefec9e6d AArch64: use RegisterOperand for NEON registers.
Previously we modelled VPR128 and VPR64 as essentially identical
register-classes containing V0-V31 (which had Q0-Q31 as "sub_alias"
sub-registers). This model is starting to cause significant problems
for code generation, particularly writing EXTRACT/INSERT_SUBREG
patterns for converting between the two.

The change here switches to classifying VPR64 & VPR128 as
RegisterOperands, which are essentially aliases for RegisterClasses
with different parsing and printing behaviour. This fits almost
exactly with their real status (VPR128 == FPR128 printed strangely,
VPR64 == FPR64 printed strangely).

llvm-svn: 190665
2013-09-13 07:26:52 +00:00
Craig Topper
5c570ba6ec Move operator to end of previous line to match coding standards.
llvm-svn: 190659
2013-09-13 04:41:06 +00:00
Eric Christopher
89bc57b9bc Add initial support for handling gnu style pubnames accepted by some
versions of gold. This support is designed to allow gold to produce
gdb_index sections similar to the accelerator tables and consumable
by gdb.

llvm-svn: 190649
2013-09-13 00:35:05 +00:00
Eric Christopher
5ac23f0f83 Reformat and hoist section grabbing to top level.
llvm-svn: 190648
2013-09-13 00:34:58 +00:00
Vincent Lejeune
a130649ec4 R600: Move clamp handling code to R600IselLowering.cpp
llvm-svn: 190645
2013-09-12 23:45:00 +00:00
Vincent Lejeune
439c29a29d R600: Move code handling literal folding into R600ISelLowering.
llvm-svn: 190644
2013-09-12 23:44:53 +00:00
Vincent Lejeune
82c06999cd R600: Move fabs/fneg/sel folding logic into PostProcessIsel
This move makes possible to correctly handle multiples instructions
from a single pattern.

llvm-svn: 190643
2013-09-12 23:44:44 +00:00
Chandler Carruth
e62d2f2be7 Remove an unused variable, fixing -Werror build with latest Clang.
llvm-svn: 190640
2013-09-12 23:30:48 +00:00
Hal Finkel
4b3cfb4727 Fix PPC ABI for ByVal structs with vector members
When a structure is passed by value, and that structure contains a vector
member, according to the PPC ABI, the structure will receive enhanced alignment
(so that the vector within the structure will always be aligned).

This should resolve PR16641.

llvm-svn: 190636
2013-09-12 23:20:06 +00:00
Joe Abbey
6cc6a1b98f Patch provide by Tom Roeder!
Reviewed by Joe Abbey and Tobias Grosser

Here is a patch that fixes decoding of CE_SELECT in BitcodeReader,
along with a simple test case. The problem in the current code is that
it generates but doesn't accept bitcode that uses vectors for the
first element of a select in this context.

llvm-svn: 190634
2013-09-12 22:02:31 +00:00
Krzysztof Parzyszek
d3b8515190 In AliasSetTracker, do not change the alias set to "mod/ref" when adding
a volatile load, or a volatile store.

llvm-svn: 190631
2013-09-12 20:15:50 +00:00
Hal Finkel
47bfa9a072 Make the PPC fast-math sqrt expansion safe at 0
In fast-math mode sqrt(x) is calculated using the fast expansion of the
reciprocal of the reciprocal sqrt expansion. The reciprocal and reciprocal
sqrt expansions use the associated estimate instructions along with some Newton
iterations. Unfortunately, as a result, sqrt(0) was being calculated as NaN,
which is not correct. Now we explicitly return a result of zero if the input is
zero.

llvm-svn: 190624
2013-09-12 19:04:12 +00:00
Roman Divacky
582df28090 Implement asm support for a few PowerPC bookIII that are needed for assembling
FreeBSD kernel.

llvm-svn: 190618
2013-09-12 17:50:54 +00:00
Filip Pizlo
fc5fe0d12b This switches CrashRecoveryContext to using ManagedStatic for its global Mutex and
global ThreadLocals, thereby getting rid of the load-time initialization of those 
objects and also getting rid of their destruction unless the LLVM client calls 
llvm_shutdown.

llvm-svn: 190617
2013-09-12 17:46:57 +00:00
Ben Langmuir
9981cd7cfe Partial support for Intel SHA Extensions (sha1rnds4)
Add basic assembly/disassembly support for the first Intel SHA
instruction 'sha1rnds4'. Also includes feature flag, and test cases.

Support for the remaining instructions will follow in a separate patch.

llvm-svn: 190611
2013-09-12 15:51:31 +00:00
Hal Finkel
5ddc53b3ef Mark PPC MFTB and DST (and friends) as deprecated
Use the new instruction deprecation feature to mark mftb (now replaced with
mfspr) and dst (along with the other Altivec cache control instructions) as
deprecated when targeting cores supporting at least ISA v2.03.

llvm-svn: 190605
2013-09-12 14:40:06 +00:00
Elena Demikhovsky
f5a643e2c5 LLVM Interpreter: implementation of "insertvalue" and "extractvalue";
undef constatnt for structure and test for these functions.

done by Yuri Veselov (mailto:Yuri.Veselov@intel.com)

llvm-svn: 190599
2013-09-12 10:48:23 +00:00
Joey Gouly
fccb3bcae3 Add an instruction deprecation feature to TableGen.
The 'Deprecated' class allows you to specify a SubtargetFeature that the
instruction is deprecated on.

The 'ComplexDeprecationPredicate' class allows you to define a custom
predicate that is called to check for deprecation.
For example:
  ComplexDeprecationPredicate<"MCR">

would mean you would have to define the following function:
  bool getMCRDeprecationInfo(MCInst &MI, MCSubtargetInfo &STI,
                             std::string &Info)

Which returns 'false' for not deprecated, and 'true' for deprecated
and store the warning message in 'Info'.

The MCTargetAsmParser constructor was chaned to take an extra argument of
the MCInstrInfo class, so out-of-tree targets will need to be changed.

llvm-svn: 190598
2013-09-12 10:28:05 +00:00
Elena Demikhovsky
139f25ed2c AVX-512: implemented extractelement with variable index.
Added parsing of mask register and "zeroing" semantic, like {%k1} {z}.

llvm-svn: 190595
2013-09-12 08:55:00 +00:00
Hal Finkel
6164109851 PPC: Enable aggressive anti-dependency breaking
Aggressive anti-dependency breaking is enabled by default for all PPC cores.
This provides a general speedup on the P7 and other platforms (among other
factors, the instruction group formation for the non-embedded PPC cores is done
during post-RA scheduling). In order to do this safely, the incompatibility
between uses of the MFOCRF instruction and anti-dependency breaking are
resolved by marking MFOCRF with hasExtraSrcRegAllocReq. As noted in the removed
FIXME, the problem was that MFOCRF's output is sensitive to the identify of the
source register, and always paired with a shift to undo this effect. Because
anti-dependency breaking is unaware of this hidden dependency of the shift
amount on the source register of the MFOCRF instruction, changing that register
must be inhibited.

Two test cases were adjusted: The SjLj test was made more insensitive to
register choices and scheduling; the saveCR test disabled anti-dependency
breaking because part of what it is testing is proper register reuse.

llvm-svn: 190587
2013-09-12 05:24:49 +00:00
Hal Finkel
c56d8f90fd Fix crash in AggressiveAntiDepBreaker with empty CriticalPathSet
If no register classes are added to CriticalPathRCs, then the CriticalPathSet
bitmask will be empty. In that case, ExcludeRegs must remain NULL or else this
line will cause a segfault:

  } else if ((ExcludeRegs != NULL) && ExcludeRegs->test(AntiDepReg)) {

I have no in-tree test case.

llvm-svn: 190584
2013-09-12 04:22:31 +00:00
Tom Stellard
6a507da088 R600/SI: expose TBUFFER_STORE_FORMAT_* for OpenGL transform feedback
For _XYZ, the type of VDATA is v4i32, because v3i32 doesn't exist.

The ADDR64 bit is not exposed. A simpler intrinsic that doesn't take
a resource descriptor might be nicer.

The maximum number of input SGPRs is bumped to 17.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 190575
2013-09-12 02:55:14 +00:00
Tom Stellard
43a8b95d18 R600: Don't use trans slot for instructions that read LDS source registers
This fixes some regressions in the piglit local memory store tests
introduced by recent commits which made the scheduler aware of the trans
slot.

It's not possible to test this using lit, because there is no way to
determine from the assembly dumps whether or not an instruction is in
the trans slot.

Even if this were possible, the test would be highly sensitive to
changes in the scheduler and might generate confusing false negatives.

Reviewed-by: Vincent Lejeune<vljn at ovi.com>
llvm-svn: 190574
2013-09-12 02:55:06 +00:00
Matt Arsenault
cee048defc Move variable under condition where it is used
llvm-svn: 190567
2013-09-12 01:07:58 +00:00
Matt Arsenault
9acedee809 Remove pointless assertion after r190376
llvm-svn: 190565
2013-09-12 01:07:49 +00:00
Hal Finkel
2f2965c149 Greatly simplify the PPC A2 scheduling itinerary
As Andy pointed out to me a long time ago, there are no structural hazards in
the later pipeline stages of the A2, and so modeling them is useless. Also,
modeling the top pre-dispatch stages is deceiving because, when multiple
hardware threads are active, those resources are shared among the threads. The
bypass definitions were mostly wrong, and so those have been removed. The
resulting itinerary is much simpler, and more accurate.

llvm-svn: 190562
2013-09-11 23:25:21 +00:00
Hal Finkel
bad4087f4a Enable MI scheduling (and CodeGen AA) by default for embedded PPC cores
For embedded PPC cores (especially the A2 core), using the MI scheduler with AA
is far superior to the other scheduling options.

llvm-svn: 190558
2013-09-11 23:05:25 +00:00
Bill Wendling
11579cf46b Use the appropriate return type for the compact unwind encoding.
llvm-svn: 190551
2013-09-11 21:47:57 +00:00
Hal Finkel
87afacb632 Implement TTI getUnrollingPreferences for PowerPC
The PowerPC A2 core greatly benefits from aggressive concatenation unrolling;
use the new getUnrollingPreferences to enable this by default when targeting
the PPC A2 core.

llvm-svn: 190549
2013-09-11 21:20:40 +00:00
Bill Wendling
8bc1f7844e Move into an anonymous namespace and closer to where it's used.
llvm-svn: 190547
2013-09-11 20:38:09 +00:00
Manman Ren
1bfa6478ab Debug info: add more comments.
llvm-svn: 190544
2013-09-11 19:40:28 +00:00
Hal Finkel
fe9daed60a Add getUnrollingPreferences to TTI
Allow targets to customize the default behavior of the generic loop unrolling
transformation. This will be used by the PowerPC backend when targeting the A2
core (which is in-order with a deep pipeline), and using more aggressive
defaults is important.

llvm-svn: 190542
2013-09-11 19:25:43 +00:00
Manman Ren
c203a91e2e Debug Info: move class definition of DIRef.
Definition of DIRef used to require the full definition of DIType because
of usage of DIType::isType in DIRef::resolve. We now use DIDescriptor::isType
instead to remove the requirement and move definition of DIRef before DIType.

With this, we can move the definition of DIType::getContext to the header
file.

llvm-svn: 190540
2013-09-11 18:55:55 +00:00
Benjamin Kramer
23f676e21d Revert "Give internal classes hidden visibility."
It works with clang, but GCC has different rules so we can't make all of those
hidden. This reverts commit r190534.

llvm-svn: 190536
2013-09-11 18:05:11 +00:00
Benjamin Kramer
386bd314a1 Give internal classes hidden visibility.
Worth 100k on a linux/x86_64 Release+Asserts clang.

llvm-svn: 190534
2013-09-11 17:42:27 +00:00
Benjamin Kramer
d2041b1a7c Don't expose symbols of lle_ functions.
+ formatting fixes.

llvm-svn: 190523
2013-09-11 12:42:39 +00:00
Daniel Sanders
e3f2e5de18 [mips][msa] Added support for matching mulv, nlzc, sll, sra, srl, and subv from normal IR (i.e. not intrinsics)
llvm-svn: 190518
2013-09-11 11:58:30 +00:00
Daniel Sanders
96466ff8b4 [mips][msa] Added support for matching fadd, fdiv, flog2, fmul, frint, fsqrt, and fsub from normal IR (i.e. not intrinsics)
llvm-svn: 190512
2013-09-11 10:51:30 +00:00
Benjamin Kramer
10412fd112 Path: Add an in-place version of path::native.
This reflects the common use case of nativizing a prepared path. The existing
version invokes undefined behavior if input = output, add an assert to catch
that case.

llvm-svn: 190510
2013-09-11 10:45:21 +00:00