1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00
Commit Graph

17156 Commits

Author SHA1 Message Date
Jim Grosbach
2f665f1cd7 AArch64: Constant fold converting vector setcc results to float.
Since the result of a SETCC for AArch64 is 0 or -1 in each lane, we can
move unary operations, in this case [su]int_to_fp through the mask
operation and constant fold the operation away. Generally speaking:
  UNARYOP(AND(VECTOR_CMP(x,y), constant))
      --> AND(VECTOR_CMP(x,y), constant2)
where constant2 is UNARYOP(constant).

This implements the transform where UNARYOP is [su]int_to_fp.

For example, consider the simple function:
define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind {
  %cmp = fcmp oeq <4 x float> %val, %test
  %ext = zext <4 x i1> %cmp to <4 x i32>
  %result = sitofp <4 x i32> %ext to <4 x float>
  ret <4 x float> %result
}

Before this change, the code is generated as:
  fcmeq.4s  v0, v0, v1
  movi.4s v1, #0x1        // Integer splat value.
  and.16b v0, v0, v1      // Mask lanes based on the comparison.
  scvtf.4s  v0, v0        // Convert each lane to f32.
  ret

After, the code is improved to:
  fcmeq.4s  v0, v0, v1
  fmov.4s v1, #1.00000000 // f32 splat value.
  and.16b v0, v0, v1      // Mask lanes based on the comparison.
  ret

The svvtf.4s has been constant folded away and the floating point 1.0f
vector lanes are materialized directly via fmov.4s.

Rather than do the folding manually in the target code, teach getNode()
in the generic SelectionDAG to handle folding constant operands of
vector [su]int_to_fp nodes. It is reasonable (as noted in a FIXME) to do
additional constant folding there as well, but I don't have test cases
for those operations, so leaving them for another time when it becomes
appropriate.

rdar://17693791

llvm-svn: 213341
2014-07-18 00:40:52 +00:00
Michael J. Spencer
ff351b2bdb Revert "[x86] Fold extract_vector_elt of a load into the Load's address computation."
There's a bug where this can create cycles in the DAG. It will take a bit
to fix, so I'm backing it out for now.

llvm-svn: 213339
2014-07-18 00:15:50 +00:00
Tim Northover
21a41cb9a1 CodeGen: generate single libcall for fptrunc -> f16 operations.
Previously we asserted on this code. Currently compiler-rt doesn't
actually implement any of these new libcalls, but external help is
pretty much the only viable option for LLVM.

I've followed the much more generic "__truncST2" naming, as opposed to
the odd name for f32 -> f16 truncation. This can obviously be changed
later, or overridden by any targets that need to.

llvm-svn: 213252
2014-07-17 11:12:12 +00:00
Tim Northover
eae1f1c8cc CodeGen: extend f16 conversions to permit types > float.
This makes the two intrinsics @llvm.convert.from.f16 and
@llvm.convert.to.f16 accept types other than simple "float". This is
only strictly needed for the truncate operation, since otherwise
double rounding occurs and there's no way to represent the strict IEEE
conversion. However, for symmetry we allow larger types in the extend
too.

During legalization, we can expand an "fp16_to_double" operation into
two extends for convenience, but abort when the truncate isn't legal. A new
libcall is probably needed here.

Even after this commit, various target tweaks are needed to actually use the
extended intrinsics. I've put these into separate commits for clarity, so there
are no actual tests of f64 conversion here.

llvm-svn: 213248
2014-07-17 10:51:23 +00:00
Sanjay Patel
5537765676 Fixed formatting, removed bug reference, renamed testcase
Thanks to Duncan Exon Smith for reviewing and cleanup suggestions. 

llvm-svn: 213205
2014-07-16 22:40:28 +00:00
Juergen Ributzka
63d2af65d4 [FastISel] Local values shouldn't be alive across an inline asm call with side effects.
This fixes an issue where a local value is defined before and used after an
inline asm call with side effects.

This fix simply flushes the local value map, which updates the insertion point
for the inline asm call to be above any previously defined local values.

This fixes <rdar://problem/17694203>

llvm-svn: 213203
2014-07-16 22:20:51 +00:00
Sanjay Patel
8befe236c4 trivial fix for PR20314
Make sure that the AddrInst is an Instruction.

llvm-svn: 213197
2014-07-16 21:08:10 +00:00
Chris Bieneman
8124ab9a24 [RegisterCoalescer] Moving the RegisterCoalescer subtarget hook onto the TargetRegisterInfo instead of the TargetSubtargetInfo.
llvm-svn: 213188
2014-07-16 20:13:31 +00:00
Tim Northover
c6c02a43ba CodeGen: don't form illegail EXTLOAD operations.
It turns out that in most cases (the main exception being i1-related
types) once these operations are formed we cannot separate them and
the targets end up having to deal with them whether they want to or
not.

This is not a good situation, and a more reasonable default can be
formed by ackowledging this and having targets leave them as Legal.
Only x86 seems to be affected (other targets don't even try marking
the operation Expand).

Mostly there's no visible change here yet, but it will be useful to
have truly expanded EXTLOADS for MVT::f16 softening support.

llvm-svn: 213162
2014-07-16 15:37:24 +00:00
Juergen Ributzka
51acdaca09 Remove TLI from isInTailCallPosition's arguments. NFC.
There is no need to pass on TLI separately to the function. As Eric pointed out
the Target Machine already provides everything we need.

llvm-svn: 213108
2014-07-16 00:01:22 +00:00
Sanjay Patel
2f0f025b2b Move Post RA Scheduling flag bit into SchedMachineModel
Refactoring; no functional changes intended

    Removed PostRAScheduler bits from subtargets (X86, ARM).
    Added PostRAScheduler bit to MCSchedModel class.
    This bit is set by a CPU's scheduling model (if it exists).
    Removed enablePostRAScheduler() function from TargetSubtargetInfo and subclasses.
    Fixed the existing enablePostMachineScheduler() method to use the MCSchedModel (was just returning false!).
    Added methods to TargetSubtargetInfo to allow overrides for AntiDepBreakMode, CriticalPathRCs, and OptLevel for PostRAScheduling.
    Added enablePostRAScheduler() function to PostRAScheduler class which queries the subtarget for the above values.
    Preserved existing scheduler behavior for ARM, MIPS, PPC, and X86: 
       a. ARM overrides the CPU's postRA settings by enabling postRA for any non-Thumb or Thumb2 subtarget. 
       b. MIPS overrides the CPU's postRA settings by enabling postRA for everything. 
       c. PPC overrides the CPU's postRA settings by enabling postRA for everything. 
       d. X86 is the only target that actually has postRA specified via sched model info.

Differential Revision: http://reviews.llvm.org/D4217

llvm-svn: 213101
2014-07-15 22:39:58 +00:00
Chris Bieneman
d1b660f0a6 [RegisterCoalescer] Add new subtarget hook allowing targets to opt-out of coalescing.
The coalescer is very aggressive at propagating constraints on the register classes, and the register allocator doesn’t know how to split sub-registers later to recover. This patch provides an escape valve for targets that encounter this problem to limit coalescing.

This patch also implements such for ARM to lower register pressure when using lots of large register classes. This works around PR18825.

llvm-svn: 213078
2014-07-15 17:18:41 +00:00
Andrea Di Biagio
454620d57b [DAGCombiner] Add more rules to fold shuffles.
This patch adds two new rules to the DAGCombiner:
 1.  shuffle (shuffle A, Undef, M0), B, M1 -> shuffle A, B, M2
 2.  shuffle (shuffle A, Undef, M0), A, M1 -> shuffle A, Undef, M2

We only do this if the combined shuffle is legal for the target.

Example:
;;
define <4 x float> @test(<4 x float> %a, <4 x float> %b) {
  %1 = shufflevector <4 x float> %a, <4 x float> undef, <4 x i32><i32 6, i32 0, i32 1, i32 7>
  %2 = shufflevector <4 x float> %1, <4 x float> %b, <4 x i32><i32 1, i32 2, i32 4, i32 5>
  ret <4 x i32> %2
}
;;

(using llc -mcpu=corei7 -march=x86-64)
Before, the x86 backend generated:
  pshufd $120, %xmm0, %xmm0
  shufps $-108, %xmm0, %xmm1
  movaps %xmm1, %xmm0

Now the x86 backend generates:
  movsd %xmm1, %xmm0

llvm-svn: 213069
2014-07-15 13:26:28 +00:00
Juergen Ributzka
bf808a1bcc [FastISel] Insert patchpoint instruction before the target generated call instruction.
The patchpoint instruction should have been inserted before the target
generated call instruction to be inside the ADJSTACKDOWN/ADJSTACKUP call
sequence window.

llvm-svn: 213034
2014-07-15 02:22:46 +00:00
Juergen Ributzka
23901e5103 [FastISel] Fix patchpoint lowering to set the result register.
Always update the value map with the result register (if there is one), for the
patchpoint instruction we created to replace the target-specific call
instruction.

llvm-svn: 213033
2014-07-15 02:22:43 +00:00
Andrea Di Biagio
167e00fc99 [DAGCombiner] Avoid calling method 'isShuffleMaskLegal' on illegal vector types.
This patch fixes a crasher in method 'DAGCombiner::visitOR' due to an invalid
call to method 'isShuffleMaskLegal'. On x86, method 'isShuffleMaskLegal'
always expects a legal vector value type in input.

With this patch, we immediately check if the input OR dag node has a legal
vector type; we only try to fold a OR dag node into a single shufflevector
if we know that the resulting shuffle will have a legal type.
This is to avoid calling method 'isShuffleMaskLegal' on a potentially
illegal vector value type.

Added a new test-case to file 'CodeGen/X86/combine-or.ll' to verify that
DAGCombiner doesn't crash in the attempt to check/combine an OR between shuffles
with illegal types.

llvm-svn: 213020
2014-07-15 00:02:32 +00:00
David Majnemer
94c981273e CodeGen: Stick constant pool entries in COMDAT sections for WinCOFF
COFF lacks a feature that other object file formats support: mergeable
sections.

To work around this, MSVC sticks constant pool entries in special COMDAT
sections so that each constant is in it's own section.  This permits
unused constants to be dropped and it also allows duplicate constants in
different translation units to get merged together.

This fixes PR20262.

Differential Revision: http://reviews.llvm.org/D4482

llvm-svn: 213006
2014-07-14 22:57:27 +00:00
Andrea Di Biagio
1b83284869 [DAGCombiner] Add more rules to combine shuffle vector dag nodes.
This patch teaches the DAGCombiner how to fold a pair of shuffles
according to rules:
  1.  shuffle(shuffle A, B, M0), B, M1) -> shuffle(A, B, M2)
  2.  shuffle(shuffle A, B, M0), A, M1) -> shuffle(A, B, M3)

The new rules would only trigger if the resulting shuffle has legal type and
legal mask.

Added test 'combine-vec-shuffle-3.ll' to verify that DAGCombiner correctly
folds shuffles on x86 when the resulting mask is legal. Also added some negative
cases to verify that we avoid introducing illegal shuffles.

llvm-svn: 213001
2014-07-14 22:46:26 +00:00
David Majnemer
8551d97f34 CodeGen: Add a getSectionKind method to MachineConstantPoolEntry
This is just a helper routine, no functionality has changed.

llvm-svn: 212993
2014-07-14 22:06:29 +00:00
Bill Wendling
1a86df990e Unify the lowering of arguments during SjLj prepare.
The 'select true, %arg, undef' instruction can be used for both aggregate and
non-aggregate arguments.

llvm-svn: 212967
2014-07-14 18:21:11 +00:00
Sanjay Patel
2433bcf766 fixed typo
llvm-svn: 212966
2014-07-14 18:21:07 +00:00
Saleem Abdulrasool
55055040ec CodeGen: add missing include
Found during windows unwinding work.  This header is indirectly included through
a chain leading through Support/Win64EH.h.  Explicitly include the header.  NFC.

llvm-svn: 212955
2014-07-14 16:28:09 +00:00
Bill Wendling
93c9860cb7 Support lowering of empty aggregates.
This crash was pretty common while compiling Rust for iOS (armv7). Reason -
SjLj preparation step was lowering aggregate arguments as ExtractValue +
InsertValue. ExtractValue has assertion which checks that there is some data in
value, which is not true in case of empty (no fields) structures. Rust uses
them quite extensively so this patch uses a 'select true, %val, undef'
instruction to lower the argument.

Patch by Valerii Hiora.

llvm-svn: 212922
2014-07-14 06:22:36 +00:00
Andrea Di Biagio
4242d12adc [DAGCombiner] Fix a crash caused by a missing check for legal type when trying to fold shuffles.
Verify that DAGCombiner does not crash when trying to fold a pair of shuffles
according to rule (added at r212539):
  (shuffle (shuffle A, Undef, M0), Undef, M1) -> (shuffle A, Undef, M2)

The DAGCombiner avoids folding shuffles if the resulting shuffle dag node
is not legal for the target. That means, the resulting shuffle must have
legal type and legal mask.

Before, the DAGCombiner only called method
'TargetLowering::isShuffleMaskLegal' to check if it was "safe" to fold according
to the above-mentioned rule. However, this caused a crash in the x86 backend
since method 'isShuffleMaskLegal' always expects to be called on a
legal vector type.

llvm-svn: 212915
2014-07-13 21:02:14 +00:00
Matt Arsenault
3d717281d0 Templatify DominanceFrontier.
Theoretically this should now work for MachineBasicBlocks.

llvm-svn: 212885
2014-07-12 21:59:52 +00:00
Reid Kleckner
aea3b689b7 Avoid a warning from MSVC on "*/" in this code by inserting a space
llvm-svn: 212862
2014-07-12 00:06:46 +00:00
Juergen Ributzka
a073ea35a3 [FastISel] Add target-independent patchpoint intrinsic support. WIP.
This implements the target-independent lowering for the patchpoint
intrinsic. Targets have to implement the FastLowerCall
hook to support this intrinsic.

Related to <rdar://problem/17427052>

llvm-svn: 212849
2014-07-11 22:19:02 +00:00
Juergen Ributzka
d2becf0ab9 [FastISel] Add basic infrastructure to support a target-independent call lowering hook in FastISel. WIP
The infrastructure mimics the call lowering we have already in place for
SelectionDAG, but with limitations. For example structure return demotion and
non-simple types are not supported (yet).

Currently every backend has its own implementation and duplicated code for call
lowering. There is also no specified interface that could be called from
target-independent code. The target-hook is opt-in and doesn't affect current
implementations.

llvm-svn: 212848
2014-07-11 22:01:42 +00:00
Juergen Ributzka
17e8cfdcf1 [FastISel] Make isInTailCallPosition independent of SelectionDAG.
Break out the arguemnts required from SelectionDAG, so that this function can
also be used by FastISel.

llvm-svn: 212844
2014-07-11 20:50:47 +00:00
Juergen Ributzka
1e7aababed [FastISel] Breakout intrinsic lowering into a separate function and add a target-hook.
Create a separate helper function for target-independent intrinsic lowering. Also
add an target-hook that allows to directly call into a target-sepcific intrinsic
lowering method. Currently the implementation is opt-in and doesn't affect
existing target implementations.

llvm-svn: 212843
2014-07-11 20:42:12 +00:00
Oliver Stannard
6090374181 ARM: Allow __fp16 as a function arg or return type for AArch64
ACLE 2.0 allows __fp16 to be used as a function argument or return
type. This enables this for AArch64.

llvm-svn: 212812
2014-07-11 13:33:46 +00:00
David Blaikie
cc8306ee32 Revert "Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself.""
This reverts commit r212776.

Nope, still seems to be failing on the sanitizer bots... but hey, not
the msan self-host anymore, it's failing in asan now. I'll start looking
there next.

llvm-svn: 212793
2014-07-11 02:42:57 +00:00
David Blaikie
eef9def8a0 Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself."
Committed in r212205 and reverted in r212226 due to msan self-hosting
failure, I believe I've got that fixed by r212761 to Clang.

Original commit message:

"Originally committed in r211723, reverted in r211724 due to failure
cases found and fixed (ArgumentPromotion: r211872, Inlining: r212065),
committed again in r212085 and reverted again in r212089 after fixing
some other cases, such as debug info subprogram lists not keeping track
of the function they represent (r212128) and then short-circuiting
things like LiveDebugVariables that build LexicalScopes for functions
that might not have full debug info.

And again, I believe the invariant actually holds for some reasonable
amount of code (but I'll keep an eye on the buildbots and see what
happens... ).

Original commit message:

PR20038: DebugInfo: Inlined call sites where the caller has debug info
but the call itself has no debug location.

This situation does bad things when inlined, so I've fixed Clang not to
produce inlinable call sites without locations when the caller has debug
info (in the one case where I could find that this occurred). This
updates the PR20038 test case to be what clang now produces, and readds
the assertion that had to be removed due to this bug.

I've also beefed up the debug info verifier to help diagnose these
issues in the future, and I hope to add checks to the inliner to just
assert-fail if it encounters this situation. If, in the future, we
decide we have to cope with this situation, the right thing to do is
probably to just remove all the DebugLocs from the inlined
instructions."

llvm-svn: 212776
2014-07-10 22:59:39 +00:00
Jan Vesely
4866999f98 SelectionDAG: Factor FP_TO_SINT lower code out of DAGLegalizer
Move the code to a helper function to allow calls from TypeLegalizer.

No functionality change intended

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>
Reviewed-by: Owen Anderson <resistor@mac.com>
llvm-svn: 212772
2014-07-10 22:40:18 +00:00
Matt Arsenault
49a604853d Revert "Revert r212640, "Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine.""
Don't try to convert the select condition type.

llvm-svn: 212750
2014-07-10 18:21:04 +00:00
Andrea Di Biagio
29af13b774 [DAG] Further improve the logic in DAGCombiner that folds a pair of shuffles into a single shuffle if the resulting mask is legal.
This patch teaches the DAGCombiner how to fold shuffles according to the
following new rules:
  1. shuffle(shuffle(x, y), undef) -> x
  2. shuffle(shuffle(x, y), undef) -> y
  3. shuffle(shuffle(x, y), undef) -> shuffle(x, undef)
  4. shuffle(shuffle(x, y), undef) -> shuffle(y, undef)

The backend avoids to combine shuffles according to rules 3. and 4. if
the resulting shuffle does not have a legal mask. This is to avoid introducing
illegal shuffles that are potentially expanded into a sub-optimal sequence of
target specific dag nodes during vector legalization.

Added test case combine-vec-shuffle-2.ll to verify that we correctly triggers
the new rules when combining shuffles.

llvm-svn: 212748
2014-07-10 18:04:55 +00:00
Chandler Carruth
b55565f54f [x86,SDAG] Introduce any- and sign-extend-vector-inreg nodes analogous
to the zero-extend-vector-inreg node introduced previously for the same
purpose: manage the type legalization of widened extend operations,
especially to support the experimental widening mode for x86.

I'm adding both because sign-extend is expanded in terms of any-extend
with shifts to propagate the sign bit. This removes the last
fundamental scalarization from vec_cast2.ll (a test case that hit many
really bad edge cases for widening legalization), although the trunc
tests in that file still appear scalarized because the the shuffle
legalization is scalarizing. Funny thing, I've been working on that.

Some initial experiments with this and SSE2 scenarios is showing
moderately good behavior already for sign extension. Still some work to
do on the shuffle combining on X86 before we're generating optimal
sequences, but avoiding scalarization is a huge step forward.

llvm-svn: 212714
2014-07-10 12:32:32 +00:00
NAKAMURA Takumi
ea850a0edc Revert r212640, "Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine."
This caused miscompilation on, at least, x86-64. SExt(i1 cond) confused other optimizations.

llvm-svn: 212708
2014-07-10 11:37:28 +00:00
Daniel Sanders
a59d10a761 Make it possible for ints/floats to return different values from getBooleanContents()
Summary:
On MIPS32r6/MIPS64r6, floating point comparisons return 0 or -1 but integer
comparisons return 0 or 1.

Updated the various uses of getBooleanContents. Two simplifications had to be
disabled when float and int boolean contents differ:
- ScalarizeVecRes_VSELECT except when the kind of boolean contents is trivially
  discoverable (i.e. when the condition of the VSELECT is a SETCC node).
- visitVSELECT (select C, 0, 1) -> (xor C, 1).
  Come to think of it, this one could test for the common case of 'C'
  being a SETCC too.

Preserved existing behaviour for all other targets and updated the affected
MIPS32r6/MIPS64r6 tests. This also fixes the pi benchmark where the 'low'
variable was counting in the wrong direction because it thought it could simply
add the result of the comparison.

Reviewers: hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, jholewinski, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D4389

llvm-svn: 212697
2014-07-10 10:18:12 +00:00
Hao Liu
cb23c5a35e [AArch64]Fix an assertion failure in DAG Combiner about concating 2 build_vector.
llvm-svn: 212677
2014-07-10 03:41:50 +00:00
Chandler Carruth
86e8e81be5 [SDAG] Make the new zext-vector-inreg node default to expand so targets
don't need to set it manually.

This is based on feedback from Tom who pointed out that if every target
needs to handle this we need to reach out to those maintainers. In fact,
it doesn't make sense to duplicate everything when anything other than
expand seems unlikely at this stage.

llvm-svn: 212661
2014-07-09 22:53:04 +00:00
David Blaikie
4464e33224 Recommit r212203: Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information.
Reverted by Eric Christopher (Thanks!) in r212203 after Bob Wilson
reported LTO issues. Duncan Exon Smith and Aditya Nandakumar helped
provide a reduced reproduction, though the failure wasn't too hard to
guess, and even easier with the example to confirm.

The assertion that the subprogram metadata associated with an
llvm::Function matches the scope data referenced by the DbgLocs on the
instructions in that function is not valid under LTO. In LTO, a C++
inline function might exist in multiple CUs and the subprogram metadata
nodes will refer to the same llvm::Function. In this case, depending on
the order of the CUs, the first intance of the subprogram metadata may
not be the one referenced by the instructions in that function and the
assertion will fail.

A test case (test/DebugInfo/cross-cu-linkonce-distinct.ll) is added, the
assertion removed and a comment added to explain this situation.

Original commit message:

If a function isn't actually in a CU's subprogram list in the debug info
metadata, ignore all the DebugLocs and don't try to build scopes, track
variables, etc.

While this is possibly a minor optimization, it's also a correctness fix
for an incoming patch that will add assertions to LexicalScopes and the
debug info verifier to ensure that all scope chains lead to debug info
for the current function.

Fix up a few test cases that had broken/incomplete debug info that could
violate this constraint.

Add a test case where this occurs by design (inlining a
debug-info-having function in an attribute nodebug function - we want
this to work because /if/ the nodebug function is then inlined into a
debug-info-having function, it should be fine (and will work fine - we
just stitch the scopes up as usual), but should the inlining not happen
we need to not assert fail either).

llvm-svn: 212649
2014-07-09 21:02:41 +00:00
Matt Arsenault
479a1f90e1 Add trunc (select c, a, b) -> select c (trunc a), (trunc b) combine.
Do this if the truncate is free and the select is legal.

llvm-svn: 212640
2014-07-09 19:12:07 +00:00
Chandler Carruth
646382f7ef [x86] Fix a bug in my new zext-vector-inreg DAG trickery where we were
not widening the input type to the node sufficiently to let the ext take
place in a register.

This would in turn result in a mysterious bitcast assertion failure
downstream. First change here is to add back the helpful assert I had in
an earlier version of the code to catch this immediately.

Next change is to add support to the type legalization to detect when we
have widened the operand either too little or too much (for whatever
reason) and find a size-matched legal vector type to convert it to
first. This can also fail so we get a new fallback path, but that seems
OK.

With this, we no longer crash on vec_cast2.ll when using widening. I've
also added the CHECK lines for the zero-extend cases here. We still need
to support sign-extend and trunc (or something) to get plausible code
for the other two thirds of this test which is one of the regression
tests that showed the most scalarization when widening was
force-enabled. Slowly closing in on widening being a viable legalization
strategy without it resorting to scalarization at every turn. =]

llvm-svn: 212614
2014-07-09 12:36:54 +00:00
Chandler Carruth
c03f949cd5 Sink two variables only used in an assert into the assert itself. Should
fix the release builds with Werror.

llvm-svn: 212612
2014-07-09 11:13:16 +00:00
Chandler Carruth
eb00ef26a3 [x86] Add a ZERO_EXTEND_VECTOR_INREG DAG node and use it when widening
vector types to be legal and a ZERO_EXTEND node is encountered.

When we use widening to legalize vector types, extend nodes are a real
challenge. Either the input or output is likely to be legal, but in many
cases not both. As a consequence, we don't really have any way to
represent this situation and the prior code in the widening legalization
framework would just scalarize the extend operation completely.

This patch introduces a new DAG node to represent doing a zero extend of
a vector "in register". The core of the idea is to allow legal but
different vector types in the input and output. The output vector must
have fewer lanes but wider elements. The operation is defined to zero
extend the low elements of the input to the size of the output elements,
and drop all of the high elements which don't have a corresponding lane
in the output vector.

It also includes generic expansion of this node in terms of blending
a zero vector into the high elements of the vector and bitcasting
across. This in turn yields extremely nice code for x86 SSE2 when we use
the new widening legalization logic in conjunction with the new shuffle
lowering logic.

There is still more to do here. We need to support sign extension, any
extension, and potentially int-to-float conversions. My current plan is
to continue using similar synthetic nodes to model each of these
transitions with generic lowering code for each one.

However, with this patch LLVM already reaches performance parity with
GCC for the core C loops of the x264 code (assuming you disable the
hand-written assembly versions) when compiling for SSE2 and SSE3
architectures and enabling the new widening and lowering logic for
vectors.

Differential Revision: http://reviews.llvm.org/D4405

llvm-svn: 212610
2014-07-09 10:58:18 +00:00
Chandler Carruth
14de70ee01 [SDAG] At the suggestion of Hal, switch to an output parameter that
tracks which elements of the build vector are in fact undef.

This should make actually inpsecting them (likely in my next patch)
reasonably pretty. Also makes the output parameter optional as it is
clear now that *most* users are happy with undefs in their splats.

llvm-svn: 212581
2014-07-09 00:41:34 +00:00
Andrea Di Biagio
ac6b7b5b82 [DAG] Teach how to combine a pair of shuffles into a single shuffle if the resulting mask is legal.
This patch teaches how to fold a shuffle according to rule:
  shuffle (shuffle (x, undef, M0), undef, M1) -> shuffle(x, undef, M2)

We do this only if the resulting mask M2 is legal; this is to avoid introducing
illegal shuffles that are potentially expanded into a sub-optimal sequence
of target specific dag nodes.

This patch has the advantage of being target independent, since it works on ISD
nodes. Therefore, all targets (not only x86) can take advantage of this rule.
The idea behind this patch is that most shuffle pairs can be safely combined
before we run the legalizer on vector operations. This allows us to
combine/simplify dag nodes earlier in the process and not only immediately
before instruction selection stage.

That said. This patch is not meant to replace any existing target specific
combine rules; backends might still introduce new shuffles during legalization
stage. Also, this rule is very simple and avoids to aggressively optimize
shuffles.

llvm-svn: 212539
2014-07-08 15:22:29 +00:00
Benjamin Kramer
0ad234df3b Fix some Twine locals.
Two of those are use after frees. Found by clang-tidy, fixed by me.

llvm-svn: 212537
2014-07-08 14:55:06 +00:00
Chandler Carruth
be294c3274 [x86,SDAG] Sink the logic for folding shuffles of splats more
aggressively from the x86 shuffle lowering to the generic SDAG vector
shuffle formation code.

This code already tried to fold away shuffles of splats! It just had
lots of bugs and couldn't handle the case my new x86 shuffle lowering
needed.

First, it failed to correctly compute whether N2 was undef because it
pre-computed this, then did transformations which could *make* N2 undef,
then failed to ever re-consider the precomputed state.

Second, it didn't look through bitcasts at all, even in the safe cases
where they are just element-type bitcasts with no change to the number
of elements.

Third, it didn't handle all-zero bit casts nicely the way my code in the
x86 side of things did, which is essential to getting good zext-shuffle
lowerings.

But all of these are generic. I just ported the code down to this layer
and fixed the surrounding bugs. Tests exercising this in the x86 backend
still pass and some silly code in widen_cast-6.ll gets better. I updated
that test to be a bit more precise but it's still pretty unclear what
the value of the test is in this day and age.

llvm-svn: 212517
2014-07-08 08:45:38 +00:00
Chandler Carruth
b2bc863a82 [SDAG] Actually check for a non-constant splat and clarify comments
around the handling of UNDEF lanes in boolean vector content analysis.

The code before my changes here also failed to check for non-constant
splats in a buildvector. I have no idea how to trigger this, I just
spotted by inspection when trying to understand the code. It seems
extremely unlikely to be worth the trouble to teach the only caller of
this code (DAG combining setcc patterns) how to cleverly handle undef
lanes, so I've just commented more thoroughly that we're giving up
there.

llvm-svn: 212515
2014-07-08 07:44:15 +00:00
Chandler Carruth
0764373bcc [SDAG] Build up a more rich set of APIs for querying build-vector SDAG
nodes about whether they are splats. This is factored out and improved
from r212324 which got reverted as it was far too aggressive. The new
API should help more conservatively handle buildvectors that are
a mixture of splatted and undef values.

No functionality change at this point. The hope is to slowly
re-introduce the undef-tolerant optimization of splats, but each time
being forced to make a concious decision about how to handle the undefs
in a way that doesn't lead to contradicting assumptions about the
collapsed value.

Hal has pointed out in discussions that this may not end up being the
desired API and instead it may be more convenient to get a mask of the
undef elements or something similar. I'm starting simple and will expand
the API as I adapt actual callers and see exactly what they need.

llvm-svn: 212514
2014-07-08 07:19:55 +00:00
Chandler Carruth
9fd1035842 [x86] Revert r212324 which was too aggressive w.r.t. allowing undef
lanes in vector splats.

The core problem here is that undef lanes can't *unilaterally* be
considered to contribute to splats. Their handling needs to be more
cautious. There is also a reported failure of the nightly testers
(thanks Tobias!) that may well stem from the same core issue. I'm going
to fix this theoretical issue, factor the APIs a bit better, and then
verify that I don't see anything bad with Tobias's reduction from the
test suite before recommitting.

Original commit message for r212324:
  [x86] Generalize BuildVectorSDNode::getConstantSplatValue to work for
  any constant, constant FP, or undef splat and to tolerate any undef
  lanes in a splat, then replace all uses of isSplatVector in X86's
  lowering with it.

  This fixes issues where undef lanes in an otherwise splat vector would
  prevent the splat logic from firing. It is a touch more awkward to use
  this interface, but it is much more accurate. Suggestions for better
  interface structuring welcome.

  With this fix, the code generated with the widening legalization
  strategy for widen_cast-4.ll is *dramatically* improved as the special
  lowering strategies for a v16i8 SRA kick in even though the high lanes
  are undef.

  We also get a slightly different choice for broadcasting an aligned
  memory location, and use vpshufd instead of vbroadcastss. This looks
  like a minor win for pipelining and domain crossing, but a minor loss
  for the number of micro-ops. I suspect its a wash, but folks can
  easily tweak the lowering if they want.

llvm-svn: 212475
2014-07-07 19:03:32 +00:00
Benjamin Kramer
195f0552f0 Make helper functions static.
llvm-svn: 212460
2014-07-07 14:47:51 +00:00
Tim Northover
96e2bce418 CodeGen: it turns out that NAND is not the same thing as BIC. At all.
We've been performing the wrong operation on ARM for "atomicrmw nand" for
years, since "a NAND b" is "~(a & b)" rather than ARM's very tempting "a & ~b".
This bled over into the generic expansion pass.

So I assume no-one has ever actually tried to do an atomic nand in the real
world. Oh well.

llvm-svn: 212443
2014-07-07 09:06:35 +00:00
Chandler Carruth
ab7167839a [x86] Generalize BuildVectorSDNode::getConstantSplatValue to work for
any constant, constant FP, or undef splat and to tolerate any undef
lanes in a splat, then replace all uses of isSplatVector in X86's
lowering with it.

This fixes issues where undef lanes in an otherwise splat vector would
prevent the splat logic from firing. It is a touch more awkward to use
this interface, but it is much more accurate. Suggestions for better
interface structuring welcome.

With this fix, the code generated with the widening legalization
strategy for widen_cast-4.ll is *dramatically* improved as the special
lowering strategies for a v16i8 SRA kick in even though the high lanes
are undef.

We also get a slightly different choice for broadcasting an aligned
memory location, and use vpshufd instead of vbroadcastss. This looks
like a minor win for pipelining and domain crossing, but a minor loss
for the number of micro-ops. I suspect its a wash, but folks can easily
tweak the lowering if they want.

llvm-svn: 212324
2014-07-04 08:11:49 +00:00
Eric Christopher
6861e06059 Move function dependent resetting of a subtarget variable out of the
subtarget. This involved having the movt predicate take the current
function - since we care about size in instruction selection for
whether or not to use movw/movt take the function so we can check
the attributes. This required adding the current MachineFunction to
FastISel and propagating through.

llvm-svn: 212309
2014-07-04 01:55:26 +00:00
Eric Christopher
33b8321c4c Temporarily revert "Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information." as it appears to be breaking some LTO constructs.
This reverts commit r212203.

llvm-svn: 212298
2014-07-03 22:24:54 +00:00
Sanjay Patel
bbb5460c77 bug fix for PR20020: anti-dependency-breaker causes miscompilation
This patch sets the 'KeepReg' bit for any tied and live registers during the PrescanInstruction() phase of the dependency breaking algorithm. It then checks those 'KeepReg' bits during the ScanInstruction() phase to avoid changing any tied registers. For more details, please see comments in:
http://llvm.org/bugs/show_bug.cgi?id=20020

I added two FIXME comments for code that I think can be removed by using register iterators that include self. I don't want to include those code changes with this patch, however, to keep things as small as possible.

The test case is larger than I'd like, but I don't know how to reduce it further and still produce the failing asm.

Differential Revision: http://reviews.llvm.org/D4351

llvm-svn: 212275
2014-07-03 15:19:40 +00:00
Ulrich Weigand
7f6ccb0182 Fix ppcf128 component access on little-endian systems
The PowerPC 128-bit long double data type (ppcf128 in LLVM) is in fact a
pair of two doubles, where one is considered the "high" or
more-significant part, and the other is considered the "low" or
less-significant part.  When a ppcf128 value is stored in memory or a
register pair, the high part always comes first, i.e. at the lower
memory address or in the lower-numbered register, and the low part
always comes second.  This is true both on big-endian and little-endian
PowerPC systems.  (Similar to how with a complex number, the real part
always comes first and the imaginary part second, no matter the byte
order of the system.)

This was implemented incorrectly for little-endian systems in LLVM.
This commit fixes three related issues:

- When printing an immediate ppcf128 constant to assembler output
  in emitGlobalConstantFP, emit the high part first on both big-
  and little-endian systems.

- When lowering a ppcf128 type to a pair of f64 types in SelectionDAG
  (which is used e.g. when generating code to load an argument into a
  register pair), use correct low/high part ordering on little-endian
  systems.

- In a related issue, because lowering ppcf128 into a pair of f64 must
  operate differently from lowering an int128 into a pair of i64,
  bitcasts between ppcf128 and int128 must not be optimized away by the
  DAG combiner on little-endian systems, but must effect a word-swap.

Reviewed by Hal Finkel.

llvm-svn: 212274
2014-07-03 15:06:47 +00:00
Chandler Carruth
d8bdeffa9e [x86] Fix the completely broken vector widening legalization of bswap.
This operation was classified as a binary operation in the widening
logic for some reason (clearly, untested). It is in fact a unary
operation. Add a RUN line to a test to exercise this for x86.

Note that again the vector widening strategy doesn't regress anything
and in one case removes a totally unecessary instruction that we
couldn't avoid when promoting the element type.

llvm-svn: 212257
2014-07-03 07:04:38 +00:00
Chandler Carruth
fc0fe5064b [codegen,aarch64] Add a target hook to the code generator to control
vector type legalization strategies in a more fine grained manner, and
change the legalization of several v1iN types and v1f32 to be widening
rather than scalarization on AArch64.

This fixes an assertion failure caused by scalarizing nodes like "v1i32
trunc v1i64". As v1i64 is legal it will fail to scalarize v1i32.

This also provides a foundation for other targets to have more granular
control over how vector types are legalized.

Patch by Hao Liu, reviewed by Tim Northover. I'm committing it to allow
some work to start taking place on top of this patch as it adds some
really important hooks to the backend that I'd like to immediately start
using. =]

http://reviews.llvm.org/D4322

llvm-svn: 212242
2014-07-03 00:23:43 +00:00
David Blaikie
a46a55b3e8 Revert "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself."
This reverts commit r212205.

Reverting this again, still seeing crashes when building compiler-rt...
Sorry for the continued noise, not sure why I'm failing to reproduce
this locally.

llvm-svn: 212226
2014-07-02 21:42:28 +00:00
David Blaikie
0cd27a9633 DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself.
Originally committed in r211723, reverted in r211724 due to failure
cases found and fixed (ArgumentPromotion: r211872, Inlining: r212065),
committed again in r212085 and reverted again in r212089 after fixing
some other cases, such as debug info subprogram lists not keeping track
of the function they represent (r212128) and then short-circuiting
things like LiveDebugVariables that build LexicalScopes for functions
that might not have full debug info.

And again, I believe the invariant actually holds for some reasonable
amount of code (but I'll keep an eye on the buildbots and see what
happens... ).

Original commit message:

PR20038: DebugInfo: Inlined call sites where the caller has debug info
but the call itself has no debug location.

This situation does bad things when inlined, so I've fixed Clang not to
produce inlinable call sites without locations when the caller has debug
info (in the one case where I could find that this occurred). This
updates the PR20038 test case to be what clang now produces, and readds
the assertion that had to be removed due to this bug.

I've also beefed up the debug info verifier to help diagnose these
issues in the future, and I hope to add checks to the inliner to just
assert-fail if it encounters this situation. If, in the future, we
decide we have to cope with this situation, the right thing to do is
probably to just remove all the DebugLocs from the inlined instructions.

llvm-svn: 212205
2014-07-02 18:32:05 +00:00
Quentin Colombet
fcd6cc7327 [RegAllocGreedy] Provide a subtarget hook to disable the local reassignment
heuristic.
By default, no functionality change.
This is a follow-up of r212099.

This hook provides a finer grain to control the optimization.

<rdar://problem/17444599>

llvm-svn: 212204
2014-07-02 18:32:04 +00:00
David Blaikie
f63c5eb709 Don't try to construct debug LexicalScopes hierarchy for functions that do not have top level debug information.
If a function isn't actually in a CU's subprogram list in the debug info
metadata, ignore all the DebugLocs and don't try to build scopes, track
variables, etc.

While this is possibly a minor optimization, it's also a correctness fix
for an incoming patch that will add assertions to LexicalScopes and the
debug info verifier to ensure that all scope chains lead to debug info
for the current function.

Fix up a few test cases that had broken/incomplete debug info that could
violate this constraint.

Add a test case where this occurs by design (inlining a
debug-info-having function in an attribute nodebug function - we want
this to work because /if/ the nodebug function is then inlined into a
debug-info-having function, it should be fine (and will work fine - we
just stitch the scopes up as usual), but should the inlining not happen
we need to not assert fail either).

llvm-svn: 212203
2014-07-02 18:31:35 +00:00
Chad Rosier
f00a3b6176 Revert "Revert "MachineScheduler: better book-keeping for asserts.""
This reverts commit r212109, which reverted r212088.

However, disable the assert as it's not necessary for correctness.  There are
several corner cases that the assert needed to handle better for in-order
scheduling, but none of them are incorrect scheduler behavior. The assert is
mainly there to collect good unit tests like this and ensure that the
target-independent scheduler is working as expected with the various machine
models.

llvm-svn: 212187
2014-07-02 16:46:08 +00:00
Matt Arsenault
4dc8768d57 Fix missing const
llvm-svn: 212168
2014-07-02 06:45:26 +00:00
Chandler Carruth
d3bf1762bd [cleanup] Hoist an if-else chain on ISD opcodes (really designed for
switches) into a switch, and sink them into a dispatch function that can
return the result rather than awkward variable setting with breaks.

llvm-svn: 212166
2014-07-02 06:23:34 +00:00
Chandler Carruth
456bcaba88 [cleanup] Remove dead 'break;' statements that I meant to nuke in
r212158 but missed.

Thanks to Craig for spotting the goof!

llvm-svn: 212159
2014-07-02 04:39:34 +00:00
Chandler Carruth
f80e626928 [cleanup] Hoist the promotion dispatch logic into the promote function
so that we can use return to express it more cleanly and avoid so many
nested switch statements.

llvm-svn: 212158
2014-07-02 03:07:15 +00:00
Chandler Carruth
fbcb8e0cd1 [cleanup] Nuke the 'VectorOp' bit of the promote method names.
This doesn't add any information for methods in the VectorLegalizer
class that clearly take SDAG operations to legalize.

llvm-svn: 212157
2014-07-02 03:07:11 +00:00
Chandler Carruth
233ad52d55 [x86] Clean up and modernize the doxygen and API comments for the vector
operation legalization code.

llvm-svn: 212155
2014-07-02 02:16:57 +00:00
Juergen Ributzka
ce04dda8eb [FastISel] Factor out stackmap intrinsic selection code into a dedicated helper method. NFCI.
llvm-svn: 212140
2014-07-01 22:25:49 +00:00
Juergen Ributzka
241f08ad4e [DAG] Pass the argument list to the CallLoweringInfo via move semantics. NFCI.
The argument list vector is never used after it has been passed to the
CallLoweringInfo and moving it to the CallLoweringInfo is cleaner and
pretty much as cheap as keeping a pointer to it.

llvm-svn: 212135
2014-07-01 22:01:54 +00:00
Alp Toker
62b822cb61 Move remaining LLVM_ENABLE_DUMP conditionals out of the headers
This macro is sometimes defined manually but isn't (and doesn't need to be) in
llvm-config.h so shouldn't appear in the headers, likewise NDEBUG.

Instead switch them over to LLVM_DUMP_METHOD on the definitions.

llvm-svn: 212130
2014-07-01 21:19:13 +00:00
Chad Rosier
92846e46b5 Revert "MachineScheduler: better book-keeping for asserts."
This reverts commit r212088, which is causing a number of spec
failures.  Will provide reduced test cases shortly.
PR20057

llvm-svn: 212109
2014-07-01 17:23:11 +00:00
Quentin Colombet
f03a637a3d [PeepholeOptimzer] Fix a typo in a comment.
Spotted by Amara Emerson.

llvm-svn: 212106
2014-07-01 16:23:44 +00:00
Quentin Colombet
f8c6acefcd [PeepholeOptimizer] Advanced rewriting of copies to avoid cross register banks
copies.

This patch extends the peephole optimization introduced in r190713 to produce
register-coalescer friendly copies when possible.

This extension taught the existing cross-bank copy optimization how to deal
with the instructions that generate cross-bank copies, i.e., insert_subreg,
extract_subreg, reg_sequence, and subreg_to_reg.
E.g.
b = insert_subreg e, A, sub0 <-- cross-bank copy
...
C = copy b.sub0 <-- cross-bank copy

Would produce the following code:
b = insert_subreg e, A, sub0 <-- cross-bank copy
...
C = copy A <-- same-bank copy

This patch also introduces a new helper class for that: ValueTracker.
This class implements the logic to look through the copy related instructions
and get the related source.

For now, the advanced rewriting is disabled by default as we are lacking the
semantic on target specific instructions to catch the motivating examples.

Related to <rdar://problem/12702965>.

llvm-svn: 212100
2014-07-01 14:33:36 +00:00
Quentin Colombet
d98d96fa61 [RegAllocGreedy] Provide a flag to disable the local reassignment heuristic.
By default, no functionality change.

Before evicting a local variable, this heuristic tries to find another (set of)
local(s) that can be reassigned to a free color.

In some extreme cases (large basic blocks with tons of local variables), the
compilation time is dominated by the local interference checks that this
heuristic must perform, with no code gen gain.
E.g., the motivating example takes 4 minutes to compile with this heuristic, 12
seconds without.

Improving the situation will likely require to make drastic changes to the
register allocator and/or the interference check framework.

For now, provide this flag to better understand the impact of that heuristic.

<rdar://problem/17444599>

llvm-svn: 212099
2014-07-01 14:08:37 +00:00
David Blaikie
d26237a1a8 Revert "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself."
This reverts commit r212085.

This breaks the sanitizer bot... & I thought I'd tried pretty hard not
to do that. Guess I need to try harder.

llvm-svn: 212089
2014-07-01 04:11:45 +00:00
Andrew Trick
449b2dbbfb MachineScheduler: better book-keeping for asserts.
Fixes another test case under PR20057.

llvm-svn: 212088
2014-07-01 03:23:13 +00:00
David Blaikie
0a2228ab5d DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself.
Originally committed in r211723, reverted in r211724 due to failure
cases found and fixed (ArgumentPromotion: r211872, Inlining: r212065),
and I now believe the invariant actually holds for some reasonable
amount of code (but I'll keep an eye on the buildbots and see what
happens... ).

Original commit message:

PR20038: DebugInfo: Inlined call sites where the caller has debug info
but the call itself has no debug location.

This situation does bad things when inlined, so I've fixed Clang not to
produce inlinable call sites without locations when the caller has debug
info (in the one case where I could find that this occurred). This
updates the PR20038 test case to be what clang now produces, and readds
the assertion that had to be removed due to this bug.

I've also beefed up the debug info verifier to help diagnose these
issues in the future, and I hope to add checks to the inliner to just
assert-fail if it encounters this situation. If, in the future, we
decide we have to cope with this situation, the right thing to do is
probably to just remove all the DebugLocs from the inlined instructions.

llvm-svn: 212085
2014-07-01 03:11:59 +00:00
Alp Toker
000fb20af5 Fix 'platform-specific' hyphenations
llvm-svn: 212056
2014-06-30 18:57:16 +00:00
Saleem Abdulrasool
499f3c1213 CodeGen: rename Win64 ExceptionHandling to WinEH
This exception format is not specific to Windows x64.  A similar approach is
taken on nearly all architectures.  Generalise the name to reflect reality.
This will eventually be used for Windows on ARM data emission as well.

Switch the enum and namespace into an enum class.

llvm-svn: 212000
2014-06-29 21:43:47 +00:00
Saleem Abdulrasool
d45105fbb4 MC: rename EmitWin64EH routines
Rename the routines to reflect the reality that they are more related to call
frame information than to Win64 EH. Although EH is implemented in an intertwined
manner by augmenting with an exception handler and an associated parameter, the
majority of these routines emit information required to unwind the frames. This
also helps identify that these routines are generic for most windows platforms
(they apply equally to nearly all architectures except x86) although the
encoding of the information is architecture dependent.

Unwinding data is emitted via EmitWinCFI* and exception handling information via
EmitWinEH*.

llvm-svn: 211994
2014-06-29 01:52:01 +00:00
Craig Topper
4c15d35f50 Add ops() method to SDNode that returns an ArrayRef<SDUse>. Use it to simplify some code.
llvm-svn: 211993
2014-06-29 00:40:57 +00:00
Chad Rosier
98a58b0e56 [AArch64] Fix memset ICE when memset value is f128.
llvm-svn: 211960
2014-06-27 21:05:09 +00:00
David Majnemer
abf7854d05 IR: Add COMDATs to the IR
This new IR facility allows us to represent the object-file semantic of
a COMDAT group.

COMDATs allow us to tie together sections and make the inclusion of one
dependent on another. This is required to implement features like MS
ABI VFTables and optimizing away certain kinds of initialization in C++.

This functionality is only representable in COFF and ELF, Mach-O has no
similar mechanism.

Differential Revision: http://reviews.llvm.org/D4178

llvm-svn: 211920
2014-06-27 18:19:56 +00:00
David Blaikie
d6c9082735 Revert "Revert "Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location."""
Reverting this again, didn't mean to commit it - while r211872 fixes one
of the issues here, there are still others to figure out and address.

This reverts commit r211871.

llvm-svn: 211873
2014-06-27 05:34:05 +00:00
David Blaikie
a4e2150a25 Revert "Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location.""
This reverts commit r211724.

llvm-svn: 211871
2014-06-27 05:31:49 +00:00
Andrew Trick
424d4c474f Left out the NDEBUG in the previous checkin.
llvm-svn: 211867
2014-06-27 05:09:36 +00:00
Andrew Trick
a8cb3163c0 MachineScheduler: add some book-keeping to fix an assert.
Fixe for Bug 20057 - Assertion failied in llvm::SUnit* llvm::SchedBoundary::pickOnlyChoice(): Assertion `i <= (HazardRec->getMaxLookAhead() + MaxObservedStall) && "permanent hazard"'

Thanks to Chad for the test case.

llvm-svn: 211865
2014-06-27 04:57:05 +00:00
Juergen Ributzka
4e8c3d809f [StackMaps] Enable patchpoint liveness analysis per default.
llvm-svn: 211817
2014-06-26 23:39:52 +00:00
Juergen Ributzka
f24160ef4c [Stackmaps] Remove the liveness calculation for stackmap intrinsics.
There is no need to calculate the liveness information for stackmaps. The
liveness information is still available for the patchpoint intrinsic and
that is also the intended usage model.

Related to <rdar://problem/17473725>

llvm-svn: 211816
2014-06-26 23:39:44 +00:00
Alp Toker
97022b0c1f Revert "Introduce a string_ostream string builder facilty"
Temporarily back out commits r211749, r211752 and r211754.

llvm-svn: 211814
2014-06-26 22:52:05 +00:00
Alp Toker
fd9ead3b6f Introduce a string_ostream string builder facilty
string_ostream is a safe and efficient string builder that combines opaque
stack storage with a built-in ostream interface.

small_string_ostream<bytes> additionally permits an explicit stack storage size
other than the default 128 bytes to be provided. Beyond that, storage is
transferred to the heap.

This convenient class can be used in most places an
std::string+raw_string_ostream pair or SmallString<>+raw_svector_ostream pair
would previously have been used, in order to guarantee consistent access
without byte truncation.

The patch also converts much of LLVM to use the new facility. These changes
include several probable bug fixes for truncated output, a programming error
that's no longer possible with the new interface.

llvm-svn: 211749
2014-06-26 00:00:48 +00:00
Eric Christopher
ce17f76f37 The includes were sorted. Revert r210578.
llvm-svn: 211737
2014-06-25 22:36:37 +00:00
David Blaikie
ac824331fd Revert "PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location."
This reverts commit r211723.

Breaks the ASan/compiler-rt build... guess I didn't test very far at all
:/.

llvm-svn: 211724
2014-06-25 18:20:54 +00:00
David Blaikie
59345c0fdc PR20038: DebugInfo: Inlined call sites where the caller has debug info but the call itself has no debug location.
This situation does bad things when inlined, so I've fixed Clang not to
produce inlinable call sites without locations when the caller has debug
info (in the one case where I could find that this occurred). This
updates the PR20038 test case to be what clang now produces, and readds
the assertion that had to be removed due to this bug.

I've also beefed up the debug info verifier to help diagnose these
issues in the future, and I hope to add checks to the inliner to just
assert-fail if it encounters this situation. If, in the future, we
decide we have to cope with this situation, the right thing to do is
probably to just remove all the DebugLocs from the inlined instructions.

llvm-svn: 211723
2014-06-25 18:03:10 +00:00
NAKAMURA Takumi
c5a2c81f7e Re-apply r211399, "Generate native unwind info on Win64" with a fix to ignore SEH pseudo ops in X86 JIT emitter.
--
This patch enables LLVM to emit Win64-native unwind info rather than
DWARF CFI.  It handles all corner cases (I hope), including stack
realignment.

Because the unwind info is not flexible enough to describe stack frames
with a gap of unknown size in the middle, such as the one caused by
stack realignment, I modified register spilling code to place all spills
into the fixed frame slots, so that they can be accessed relative to the
frame pointer.

Patch by Vadim Chugunov!

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D4081

llvm-svn: 211691
2014-06-25 12:41:52 +00:00
NAKAMURA Takumi
35a44c8eda Reformat.
llvm-svn: 211689
2014-06-25 12:40:56 +00:00
Rafael Espindola
c0fea93ce8 Print a=b as an assignment.
In assembly the expression a=b is parsed as an assignment, so it should be
printed as one.

This remove a truly horrible hack for producing a label with "a=.". It would
be used by codegen but would never be reached by the asm parser. Sorry I
missed this when it was first committed.

llvm-svn: 211639
2014-06-24 22:45:16 +00:00
Sanjay Patel
def1964051 fixed a few typos in comments
llvm-svn: 211634
2014-06-24 21:11:51 +00:00
David Majnemer
02a115bee2 CodeGen: Avoid multiple strlen calls
Use a StringRef to hold our section prefix.  This avoids multiple calls
to strlen.

llvm-svn: 211602
2014-06-24 16:01:53 +00:00
Kevin Qin
22b89e0ae0 [AArch64] Fix a build_vector pattern match fail
caused by defect in isBuildVectorAllZeros().

llvm-svn: 211567
2014-06-24 05:37:27 +00:00
Rafael Espindola
1c8d56c257 Remove a temporary hack.
Amusingly this survived a lot longer than the CFI transition. We don't even
support non-cfi assemblers any more.

llvm-svn: 211498
2014-06-23 14:22:55 +00:00
NAKAMURA Takumi
eca89a0522 Revert r211399, "Generate native unwind info on Win64"
It broke Legacy JIT Tests on x86_64-{mingw32|msvc}, aka Windows x64.

llvm-svn: 211480
2014-06-22 22:00:56 +00:00
Benjamin Kramer
8edc3ff6a3 Legalizer: Add support for splitting insert_subvectors.
We handle this by spilling the whole thing to the stack and doing the
insertion as a store.

PR19492. This happens in real code because the vectorizer creates v2i128 when AVX is enabled.

llvm-svn: 211435
2014-06-21 12:56:42 +00:00
Richard Trieu
b7d5af56cb Add back functionality removed in r210497.
Instead of asserting, output a message stating that a null pointer was found.

llvm-svn: 211430
2014-06-21 02:43:02 +00:00
Reid Kleckner
40d9c6f936 Generate native unwind info on Win64
This patch enables LLVM to emit Win64-native unwind info rather than
DWARF CFI.  It handles all corner cases (I hope), including stack
realignment.

Because the unwind info is not flexible enough to describe stack frames
with a gap of unknown size in the middle, such as the one caused by
stack realignment, I modified register spilling code to place all spills
into the fixed frame slots, so that they can be accessed relative to the
frame pointer.

Patch by Vadim Chugunov!

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D4081

llvm-svn: 211399
2014-06-20 20:35:47 +00:00
Rafael Espindola
cc59644b5c Allow a target to create a null streamer.
Targets can assume that a target streamer is present, so they have to be able
to construct a null streamer in order to set the target streamer in it to.

Fixes a crash when using the null streamer with arm.

llvm-svn: 211358
2014-06-20 13:11:28 +00:00
Yaron Keren
e1172cbc66 The count() function for STL datatypes returns unsigned, even where it's
only 1/0 result like std::set. Some of the LLVM ADT already return unsigned
count(), while others still return bool count().

In continuation to r197879, this patch modifies DenseMap, DenseSet, 
ScopedHashTable, ValueMap:: count() to return size_type instead of bool,
1 instead of true and 0 instead of false.

size_type is typedef-ed locally within each class to size_t.

http://reviews.llvm.org/D4018

Reviewed by dblaikie.

llvm-svn: 211350
2014-06-20 10:26:56 +00:00
Karthik Bhat
fb8456f1af Add Support to Recognize and Vectorize NON SIMD instructions in SLPVectorizer.
This patch adds support to recognize patterns such as fadd,fsub,fadd,fsub.../add,sub,add,sub... and
vectorizes them as vector shuffles if they are profitable.
These patterns of vector shuffle can later be converted to instructions such as addsubpd etc on X86.
Thanks to Arnold and Hal for the reviews. http://reviews.llvm.org/D4015 

llvm-svn: 211339
2014-06-20 04:32:48 +00:00
Eric Christopher
3dac9feafc Add a new subtarget hook for whether or not we'd like to enable
the atomic load linked expander pass to run for a particular
subtarget. This requires a check of the subtarget and so save
the TargetMachine rather than only TargetLoweringInfo and update
all callers.

llvm-svn: 211314
2014-06-19 21:03:04 +00:00
David Blaikie
c27f5e5414 DebugInfo: Fission: Ensure the address pool entries for location lists are emitted.
The address pool was being emitted before location lists. The latter
could add more entries to the pool which would be lost/never emitted.

llvm-svn: 211284
2014-06-19 17:59:14 +00:00
Jingyue Wu
52b8eafe4c [ValueTracking] Extend range metadata to call/invoke
Summary:
With this patch, range metadata can be added to call/invoke including
IntrinsicInst. Previously, it could only be added to load.

Rename computeKnownBitsLoad to computeKnownBitsFromRangeMetadata because
range metadata is not only used by load.

Update the language reference to reflect this change.

Test Plan:
Add several tests in range-2.ll to confirm the verifier is happy with
having range metadata on call/invoke.

Add two tests in AddOverFlow.ll to confirm annotating range metadata to
call/invoke can benefit InstCombine.

Reviewers: meheff, nlewycky, reames, hfinkel, eliben

Reviewed By: eliben

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D4187

llvm-svn: 211281
2014-06-19 16:50:16 +00:00
Oliver Stannard
0f096f23f1 Emit DWARF3 call frame information when DWARF3+ debug info is requested
Currently, llvm always emits a DWARF CIE with a version of 1, even when emitting
DWARF 3 or 4, which both support CIE version 3. This patch makes it emit the
newer CIE version when we are emitting DWARF 3 or 4. This will not reduce
compatibility, as we already emit other DWARF3/4 features, and is worth doing as
the DWARF3 spec removed some ambiguities in the interpretation of call frame
information.

It also fixes a minor bug where the "return address" field of the CIE was
encoded as a ULEB128, which is only valid when the CIE version is 3. There are
no test changes for this, because (as far as I can tell) none of the platforms
that we test have a return address register with a DWARF register number >127.

llvm-svn: 211272
2014-06-19 15:39:33 +00:00
Eric Christopher
c246cb10ff Move -dwarf-version to an MC level command line option so it's
used by all of the MC level tools and codegen. Fix up all uses
in the compiler to use this and set it on the context accordingly.

llvm-svn: 211257
2014-06-19 06:22:08 +00:00
Eric Christopher
4be7eb299a Remove unnecessary include.
llvm-svn: 211256
2014-06-19 06:22:05 +00:00
Tim Northover
ee6cb87672 DAG: move sret demotion into most basic LowerCallTo implementation.
It looks like there are two versions of LowerCallTo here: the
SelectionDAGBuilder one is designed to operate on LLVM IR, and the
TargetLowering one in the case where everything is at DAG level.

Previously, only the SelectionDAGBuilder variant could handle demoting
an impossible return to sret semantics (before delegating to the
TargetLowering version), but this functionality is also useful for
certain libcalls (e.g. 128-bit operations on 32-bit x86).  So this
commit moves the sret handling down a level.

rdar://problem/17242889

llvm-svn: 211155
2014-06-18 11:52:44 +00:00
Tom Stellard
6987d06184 SelectionDAG: Expand i64 = FP_TO_SINT i32
llvm-svn: 211108
2014-06-17 16:53:07 +00:00
David Blaikie
1043ced2ca PR20038: DebugInfo missing DIEs for some concrete variables.
I haven't nailed this down entirely, but this is about as small of a
test case as I can seem to construct and adequately demonstrates the
crasher. I'll continue investigating the root cause/fix(es).

llvm-svn: 210993
2014-06-15 19:34:26 +00:00
Tim Northover
7dd495fd0e LegalizeDAG: make sure cast is unsigned before using FP_TO_UINT.
It's valid to use FP_TO_SINT when asking for a smaller type (e.g. all
"unsigned int16" values fit into a "signed int32"), but the reverse
isn't true.

Unfortunately, I'm not actually aware of any architecture with
asymmetric FP_TO_SINT and FP_TO_UINT handling and the logic happens to
work in the symmetric case, so I can't actually write a test for this.

llvm-svn: 210986
2014-06-15 09:27:20 +00:00
David Blaikie
53324d9a53 DebugInfo: Remove some extra handling of abstract variables and instead rely solely on the delayed handling introduced in r210946
Now that we handle finding abstract variables at the end of the module,
remove the upfront handling and just ensure the abstract variable is
built when necessary.

In theory we could have a split implementation, where inlined variables
are immediately constructed referencing the abstract definition, and
concrete variables are delayed - but let's go with one solution for now
unless there's a reason not to.

llvm-svn: 210961
2014-06-13 23:52:55 +00:00
Jiangning Liu
c7a7a14ca1 Move GlobalMerge from Transform to CodeGen.
This patch is to move GlobalMerge pass from Transform/Scalar                                                           
to CodeGen, because GlobalMerge depends on TargetMachine.
In the mean time, the macro INITIALIZE_TM_PASS is also moved
to CodeGen/Passes.h. With this fix we can avoid making
libScalarOpts depend on libCodeGen.

llvm-svn: 210951
2014-06-13 22:57:59 +00:00
Eric Christopher
2ea2870f36 The hazard recognizer only needs a subtarget, not a target machine
so make it take one. Fix up all users accordingly.

llvm-svn: 210948
2014-06-13 22:38:52 +00:00
David Blaikie
a6798763a0 DebugInfo: Reference abstract definitions from variables in concrete definitions that preceed their first inline definition.
Rather than relying on abstract variables looked up at the time the
concrete variable is created, look them up at the end of the module to
ensure they're referenced even if they're created after the concrete
definition. This completes/matches the work done in r209677 to handle
this for the subprograms themselves.

llvm-svn: 210946
2014-06-13 22:35:44 +00:00
David Blaikie
da5335e0bc DwarfDebug::getExistingAbstractVariable: constify an existing reference parameter that didn't need to be mutated.
llvm-svn: 210944
2014-06-13 22:29:31 +00:00
David Blaikie
33f6d6743b DebugInfo: Following up to r209677, refactor local variable emission to delay the choice between emitting the definition attributes or using DW_AT_abstract_definition
This doesn't fix the abstract variable handling yet, but it introduces a
similar delay mechanism as was added for subprograms, causing
DW_AT_location to be reordered to the beginning of the attribute list
for local variables, and fixes all the test fallout for that.

A subsequent commit will remove the abstract variable handling in
DbgVariable and just do the abstract variable lookup at module end to
ensure that abstract variables introduced after their concrete
counterparts are appropriately referenced by the concrete variable.

llvm-svn: 210943
2014-06-13 22:18:23 +00:00
Tim Northover
ad404d23ea Atomics: make use of the "cmpxchg weak" instruction.
This also simplifies the IR we create slightly: instead of working out
where success & failure should go manually, it turns out we can just
always jump to a success/failure block created for the purpose. Later
phases will sort out the mess without much difficulty.

llvm-svn: 210917
2014-06-13 16:45:52 +00:00
Tim Northover
6aacd97b06 Atomics: switch direction of cmpxchg comparison
This has two benefits: it makes the result more suitable for direct
insertaion into the struct to emulate the new cmpxchg, and it means
the name we give the instruction matches its actual effect better.

llvm-svn: 210916
2014-06-13 16:45:36 +00:00
Tim Northover
b9ec29d7c5 IR: add "cmpxchg weak" variant to support permitted failure.
This commit adds a weak variant of the cmpxchg operation, as described
in C++11. A cmpxchg instruction with this modifier is permitted to
fail to store, even if the comparison indicated it should.

As a result, cmpxchg instructions must return a flag indicating
success in addition to their original iN value loaded. Thus, for
uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The
second flag is 1 when the store succeeded.

At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been
added as the natural representation for the new cmpxchg instructions.
It is a strong cmpxchg.

By default this gets Expanded to the existing ATOMIC_CMP_SWAP during
Legalization, so existing backends should see no change in behaviour.
If they wish to deal with the enhanced node instead, they can call
setOperationAction on it. Beware: as a node with 2 results, it cannot
be selected from TableGen.

Currently, no use is made of the extra information provided in this
patch. Test updates are almost entirely adapting the input IR to the
new scheme.

Summary for out of tree users:
------------------------------

+ Legacy Bitcode files are upgraded during read.
+ Legacy assembly IR files will be invalid.
+ Front-ends must adapt to different type for "cmpxchg".
+ Backends should be unaffected by default.

llvm-svn: 210903
2014-06-13 14:24:07 +00:00
Juergen Ributzka
be48c6b01a [FastISel][X86] - Add branch weights
Add branch weights to branch instructions, so that the following passes can
optimize based on it (i.e. basic block ordering).

llvm-svn: 210863
2014-06-13 00:45:11 +00:00
Juergen Ributzka
9dda2c5782 [FastISel][X86] Add MachineMemOperand to load/store instructions.
This commit adds MachineMemOperands to load and store instructions. This allows
the peephole optimizer to fold load instructions. Unfortunatelly the peephole
optimizer currently doesn't run at -O0.

llvm-svn: 210858
2014-06-12 23:27:57 +00:00
Andrew Trick
e967f1bca1 Fix the scheduler's MaxObservedStall computation.
WenHan Gu pointed out this bug that results in an assert
not being effective in some cases.

llvm-svn: 210846
2014-06-12 22:36:28 +00:00
Tom Stellard
5f887c9493 Revert "SelectionDAG: Enable (and (setcc x), (setcc y)) -> (setcc (and x, y)) for vectors"
This reverts commit r210540, adds a testcase for the regression it
caused, and marks the R600 test it was supposed to fix as XFAIL.

llvm-svn: 210792
2014-06-12 16:04:47 +00:00
Juergen Ributzka
3c469924f6 [FastISel] Add support for the stackmap intrinsic.
This implements target-independent FastISel lowering for the stackmap intrinsic.

llvm-svn: 210742
2014-06-12 03:29:26 +00:00
Eric Christopher
67a4d6642a Revert r210613 to conform to coding standards.
Thanks Duncan for noticing.

llvm-svn: 210662
2014-06-11 16:59:33 +00:00
Jiangning Liu
a490614c0a Create macro INITIALIZE_TM_PASS.
Pass initialization requires to initialize TargetMachine for back-end
specific passes. This commit creates a new macro INITIALIZE_TM_PASS to
simplify this kind of initialization.

llvm-svn: 210641
2014-06-11 07:04:37 +00:00
Saleem Abdulrasool
9563db9981 CodeGen: refactor DwarfException
DwarfException served as a base class for exception handling directive emission.
However, this is also used by other exception models (e.g. Win64EH).  Rename
this class to EHStreamer and split it out of DwarfException.h.  NFC.

Use the opportunity to fix up some of the documentation comments to match
current LLVM style.  Also rename some functions to conform better with current
LLVM coding style.

llvm-svn: 210622
2014-06-11 01:19:03 +00:00
Eric Christopher
c3e6a0952e Sort includes.
llvm-svn: 210613
2014-06-11 00:25:16 +00:00
Eric Christopher
47d13b572b Have isInTailCallPosition take the DAG so that we can use the
version of TargetLowering/Machine from there on the way to avoiding
TargetMachine in TargetLowering.

llvm-svn: 210579
2014-06-10 20:39:38 +00:00
Eric Christopher
cdc9402ed1 Reorder includes to be sorted.
llvm-svn: 210578
2014-06-10 20:39:35 +00:00
Eric Christopher
56f9d2e5d3 Fix typos.
llvm-svn: 210571
2014-06-10 20:07:29 +00:00
Juergen Ributzka
de3f16cdd3 [FastISel] Collect statistics about failing intrinsic calls.
Add more instruction-specific statistics about failing intrinsic calls during
FastISel.

llvm-svn: 210556
2014-06-10 18:17:00 +00:00
Tom Stellard
ad2d29f10e SelectionDAG: Don't use MVT::Other to determine legality of ISD::SELECT_CC
The SelectionDAG bad a special case for ISD::SELECT_CC, where it would
allow targets to specify:

setOperationAction(ISD::SELECT_CC, MVT::Other, Expand);

to indicate that they wanted to expand ISD::SELECT_CC for all types.
This wasn't applied correctly everywhere, and it makes writing new
DAG patterns with ISD::SELECT_CC difficult.

llvm-svn: 210541
2014-06-10 16:01:29 +00:00
Tom Stellard
ba1210e7ac SelectionDAG: Enable (and (setcc x), (setcc y)) -> (setcc (and x, y)) for vectors
This prevents a future commit from regressing:

test/CodeGen/R600/setcc-equivalent.ll

llvm-svn: 210540
2014-06-10 16:01:25 +00:00
Tom Stellard
aab1db4cd9 SelectionDAG: Expand SELECT_CC to SELECT + SETCC
This consolidates code from the Hexagon, R600, and XCore targets.

No functionality change intended.

llvm-svn: 210539
2014-06-10 16:01:22 +00:00
Richard Trieu
8c7b353cd7 Removing an "if (!this)" check from two print methods. The condition will
never be true in a well-defined context.  The checking for null pointers
has been moved into the caller logic so it does not rely on undefined behavior.

llvm-svn: 210497
2014-06-09 22:53:16 +00:00
Alexey Samsonov
e52f04f5e6 Generate better location ranges for some register-described variables.
Don't terminate location ranges for register-described variables
at the end of machine basic block if this register is never modified
in the function body, except for the prologue and epilogue. Prologue
location is guessed by FrameSetup flags on MachineInstructions, while
epilogue location is deduced from debug locations of instructions
in the basic blocks ending with return instructions.

This patch is mostly targeted to fix non-trivial debug locations for
variables addressed via stack and frame pointers.

It is not really a generic fix. We can still produce poor debug info
for register-described variables if this register *is* modified somewhere
in the function, but in unrelated places. This might be the case for the debug
info in optimized binaries (e.g. for local variables in inlined functions).
LiveDebugVariables pass in CodeGen attempts to fix this problem by adjusting
DBG_VALUE instructions, but this pass is tied to greedy register allocator,
which is used in optimized builds only. Proper fix would likely involve
generalizing LiveDebugVariables to all register allocators. See more discussion
in http://reviews.llvm.org/D3933 review thread.

I'm proceeding with this patch to fix immediate severe problems and
important cases, e.g. fix completely broken debug info with AddressSanitizer
and fix PR19307 (missing debug info for by-value std::string arguments).

llvm-svn: 210492
2014-06-09 21:53:47 +00:00
Andrea Di Biagio
23548cb631 [X86] Add target combine rules for horizontal add/sub.
This patch adds new target specific combine rules to identify horizontal
add/sub idioms from BUILD_VECTOR dag nodes.

This patch also teaches the DAGCombiner how to canonicalize sequences of
insert_vector_elt dag nodes according to the following rule:

  (insert_vector_elt (insert_vector_elt A, I0), I1) ->
    (insert_vecto_elt (insert_vector_elt A, I1), I0)

This new canonicalization rule only triggers if the inner insert_vector
dag node has exactly one use; also, both indices must be known constants,
and I1 < I0.
This last rule made it possible to write a simpler algorithm to identify
horizontal add/sub patterns because now we don't have to worry about the
ordering of insert_vector_elt dag nodes.

llvm-svn: 210477
2014-06-09 16:54:41 +00:00
Andrea Di Biagio
a2490b5eaa [DAG] Expose NoSignedWrap, NoUnsignedWrap and Exact flags to SelectionDAG.
This patch modifies SelectionDAGBuilder to construct SDNodes with associated
NoSignedWrap, NoUnsignedWrap and Exact flags coming from IR BinaryOperator
instructions.

Added a new SDNode type called 'BinaryWithFlagsSDNode' to allow accessing
nsw/nuw/exact flags during codegen.

Patch by Marcello Maggioni.

llvm-svn: 210467
2014-06-09 12:32:53 +00:00
Craig Topper
b00824c629 [C++11] Use 'nullptr'.
llvm-svn: 210442
2014-06-08 22:29:17 +00:00
Alp Toker
62946e907c Fix typos
llvm-svn: 210401
2014-06-07 21:23:09 +00:00
Andrew Trick
bdb0601a15 Fix the MachineScheduler's logic for updating ready times for in-order.
Now the scheduler updates a node's ready time as soon as it is
scheduled, before releasing dependent nodes. There was a reason I
didn't do this initially but it no longer applies.

A53 is in-order and was running into an issue where nodes where added
to the readyQ too early. That's now fixed.

This also makes it easier for custom scheduling strategies to build
heuristics based on the actual cycles that the node was scheduled at.

The only impact on OOO (sandybridge/cyclone) is that ready times will
be slightly more accurate. I didn't measure any significant regressions.

llvm-svn: 210390
2014-06-07 01:48:43 +00:00
David Blaikie
77b762514c DebugInfo: Use the scope of the function declaration, if any, to name a function in DWARF pubnames
This ensures that member functions, for example, are entered into
pubnames with their fully qualified name, rather than inside the global
namespace.

llvm-svn: 210379
2014-06-06 22:29:05 +00:00
David Blaikie
2ad9dfbfdb DebugInfo: pubnames: include file-local (static or anonymous namespace) variables and anonymous namespaces themselves.
Still some issues with name qualification, FIXMEs added to test cases
and fixes will come next.

llvm-svn: 210378
2014-06-06 22:16:56 +00:00
Rafael Espindola
34f7950ebf Fix a few issues with comdat handling on COFF.
* Section association cannot use just the section name as many
sections can have the same name. With this patch, the comdat symbol in
an assoc section is interpreted to mean a symbol in the associated
section and the mapping is discovered from it.

* Comdat symbols were not being set correctly. Instead we were getting
whatever was output first for that section.

A consequence is that associative sections now must use .section to
set the association. Using .linkonce would not work since it is not
possible to change a sections comdat symbol (it is used to decide if
we should create a new section or reuse an existing one).

This includes r210298, which was reverted because it was asserting
on an associated section having the same comdat as the associated
section.

llvm-svn: 210367
2014-06-06 19:26:12 +00:00
Eric Christopher
db8e2ecde5 Have TargetSelectionDAGInfo take a DataLayout initializer rather than
a TargetMachine since the only thing it wants is DataLayout.

llvm-svn: 210366
2014-06-06 19:04:48 +00:00
Alexey Samsonov
c576391962 Fix null dereference with -debug-only=dwarfdebug
llvm-svn: 210299
2014-06-05 23:10:19 +00:00
Tom Roeder
740d86dc79 Add a new attribute called 'jumptable' that creates jump-instruction tables for functions marked with this attribute.
It includes a pass that rewrites all indirect calls to jumptable functions to pass through these tables.

This also adds backend support for generating the jump-instruction tables on ARM and X86.
Note that since the jumptable attribute creates a second function pointer for a
function, any function marked with jumptable must also be marked with unnamed_addr.

llvm-svn: 210280
2014-06-05 19:29:43 +00:00
Sasa Stankovic
b9ef62e8e9 Prevent hoisting the instruction whose def might be clobbered by the terminator.
llvm-svn: 210261
2014-06-05 13:42:48 +00:00
David Blaikie
a5a226f1fa Revert r210221 again, due to a crash Richard Smith has provided involving self-hosting LLVM with libc++.
Test case coming, once I reduce it.

llvm-svn: 210236
2014-06-05 02:04:59 +00:00
David Blaikie
c3347b5d63 DebugInfo: Reuse existing LexicalScope to retrieve the scope's MDNode, rather than looking it up through the DebugLoc.
No functional change intended, just streamlines the abstract variable
lookup/construction to use a common entry point.

llvm-svn: 210234
2014-06-05 01:30:50 +00:00
David Blaikie
144ccfbf8c DebugInfo: Roll argument insertion into variable insertion to ensure arguments are correctly handled in all cases.
No functional change intended.

llvm-svn: 210233
2014-06-05 01:04:20 +00:00
David Blaikie
c7bf52b36f PR19388: DebugInfo: Emit dead arguments in their originally declared order.
Unused arguments were not being added to the argument list, but instead
treated as arbitrary scope variables. This meant they weren't carefully
added in the original argument order.

In this particular example, though, it turns out the argument is only
/mostly/ unused (well, actually it's entirely used, but in a specific
way). It's a struct that, due to ABI reasons, is decomposed into chunks
(exactly one chunk, since it has one member) and then passed. Since only
one of those chunks is used (SROA, etc, kill the original reconstitution
code) we don't have a location to describe the whole variable.

In this particular case, since the struct consists of just the one int,
once we have partial location information, this should have a location
that describes the entire variable (since the piece is the entirety of
the object).

And at some point we'll need to describe the location of even /entirely/
unused arguments so that they can at least be printed on function entry.

llvm-svn: 210231
2014-06-05 00:51:35 +00:00
David Blaikie
2f413e8bd2 DebugInfo: Add comments/assert description to r209674 based on Eric Christopher's post-commit review feedback.
llvm-svn: 210228
2014-06-05 00:25:26 +00:00
David Blaikie
06ac426ec0 DebugInfo: Reapply r209984 (reverted in r210143), asserting that abstract DbgVariables have DIEs.
Abstract variables within abstract scopes that are entirely optimized
away in their first inlining are omitted because their scope is not
present so the variable is never created. Instead, we should ensure the
scope is created so the variable can be added, even if it's been
optimized away in its first inlining.

This fixes the incorrect debug info in missing-abstract-variable.ll
(added in r210143) and passes an asserts self-hosting build, so
hopefully there's not more of these issues left behind... *fingers
crossed*.

llvm-svn: 210221
2014-06-04 23:50:52 +00:00
Hans Wennborg
7e14d811bc Don't emit structors for available_externally globals (PR19933)
We would previously assert here when trying to figure out the section
for the global.

This makes us handle the situation more gracefully since the IR isn't
malformed.

Differential Revision: http://reviews.llvm.org/D4022

llvm-svn: 210215
2014-06-04 21:04:54 +00:00
Andrew Trick
2cfbbba814 Add a subtarget hook: enablePostMachineScheduler.
As requested by AArch64 subtargets.

Note that this will have no effect until the
AArch64 target actually enables the pass like this:
substitutePass(&PostRASchedulerID, &PostMachineSchedulerID);

As soon as armv7 switches over, PostMachineScheduler will become the
default postRA scheduler, so this won't be necessary any more.
Targets using the old postRA schedule would then do:
substitutePass(&PostMachineSchedulerID, &PostRASchedulerID);

llvm-svn: 210167
2014-06-04 07:06:27 +00:00
Andrew Trick
ba65eed5cb Move GenericScheduler and PostGenericScheduler into a header.
These were not exposed previously because I didn't want out-of-tree
targets to be too dependent on their internals. They can be reused for
a very wide variety of processors with casual scheduling needs without
exposing the classes by instead using hooks defined in
MachineSchedPolicy (we can add more if needed). When targets are more
aggressively tuned or want to provide custom heuristics, they can
define their own MachineSchedStrategy. I tend to think this is better
once you start customizing heuristics because you can copy over only
what you need. I don't think that layering heuristics generally works
well.

However, Arch64 targets now want to reuse the Generic scheduling logic
but also provide extensions. I don't see much harm in exposing the
Generic scheduling classes with a major caveat: these scheduling
strategies may change in the future without validating performance on
less mainstream processors. If you want to be immune from changes,
just define your own MachineSchedStrategy.

llvm-svn: 210166
2014-06-04 07:06:18 +00:00
David Blaikie
5a54fcae12 DebugInfo: Partial revert r209984 due to more cases where abstract DbgVariables do not have associated DIEs.
Along with a test case to demonstrate that due to inlining order there
are cases where abstract variable DIEs are not constructed since the
abstract subprogram was built due to a previous inlining that optimized
away those variables. This produces incorrect debug info (the 'missing'
abstract variable causes the inlined instance of that variable to be
emitted with a full description (name, line, file) rather than
referencing the abstract origin), but this commit at least ensures that
it doesn't crash...

llvm-svn: 210143
2014-06-04 01:30:59 +00:00
Pete Cooper
4164797aa0 Calculate dead instructions when a live interval is created.
This gets us closer to being able to remove LiveVariables entirely which is where dead instructions are currently tagged as such.

Reviewed by Jakob Olesen

llvm-svn: 210132
2014-06-03 22:42:10 +00:00
Rafael Espindola
87cd774844 Allow alias to point to an arbitrary ConstantExpr.
This  patch changes GlobalAlias to point to an arbitrary ConstantExpr and it is
up to MC (or the system assembler) to decide if that expression is valid or not.

This reduces our ability to diagnose invalid uses and how early we can spot
them, but it also lets us do things like

@test5 = alias inttoptr(i32 sub (i32 ptrtoint (i32* @test2 to i32),
                                 i32 ptrtoint (i32* @bar to i32)) to i32*)

An important implication of this patch is that the notion of aliased global
doesn't exist any more. The alias has to encode the information needed to
access it in its metadata (linkage, visibility, type, etc).

Another consequence to notice is that getSection has to return a "const char *".
It could return a NullTerminatedStringRef if there was such a thing, but when
that was proposed the decision was to just uses "const char*" for that.

llvm-svn: 210062
2014-06-03 02:41:57 +00:00
Eric Christopher
51ec137e6b InitLibcallNames can take a Triple instead of a TargetMachine.
llvm-svn: 210045
2014-06-02 20:51:49 +00:00
David Blaikie
365949d5af DebugInfo: Assert that DbgVariables have associated DIEs
This was previously committed in r209680 and reverted in r209683 after
it caused sanitizer builds to crash.

The issue seems to be that the DebugLoc associated with dbg.value IR
intrinsics isn't necessarily accurate. Instead, we duplicate the
DIVariables and add an InlinedAt field to them to record their
location.

We were using this InlinedAt field to compute the LexicalScope for the
variable, but not using it in the abstract DbgVariable construction and
mapping. This resulted in a formal parameter to the current concrete
function, correctly having no InlinedAt information, but incorrectly
having a DebugLoc that described an inlined location within the
function... thus an abstract DbgVariable was created for the variable,
but its DIE was never constructed (since the LexicalScope had no such
variable). This DbgVariable was silently ignored (by testing for a
non-null DIE on the abstract DbgVariable).

So, fix this by using the right scoping information when constructing
abstract DbgVariables.

In the long run, I suspect we want to undo the work that added this
second kind of location tracking and fix the places where the DebugLoc
propagation on the dbg.value intrinsic fails. This will shrink debug
info (by not duplicating DIVariables), make it more efficient (by not
having to construct new DIVariable metadata nodes to try to map back to
a single variable), and benefit all instructions.

But perhaps there are insurmountable issues with DebugLoc quality that
I'm unaware of... I just don't know how we can't /just keep the DebugLoc
from the dbg.declare to the dbg.values and never get this wrong/.

Some history context:

http://llvm.org/viewvc/llvm-project?view=revision&revision=135629
http://llvm.org/viewvc/llvm-project?view=revision&revision=137253

llvm-svn: 209984
2014-06-01 03:38:13 +00:00
Alp Toker
e8634eb077 Fix typos
llvm-svn: 209982
2014-05-31 21:26:28 +00:00
Adam Nemet
807861a7b6 [SelectionDAG] Force cycle detection in AssignTopologicalOrder before aborting
DAG cycle detection is only enabled with ENABLE_EXPENSIVE_CHECKS.  However we
can run it just before we would crash in order to provide more informative
diagnostics.

Now in addition to the "Overran sorted position" message we also get the Node
printed if a cycle was detected.

Tested by building several configs: Debug+Assert, Debug+Assert+Check (this is
ENABLE_EXPENSIVE_CHECKS), Release+Assert and Release.  Also tried that the
AssignTopologicalOrder assert produces the expected results.

llvm-svn: 209977
2014-05-31 16:23:20 +00:00
Adam Nemet
267d4048ff [SelectionDAG] Pass DAG to checkForCycles
Pass the DAG down to checkForCycles from all callers where we have it.  This
allows target-specific nodes to be printed properly.

Also print some missing newlines.

llvm-svn: 209976
2014-05-31 16:23:17 +00:00
Andrea Di Biagio
3a03708285 [X86] Add two combine rules to simplify dag nodes introduced during type legalization when promoting nodes with illegal vector type.
This patch teaches the backend how to simplify/canonicalize dag node
sequences normally introduced by the backend when promoting certain dag nodes
with illegal vector type.

This patch adds two new combine rules:
1) fold (shuffle (bitcast (BINOP A, B)), Undef, <Mask>) ->
        (shuffle (BINOP (bitcast A), (bitcast B)), Undef, <Mask>)

2) fold (BINOP (shuffle (A, Undef, <Mask>)), (shuffle (B, Undef, <Mask>))) ->
        (shuffle (BINOP A, B), Undef, <Mask>).

Both rules are only triggered on the type-legalized DAG.
In particular, rule 1. is a target specific combine rule that attempts
to sink a bitconvert into the operands of a binary operation.
Rule 2. is a target independet rule that attempts to move a shuffle
immediately after a binary operation.

llvm-svn: 209930
2014-05-30 23:17:53 +00:00
Filipe Cabecinhas
8abf11ea97 Convert a vselect into a concat_vector if possible
Summary:
If both vector args to vselect are concat_vectors and the condition is
constant and picks half a vector from each argument, convert the vselect
into a concat_vectors.

Added a test.

The ConvertSelectToConcatVector is assuming it doesn't get vselects with
arguments of, for example, <undef, undef, true, true>. Those get taken
care of in the checks above its call.

Reviewers: nadav, delena, grosbach, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3916

llvm-svn: 209929
2014-05-30 23:03:11 +00:00
Adrian Prantl
fd0f672222 Roll DbgVariable::setMInsn into the constructor. No functional changes.
llvm-svn: 209920
2014-05-30 21:10:13 +00:00
Logan Chien
79b8446257 Fix MIPS exception personality encoding.
For MIPS, we have to encode the personality routine with
an indirect pointer to absptr; otherwise, some link warning
warning will be raised, and the program might crash in some
early MIPS Android device.

llvm-svn: 209907
2014-05-30 16:48:56 +00:00
Rafael Espindola
d1ec35ff7d [pr19636] Fix known bit computation in urem instruction with power of two.
Patch by Andrey Kuharev.

llvm-svn: 209902
2014-05-30 15:00:45 +00:00
Tim Northover
89515a61ad SelectionDAG: skip barriers for unordered atomic operations
Unordered is strictly weaker than monotonic, so if the latter doesn't have any
barriers then the former certainly shouldn't.

rdar://problem/16548260

llvm-svn: 209901
2014-05-30 14:41:51 +00:00
Tim Northover
3bb84c9bcc ARM & AArch64: make use of common cmpxchg idioms after expansion
The C and C++ semantics for compare_exchange require it to return a bool
indicating success. This gets mapped to LLVM IR which follows each cmpxchg with
an icmp of the value loaded against the desired value.

When lowered to ldxr/stxr loops, this extra comparison is redundant: its
results are implicit in the control-flow of the function.

This commit makes two changes: it replaces that icmp with appropriate PHI
nodes, and then makes sure earlyCSE is called after expansion to actually make
use of the opportunities revealed.

I've also added -{arm,aarch64}-enable-atomic-tidy options, so that
existing fragile tests aren't perturbed too much by the change. Many
of them either rely on undef/unreachable too pervasively to be
restored to something well-defined (particularly while making sure
they test the same obscure assert from many years ago), or depend on a
particular CFG shape, which is disrupted by SimplifyCFG.

rdar://problem/16227836

llvm-svn: 209883
2014-05-30 10:09:59 +00:00
Richard Trieu
2252d26c83 Remove use of comma operator.
llvm-svn: 209871
2014-05-30 03:15:17 +00:00
Adrian Prantl
fdb8f218a3 Debug Info: Remove unused code. The MInsn of an _abstract_ variable is
never used again and updating the abstract variable for each inlined
instance of it was questionable in the first place.

llvm-svn: 209829
2014-05-29 16:56:48 +00:00
Hao Liu
0e99724daa Fix an assertion failure caused by v1i64 in DAGCombiner Shrink.
llvm-svn: 209798
2014-05-29 09:19:07 +00:00
Michael J. Spencer
836291b5ff [x86] Fold extract_vector_elt of a load into the Load's address computation.
An address only use of an extract element of a load can be simplified to a
load. Without this the result of the extract element is spilled to the
stack so that an address is available.

llvm-svn: 209788
2014-05-29 01:42:45 +00:00
Matt Arsenault
8a780443dc Fix wrong setcc result type when legalizing uaddo/usubo
No test because no in-tree targets change the bitwidth of the
setcc type depending on the bitwidth of the compared type.

Patch by Ke Bai

llvm-svn: 209771
2014-05-28 20:51:42 +00:00
Rafael Espindola
acdb307db3 [pr19844] Add thread local mode to aliases.
This matches gcc's behavior. It also seems natural given that aliases
contain other properties that govern how it is accessed (linkage,
visibility, dll storage).

Clang still has to be updated to expose this feature to C.

llvm-svn: 209759
2014-05-28 18:15:43 +00:00
Hal Finkel
530ec71f07 Revert "[DAGCombiner] Split up an indexed load if only the base pointer value is live"
This reverts r208640 (I've just XFAILed the test) because it broke ppc64/Linux
self-hosting. Because nearly every regression test triggers a segfault, I hope
this will be easy to fix.

llvm-svn: 209747
2014-05-28 15:33:19 +00:00
Alexey Samsonov
86c8e35dda Change representation of instruction ranges where variable is accessible.
Use more straightforward way to represent the set of instruction
ranges where the location of a user variable is defined - vector of pairs
of instructions (defining start/end of each range),
instead of a flattened vector of instructions where some instructions
are supposed to start the range, and the rest are supposed to "clobber" it.

Simplify the code which generates actual .debug_loc entries.

No functionality change.

llvm-svn: 209698
2014-05-27 23:09:50 +00:00
Alexey Samsonov
5b35e3c43b Factor out looking for prologue end into a function
llvm-svn: 209697
2014-05-27 22:47:41 +00:00
Alexey Samsonov
7782ecef5d Don't pre-populate the set of keys in the map with variable locations history.
Current implementation of calculateDbgValueHistory already creates the
keys in the expected order (user variables are listed in order of appearance),
and should do so later by contract.

No functionality change.

llvm-svn: 209690
2014-05-27 22:35:00 +00:00
David Blaikie
9e81aac4ee DebugInfo: partially revert cleanup committed in r209680
I'm not sure exactly where/how we end up with an abstract DbgVariable
with a null DIE, but we do... looking into it & will add a test and/or
fix when I figure it out.

Currently shows up in selfhost or compiler-rt builds.

llvm-svn: 209683
2014-05-27 20:20:43 +00:00
David Blaikie
f418e971c9 DebugInfo: Simplify solution to avoid DW_AT_artificial on inlined parameters.
Originally committed in r207717, I clearly didn't look very closely at
the code to understand how existing things were working...

llvm-svn: 209680
2014-05-27 19:34:32 +00:00
David Blaikie
54d2925f43 DebugInfo: Create abstract function definitions even when concrete definitions preceed inline definitions.
After much puppetry, here's the major piece of the work to ensure that
even when a concrete definition preceeds all inline definitions, an
abstract definition is still created and referenced from both concrete
and inline definitions.

Variables are still broken in this case (see comment in
dbg-value-inlined-parameter.ll test case) and will be addressed in
follow up work.

llvm-svn: 209677
2014-05-27 18:37:55 +00:00
David Blaikie
b7b9f0c76e DebugInfo: Avoid an extra map lookup when finding abstract subprogram DIEs.
llvm-svn: 209676
2014-05-27 18:37:51 +00:00
David Blaikie
6003ac79b7 DebugInfo: Lazily construct subprogram definition DIEs.
A further step to correctly emitting concrete out of line definitions
preceeding inlined instances of the same program.

To do this, emission of subprograms must be delayed until required since
we don't know which (abstract only (if there's no out of line
definition), concrete only (if there are no inlined instances), or both)
DIEs are required at the start of the module.

To reduce the test churn in the following commit that actually fixes the
bug, this commit introduces the lazy DIE construction and cleans up test
cases that are impacted by the changes in the resulting DIE ordering.

llvm-svn: 209675
2014-05-27 18:37:48 +00:00
David Blaikie
ae911772eb DebugInfo: Lazily attach definition attributes to definitions.
This is a precursor to fixing inlined debug info where the concrete,
out-of-line definition may preceed any inlined usage. To cope with this,
the attributes that may appear on the concrete definition or the
abstract definition are delayed until the end of the module. Then, if an
abstract definition was created, it is referenced (and no other
attributes are added to the out-of-line definition), otherwise the
attributes are added directly to the out-of-line definition.

In a couple of cases this causes not just reordering of attributes, but
reordering of types. When the creation of the attribute is delayed, if
that creation would create a type (such as for a DW_AT_type attribute)
then other top level DIEs may've been constructed during the delay,
causing the referenced type to be created and added after those
intervening DIEs. In the extreme case, in cross-cu-inlining.ll, this
actually causes the DW_TAG_basic_type for "int" to move from one CU to
another.

llvm-svn: 209674
2014-05-27 18:37:43 +00:00
David Blaikie
59a00b3753 DebugInfo: Separate out the addition of subprogram attribute additions so that they can be added later depending on whether or not the function is inlined.
llvm-svn: 209673
2014-05-27 18:37:38 +00:00
Tim Northover
2172cefdfd ARM: teach AAPCS-VFP to deal with Cortex-M4.
Cortex-M4 only has single-precision floating point support, so any LLVM
"double" type will have been split into 2 i32s by now. Fortunately, the
consecutive-register framework turns out to be precisely what's needed to
reconstruct the double and follow AAPCS-VFP correctly!

rdar://problem/17012966

llvm-svn: 209650
2014-05-27 10:43:38 +00:00
David Blaikie
174b49812d DwarfUnit: Remove some misleading no-op code introduced in r204162.
Post commit review feedback from Manman called this out, but it looks
like it slipped through the cracks.

llvm-svn: 209611
2014-05-26 05:32:21 +00:00
David Blaikie
b1be374c2a DebugInfo: Fix inlining with #file directives a little harder
Seems my previous fix was insufficient - we were still not adding the
inlined function to the abstract scope list. Which meant it wasn't
flagged as inline, didn't have nested lexical scopes in the abstract
definition, and didn't have abstract variables - so the inlined variable
didn't reference an abstract variable, instead being described
completely inline.

llvm-svn: 209602
2014-05-25 18:11:35 +00:00
Benjamin Kramer
a89a51e349 MachineVerifier: Clean up some syntactic weirdness left behind by find&replace.
No functionality change.

llvm-svn: 209581
2014-05-24 13:31:10 +00:00
Benjamin Kramer
bbbf511429 CodeGen: Make MachineBasicBlock::back skip to the beginning of the last bundle.
This makes front/back symmetric with begin/end, avoiding some confusion.
Added instr_front/instr_back for the old behavior, corresponding to
instr_begin/instr_end. Audited all three in-tree users of back(), all
of them look like they don't want to look inside bundles.

Fixes an assertion (PR19815) when generating debug info on mips, where a
delay slot was bundled at the end of a branch.

llvm-svn: 209580
2014-05-24 13:13:17 +00:00
David Blaikie
f42895705f DebugInfo: Put concrete definitions referencing abstract definitions in the same scope as the abstract definition.
This seems like a simple cleanup/improved consistency, but also helps
lay the foundation to fix the bug mentioned in the test case: concrete
definitions preceeding any inlined usage aren't properly split into
concrete + abstract (because they're not known to need it until it's too
late).

Once we start deferring this choice until later, we won't have the
choice to put concrete definitions for inlined subroutines in a
different scope from concrete definitions for non-inlined subroutines
(since we won't know at time-of-construction which one it'll be). This
change brings those two cases into alignment ahead of that future
chaneg/fix.

llvm-svn: 209547
2014-05-23 20:25:15 +00:00
David Blaikie
79c338088c Add FIXME comment based on code review feedback by Hal Finkel on r209338
llvm-svn: 209529
2014-05-23 16:53:14 +00:00
David Blaikie
f5d7f9cafd Rename a couple of variables to be more accurate.
It's not really a "ScopeDIE", as such - it's the abstract function
definition's DIE. And we usually use "SP" for subprograms, rather than
"Sub".

llvm-svn: 209499
2014-05-23 05:03:23 +00:00
David Blaikie
3765b7c1a0 DebugInfo: Fix cross-CU references for scopes (and variables within those scopes) in abstract definitions of cross-CU inlined functions
Found by Adrian Prantl during post-commit review of r209335.

llvm-svn: 209498
2014-05-23 04:23:06 +00:00
Eric Christopher
c878221cbd Return false if we're not going to do anything.
llvm-svn: 209455
2014-05-22 17:49:33 +00:00
Eric Christopher
a1b851ab1e Remove unused variable.
llvm-svn: 209391
2014-05-22 05:33:03 +00:00
David Blaikie
38c892567f DebugInfo: Simplify dead variable collection slightly.
constructSubprogramDIE was already called for every subprogram in every
CU when the module was started - there's no need to call it again at
module finalization.

llvm-svn: 209372
2014-05-22 00:48:36 +00:00
Eli Bendersky
80ec3983a5 Similar to bitcast, treat addrspacecast as a foldable operand.
Added a test sink-addrspacecast.ll to verify this change.

Patch by Jingyue Wu.

llvm-svn: 209343
2014-05-22 00:02:52 +00:00
Eric Christopher
98c81f11ea Fix compilation issues.
llvm-svn: 209342
2014-05-21 23:51:57 +00:00
Eric Christopher
7880d61aac Make early if conversion dependent upon the subtarget and add
a subtarget hook to enable. Unconditionally add to the pass pipeline
for targets that might want to use it. No functional change.

llvm-svn: 209340
2014-05-21 23:40:26 +00:00
David Blaikie
a52d52f136 Revert "DebugInfo: Don't put fission type units in comdat sections."
This reverts commit r208930, r208933, and r208975.

It seems not all fission consumers are ready to handle this behavior.
Reverting until tools are brought up to spec.

llvm-svn: 209338
2014-05-21 23:27:41 +00:00
David Blaikie
f148b7f5aa DebugInfo: Use the SPMap to find the parent CU of inlined functions as they may not be in the current CU
Committed in r209178 then reverted in r209251 due to LTO breakage,
here's a proper fix for the case of the missing subprogram DIE. The DIEs
were there, just in other compile units. Using the SPMap we can find the
right compile unit to search for and produce cross-unit references to
describe this kind of inlining.

One existing test case needed to be updated because it had a function
that wasn't in the CU's subprogram list, so it didn't appear in the
SPMap.

llvm-svn: 209335
2014-05-21 23:14:12 +00:00
David Blaikie
03f45901f9 DebugInfo: Ensure concrete out of line variables from inlined functions reference their abstract origins.
llvm-svn: 209327
2014-05-21 22:41:17 +00:00
David Blaikie
ae30c092e2 DebugInfo: Simplify subprogram declaration creation/references and accidentally refix PR11300.
Also simplifies the linkage name handling a little too.

llvm-svn: 209311
2014-05-21 18:04:33 +00:00
Richard Smith
cb6bc29096 [modules] Add module maps for LLVM. These are not quite ready for prime-time
yet, but only a few more Clang patches need to land. (I have 'ninja check'
passing locally.)

llvm-svn: 209269
2014-05-21 02:46:14 +00:00
Eric Christopher
04464d5263 Move the verbose asm option to be part of the options struct and
set appropriately.

llvm-svn: 209258
2014-05-20 23:59:50 +00:00
David Blaikie
80cb22d7ed Revert "DebugInfo: Assume all subprogram DIEs have been created before any abstract subprograms are constructed."
This reverts commit r209178.

This seems to be asserting in an LTO build on some internal Apple
buildbots. No upstream reproduction (and I don't have an LLVM-aware gold
built right now to reproduce it personally) but it's a small patch & the
failure's semi-plausible so I'm going to revert first while I try to
reproduce this.

llvm-svn: 209251
2014-05-20 22:33:09 +00:00
David Blaikie
d856e7924f Unbreak the sanitizer buildbots after r209226 due to SROA issue described in http://reviews.llvm.org/D3714
Undecided whether this should include a test case - SROA produces bad
dbg.value metadata describing a value for a reference that is actually
the value of the thing the reference refers to. For now, loosening the
assert lets this not assert, but it's still bogus/wrong output...

If someone wants to tell me to add a test, I'm willing/able, just
undecided. Hopefully we'll get SROA fixed soon & we can tighten up this
assertion again.

llvm-svn: 209240
2014-05-20 21:40:13 +00:00
David Blaikie
939d8411a7 Fix test breakage introduced in r209223.
Oops, broke the broken enum constants again.

llvm-svn: 209226
2014-05-20 18:36:35 +00:00
Alexey Samsonov
3a8ff19189 Rewrite calculateDbgValueHistory to make it (hopefully) more transparent.
This change preserves the original algorithm of generating history
for user variables, but makes it more clear.

High-level description of algorithm:
Scan all the machine basic blocks and machine instructions in the order
they are emitted to the object file. Do the following:
1) If we see a DBG_VALUE instruction, add it to the history of the
corresponding user variable. Keep track of all user variables, whose
locations are described by a register.
2) If we see a regular instruction, look at all the registers it clobbers,
and terminate the location range for all variables described by these registers.
3) At the end of the basic block, terminate location ranges for all
user variables described by some register.

Although this change shouldn't be user-visible (the contents of .debug_loc section
should be the same), it changes some internal assumptions about the set
of instructions used to track the variable locations. Watching the bots.

llvm-svn: 209225
2014-05-20 18:34:54 +00:00
David Blaikie
a92dddef5c PR19767: DebugInfo emission of pointer constants.
In refactoring DwarfUnit::isUnsignedDIType I restricted it to only work
on values with signedness (unsigned or signed), asserting on anything
else (which did uncover some bugs). But it turns out that we do need to
emit constants of signless data, such as pointer constants - only null
pointer constants are known to need this so far, but it's conceivable
that there might be non-null pointer constants at some point (hardcoded
address offsets for device drivers?).

This patch just uses 'unsigned' for signless data such as pointer
constants. Arguably we could use signless representations
(DW_FORM_dataN) instead, allowing a trinary result from isUnsignedDIType
(signed, unsigned, signless), but this seems reasonable for now.

llvm-svn: 209223
2014-05-20 18:21:51 +00:00
Eric Christopher
262770bdee Clean up language and grammar.
Based on a patch by jfcaron3@gmail.com!
PR19806

llvm-svn: 209216
2014-05-20 17:11:11 +00:00
Benjamin Kramer
5f1713b9ab Legalizer: Make bswap promotion safe for vectors.
llvm-svn: 209202
2014-05-20 09:42:31 +00:00
David Blaikie
4c5d7eb5e6 DebugInfo: Emit function definitions within their namespace scope.
This workaround (presumably for ancient GDB) doesn't appear to be
required (GDB 7.5 seems to tolerate function definition DIEs in
namespace scope just fine).

llvm-svn: 209189
2014-05-20 03:23:24 +00:00
David Blaikie
c6084e64ed DebugInfo: Assume all subprogram DIEs have been created before any abstract subprograms are constructed.
Since we visit the whole list of subprograms for each CU at module
start, this is clearly true - don't test for the case, just assert it.

A few old test cases seemed to have incomplete subprogram lists, but any
attempt to reproduce them shows full subprogram lists that even include
entities that have been completely inlined and the out of line
definition removed.

llvm-svn: 209178
2014-05-19 23:16:19 +00:00
David Blaikie
aa1faa6281 DebugInfo: Don't include DW_AT_inline on each abstract definition multiple times.
When I refactored this in r208636 I accidentally caused this to be added
multiple times to each abstract subprogram (not accounting for the
deduplicating effect of the InlinedSubprogramDIEs set).

This got better in r208798 when the abstract definitions got the
attribute added to them at construction time, but still had the
redundant copies introduced in r208636.

This commit removes those excess DW_AT_inlines and relies solely on the
insertion in r208798.

llvm-svn: 209166
2014-05-19 22:07:16 +00:00
David Blaikie
5d56623a2c DebugInfo: Fix missing inlined_subroutines caused by r208748.
The check in DwarfDebug::constructScopeDIE was meant to consider inlined
subroutines as any non-top-level scope that was a subprogram. Instead of
checking "not top level scope" it was checking if the /subprogram's/
scope was non-top-level.

Fix this and beef up a test case to demonstrate some of the missing
inlined_subroutines are no longer missing.

In the course of fixing this I also found that r208748 (with this fix)
found one /extra/ inlined_subroutine in concrete_out_of_line.ll due to
two inlined_subroutines having the same inlinedAt location. The previous
implementation was collapsing these into a single inlined subroutine.

I'm not sure what the original code was that created this .ll file so
I'm not sure if this actually happens in practice today. Since we
deliberately include column information to disambiguate two calls on the
same line, that may've addressed this bug in the frontend, but it's good
to know that workaround isn't necessary for this particular case
anymore.

llvm-svn: 209165
2014-05-19 21:54:31 +00:00
Eric Christopher
f08cdc9333 Fix typos.
llvm-svn: 209164
2014-05-19 21:18:47 +00:00
Benjamin Kramer
600e24a1cb SDAG: Legalize vector BSWAP into a shuffle if the shuffle is legal but the bswap not.
- On ARM/ARM64 we get a vrev because the shuffle matching code is really smart. We still unroll anything that's not v4i32 though.
- On X86 we get a pshufb with SSSE3. Required more cleverness in isShuffleMaskLegal.
- On PPC we get a vperm for v8i16 and v4i32. v2i64 is unrolled.

llvm-svn: 209123
2014-05-19 13:12:38 +00:00
Saleem Abdulrasool
501d3b6235 Target: remove old constructors for CallLoweringInfo
This is mostly a mechanical change changing all the call sites to the newer
chained-function construction pattern.  This removes the horrible 15-parameter
constructor for the CallLoweringInfo in favour of setting properties of the call
via chained functions.  No functional change beyond the removal of the old
constructors are intended.

llvm-svn: 209082
2014-05-17 21:50:17 +00:00
Saleem Abdulrasool
4b7b7da0ac Target: change member from reference to pointer
This is a preliminary step to help ease the construction of CallLoweringInfo.
Changing the construction to a chained function pattern requires that the
parameter be nullable.  However, rather than copying the vector, save a pointer
rather than the reference to permit a late binding of the arguments.

llvm-svn: 209080
2014-05-17 21:50:01 +00:00
Rafael Espindola
e809bea68e Delete getAliasedGlobal.
llvm-svn: 209040
2014-05-16 22:37:03 +00:00
David Blaikie
b77a3db035 DebugInfo: Assert rather than conditionalizing when a CU's subprogram list contains declarations.
llvm-svn: 209039
2014-05-16 22:21:45 +00:00
David Blaikie
8a227f7d94 DebugInfo: Handle emitting constants of C++ unicode character type.
Patch by Stephan Tolksdorf! (with some test case stuff by me)

Differential Revision: http://reviews.llvm.org/D3810

llvm-svn: 209037
2014-05-16 21:53:09 +00:00
Reid Kleckner
fd1a7dfe04 Add comdat key field to llvm.global_ctors and llvm.global_dtors
This allows us to put dynamic initializers for weak data into the same
comdat group as the data being initialized.  This is necessary for MSVC
ABI compatibility.  Once we have comdats for guard variables, we can use
the combination to help GlobalOpt fire more often for weak data with
guarded initialization on other platforms.

Reviewers: nlewycky

Differential Revision: http://reviews.llvm.org/D3499

llvm-svn: 209015
2014-05-16 20:39:27 +00:00
David Blaikie
a844851458 DebugInfo: Add an assert regarding the subprogram in the subprogram map matching the abstract subprogram.
I'm not sure this is how it'll be going forward (I'd rather prefer the
definition to be in the main SP mapping, for various reasons) but this
helps me understand how it is today.

llvm-svn: 209009
2014-05-16 19:42:10 +00:00
David Blaikie
5e7d4ec337 DebugInfo: Assume the CU's Subprogram list only contains definitions.
DIBuilder maintains this invariant and the current DwarfDebug code could
end up doing weird things if it contained declarations (such as putting
the definition DIE inside a CU that contained the declaration - this
doesn't seem like a good idea, so rather than adding logic to handle
this case we'll just ban in for now & cross that bridge if we come to
it later).

llvm-svn: 209004
2014-05-16 18:26:53 +00:00
David Blaikie
82dc713fc3 DwarfDebug: Refactor AT_ranges/AT_high_pc+AT_low_pc emission into helper function.
llvm-svn: 208997
2014-05-16 16:42:40 +00:00
Rafael Espindola
6d40091c3c Revert "Implement global merge optimization for global variables."
This reverts commit r208934.

The patch depends on aliases to GEPs with non zero offsets. That is not
supported and fairly broken.

The good news is that GlobalAlias is being redesigned and will have support
for offsets, so this patch should be a nice match for it.

llvm-svn: 208978
2014-05-16 13:02:18 +00:00
Eric Christopher
37970f0edd Remove the Options query functions and just access our Options directly.
llvm-svn: 208937
2014-05-16 00:32:52 +00:00
Jiangning Liu
5366cb42f6 Implement global merge optimization for global variables.
This commit implements two command line switches -global-merge-on-external
and -global-merge-aligned, and both of them are false by default, so this
optimization is disabled by default for all targets.

For ARM64, some back-end behaviors need to be tuned to get this optimization
further enabled.

llvm-svn: 208934
2014-05-15 23:45:42 +00:00
David Blaikie
1f83b7a8f9 DebugInfo: Follow up to r208930, comment usage of 'using' to bring in base class overload.
Code review feedback from Eric Christopher.

llvm-svn: 208933
2014-05-15 23:29:53 +00:00
Eric Christopher
da6a1ff3cc Move more MC options into the MCTargetOptions structure.
No functional change.

llvm-svn: 208932
2014-05-15 23:27:49 +00:00
David Blaikie
bd84bbc97b DebugInfo: Don't put fission type units in comdat sections.
Since type units in the dwo file are handled by a debug aware tool, they
don't need to leverage the ELF comdat grouping to implement
deduplication. Avoid creating all the .group sections for these as a
space optimization.

llvm-svn: 208930
2014-05-15 23:18:15 +00:00
David Blaikie
9fe64b3c3f DebugInfo: Simplify retrieving filename/directory name for line table entry building.
llvm-svn: 208911
2014-05-15 20:18:50 +00:00
Jay Foad
2827803889 Instead of littering asserts throughout the code after every call to
computeKnownBits, consolidate them into one assert at the end of
computeKnownBits itself.

llvm-svn: 208876
2014-05-15 12:12:55 +00:00
Alp Toker
18115693f7 Fix typos
llvm-svn: 208839
2014-05-15 01:52:21 +00:00
David Blaikie
071f1f0679 DwarfDebug: Don't set frame index locations on abstract variables.
Abstract variables should never have/use locations. In this case the
data wasn't used, so no functional change intended here, just
simplification.

llvm-svn: 208820
2014-05-14 22:51:59 +00:00
David Blaikie
ef32c3bf98 DebugInfo: Sure up subprogram variable list handling with more assertions and fewer conditionals.
Many old tests using prior schemas still had some brokenness here (both
indirect arrays and arrays with single bogus elements). Fixed those up
so they don't hit the new assertions.

Also reduced nesting in some places, etc.

llvm-svn: 208817
2014-05-14 21:52:46 +00:00
David Blaikie
03965d6089 DebugInfo: Assert that a CU's subprogram list contains only subprograms.
llvm-svn: 208816
2014-05-14 21:52:37 +00:00
Jay Foad
e0eac700cb Rename ComputeMaskedBits to computeKnownBits. "Masked" has been
inappropriate since it lost its Mask parameter in r154011.

llvm-svn: 208811
2014-05-14 21:14:37 +00:00
David Blaikie
4e2a4915ae DebugInfo: Do not delay attaching DW_AT_inline attribute to abstract definitions.
This is just unneccessary - we only create abstract definitions when
we're inlining anyway, so there's no reason to delay this to see if
we're going to inline anything.

llvm-svn: 208798
2014-05-14 17:58:53 +00:00
Logan Chien
fb407ff21d Fix ARM EHABI when function has landingpad and nounwind.
If the function has the landingpad instruction, then the
handlerdata should be emitted even if the function has
nouwnind attribute.  Otherwise, following code will not
work:

    void test1() noexcept {
      try {
        throw_exception();
      } catch (...) {
        log_unexpected_exception();
      }
    }

Since the cantunwind was incorrectly emitted and the
LSDA is not available.

llvm-svn: 208791
2014-05-14 16:38:30 +00:00
Jay Foad
df682f6c8b Update the comments for ComputeMaskedBits, which lost its Mask parameter
in r154011.

llvm-svn: 208757
2014-05-14 08:00:07 +00:00
David Blaikie
fac608f99b Recommit r208506: DebugInfo: Include lexical scopes in inlined subroutines.
This was reverted in r208642 due to regressions surrounding file changes
within lexical scopes causing inlining information to be lost.

The issue was in LexicalScopes::getOrCreateInlinedScope, where I was
previously testing "isLexicalBlock" which is false for
"DILexicalBlockFile" (a scope used to represent changes in the current
file name) and assuming it was then a function (breaking out of the
inlined scope path and reaching for the parent non-inlined scopes). By
inverting the condition and testing for "isSubprogram" the correct
behavior is attained.

(also found some weirdness in Clang, see r208742 when reducing this test
case - the resulting test case doesn't apply with the Clang fix, but
I've added a more realistic test case to inline-scopes.ll which does
reproduce the issue and demonstrate the fix)

llvm-svn: 208748
2014-05-14 01:08:28 +00:00
Louis Gerbarg
c59e699789 Add missing line breaks to debug output in CodeGenPrepare
llvm-svn: 208731
2014-05-13 21:54:22 +00:00
Rafael Espindola
e6465aeb2d Split GlobalValue into GlobalValue and GlobalObject.
This allows code to statically accept a Function or a GlobalVariable, but
not an alias. This is already a cleanup by itself IMHO, but the main
reason for it is that it gives a lot more confidence that the refactoring to fix
the design of GlobalAlias is correct. That will be a followup patch.

llvm-svn: 208716
2014-05-13 18:45:48 +00:00
Joey Gouly
c0f94cb136 [CGP] r205941 changed the logic, so that a cast happens *before* 'Result' is
compared to 'AddrMode.BaseReg'. In the case that 'AddrMode.BaseReg' is
nullptr, 'Result' will also be nullptr, so the cast causes an assertion. We
should use dyn_cast_or_null here to check 'Result' is not null and it is an
instruction.

Bug found by Mats Petersson, and I reduced his IR to get a test case.

llvm-svn: 208705
2014-05-13 15:42:45 +00:00
David Blaikie
1203463603 Revert "DebugInfo: Include lexical scopes in inlined subroutines."
This reverts commit r208506.

Some inlined subroutine scopes appear to be missing with this change.
Reverting while I investigate.

llvm-svn: 208642
2014-05-12 23:53:03 +00:00
Pete Cooper
801ae0ce03 Use a logical not when inverting SetCC. This unfortunately doesn't fire on any targets so I couldn't find a test case to trigger it.
The problem occurs when a non-i1 setcc is inverted.  For example 'i8 = setcc' will get 'xor 0xff' to invert this.   This is clearly wrong when the boolean contents are ZeroOrOne.

This patch introduces getLogicalNOT and updates SetCC legalisation to use it.

Reviewed by Hal Finkel.

llvm-svn: 208641
2014-05-12 23:26:58 +00:00
Adam Nemet
78e81b5109 [DAGCombiner] Split up an indexed load if only the base pointer value is live
Right now the load may not get DCE'd because of the side-effect of updating
the base pointer.

This can happen if we lower a read-modify-write of an illegal larger type
(e.g. i48) such that the modification only affects one of the subparts (the
lower i32 part but not the higher i16 part).  See the testcase.

In order to spot the dead load we need to revisit it when SimplifyDemandedBits
decided that the value of the load is masked off.  This is the
CommitTargetLoweringOpt piece.

I checked compile time with ARM64 by sending SPEC bitcode files through llc.
No measurable change.

Fixes <rdar://problem/16031651>

llvm-svn: 208640
2014-05-12 23:00:03 +00:00
David Blaikie
32cd8a7919 DebugInfo: Attach DW_AT_inline to inlined subprograms at DIE-construction time rather than as a post-processing step.
llvm-svn: 208636
2014-05-12 21:50:44 +00:00
David Blaikie
5ca904146d DwarfDebug: Avoid an extra map lookup while constructing abstract scope DIEs and reduce nesting/conditionals.
One test case had to be updated as it still had the extra indirection
for the variable list - removing the extra indirection got it back to
passing.

llvm-svn: 208608
2014-05-12 18:23:35 +00:00
Matt Arsenault
43171f4aad Make SimplifyDemandedBits understand BUILD_PAIR
llvm-svn: 208598
2014-05-12 17:14:48 +00:00
Saleem Abdulrasool
5f98d2b767 CodeGen: add parenthesis around complex expression
Add missing parenthesis suggested by GCC.  NFC.

llvm-svn: 208519
2014-05-12 06:08:18 +00:00
Hal Finkel
5b038e4cbc Pass the value type to TLI::getRegisterByName
We must validate the value type in TLI::getRegisterByName, because if we
don't and the wrong type was used with the IR intrinsic, then we'll assert
(because we won't be able to find a valid register class with which to
construct the requested copy operation). For PPC64, additionally, the type
information is necessary to decide between the 64-bit register and the 32-bit
subregister.

No functionality change.

llvm-svn: 208508
2014-05-11 19:29:07 +00:00
David Blaikie
a56eed1551 DebugInfo: Include lexical scopes in inlined subroutines.
llvm-svn: 208506
2014-05-11 18:12:17 +00:00
David Blaikie
c11c7cade1 DwarfUnit: Make explicit a limitation/bug in enumeration constant emission.
Filed as PR19712, LLVM fails to detect the right type of an enum
constant when a frontend does not provide an underlying type for the
enumeration type.

llvm-svn: 208502
2014-05-11 17:04:05 +00:00
David Blaikie
31bd82e5d9 DwarfUnit: Pick a winner between isTypeSigned and isUnsignedDIType.
And the winner by a nose is isUnsignedDIType, for no particular reason.

These two functions were just complements of each other and used in very
related code, so refactor callers to just use one of them.

llvm-svn: 208500
2014-05-11 16:08:41 +00:00
David Blaikie
5c0fff0a62 DwarfUnit: Factor out calling isUnsignedDIType into a utility function so each caller of emitConstantValue doesn't have to call it separately.
llvm-svn: 208496
2014-05-11 15:56:59 +00:00
David Blaikie
32bfd4a974 DwarfUnit: Share common constant value emission between APInts of small (<= 64 bit) and MCOperand immediates.
Doesn't seem a good reason to duplicate this code (it was more literally
duplicated prior to r208494, and while the dataN code /does/ actually
fire in this case, it doesn't seem necessary (and the DWARF standard
recommends using udata/sdata pervasively instead of dataN, so as to
indicate signedness of the values))

llvm-svn: 208495
2014-05-11 15:47:39 +00:00
David Blaikie
1f94b7156a DebugInfo: Simplify constant value emission.
This code looks to have become dead at some time in the past. I tried to
reproduce cases where LLVM would emit constants with dataN, but could
not. Upon inspection it seems the code doesn't do that anymore - the
only time a size is provided by isTypeSigned is when the type is signed,
and in those cases we use sdata. dataN is only used for unsigned types
and isTypeSigned doesn't provide a value for sizeInBits in that case.

Remove the dead cases/size plumbing.

llvm-svn: 208494
2014-05-11 15:06:20 +00:00
Oliver Stannard
2b1166b162 ARM: HFAs must be passed in consecutive registers
When using the ARM AAPCS, HFAs (Homogeneous Floating-point Aggregates) must
be passed in a block of consecutive floating-point registers, or on the stack.
This means that unused floating-point registers cannot be back-filled with
part of an HFA, however this can currently happen. This patch, along with the
corresponding clang patch (http://reviews.llvm.org/D3083) prevents this.

llvm-svn: 208413
2014-05-09 14:01:47 +00:00
Quentin Colombet
112f8aafc2 [TargetInstrInfo] Fix the implementation of commuteInstruction to match the
comment of the API.

Relaxes the behavior of TargetInstrInfo::commuteInstruction when
TargetInstrInfo::findCommutedOpIndices returns false.

Previously TargetInstrInfo triggered a fatal error in such situation whereas based
on the comment in the API it should just return nullptr. Indeed the only
precondition that should be ensured is that the instruction must be commutable.

llvm-svn: 208371
2014-05-08 23:12:27 +00:00
David Blaikie
f61e94a80f Reapply r207876 (Try simplifying LexicalScopes ownership again) including a workaround for an MSVC2012 bug regarding forward_as_tuple
(r207876 was reverted in r208131 after seeing some consistent buildbot
failure for MSVC 2012. The original commits were in r207724-r207726)

Takumi was nice enough to dig into this and locate this Microsoft
Connect issue:
http://connect.microsoft.com/VisualStudio/feedback/details/814899/forward-as-tuple-debug-implementation-error
describing a bug in MSVC2012's forward_as_tuple implementation.

Since the parameters in this instance are trivial/small, pass them by
value (using make_tuple) instead of perfectly-forwarded tuple of rvalue
references (involving the broken forward_as_tuple). Hopefully this will
satisfy MSVC2012.

llvm-svn: 208364
2014-05-08 22:24:51 +00:00
Hal Finkel
faaba5686b Fix a spelling error
llvm-svn: 208314
2014-05-08 13:42:57 +00:00
Hal Finkel
c52e65b830 Move late partial-unrolling thresholds into the processor definitions
The old method used by X86TTI to determine partial-unrolling thresholds was
messy (because it worked by testing target features), and also would not
correctly identify the target CPU if certain target features were disabled.
After some discussions on IRC with Chandler et al., it was decided that the
processor scheduling models were the right containers for this information
(because it is often tied to special uop dispatch-buffer sizes).

This does represent a small functionality change:
 - For generic x86-64 (which uses the SB model and, thus, will get some
   unrolling).
 - For AMD cores (because they still currently use the SB scheduling model)
 - For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump
   the default threshold to 50; we're working on a test case for this).
Otherwise, nothing has changed for any other targets. The logic, however, has
been moved into BasicTTI, so other targets may now also opt-in to this
functionality simply by setting LoopMicroOpBufferSize in their processor
model definitions.

llvm-svn: 208289
2014-05-08 09:14:44 +00:00
Matt Arsenault
903ece3700 Fix using wrong result type for setcc.
When reducing the bitwidth of a comparison against a constant, the
original setcc's result type was used, which was incorrect.

No test since I don't think any other in tree targets change the
bitwidth of the setcc type depending on the bitwidth of the compared
type.

llvm-svn: 208236
2014-05-07 18:26:58 +00:00
Rafael Espindola
765e5e78cf Remove the UseCFI option from createAsmStreamer.
We were already always passing true, this just removes the option.

llvm-svn: 208205
2014-05-07 13:00:43 +00:00
Zinovy Nis
ce225593e1 [BUG][REFACTOR]
1) Fix for printing debug locations for absolute paths.
2) Location printing is moved into public method DebugLoc::print() to avoid re-inventing the wheel.

Differential Revision: http://reviews.llvm.org/D3513

llvm-svn: 208177
2014-05-07 09:51:22 +00:00
David Blaikie
7acf842266 Revert "Try simplifying LexicalScopes ownership again."
Speculatively reverting due to a suspicious failure on a Windows
buildbot.

This reverts commit 10c37a012ea11596d44cd9059fe09c959caf30c8.

llvm-svn: 208131
2014-05-06 21:07:17 +00:00
Benjamin Kramer
593859517f TTI: Estimate @llvm.fmuladd cost as fmul + fadd when FMA's aren't legal on the target.
llvm-svn: 208115
2014-05-06 18:36:23 +00:00
Renato Golin
8a9a382ab2 Implememting named register intrinsics
This patch implements the infrastructure to use named register constructs in
programs that need access to specific registers (bare metal, kernels, etc).

So far, only the stack pointer is supported as a technology preview, but as it
is, the intrinsic can already support all non-allocatable registers from any
architecture.

llvm-svn: 208104
2014-05-06 16:51:25 +00:00
David Blaikie
5f72874fd2 Try simplifying LexicalScopes ownership again.
Committed initially in r207724-r207726 and reverted due to compiler-rt
crashes in r207732.

Instead, fix this harder with unordered_map and store the LexicalScopes
by value in the map. This did necessitate moving the definition of
LexicalScope above the definition of LexicalScopes.

Let's see how the buildbots/compilers tolerate unordered_map::emplace +
std::piecewise_construct + std::forward_as_tuple...

llvm-svn: 207876
2014-05-02 22:21:05 +00:00
Benjamin Kramer
bc327c6bd3 Satisfy GCC's urgent need for parentheses around ‘&&’ within ‘||’.
llvm-svn: 207871
2014-05-02 21:28:49 +00:00
Tim Northover
4aa1f54c61 DAGCombine: prevent formation of illegal ConstantFP nodes.
llvm-svn: 207850
2014-05-02 17:25:02 +00:00
Benjamin Kramer
96dab04f0f Allow SelectionDAG::FoldConstantArithmetic to work when it's called with a vector VT but scalar values.
llvm-svn: 207835
2014-05-02 12:35:22 +00:00
Juergen Ributzka
b036be4ecc [Stackmaps] Pacify windows buildbot.
llvm-svn: 207807
2014-05-01 22:39:26 +00:00
Juergen Ributzka
6bed4e0fc7 [Stackmaps] Add command line option to specify the stackmap version.
llvm-svn: 207805
2014-05-01 22:21:30 +00:00
Juergen Ributzka
60208fb4c5 [Stackmaps] Refactor serialization code. No functional change intended.
llvm-svn: 207804
2014-05-01 22:21:27 +00:00
Juergen Ributzka
08694158e1 [Stackmaps] Replace the custom ConstantPool class with a MapVector.
llvm-svn: 207803
2014-05-01 22:21:24 +00:00