1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00
Commit Graph

19461 Commits

Author SHA1 Message Date
Hao Liu
4a62b32731 Fix PR15950 A bug in DAG Combiner about undef mask
llvm-svn: 181682
2013-05-13 02:07:05 +00:00
Rafael Espindola
777f36d415 XFAIL this test for mingw too.
llvm-svn: 181678
2013-05-13 00:18:24 +00:00
Nadav Rotem
434624fd18 SLPVectorizer: Fix a bug in the code that generates extracts for values with multiple users.
The external user does not have to be in lane #0. We have to save the lane for each scalar so that we know which vector lane to extract.

llvm-svn: 181674
2013-05-12 22:58:45 +00:00
David Majnemer
f01cfb5523 InstCombine: Flip the order of two urem transforms
There are two transforms in visitUrem that conflict with each other.

*) One, if a divisor is a power of two, subtracts one from the divisor
   and turns it into a bitwise-and.
*) The other unwraps both operands if they are surrounded by zext
   instructions.

Flipping the order allows the subtraction to go beneath the sign
extension.

llvm-svn: 181668
2013-05-12 00:07:05 +00:00
Arnold Schwaighofer
c532e0f494 LoopVectorize: Use the widest induction variable type
Use the widest induction type encountered for the cannonical induction variable.

We used to turn the following loop into an empty loop because we used i8 as
induction variable type and truncated 1024 to 0 as trip count.

int a[1024];
void fail() {
  int reverse_induction = 1023;
  unsigned char forward_induction = 0;
  while ((reverse_induction) >= 0) {
    forward_induction++;
    a[reverse_induction] = forward_induction;
    --reverse_induction;
  }
}

radar://13862901

llvm-svn: 181667
2013-05-11 23:04:28 +00:00
David Majnemer
6f959e00ef InstCombine: Turn urem to bitwise-and more often
Use isKnownToBeAPowerOfTwo in visitUrem so that we may more aggressively
fold away urem instructions.

llvm-svn: 181661
2013-05-11 09:01:28 +00:00
Reed Kotler
1f27950895 Add -mtriple=mipsel-linux-gnu to the test so that the compiler does
not think it can support small data sections.

llvm-svn: 181654
2013-05-11 01:02:20 +00:00
Nadav Rotem
062be30d6a SLPVectorizer: Add support for trees with external users.
For example:
bar() {
  int a = A[i];
  int b = A[i+1];
  B[i] = a;
  B[i+1] = b;
  foo(a);  <--- a is used outside the vectorized expression.
}

llvm-svn: 181648
2013-05-10 22:59:33 +00:00
Nadav Rotem
b2b70d65a8 Add an additional testcase for PR15882.
llvm-svn: 181646
2013-05-10 22:55:44 +00:00
Reed Kotler
88fbecdc6f Checkin in of first of several patches to finish implementation of
mips16/mips32 floating point interoperability. 

This patch fixes returns from mips16 functions so that if the function
was in fact called by a mips32 hard float routine, then values
that would have been returned in floating point registers are so returned.

Mips16 mode has no floating point instructions so there is no way to
load values into floating point registers.

This is needed when returning float, double, single complex, double complex
in the Mips ABI.

Helper functions in libc for mips16 are available to do this.

For efficiency purposes, these helper functions have a different calling
convention from normal Mips calls.

Registers v0,v1,a0,a1 are used to pass parameters instead of
a0,a1,a2,a3.

This is because v0,v1,a0,a1 are the natural registers used to return
floating point values in soft float. These values can then be moved
to the appropriate floating point registers with no extra cost.

The only register that is modified is ra in this call.

The helper functions make sure that the return values are in the floating
point registers that they would be in if soft float was not in effect
(which it is for mips16, though the soft float is implemented using a mips32
library that uses hard float).
 

llvm-svn: 181641
2013-05-10 22:25:39 +00:00
David Blaikie
cda1715add Give the test from r181632 a target triple.
llvm-svn: 181637
2013-05-10 22:14:39 +00:00
David Blaikie
685380da68 PR14492: Debug Info: Support for values of non-integer non-type template parameters.
This is only tested for global variables at the moment (& includes tests
for the unnamed parameter case, since apparently this entire function
was completely untested previously)

llvm-svn: 181632
2013-05-10 21:52:07 +00:00
Jyotsna Verma
2dfc0b2d13 Hexagon: Fix switch cases in HexagonVLIWPacketizer.cpp.
llvm-svn: 181624
2013-05-10 20:27:34 +00:00
Chad Rosier
7da7292b4e [ms-inline asm] Fix a crasher when we fail on a direct match.
The issue was that the MatchingInlineAsm and VariantID args to the
MatchInstructionImpl function weren't being set properly.  Specifically, when
parsing intel syntax, the parser thought it was parsing inline assembly in the
at&t dialect; that will never be the case.  

The crash was caused when the emitter tried to emit the instruction, but the
operands weren't set.  When parsing inline assembly we only set the opcode, not
the operands, which is used to lookup the instruction descriptor.
rdar://13854391 and PR15945

Also, this commit reverts r176036.  Now that we're correctly parsing the intel
syntax the pushad/popad don't match properly.  I've reimplemented that fix using
a MnemonicAlias.

llvm-svn: 181620
2013-05-10 18:24:17 +00:00
Benjamin Kramer
c8a8544b79 InstCombine: Don't claim to be able to evaluate any shl in a zexted type.
The shift amount may be larger than the type leading to undefined behavior.
Limit the transform to constant shift amounts. While there update the bits to
clear in the result which may enable additional optimizations.

PR15959.

llvm-svn: 181604
2013-05-10 16:26:37 +00:00
Logan Chien
d5b8ea6c58 Implement AsmParser for ARM unwind directives.
This commit implements the AsmParser for fnstart, fnend,
cantunwind, personality, handlerdata, pad, setfp, save, and
vsave directives.

This commit fixes some minor issue in the ARMELFStreamer:

* The switch back to corresponding section after the .fnend
  directive.

* Emit the unwind opcode while processing .fnend directive
  if there is no .handlerdata directive.

* Emit the unwind opcode to .ARM.extab while processing
  .handlerdata even if .personality directive does not exist.

llvm-svn: 181603
2013-05-10 16:17:24 +00:00
Aaron Ballman
3a151dd415 XFAILing this test on Win32 to unbreak the build bots.
llvm-svn: 181600
2013-05-10 14:42:16 +00:00
Benjamin Kramer
9205d91a11 DAGCombiner: Generate a correct constant for vector types when folding (xor (and)) into (and (not)).
PR15948.

llvm-svn: 181597
2013-05-10 14:09:52 +00:00
Benjamin Kramer
bb162fb77a InstCombine: Verify the type before transforming uitofp into select.
PR15952.

llvm-svn: 181586
2013-05-10 09:16:52 +00:00
Tom Stellard
7edf38bf1f R600: Remove AMDILPeeopholeOptimizer and replace optimizations with tablegen patterns
The BFE optimization was the only one we were actually using, and it was
emitting an intrinsic that we don't support.

https://bugs.freedesktop.org/show_bug.cgi?id=64201

Reviewed-by: Christian König <christian.koenig@amd.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181580
2013-05-10 02:09:45 +00:00
Tom Stellard
ed363c57b2 R600: Expand SUB for v2i32/v4i32
Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181579
2013-05-10 02:09:39 +00:00
Tom Stellard
3ca3d250c6 R600: Expand MUL for v4i32/v2i32
Fixes piglit test for OpenCL builtin mul24, and allows mad24 to run.

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181578
2013-05-10 02:09:34 +00:00
Tom Stellard
56fef8261c R600: Expand SRA for v4i32/v2i32
v2: Add v4i32 test

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181577
2013-05-10 02:09:29 +00:00
Tom Stellard
5d4a5a0d37 R600: Expand vselect for v4i32 and v2i32
v2: Add vselect v4i32 test

Patch by: Aaron Watry

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Aaron Watry <awatry@gmail.com>

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 181576
2013-05-10 02:09:24 +00:00
Chad Rosier
1c5336508d [x86AsmParser] It's valid to stop parsing an operand at an immediate.
rdar://13854369 and PR15944

llvm-svn: 181564
2013-05-09 23:48:53 +00:00
Owen Anderson
e34b034b7d Teach SelectionDAG to constant fold all-constant FMA nodes the same way that it constant folds FADD, FMUL, etc.
llvm-svn: 181555
2013-05-09 22:27:13 +00:00
Bill Wendling
59bb428cc9 Generate a compact unwind encoding in the face of a stack alignment push.
We generate a `push' of a random register (%rax) if the stack needs to be
aligned by the size of that register. However, this could mess up compact unwind
generation. In particular, we want to still generate compact unwind in the
presence of this monstrosity.

Check if the push of of the %rax/%eax register. If it is and it's marked with
the `FrameSetup' flag, then we can generate a compact unwind encoding for the
function only if the push is the last FrameSetup instruction.

llvm-svn: 181540
2013-05-09 20:10:38 +00:00
Jyotsna Verma
15d448ba0c Hexagon: Use relation map for getMatchingCondBranchOpcode() and
getInvertedPredicatedOpcode() functions instead of switch cases.

llvm-svn: 181530
2013-05-09 18:25:44 +00:00
Rafael Espindola
8e3074a3a4 Don't replace an alias in llvm.used with its target.
When we replace an internal alias with its target, be careful not to
replace the entry in llvm.used (and llvm.compiler_used).

llvm-svn: 181524
2013-05-09 17:22:59 +00:00
Richard Osborne
7504cb9f47 [XCore] Fix handling of functions where only the LR is spilled.
Previously we only checked if the LR required saving if the frame size was
non zero. However because the caller reserves 1 word for the callee to use
that doesn't count towards our frame size it is possible for the LR to need
saving and for the frame size to be 0.

We didn't hit when the LR needed saving because of a function calls because
the 1 word of stack we must allocate for our callee means the frame size
is always non zero in this case. However we can hit this case if the LR is
clobbered in inline asm.

llvm-svn: 181520
2013-05-09 16:43:42 +00:00
Benjamin Kramer
54122b8028 InstCombine: Don't just copy known bits from the first operand of an srem.
That's obviously wrong. Conservatively restrict it to the sign bit, which
matches the original intention of this analysis. Fixes PR15940.

llvm-svn: 181518
2013-05-09 16:32:32 +00:00
Rafael Espindola
56fa3ce519 Change getRelocationAdditionalInfo to be ELF only.
It was only implemented for ELF where it collected the Addend, so this
patch also renames it to getRelocationAddend.

llvm-svn: 181502
2013-05-09 03:39:05 +00:00
Eric Christopher
b48ba83359 Revert "Make sure debug info contains linkage names (DW_AT_MIPS_linkage_name)"
temporarily while investigating gdb.cp/templates.exp.

This reverts commit r181471.

llvm-svn: 181496
2013-05-09 00:42:33 +00:00
Arnold Schwaighofer
374ad2d113 LoopVectorizer: Don't assert on the absence of induction variables
A computable loop exit count does not imply the presence of an induction
variable. Scalar evolution can return a value for an infinite loop.

Fixes PR15926.

llvm-svn: 181495
2013-05-09 00:32:18 +00:00
Daniel Malea
79735dd740 Revert 181475 as the DebugIR tests are breaking (automake) buildbots that re-use build dirs
- the temporaries "-debug.ll" files generated by DebugIR pass are considered tests, even though they are not

llvm-svn: 181476
2013-05-08 21:55:31 +00:00
Eric Christopher
dfe2e69b49 Make sure debug info contains linkage names (DW_AT_MIPS_linkage_name)
for constructors and destructors since the original declaration given
by the AT_specification both won't and can't.

Patch by Yacine Belkadi, I've cleaned up the testcases.

llvm-svn: 181471
2013-05-08 21:23:22 +00:00
Daniel Malea
b69ddddcde DebugIR tests -- lit tests for the line number transform
- simple one-function case
- function-calling case
- external function calling case
- exception throwing case
- vector case

Note: these tests are somewhat coupled to the current format of debug metadata.
llvm-svn: 181469
2013-05-08 21:03:00 +00:00
Akira Hatanaka
a1d814e7b8 [mips] Add instruction selection pattern for (seteq $LHS, 0).
llvm-svn: 181459
2013-05-08 19:38:04 +00:00
Ulrich Weigand
65402d4ff3 [PowerPC] Add ELF relocation tests
This patch extends test/MC/PowerPC/ppc64-fixups.s to not only check for
the correct fixup type in the --show-encoding output, but also runs the
generated object file through llvm-readobj -r and verifies that the
correct ELF relocation records were generated.

llvm-svn: 181453
2013-05-08 17:51:44 +00:00
Bill Schmidt
7f1a2b5212 Fix handling of anonymous aggregate parameters for powerpc*-apple-darwin8.
This fixes bug 15821 similarly to the powerpc64-linux fix for bug 14779.

Patch by David Fang.

llvm-svn: 181449
2013-05-08 17:22:33 +00:00
Michel Danzer
e95e9672c2 R600/SI: Add lit tests for llvm.SI.imageload and llvm.SI.resinfo intrinsics
Adapted from the llvm.SI.sample test.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 181425
2013-05-08 13:07:29 +00:00
Hal Finkel
1636de8210 PPCInstrInfo::optimizeCompareInstr should not optimize FP compares
The floating-point record forms on PPC don't set the condition register bits
based on a comparison with zero (like the integer record forms do), but rather
based on the exception status bits.

llvm-svn: 181423
2013-05-08 12:16:14 +00:00
Mihai Popa
6775df74f5 This patch fixes two tests marked as XFAIL among the ARM assembler tests.
The reference encoding is correct, but written in the wrong byte order (these are Thumb tests, while the reference is in ARM byte order).

llvm-svn: 181420
2013-05-08 09:41:12 +00:00
Nick Lewycky
a88ff03516 Fix a bug in codegenprep where it was losing track of values OptimizeMemoryInst
by switching to a ValueMap. Patch by Andrea DiBiagio!

llvm-svn: 181397
2013-05-08 09:00:10 +00:00
David Majnemer
fdba035711 DAGCombiner: Simplify inverted bit tests
Fold (xor (and x, y), y) -> (and (not x), y)

This removes an opportunity for a constant to appear twice.

llvm-svn: 181395
2013-05-08 06:44:42 +00:00
David Blaikie
60e6a4e680 Debug Info: Support DW_TAG_imported_declaration
This provides basic functionality for imported declarations. For
subprograms and types some amount of lazy construction is supported (so
the definition of a function can proceed the using declaration), but it
still doesn't handle declared-but-not-defined functions (since we don't
generally emit function declarations).

Variable support is really rudimentary at the moment - simply looking up
the existing definition with no support for out of order (declaration,
imported_module, then definition).

llvm-svn: 181392
2013-05-08 06:01:41 +00:00
Arnold Schwaighofer
b00b8ac69a LoopVectorizer: Improve reduction variable identification
The two nested loops were confusing and also conservative in identifying
reduction variables. This patch replaces them by a worklist based approach.

llvm-svn: 181369
2013-05-07 21:55:37 +00:00
Kevin Enderby
4562d15159 Fix a bug in the MC asm parser evaluating expressions. It was treating:
A = 9
B = 3 * A - 2 * A + 1 as  B = 3 * A - (2 * A + 1)

rdar://13816516

llvm-svn: 181366
2013-05-07 21:40:58 +00:00
Jyotsna Verma
37863260ff Hexagon: Fix Small Data support to handle -G 0 correctly.
llvm-svn: 181344
2013-05-07 19:53:00 +00:00
David Blaikie
22d24b013e Debug Info: Fix for break due to r181271
Apparently we didn't keep an association of Compile Unit metadata nodes
to DIEs so looking up that parental context failed & thus caused no
DW_TAG_imported_modules to be emitted at the CU scope. Fix this by
adding the mapping & sure up the test case to verify this.

llvm-svn: 181339
2013-05-07 17:57:13 +00:00
Jyotsna Verma
5307666fe8 Reverting r181331.
Missing file, HexagonSplitConst32AndConst64.cpp, from lib/Target/Hexagon/CMakeLists.txt.

llvm-svn: 181334
2013-05-07 17:12:35 +00:00
Jyotsna Verma
af0c734e1b Hexagon: Fix Small Data support to handle -G 0 correctly.
llvm-svn: 181331
2013-05-07 16:42:15 +00:00
Arnold Schwaighofer
f95f087afb LoopVectorize: getConsecutiveVector must respect signed arithmetic
We were passing an i32 to ConstantInt::get where an i64 was needed and we must
also pass the sign if we pass negatives numbers. The start index passed to
getConsecutiveVector must also be signed.

Should fix PR15882.

llvm-svn: 181286
2013-05-07 04:37:05 +00:00
David Blaikie
b40dfb39c7 DebugInfo: Support imported modules in lexical blocks
llvm-svn: 181271
2013-05-06 23:33:07 +00:00
David Majnemer
68574fa9e6 InstCombine: (X ^ signbit) + C -> X + (signbit ^ C)
llvm-svn: 181249
2013-05-06 21:21:31 +00:00
Bill Wendling
0c1e625af5 Reduce attributes.
llvm-svn: 181245
2013-05-06 20:57:23 +00:00
Rafael Espindola
d896db3c7d Split Alignment out of the Section Characteristics.
The alignment is just a byte in the middle of Characteristics, not an
independent flag. Making it an independent field in the yaml
representation makes it more yamlio friendly.

llvm-svn: 181243
2013-05-06 20:11:21 +00:00
Jean-Luc Duprat
bf543366fd Test results verified using FileCheck rather than grep | count
llvm-svn: 181234
2013-05-06 18:45:16 +00:00
Andrew Trick
5d13ab6ea6 Rotate multi-exit loops even if the latch was simplified.
Test case by Michele Scandale!

Fixes PR10293: Load not hoisted out of loop with multiple exits.

There are few regressions with this patch, now tracked by
rdar:13817079, and a roughly equal number of improvements. The
regressions are almost certainly back luck because LoopRotate has very
little idea of whether rotation is profitable. Doing better requires a
more comprehensive solution.

This checkin is a quick fix that lacks generality (PR10293 has
a counter-example). But it trivially fixes the case in PR10293 without
interfering with other cases, and it does satify the criteria that
LoopRotate is a loop canonicalization pass that should avoid
heuristics and special cases.

I can think of two approaches that would probably be better in
the long run. Ultimately they may both make sense.

(1) LoopRotate should check that the current header would make a good
loop guard, and that the loop does not already has a sufficient
guard. The artifical SimplifiedLoopLatch check would be unnecessary,
and the design would be more general and canonical. Two difficulties:

- We need a strong guarantee that we won't endlessly rotate, so the
  analysis would need to be precise in order to avoid the
  SimplifiedLoopLatch precondition.

- Analysis like this are usually based on SCEV, which we don't want to
  rely on.

(2) Rotate on-demand in late loop passes. This could even be done by
shoving the loop back on the queue after the optimization that needs
it. This could work well when we find LICM opportunities in
multi-branch loops. This requires some work, and it doesn't really
solve the problem of SCEV wanting a loop guard before the analysis.

llvm-svn: 181230
2013-05-06 17:58:18 +00:00
Tom Stellard
fb8e73f3af R600: Emit config values in register / value pairs
Reviewed-by: Vincent Lejeune <vljn@ovi.com>
Tested-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 181228
2013-05-06 17:50:51 +00:00
Tom Stellard
6c3f6e1b02 R600: Stop emitting the instruction type byte before each instruction
Reviewed-by: Vincent Lejeune <vljn@ovi.com>
Tested-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 181225
2013-05-06 17:50:44 +00:00
Tom Stellard
ebe049fd75 R600: Emit ISA for CALL_FS_* instructions
Reviewed-by: Vincent Lejeune <vljn@ovi.com>
Tested-By: Aaron Watry <awatry@gmail.com>
llvm-svn: 181223
2013-05-06 17:50:26 +00:00
Ulrich Weigand
2d456969d5 [SystemZ] Update non-pic DWARF encodings
As pointed out by Rafael Espindola, we should match the DWARF encodings
produced by GCC in both pic and non-pic modes.  This was not the case
for the non-pic case.

This patch changes all DWARF encodings to DW_EH_PE_absptr for the
non-pic case, just like GCC does.  The test case is updated to check
for both variants.

llvm-svn: 181222
2013-05-06 17:28:30 +00:00
Adhemerval Zanella
3b2874423e PowerPC: Fix unimplemented relocation on ppc64
This patch handles the R_PPC64_REL64 relocation type for powerpc64
for mcjit.

llvm-svn: 181220
2013-05-06 17:21:23 +00:00
Jean-Luc Duprat
8911ab2fa9 Fix add4.ll test cmdline so that it passes
llvm-svn: 181219
2013-05-06 17:18:47 +00:00
Jean-Luc Duprat
5607a72e21 Provide InstCombines for the following 3 cases:
A * (1 - (uitofp i1 C)) -> select C, 0, A
B * (uitofp i1 C) -> select C, B, 0
select C, 0, A + select C, B, 0 -> select C, B, A

These come up in code that has been hand-optimized from a select to a linear blend, 
on platforms where that may have mattered. We want to undo such changes 
with the following transform:
A*(1 - uitofp i1 C) + B*(uitofp i1 C) -> select C, A, B

llvm-svn: 181216
2013-05-06 16:55:50 +00:00
Tim Northover
eb518c7918 AArch64: use MCJIT by default and enable related tests.
This just enables some testing I'd missed after implementing MCJIT
support.

llvm-svn: 181215
2013-05-06 16:51:08 +00:00
Ulrich Weigand
dbdc6e0306 [SystemZ] Set up JIT/MCJIT test cases
This patch adds the necessary configuration bits and #ifdef's to set up
the JIT/MCJIT test cases for SystemZ.  Like other recent targets, we do
fully support MCJIT, but do not support the old JIT at all.  Set up the
lit config files accordingly, and disable old-JIT unit tests.

Patch by Richard Sandiford.

llvm-svn: 181207
2013-05-06 16:21:50 +00:00
Ulrich Weigand
45233a4e05 [SystemZ] Add MC test cases
This adds all MC tests for the SystemZ target.

Patch by Richard Sandiford.

llvm-svn: 181206
2013-05-06 16:20:58 +00:00
Ulrich Weigand
a63b3fe2f6 [SystemZ] Add DebugInfo test cases
This adds all DebugInfo tests for the SystemZ target.

This version of the patch incorporates feedback from reviews by
Eric Christopher and Rafael Espindola.  Thanks to all reviewers!

Patch by Richard Sandiford.

llvm-svn: 181205
2013-05-06 16:18:29 +00:00
Ulrich Weigand
1431b3c2f5 [SystemZ] Add CodeGen test cases
This adds all CodeGen tests for the SystemZ target.

This version of the patch incorporates feedback from a review by
Sean Silva.  Thanks to all reviewers!

Patch by Richard Sandiford.

llvm-svn: 181204
2013-05-06 16:17:29 +00:00
Rafael Espindola
b038125da3 Free the exception object. Should fix the vg bots.
llvm-svn: 181195
2013-05-06 13:30:52 +00:00
Michael Kuperstein
74685eff73 Fix slightly too aggressive conact_vector optimization.
(Would sometimes optimize away conacts used to extend a vector with undef values)

llvm-svn: 181186
2013-05-06 08:06:13 +00:00
Bill Wendling
454b468436 Add a testcase that checks that we generate functions with frame
pointers or not depending upon the function attributes.

llvm-svn: 181180
2013-05-06 05:45:57 +00:00
Rafael Espindola
723fad6f83 XFAIL for cygwin.
Looks like symbol resolution is not working on cygwin, the test fails
because __gxx_personality_v0 is not found.

llvm-svn: 181179
2013-05-06 03:35:56 +00:00
Nadav Rotem
8564ccca8b Revert r164763 because it introduces new shuffles.
Thanks Nick Lewycky for pointing this out.

llvm-svn: 181177
2013-05-06 02:39:09 +00:00
Matt Arsenault
34e1805cd0 Fix unchecked uses of DominatorTree in MemoryDependenceAnalysis.
Use unknown results for places where it would be needed

llvm-svn: 181176
2013-05-06 02:07:24 +00:00
Rafael Espindola
15a39ed0e8 Fix const merging when an alias of a const is llvm.used.
We used to disable constant merging not only if a constant is llvm.used, but
also if an alias of a constant is llvm.used. This change fixes that.

llvm-svn: 181175
2013-05-06 01:48:55 +00:00
Rafael Espindola
f67dd4ce8a This should also fail on ARM.
We currently have no way to register new eh frames on ARM.

llvm-svn: 181172
2013-05-05 22:42:34 +00:00
Rafael Espindola
ef39d9678b Fix XFAIL line.
llvm-svn: 181171
2013-05-05 21:30:10 +00:00
Rafael Espindola
6a1e273872 XFAIL this on ppc64.
It looks like eh uses an unimplemented relocation on pp64

llvm-svn: 181169
2013-05-05 21:04:18 +00:00
Rafael Espindola
297c5a1e10 Add EH support to the MCJIT.
This gets exception handling working on ELF and Macho (x86-64 at least).
Other than the EH frame registration, this patch also implements support
for GOT relocations which are used to locate the personality function on
MachO.

llvm-svn: 181167
2013-05-05 20:43:10 +00:00
Evan Cheng
2fcdb53946 Test case for r181160 and r181161. rdar://13782395
llvm-svn: 181162
2013-05-05 18:07:15 +00:00
Richard Osborne
816f899c45 [XCore] Add LDAPB instructions.
With the change the disassembler now supports the XCore ISA in its
entirety.

llvm-svn: 181155
2013-05-05 13:36:53 +00:00
Richard Osborne
6600501755 [XCore] Add BLRB instructions.
llvm-svn: 181152
2013-05-05 13:24:16 +00:00
Stepan Dyatkovskiy
c06cd03f6e For ARM backend, fixed "byval" attribute support.
Now even the small structures could be passed within byval (small enough
to be stored in GPRs).
In regression tests next function prototypes are checked:

PR15293:
  %artz = type { i32 }
  define void @foo(%artz* byval %s)
  define void @foo2(%artz* byval %s, i32 %p, %artz* byval %s2)
foo: "s" stored in R0
foo2: "s" stored in R0, "s2" stored in R2.

Next AAPCS rules are checked:
5.5 Parameters Passing, C.4 and C.5,
"ParamSize" is parameter size in 32bit words:
-- NSAA != 0, NCRN < R4 and NCRN+ParamSize > R4.
   Parameter should be sent to the stack; NCRN := R4.
-- NSAA != 0, and NCRN < R4, NCRN+ParamSize < R4.
   Parameter stored in GPRs; NCRN += ParamSize.

llvm-svn: 181148
2013-05-05 07:48:36 +00:00
David Majnemer
d5ba4da281 Remove a recently redundant transform from X86ISelLowering.
X86ISelLowering has support to treat:
(icmp ne (and (xor %flags, -1), (shl 1, flag)), 0)

as if it were actually:
(icmp eq (and %flags, (shl 1, flag)), 0)

However, r179386 has code at the InstCombine level to handle this.

llvm-svn: 181145
2013-05-05 02:00:10 +00:00
Arnold Schwaighofer
0a0d84c7b7 LoopVectorize: Add support for floating point min/max reductions
Add support for min/max reductions when "no-nans-float-math" is enabled. This
allows us to assume we have ordered floating point math and treat ordered and
unordered predicates equally.

radar://13723044

llvm-svn: 181144
2013-05-05 01:54:48 +00:00
Arnold Schwaighofer
0158aa58f2 LoopVectorize: We don't need an identity element for min/max reductions
We can just use the initial element that feeds the reduction.

  max(max(x, y), z) == max(max(x,y), max(x,z))

radar://13723044

llvm-svn: 181141
2013-05-05 01:54:42 +00:00
Tim Northover
390335941f AArch64: enable MCJIT and tests now that everything passes.
This removes dire warnings about AArch64 being unsupported and enables
the tests when appropriate on this platform.

llvm-svn: 181135
2013-05-04 20:14:22 +00:00
Tim Northover
d4f2cac7b6 AArch64: support literal pool access in large memory model.
llvm-svn: 181120
2013-05-04 16:54:07 +00:00
Tim Northover
4ef2500d01 AArch64: support large code model for jump-tables
llvm-svn: 181119
2013-05-04 16:54:00 +00:00
Tim Northover
ece66eacb2 AArch64: implement support for blockaddress in large code model
llvm-svn: 181118
2013-05-04 16:53:53 +00:00
Tim Northover
87645e02c0 AArch64: implement large code model access to global variables.
The MOVZ/MOVK instruction sequence may not be the most efficient (a
literal-pool load could be better) but adding that would require
reinstating the ConstantIslands pass.

For now the sequence is correct, and that's enough. Beware, as of
commit GNU ld does not appear to support the relocations needed for
this. Its primary purpose (for now) will be to support JITed code,
since in that case there is no guarantee of where your code will end
up in memory relative to external symbols it references.

llvm-svn: 181117
2013-05-04 16:53:46 +00:00
Tim Northover
f5d9c9ba13 Allow host triple to be correctly overridden in CMake builds
The intended semantics mirror autoconf, where the user is able to
specify a host triple, but if it's left to the build system then
"config.guess" is invoked for the default.

This also renames the LLVM_HOSTTRIPLE define to LLVM_HOST_TRIPLE to
fit in with the style of the surrounding defines.

llvm-svn: 181112
2013-05-04 07:36:23 +00:00
Amara Emerson
036eb4649d Revert r181009.
llvm-svn: 181079
2013-05-03 23:57:17 +00:00
Reed Kotler
b89d9a0181 Remove some uneeded pseudos in the presence of the naked function attribute.
llvm-svn: 181072
2013-05-03 23:17:24 +00:00
Amara Emerson
a0c67cd288 Delete test instead.
llvm-svn: 181066
2013-05-03 22:39:03 +00:00
Amara Emerson
63cfb8d51f Temporarily disable failing test.
llvm-svn: 181062
2013-05-03 22:27:48 +00:00
Ulrich Weigand
86c774797b [PowerPC] Parse platform-specifc variant kinds in AsmParser
This patch adds support for PowerPC platform-specific variant
kinds in MCSymbolRefExpr::getVariantKindForName, and also
adds a test case to verify they are translated to the appropriate
fixup type.

llvm-svn: 181053
2013-05-03 19:52:35 +00:00
Ulrich Weigand
c7ad3c20c4 [PowerPC] Add some Book II instructions to AsmParser
This patch adds a couple of Book II instructions (isync, icbi) to the
PowerPC assembler parser.  These are needed when bootstrapping clang
with the integrated assembler forced on, because they are used in
inline asm statements in the code base.

The test case adds the full list of Book II storage control instructions,
including associated extended mnemonics.  Again, those that are not yet
supported as marked as FIXME.

llvm-svn: 181052
2013-05-03 19:51:09 +00:00
Ulrich Weigand
4b44c2d06f [PowerPC] Support extended mnemonics in AsmParser
This patch adds infrastructure to support extended mnemonics in the
PowerPC assembler parser.  It adds support specifically for those
extended mnemonics that LLVM will itself generate.

The test case lists *all* extended mnemonics according to the
PowerPC ISA v2.06 Book I, but marks those not yet supported
as FIXME.

llvm-svn: 181051
2013-05-03 19:50:27 +00:00
Ulrich Weigand
d9b4cff835 [PowerPC] Add assembler parser
This adds assembler parser support to the PowerPC back end.

The parser will run for any powerpc-*-* and powerpc64-*-* triples,
but was tested only on 64-bit Linux.  The supported syntax is
intended to be compatible with the GNU assembler.

The parser does not yet support all PowerPC instructions, but
it does support anything that is generated by LLVM itself.
There is no support for testing restricted instruction sets yet,
i.e. the parser will always accept any instructions it knows,
no matter what feature flags are given.

Instruction operands will be checked for validity and errors
generated.  (Error handling in general could still be improved.)

The patch adds a number of test cases to verify instruction
and operand encodings.  The tests currently cover all instructions
from the following PowerPC ISA v2.06 Book I facilities:
Branch, Fixed-point, Floating-Point, and Vector. 
Note that a number of these instructions are not yet supported
by the back end; they are marked with FIXME.

A number of follow-on check-ins will add extra features.  When
they are all included, LLVM passes all tests (including bootstrap)
when using clang -cc1as as the system assembler.

llvm-svn: 181050
2013-05-03 19:49:39 +00:00
Akira Hatanaka
5f295bccfc [mips] Split the DSP control register and define one register for each field of
its fields.

This removes false dependencies between DSP instructions which access different
fields of the the control register. Implicit register operands are added to
instructions RDDSP and WRDSP after instruction selection, depending on the
value of the mask operand.

llvm-svn: 181041
2013-05-03 18:37:49 +00:00
Nadav Rotem
97fe0281b4 LoopVectorizer: Add support for if-conversion of PHINodes with 3+ incoming values.
By supporting the vectorization of PHINodes with more than two incoming values we can increase the complexity of nested if statements.

We can now vectorize this loop:

int foo(int *A, int *B, int n) {
  for (int i=0; i < n; i++) {
    int x = 9;
    if (A[i] > B[i]) {
      if (A[i] > 19) {
        x = 3;
      } else if (B[i] < 4 ) {
        x = 4;
      } else {
        x = 5;
      }
    }
    A[i] = x;
  }
}

llvm-svn: 181037
2013-05-03 17:42:55 +00:00
Tom Stellard
2165728987 R600: Expand vector or, shl, srl, and xor nodes
llvm-svn: 181035
2013-05-03 17:21:31 +00:00
Tom Stellard
f2fd0109a0 R600: Add pattern for SHA-256 Ma function
This can be optimized using the BFI_INT instruction.

llvm-svn: 181033
2013-05-03 17:21:20 +00:00
Tobias Grosser
fb2da25967 RegionInfo: Do not crash if unreachable block is found
llvm-svn: 181025
2013-05-03 15:48:34 +00:00
Amara Emerson
863672f436 Add support for reading ARM ELF build attributes.
Build attribute sections can now be read if they exist via ELFObjectFile, and
the llvm-readobj tool has been extended with an option to dump this information
if requested. Regression tests are also included which exercise these features.

Also update the docs with a fixed ARM ABI link and a new link to the Addenda
which provides the build attributes specification.

llvm-svn: 181009
2013-05-03 11:36:35 +00:00
Akira Hatanaka
ab6ee99fe0 [mips] Handle reading, writing or copying of ccond field of DSP control
register.

- Define pseudo instructions which store or load ccond field of the DSP
  control register.
- Emit the pseudos in MipsSEInstrInfo::storeRegToStack and loadRegFromStack.
- Expand the pseudos before callee-scan save.
- Emit instructions RDDSP or WRDSP to copy between ccond field and GPRs. 

llvm-svn: 180969
2013-05-02 23:07:05 +00:00
Vincent Lejeune
5d7b2a4aea R600: Signed literals are 64bits wide
llvm-svn: 180960
2013-05-02 21:53:03 +00:00
Vincent Lejeune
3ff31b75b3 R600: If previous bundle is dot4, PV valid chan is always X
llvm-svn: 180959
2013-05-02 21:52:55 +00:00
Vincent Lejeune
97fb65a788 R600: Add a test to check that use_kill is emitted
llvm-svn: 180958
2013-05-02 21:52:46 +00:00
Vincent Lejeune
62da1453e1 R600: Prettier asmPrint of Alu
llvm-svn: 180956
2013-05-02 21:52:30 +00:00
Pranav Bhandarkar
520d26e773 Hexagon - Add peephole optimizations for zero extends.
* lib/Target/Hexagon/HexagonInstrInfo.td: Add patterns to combine a
	sequence of a pair of i32->i64 extensions followed by a "bitwise or"
	into COMBINE_rr.
	* lib/Target/Hexagon/HexagonPeephole.cpp: Copy propagate Rx in the
	instruction Rp = COMBINE_Ir_V4(0, Rx) to the uses of Rp:subreg_loreg.
	* test/CodeGen/Hexagon/union-1.ll: New test.
	* test/CodeGen/Hexagon/combine_ir.ll: Fix test.

llvm-svn: 180946
2013-05-02 20:22:51 +00:00
Manman Ren
0e2c14381d TBAA: remove !tbaa from testing cases if not used.
This will make it easier to turn on struct-path aware TBAA since the metadata
format will change.

llvm-svn: 180935
2013-05-02 18:11:35 +00:00
Michael Liao
ca62375d96 Rewrite X86 codegen regression test with FileCheck
llvm-svn: 180910
2013-05-02 06:20:42 +00:00
David Majnemer
7f5357e6c4 Add a test for the foldSelectICmpAndOr fix committed in r180779.
This tests a case where C1 and C2 were the same but X and Y were different
widths.

llvm-svn: 180907
2013-05-02 02:44:23 +00:00
Michael Liao
ec28235c2a Avoid generating tempfile(s) never used
As DejaGNU is deprecated, it seems pipe-jam issue doesn't exist any more.

llvm-svn: 180892
2013-05-01 22:46:50 +00:00
Bill Wendling
218b457a2f Revert r180737. The companion patch was reverted, and this is not relevant right now.
llvm-svn: 180889
2013-05-01 22:32:08 +00:00
Nadav Rotem
c0309431a1 SROA: Generate selects instead of shuffles when blending values because this is the cannonical form.
Shuffles are more difficult to lower and we usually don't touch them, while we do optimize selects more often.

llvm-svn: 180875
2013-05-01 19:53:30 +00:00
Nadav Rotem
d62966b79d Optimize away nop CONCAT_VECTOR nodes.
Optimize CONCAT_VECTOR nodes that merge EXTRACT_SUBVECTOR values that extract from the same vector.

rdar://13402653
PR15866

llvm-svn: 180871
2013-05-01 19:18:51 +00:00
Rafael Espindola
056abc9e26 Put VMOVPQIto64rr in the VRPDI class.
Patch by Joshua Magee.

llvm-svn: 180842
2013-05-01 13:00:16 +00:00
Michael Liao
cff6c527b8 Forget remove the tempfile argument
llvm-svn: 180838
2013-05-01 05:45:57 +00:00
Michael Liao
349e772e4f More rewrites of x86 codegen regression tests with FileCheck
llvm-svn: 180837
2013-05-01 05:34:30 +00:00
Jim Grosbach
5002c3f17d Revert "InstCombine: Fold more shuffles of shuffles."
This reverts commit r180802

There's ongoing discussion about whether this is the right place to make
this transformation. Reverting for now while we figure it out.

llvm-svn: 180834
2013-05-01 00:25:27 +00:00
Akira Hatanaka
f5c940dea8 [mips] Fix handling of instructions which copy to/from accumulator registers.
Expand copy instructions between two accumulator registers before callee-saved
scan is done. Handle copies between integer GPR and hi/lo registers in
MipsSEInstrInfo::copyPhysReg. Delete pseudo-copy instructions that are not
needed.

llvm-svn: 180827
2013-04-30 23:22:09 +00:00
Stephen Lin
84b2d4dbd4 Only pass 'returned' to target-specific lowering code when the value of entire register is guaranteed to be preserved.
llvm-svn: 180825
2013-04-30 22:49:28 +00:00
Akira Hatanaka
0bca7f3584 [mips] Instruction selection patterns for DSP-ASE vector select and compare
instructions.

llvm-svn: 180820
2013-04-30 22:37:26 +00:00
Adrian Prantl
7482401c1d Temporarily revert "Change the informal convention of DBG_VALUE so that we can express a"
because it breaks some buildbots.

This reverts commit 180816.

llvm-svn: 180819
2013-04-30 22:35:14 +00:00
Adrian Prantl
baf0a98faa Change the informal convention of DBG_VALUE so that we can express a
register-indirect address with an offset of 0.
It used to be that a DBG_VALUE is a register-indirect value if the offset
(operand 1) is nonzero. The new convention is that a DBG_VALUE is
register-indirect if the first operand is a register and the second
operand is an immediate. For plain registers use the combination reg, reg.

rdar://problem/13658587

llvm-svn: 180816
2013-04-30 22:16:46 +00:00
Akira Hatanaka
326a351350 [mips] Test for r179873.
Patch by Zoran Jovanovic.

llvm-svn: 180804
2013-04-30 20:48:49 +00:00
Jim Grosbach
940f9dc094 InstCombine: Fold more shuffles of shuffles.
Always fold a shuffle-of-shuffle into a single shuffle when there's only one
input vector in the first place. Continue to be more conservative when there's
multiple inputs.

rdar://13402653
PR15866

llvm-svn: 180802
2013-04-30 20:43:52 +00:00
Hal Finkel
2ac40ae0d3 LocalStackSlotAllocation improvements
First, taking advantage of the fact that the virtual base registers are allocated in order of the local frame offsets, remove the quadratic register-searching behavior. Because of the ordering, we only need to check the last virtual base register created.

Second, store the frame index in the FrameRef structure, and get the frame index and the local offset from this structure at the top of the loop iteration. This allows us to de-nest the loops in insertFrameReferenceRegisters (and I think makes the code cleaner). I also moved the needsFrameBaseReg check into the first loop over instructions so that we don't bother pushing FrameRefs for instructions that don't want a virtual base register anyway.

Lastly, and this is the only functionality change, avoid the creation of single-use virtual base registers. These are currently not useful because, in general, they end up replacing what would be one r+r instruction with an add and a r+i instruction. Committing this removes the XFAIL in CodeGen/PowerPC/2007-09-07-LoadStoreIdxForms.ll

Jim has okayed this off-list.

llvm-svn: 180799
2013-04-30 20:04:37 +00:00
Manman Ren
0b37dd0efc TBAA: remove !tbaa from testing cases if not used.
This will make it easier to turn on struct-path aware TBAA since the metadata
format will change.

llvm-svn: 180796
2013-04-30 17:52:57 +00:00
Adrian Prantl
001a9c20ce Set debug locations for branch instructions created during inlining, even
the inlined function has multiple returns.

rdar://problem/12415623

llvm-svn: 180793
2013-04-30 17:08:16 +00:00
Rafael Espindola
58fcd9bdde Fix Addend computation for non external relocations on Macho.
llvm-svn: 180790
2013-04-30 15:40:54 +00:00
Vincent Lejeune
25352bd54f R600: fix loop-address.ll test
Texture cache is now used when shader type is not specified

llvm-svn: 180785
2013-04-30 12:47:56 +00:00
Mihai Popa
e489a3a040 s tightens up the encoding description for ARM post-indexed ldr instructions. All instructions in this class have bit 4 cleared. It turns out that there is a test case for this, but it was marked XFAIL.
llvm-svn: 180778
2013-04-30 09:00:12 +00:00
David Majnemer
4b346f9a6e Fix "Combine bit test + conditional or into simple math"
This fixes the optimization introduced in r179748 and reverted in r179750.

While the optimization was sound, it did not properly respect differences in
bit-width.

llvm-svn: 180777
2013-04-30 08:57:58 +00:00
Michael Liao
c67e0fc9ea Rewrite X86 codegen regression test with FileCheck
llvm-svn: 180776
2013-04-30 07:51:08 +00:00
Rafael Espindola
c3bc22082f Collect the Addend for external relocs.
This fixes 2013-04-04-RelocAddend.ll. We don't have a testcase for non external
relocs with an Addend. I will try to write one.

llvm-svn: 180767
2013-04-30 01:29:57 +00:00
Vincent Lejeune
29f24e0ce8 R600: use native for alu
llvm-svn: 180761
2013-04-30 00:14:38 +00:00
Vincent Lejeune
e641cd06c9 R600: Add FetchInst bit to instruction defs to denote vertex/tex instructions
v2[Vincent Lejeune]: Split FetchInst into usesTextureCache/usesVertexCache

llvm-svn: 180755
2013-04-30 00:13:39 +00:00
Michael Liao
3db1a24464 Rewrite test in FileCheck instead of grep in X86 codegen
llvm-svn: 180754
2013-04-30 00:13:38 +00:00
Manman Ren
180923f053 TBAA: remove !tbaa from testing cases if not used.
This will make it easier to turn on struct-path aware TBAA since the metadata
format will change.

llvm-svn: 180745
2013-04-29 22:58:55 +00:00
Bill Wendling
e24a89874b Duplicate a testcase.
llvm-svn: 180744
2013-04-29 22:42:47 +00:00
Manman Ren
13b2364d24 TBAA: remove !tbaa from testing cases if not used.
This will make it easier to turn on struct-path aware TBAA since the metadata
format will change.

llvm-svn: 180743
2013-04-29 22:42:01 +00:00
Michael Liao
4e03c7690d Rewrite some tests with FileCHeck in X86 codegen
- Revise previous patches of the same purpose by fixing
  *) grep <PA> | not grep <PB> semantically is not the same as
     CHECK: <PA>{{^<PB>.*$}} as the former will check all occurrences of <PA>
     while the later only check the first match. As the result, CHECK needs
     putting in all place where <PA> occurs.
  *) grep <PA> | count <N> needs a final CHECK-NOT of the same pattern.
     (As 'CHECK-<N>' is proposed for discussion, converting 'grep | count <N>'
      where N > 1 is postponed.)

llvm-svn: 180742
2013-04-29 22:41:29 +00:00
Adrian Prantl
d599fd59f3 Improve documentation.
llvm-svn: 180738
2013-04-29 22:25:52 +00:00
Rafael Espindola
d175d83203 Add getSymbolAlignment to the ObjectFile interface.
For regular object files this is only meaningful for common symbols. An object
file format with direct support for atoms should be able to provide alignment
information for all symbols.

This replaces getCommonSymbolAlignment and fixes
test-common-symbols-alignment.ll on darwin. This also includes a fix to
MachOObjectFile::getSymbolFlags. It was marking undefined symbols as common
(already tested by existing mcjit tests now that it is used).

llvm-svn: 180736
2013-04-29 22:24:22 +00:00
Tom Stellard
33e7a52e1c R600: Use correct CF_END instruction on Northern Island GPUs
llvm-svn: 180735
2013-04-29 22:23:58 +00:00
Tom Stellard
a22d2b47f3 R600: Fix encoding of CF_END_{EG, R600} instructions
The EOP bit was not being encoded.

llvm-svn: 180734
2013-04-29 22:23:54 +00:00
Arnold Schwaighofer
d0c38e0586 SimplifyCFG: If convert single conditional stores
This resurrects r179957, but adds code that makes sure we don't touch
atomic/volatile stores:

This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:

 a[i] =
 may-alias with a[i] load
 if (cond)
   a[i] = Y

into an unconditional store.

 a[i] = X
 may-alias with a[i] load
 tmp = cond ? Y : X;
 a[i] = tmp

We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case where the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.

hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.

llvm-svn: 180731
2013-04-29 21:28:24 +00:00
Rafael Espindola
5189efa685 Disable the MCJIT tests on 32 bit darwin.
I recently enabled them on 32 and 64 bit darwin, but it looks like 32 bit is
still fairly broken.

llvm-svn: 180730
2013-04-29 21:09:32 +00:00
Rafael Espindola
eda9da47d6 Propagate relocation info to resolveRelocation.
This gets most of the MCJITs tests passing with MachO.

llvm-svn: 180716
2013-04-29 17:24:34 +00:00
Michael Gottesman
e93cabd1a0 [objc-arc] Apply the RV optimization to retains next to calls in ObjCARCContract instead of ObjCARCOpts.
Turning retains into retainRV calls disrupts the data flow analysis in
ObjCARCOpts. Thus we move it as late as we can by moving it into
ObjCARCContract.

We leave in the conversion from retainRV -> retain in ObjCARCOpt since
it enables the dataflow analysis.

rdar://10813093

llvm-svn: 180698
2013-04-29 06:53:53 +00:00
Shuxin Yang
cb9d06c59b Fix a XOR reassociation bug.
When Reassociator optimize "(x | C1)" ^ "(X & C2)", it may swap the two
subexpressions, however, it forgot to swap cached constants (of C1 and C2)
accordingly.

rdar://13739160

llvm-svn: 180676
2013-04-27 18:02:12 +00:00
Tim Northover
26886c81b9 AArch64: convert MC-layer test to .s file
The CodeGen aspects of this test are already covered by cfi-frame.ll;
making it an assembly file reduces the risk of incidental changes
affecting the test.

llvm-svn: 180671
2013-04-27 11:56:14 +00:00
Michael Gottesman
9bb7d81aae [objc-arc] Test cleanups.
Mainly adding paranoid checks for the closing brace of a function to
help with FileCheck error readability. Also some other minor changes.

No actual CHECK changes.

llvm-svn: 180668
2013-04-27 05:25:54 +00:00
Eric Christopher
6edd6be6af Use the target triple from the target machine rather than the module
to determine whether or not we're on a darwin platform for debug code
emitting.

Solves the problem of a module with no triple on the command line
and no triple in the module using non-gdb ok features on darwin. Fix
up the member-pointers test to check the correct things for cross
platform (DW_FORM_flag is a good prefix).

Unfortunately no testcase because I have no ideas how to test something
without a triple and without a triple in the module yet check
precisely on two platforms. Ideas welcome.

llvm-svn: 180660
2013-04-27 01:07:52 +00:00
Eric Christopher
60d871c1e9 Move the XFAIL out of the middle of a comment.
llvm-svn: 180659
2013-04-27 01:07:22 +00:00
Rafael Espindola
ef355ff0c1 Make all darwin ppc stubs local.
This fixes pr15763.
Patch by David Fang.

llvm-svn: 180657
2013-04-27 00:43:16 +00:00
Manman Ren
c576d690b0 Struct-path aware TBAA: change the format of TBAAStructType node.
We switch the order of offset and field type to make TBAAStructType node
(name, parent node, offset) similar to scalar TBAA node (name, parent node).
TypeIsImmutable is added to TBAAStructTag node.

llvm-svn: 180654
2013-04-27 00:26:11 +00:00
Benjamin Kramer
583d3f2591 Make CHECK lines a bit less strict so they also match code generated for win64.
Hopefully brings the windows buildbots back to life.

llvm-svn: 180630
2013-04-26 21:04:21 +00:00
Nadav Rotem
19172bc0f1 Teach the interpreter to handle vector compares and additional vector arithmetic operations.
Patch by Yuri Veselov.

llvm-svn: 180626
2013-04-26 20:19:41 +00:00
Tom Stellard
de2ad0a8f1 R600: Initialize AMDGPUMachineFunction::ShaderType to ShaderType::COMPUTE
We need to intialize this to something and since clang does not set
the shader type attribute and clang is used only for compute shaders,
initializing it to COMPUTE seems like the best choice.

Reviewed-by: Christian König <christian.koenig@amd.com>
llvm-svn: 180620
2013-04-26 18:32:24 +00:00
Adrian Prantl
b8f53580c7 cleanup testcase some more
rdar://problem/13056109

llvm-svn: 180619
2013-04-26 18:10:54 +00:00
Quentin Colombet
7a0d14f876 ARM: Fix encoding of hint instruction for Thumb.
"hint" space for Thumb actually overlaps the encoding space of the CPS
instruction. In actuality, hints can be defined as CPS instructions where imod
and M bits are all nil.

Handle decoding of permitted nop-compatible hints (i.e. nop, yield, wfi, wfe,
sev) in DecodeT2CPSInstruction.

This commit adds a proper diagnostic message for Imm0_4 and updates all tests.

Patch by Mihail Popa <Mihail.Popa@arm.com>.

llvm-svn: 180617
2013-04-26 17:54:54 +00:00
Rafael Espindola
008c3f4ae4 Add missing ':'.
llvm-svn: 180616
2013-04-26 17:54:46 +00:00
Adrian Prantl
1a2b6a92ae Bugfix for the debug intrinsic handling in InstCombiner:
Since we can't guarantee that the original dbg.declare instrinsic
is removed by LowerDbgDeclare(), we need to make sure that we are
not inserting the same dbg.value intrinsic over and over.
This removes tons of redundant DIEs when compiling optimized code.

rdar://problem/13056109

llvm-svn: 180615
2013-04-26 17:48:33 +00:00
Benjamin Kramer
40a2d53c85 ARM/NEON: Pattern match vector integer abs to vabs.
llvm-svn: 180604
2013-04-26 15:00:57 +00:00
Benjamin Kramer
11723aa321 X86: Now that we have a canonical form for vector integer abs, match it into pabs.
llvm-svn: 180600
2013-04-26 12:05:21 +00:00
Benjamin Kramer
7ce75fb032 DAGCombiner: Canonicalize vector integer abs in the same way we do it for scalars.
This already helps SSE2 x86 a lot because it lacks an efficient way to
represent a vector select. The long term goal is to enable the backend to match
a canonicalized pattern into a single instruction (e.g. vabs or pabs).

llvm-svn: 180597
2013-04-26 09:19:19 +00:00
Nadav Rotem
fe6c769d60 LoopVectorizer: Calculate the number of pointers to disambiguate at runtime based on the numbers of reads and writes.
llvm-svn: 180593
2013-04-26 05:08:59 +00:00
Jack Carter
02ade2a6d8 Mips assembler: .set reorder support
Mips have delayslots for certain instructions 
like jumps and branches. These are instructions 
that follow the branch or jump and are executed
before the jump or branch is completed.

Early Mips compilers could not cope with delayslots
and left them up to the assembler. The assembler would
fill the delayslots with the appropriate instruction,
usually just a nop to allow correct runtime behavior.

The default behavior for this is set with .set reorder.
To tell the assembler that you don't want it to mess with
the delayslot one used .set noreorder.

For backwards compatibility we need to support
.set reorder and have it be the default behavior in the 
assembler.

Our support for it is to insert a NOP directly after an
instruction with a delayslot when in .set reorder mode.

Contributer: Vladimir Medic
llvm-svn: 180584
2013-04-25 23:31:35 +00:00
Michael Liao
4cd6523c9e Remove SMLoc paired with CHECK-NOT patterns. Not functionality change.
Pattern has source location by itself. After adding a trivial method to
retrieve it, it's unnecessary to pair a source location for CHECK-NOT patterns.
One thing revised after this is the diagnostic info is more accurate by
pointing to the start of the CHECK-NOT pattern instead of the end of the
CHECK-NOT pattern. E.g. diagnostic message previously looks like

    <stdin>:1:1: error: CHECK-NOT: string occurred!
    test
    ^
    test.txt:1:16: note: CHECK-NOT: pattern specified here
    CHECK-NOT: test
                   ^

is changed to

    <stdin>:1:1: error: CHECK-NOT: string occurred!
    test
    ^
    test.txt:1:12: note: CHECK-NOT: pattern specified here
    CHECK-NOT: test
               ^

llvm-svn: 180578
2013-04-25 21:31:34 +00:00
Arnold Schwaighofer
b1fc314b5f ARM cost model: Integer div and rem is lowered to a function call
Reflect this in the cost model. I observed this in MiBench/consumer-lame.

radar://13354716

llvm-svn: 180576
2013-04-25 21:16:18 +00:00
Preston Gurd
0547d81fdb This patch adds the X86FixupLEAs pass, which will reduce instruction
latency for certain models of the Intel Atom family, by converting
instructions into their equivalent LEA instructions, when it is both
useful and possible to do so.

llvm-svn: 180573
2013-04-25 20:29:37 +00:00
Nadav Rotem
d5eaf768a9 LoopVectorizer: No need to generate pointer disambiguation checks between readonly pointers.
llvm-svn: 180570
2013-04-25 19:55:03 +00:00
Reid Kleckner
9efd3d0aa0 [mc-coff] Forward Linker Option flags into the .drectve section
Summary:
This is modelled on the Mach-O linker options implementation and should
support a Clang implementation of #pragma comment(lib/linker).

Reviewers: rafael

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D724

llvm-svn: 180569
2013-04-25 19:34:41 +00:00
Rafael Espindola
e426e8360c Fix section relocation for SECTIONREL32 with immediate offset.
Patch by Kai Nacke. This matches the gnu as output.

llvm-svn: 180568
2013-04-25 19:27:05 +00:00
Chad Rosier
3030f76a0d [inline asm] Add a test case for r180226. The specific issue is that the inline
assembly is requesting a 64-bit register, which is invalid for i386.
rdar://13731657

llvm-svn: 180445
2013-04-25 17:10:21 +00:00
Rafael Espindola
9541db893e Clarify getRelocationAddress x getRelocationOffset a bit.
getRelocationAddress is for dynamic libraries and executables,
getRelocationOffset for relocatable objects.

Mark the getRelocationAddress of COFF and MachO as not implemented yet. Add a
test of ELF's. llvm-readobj -r now prints the same values as readelf -r.

llvm-svn: 180259
2013-04-25 12:28:45 +00:00
Silviu Baranga
c656dd9bdd Fix constant folding for one lane vector types. Constant folding one lane vector types not returns a vector instead of a scalar.
llvm-svn: 180254
2013-04-25 09:32:33 +00:00
Akira Hatanaka
1eb1207ce9 Test case for r180241.
llvm-svn: 180246
2013-04-25 02:22:07 +00:00
Akira Hatanaka
b42fb2ad92 Test case for r180238.
llvm-svn: 180245
2013-04-25 02:21:09 +00:00
Tom Stellard
48d161332e R600: Use SHT_PROGBITS for the .AMDGPU.config section
The libelf implementation that is distributed here:
http://www.mr511.de/software/english.html
will not parse sections that are marked SHT_NULL.

llvm-svn: 180230
2013-04-24 23:56:14 +00:00
Jack Carter
3fa12e632e Mips assembler: Add 64 bit testing for JAL
Contributer: Vladimir Medic
llvm-svn: 180220
2013-04-24 21:52:42 +00:00
Rafael Espindola
9e7ded9509 Use pointers to iterate over symbols.
While here, don't report a dummy symbol for relocations that don't have symbols.
We used to says such relocations were for the first defined symbol, but now we
return end_symbols(). The llvm-readobj output change agrees with otool.

llvm-svn: 180214
2013-04-24 19:47:55 +00:00
Arnold Schwaighofer
ad591145df LoopVectorize: Scalarize padded types
This patch disables memory-instruction vectorization for types that need padding
bytes, e.g., x86_fp80 has 10 bytes store size with 6 bytes padding in darwin on
x86_64. Because the load/store vectorization is performed by the bit casting to
a packed vector, which has incompatible memory layout due to the lack of padding
bytes, the present vectorizer produces inconsistent result for memory
instructions of those types.
This patch checks an equality of the AllocSize of a scalar type and allocated
size for each vector element, to ensure that there is no padding bytes and the
array can be read/written using vector operations.

Patch by Daisuke Takahashi!

Fixes PR15758.

llvm-svn: 180196
2013-04-24 16:16:01 +00:00
Arnold Schwaighofer
169f004ff2 LoopVectorizer: Bail out if we don't have datalayout we need it
llvm-svn: 180195
2013-04-24 16:15:58 +00:00
Andrew Trick
73014520d6 MI Sched: eliminate local vreg copies.
For now, we just reschedule instructions that use the copied vregs and
let regalloc elliminate it. I would really like to eliminate the
copies on-the-fly during scheduling, but we need a complete
implementation of repairIntervalsInRange() first.

The general strategy is for the register coalescer to eliminate as
many global copies as possible and shrink live ranges to be
extended-basic-block local. The coalescer should not have to worry
about resolving local copies (e.g. it shouldn't attemp to reorder
instructions). The scheduler is a much better place to deal with local
interference. The coalescer side of this equation needs work.

llvm-svn: 180193
2013-04-24 15:54:43 +00:00
Adrian Prantl
924bf458fb Cleanup testcase and ensure we actually exercise the inliner.
rdar://problem/12415623

llvm-svn: 180168
2013-04-24 01:44:15 +00:00
Jyotsna Verma
4a5d195942 Hexagon: Use multiclass for combine and STri[bhwd]_shl_V4 instructions.
llvm-svn: 180145
2013-04-23 21:17:40 +00:00
Adrian Prantl
f7c84e0f56 Make sure the instruction right after an inlined function has a
debug location. This solves a problem where range of an inlined
subroutine is emitted wrongly.
Patch by Manman Ren.

Fixes rdar://problem/12415623

llvm-svn: 180140
2013-04-23 19:56:03 +00:00
Stephen Lin
44a24c9593 Add more tests for r179925 to verify correct handling of signext/zeroext; strengthen condition check to require actual MVT::i32 virtual register types, just in case (no actual functionality change)
llvm-svn: 180138
2013-04-23 19:42:25 +00:00
Rafael Espindola
3d1e86bc2f Fix typo.
llvm-svn: 180137
2013-04-23 19:39:34 +00:00
Jyotsna Verma
624f2e0434 Hexagon: Remove assembler mapped instruction definitions.
llvm-svn: 180133
2013-04-23 19:15:55 +00:00
Vincent Lejeune
3666f07489 R600: Use .AMDGPU.config section to emit stacksize
llvm-svn: 180124
2013-04-23 17:34:12 +00:00
Vincent Lejeune
e5ba5f1b14 R600: Add CF_END
llvm-svn: 180123
2013-04-23 17:34:00 +00:00
Nadav Rotem
1bfb7903e3 LoopVectorizer: Fix 15830. When scalarizing and unrolling stores make sure that the order in which the elements are scalarized is the same as the original order.
This fixes a miscompilation in FreeBSD's regex library.

llvm-svn: 180121
2013-04-23 17:12:42 +00:00
Jyotsna Verma
20903a7aba Hexagon: Remove duplicate instructions to handle global/immediate values
for absolute/absolute-set addressing modes.

llvm-svn: 180120
2013-04-23 17:11:46 +00:00
Pekka Jaaskelainen
d8a8d1d02f Call the potentially costly isAnnotatedParallel() only once.
Made the uniform write test's checks a bit stricter.

llvm-svn: 180119
2013-04-23 16:44:43 +00:00
Rafael Espindola
534f0bf6c6 Write relocations in yaml2obj.
llvm-svn: 180115
2013-04-23 15:53:02 +00:00
Rafael Espindola
f7c86d97a1 Move test from grep to FileCheck.
llvm-svn: 180092
2013-04-23 12:03:27 +00:00
Alexey Samsonov
cb820bd0a4 Use zlib to uncompress debug sections in DWARF parser.
This makes llvm-dwarfdump and llvm-symbolizer understand
debug info sections compressed by ld.gold linker.

llvm-svn: 180088
2013-04-23 10:17:34 +00:00
Pekka Jaaskelainen
1492231ce9 Refuse to (even try to) vectorize loops which have uniform writes,
even if erroneously annotated with the parallel loop metadata.

Fixes Bug 15794: 
"Loop Vectorizer: Crashes with the use of llvm.loop.parallel metadata"

llvm-svn: 180081
2013-04-23 08:08:51 +00:00
Chad Rosier
6d8c79f647 Add test case for PR15779, which has previously been fixed.
llvm-svn: 180058
2013-04-22 22:30:01 +00:00
Anat Shemer
0d94c56dac Changed back (relative to commit 179786) the operations executed when extract(cast) is transformed to cast(extract). It uses the Builder class as before. In addition the result node is added to the Worklist, so all the previous extract users will become the new scalar cast users.
llvm-svn: 180045
2013-04-22 20:51:10 +00:00
Akira Hatanaka
913bf6194a [mips] In performDSPShiftCombine, check that all elements in the vector are
shifted by the same amount and the shift amount is smaller than the element
size.

llvm-svn: 180039
2013-04-22 19:58:23 +00:00
Peter Collingbourne
9c7b994acf COFF: Fix weak external aliases.
Differential Revision: http://llvm-reviews.chandlerc.com/D700

llvm-svn: 180034
2013-04-22 18:48:56 +00:00
Stephen Lin
4e394628e7 Extra paranoid test for r179925 (verify that tail calls are not generated to 'this'-returning constructors of objects with different 'this' pointers than the caller)
llvm-svn: 180032
2013-04-22 17:23:49 +00:00
Rafael Espindola
4a4fbbaf4e Also verify llvm.compiler_used.
llvm-svn: 180020
2013-04-22 15:16:51 +00:00
Rafael Espindola
88a7961c64 Clarify that llvm.used can contain aliases.
Also add a check for llvm.used in the verifier and simplify clients now that
they can assume they have a ConstantArray.

llvm-svn: 180019
2013-04-22 14:58:02 +00:00
Stepan Dyatkovskiy
8adaf54376 Fix for 5.5 Parameter Passing --> Stage C:
-- C.4 and C.5 statements, when NSAA is not equal to SP.
 -- C.1.cp statement for VA functions. Note: There are no VFP CPRCs in a
    variadic procedure.

Before this patch "NSAA != 0" means "don't use GPRs anymore ". But there are
some exceptions in AAPCS.
1. For non VA function: allocate all VFP regs for CPRC. When all VFPs are allocated
   CPRCs would be sent to stack, while non CPRCs may be still allocated in GRPs.
2. Check that for VA functions all params uses GPRs and then stack.
   No exceptions, no CPRCs here.

llvm-svn: 180011
2013-04-22 13:06:52 +00:00
Eric Christopher
95e5e9b173 Add .ll as a valid test suffix for Object, this allows .ll -> object
and then dumping as tests.

llvm-svn: 180010
2013-04-22 10:45:06 +00:00
Arnaud A. de Grandmaison
087fe129d8 Cleanup: test source files do not need to be executable
llvm-svn: 180003
2013-04-22 08:02:43 +00:00
David Blaikie
9bfe15c313 Revert "Revert "PR14606: debug info imported_module support""
This reverts commit r179840 with a fix to test/DebugInfo/two-cus-from-same-file.ll

I'm not sure why that test only failed on ARM & MIPS and not X86 Linux, even
though the debug info was clearly invalid on all of them, but this ought to fix
it.

llvm-svn: 179996
2013-04-22 06:12:31 +00:00
Jim Grosbach
3104dcf2ca Legalize vector truncates by parts rather than just splitting.
Rather than just splitting the input type and hoping for the best, apply
a bit more cleverness. Just splitting the types until the source is
legal often leads to an illegal result time, which is then widened and a
scalarization step is introduced which leads to truly horrible code
generation. With the loop vectorizer, these sorts of operations are much
more common, and so it's worth extra effort to do them well.

Add a legalization hook for the operands of a TRUNCATE node, which will
be encountered after the result type has been legalized, but if the
operand type is still illegal. If simple splitting of both types
ends up with the result type of each half still being legal, just
do that (v16i16 -> v16i8 on ARM, for example). If, however, that would
result in an illegal result type (v8i32 -> v8i8 on ARM, for example),
we can get more clever with power-two vectors. Specifically,
split the input type, but also widen the result element size, then
concatenate the halves and truncate again.  For example on ARM,
To perform a "%res = v8i8 trunc v8i32 %in" we transform to:
  %inlo = v4i32 extract_subvector %in, 0
  %inhi = v4i32 extract_subvector %in, 4
  %lo16 = v4i16 trunc v4i32 %inlo
  %hi16 = v4i16 trunc v4i32 %inhi
  %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16
  %res = v8i8 trunc v8i16 %in16

This allows instruction selection to generate three VMOVN instructions
instead of a sequences of moves, stores and loads.

Update the ARMTargetTransformInfo to take this improved legalization
into account.

Consider the simplified IR:

define <16 x i8> @test1(<16 x i32>* %ap) {
  %a = load <16 x i32>* %ap
  %tmp = trunc <16 x i32> %a to <16 x i8>
  ret <16 x i8> %tmp
}

define <8 x i8> @test2(<8 x i32>* %ap) {
  %a = load <8 x i32>* %ap
  %tmp = trunc <8 x i32> %a to <8 x i8>
  ret <8 x i8> %tmp
}

Previously, we would generate the truly hideous:
	.syntax unified
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1
	.align	2
_test1:                                 @ @test1
@ BB#0:
	push	{r7}
	mov	r7, sp
	sub	sp, sp, #20
	bic	sp, sp, #7
	add	r1, r0, #48
	add	r2, r0, #32
	vld1.64	{d24, d25}, [r0:128]
	vld1.64	{d16, d17}, [r1:128]
	vld1.64	{d18, d19}, [r2:128]
	add	r1, r0, #16
	vmovn.i32	d22, q8
	vld1.64	{d16, d17}, [r1:128]
	vmovn.i32	d20, q9
	vmovn.i32	d18, q12
	vmov.u16	r0, d22[3]
	strb	r0, [sp, #15]
	vmov.u16	r0, d22[2]
	strb	r0, [sp, #14]
	vmov.u16	r0, d22[1]
	strb	r0, [sp, #13]
	vmov.u16	r0, d22[0]
	vmovn.i32	d16, q8
	strb	r0, [sp, #12]
	vmov.u16	r0, d20[3]
	strb	r0, [sp, #11]
	vmov.u16	r0, d20[2]
	strb	r0, [sp, #10]
	vmov.u16	r0, d20[1]
	strb	r0, [sp, #9]
	vmov.u16	r0, d20[0]
	strb	r0, [sp, #8]
	vmov.u16	r0, d18[3]
	strb	r0, [sp, #3]
	vmov.u16	r0, d18[2]
	strb	r0, [sp, #2]
	vmov.u16	r0, d18[1]
	strb	r0, [sp, #1]
	vmov.u16	r0, d18[0]
	strb	r0, [sp]
	vmov.u16	r0, d16[3]
	strb	r0, [sp, #7]
	vmov.u16	r0, d16[2]
	strb	r0, [sp, #6]
	vmov.u16	r0, d16[1]
	strb	r0, [sp, #5]
	vmov.u16	r0, d16[0]
	strb	r0, [sp, #4]
	vldmia	sp, {d16, d17}
	vmov	r0, r1, d16
	vmov	r2, r3, d17
	mov	sp, r7
	pop	{r7}
	bx	lr

	.globl	_test2
	.align	2
_test2:                                 @ @test2
@ BB#0:
	push	{r7}
	mov	r7, sp
	sub	sp, sp, #12
	bic	sp, sp, #7
	vld1.64	{d16, d17}, [r0:128]
	add	r0, r0, #16
	vld1.64	{d20, d21}, [r0:128]
	vmovn.i32	d18, q8
	vmov.u16	r0, d18[3]
	vmovn.i32	d16, q10
	strb	r0, [sp, #3]
	vmov.u16	r0, d18[2]
	strb	r0, [sp, #2]
	vmov.u16	r0, d18[1]
	strb	r0, [sp, #1]
	vmov.u16	r0, d18[0]
	strb	r0, [sp]
	vmov.u16	r0, d16[3]
	strb	r0, [sp, #7]
	vmov.u16	r0, d16[2]
	strb	r0, [sp, #6]
	vmov.u16	r0, d16[1]
	strb	r0, [sp, #5]
	vmov.u16	r0, d16[0]
	strb	r0, [sp, #4]
	ldm	sp, {r0, r1}
	mov	sp, r7
	pop	{r7}
	bx	lr

Now, however, we generate the much more straightforward:
	.syntax unified
	.section	__TEXT,__text,regular,pure_instructions
	.globl	_test1
	.align	2
_test1:                                 @ @test1
@ BB#0:
	add	r1, r0, #48
	add	r2, r0, #32
	vld1.64	{d20, d21}, [r0:128]
	vld1.64	{d16, d17}, [r1:128]
	add	r1, r0, #16
	vld1.64	{d18, d19}, [r2:128]
	vld1.64	{d22, d23}, [r1:128]
	vmovn.i32	d17, q8
	vmovn.i32	d16, q9
	vmovn.i32	d18, q10
	vmovn.i32	d19, q11
	vmovn.i16	d17, q8
	vmovn.i16	d16, q9
	vmov	r0, r1, d16
	vmov	r2, r3, d17
	bx	lr

	.globl	_test2
	.align	2
_test2:                                 @ @test2
@ BB#0:
	vld1.64	{d16, d17}, [r0:128]
	add	r0, r0, #16
	vld1.64	{d18, d19}, [r0:128]
	vmovn.i32	d16, q8
	vmovn.i32	d17, q9
	vmovn.i16	d16, q8
	vmov	r0, r1, d16
	bx	lr

llvm-svn: 179989
2013-04-21 23:47:41 +00:00
Jim Grosbach
2582e2e539 ARM: Split out cost model vcvt testcases.
They had a separate RUN line already, so may as well be in a separate file.

llvm-svn: 179988
2013-04-21 23:47:37 +00:00
Jakob Stoklund Olesen
c9f30e9065 Passing arguments to varags functions under the SPARC v9 ABI.
Arguments after the fixed arguments never use the floating point
registers.

llvm-svn: 179987
2013-04-21 21:36:49 +00:00
Jakob Stoklund Olesen
d8a2b84611 Fix the SETHIimm pattern for 64-bit code.
Don't ignore the high 32 bits of the immediate.

llvm-svn: 179985
2013-04-21 21:18:03 +00:00
Benjamin Kramer
47f18d3da1 SROA: Don't crash on a select with two identical operands.
This is an edge case that can happen if we modify a chain of multiple selects.
Update all operands in that case and remove the assert. PR15805.

llvm-svn: 179982
2013-04-21 17:48:39 +00:00
Arnold Schwaighofer
76cf4c753d Revert "SimplifyCFG: If convert single conditional stores"
There is the temptation to make this tranform dependent on target information as
it is not going to be beneficial on all (sub)targets. Therefore, we should
probably do this in MI Early-Ifconversion.

This reverts commit r179957. Original commit message:

"SimplifyCFG: If convert single conditional stores

This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:

a[i] =
may-alias with a[i] load
if (cond)
    a[i] = Y
into an unconditional store.

a[i] = X
may-alias with a[i] load
tmp = cond ? Y : X;
a[i] = tmp

We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case were the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.

hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.

I am going to watch performance numbers across the builtbots and will revert
this if anything unexpected comes up."

llvm-svn: 179980
2013-04-21 13:09:04 +00:00
Tim Northover
593f76e08e ARM: fix part of test which actually needed an asserts build
This should fix a buildbot failure that occurred after r179977.

llvm-svn: 179978
2013-04-21 12:20:19 +00:00
Tim Northover
943f2a9234 ARM: Use ldrd/strd to spill 64-bit pairs when available.
This allows common sp-offsets to be part of the instruction and is
probably faster on modern CPUs too.

llvm-svn: 179977
2013-04-21 11:57:07 +00:00
Nadav Rotem
e567845da4 SLPVectorize: Add support for vectorization of casts.
llvm-svn: 179975
2013-04-21 08:05:59 +00:00
Michael Gottesman
7577b1c190 [objc-arc] Cleaned up tail-call-invariant-enforcement.ll.
Specifically:

1. Added checks that unwind is being properly added to various instructions.
2. Fixed the declaration/calling of objc_release to have a return type of void.
3. Moved all checks to precede the functions and added checks to ensure that the
checks would only match inside the specific function that we are attempting to
check.

llvm-svn: 179973
2013-04-21 02:59:44 +00:00
Michael Gottesman
3af6adbc8d [objc-arc] Check that objc-arc-expand properly handles all strictly forwarding calls and does not touch calls which are not strictly forwarding (i.e. objc_retainBlock).
llvm-svn: 179972
2013-04-21 01:57:46 +00:00
Michael Gottesman
03f1fec178 [objc-arc] Renamed the test file clang-arc-used-intrinsic-removed-if-isolated.ll -> intrinsic-use-isolated.ll to match the other test file intrinsic-use.ll.
llvm-svn: 179971
2013-04-21 01:42:24 +00:00
Bill Wendling
61eb6957c5 Remove tbaa metadata.
llvm-svn: 179970
2013-04-21 01:38:25 +00:00
Jakob Stoklund Olesen
cbfbef04da Compile varargs functions for SPARCv9.
With a little help from the frontend, it looks like the standard va_*
intrinsics can do the job.

Also clean up an old bitcast hack in LowerVAARG that dealt with
unaligned double loads. Load SDNodes can specify an alignment now.

Still missing: Calling varargs functions with float arguments.

llvm-svn: 179961
2013-04-20 22:49:16 +00:00
Nadav Rotem
069e6d9a7f Fix PR15800. Do not try to vectorize vectors and structs.
llvm-svn: 179960
2013-04-20 22:29:43 +00:00
Arnold Schwaighofer
a5ec409858 SimplifyCFG: If convert single conditional stores
This transformation will transform a conditional store with a preceeding
uncondtional store to the same location:

 a[i] =
 may-alias with a[i] load
 if (cond)
   a[i] = Y

into an unconditional store.

 a[i] = X
 may-alias with a[i] load
 tmp = cond ? Y : X;
 a[i] = tmp

We assume that on average the cost of a mispredicted branch is going to be
higher than the cost of a second store to the same location, and that the
secondary benefits of creating a bigger basic block for other optimizations to
work on outway the potential case were the branch would be correctly predicted
and the cost of the executing the second store would be noticably reflected in
performance.

hmmer's execution time improves by 30% on an imac12,2 on ref data sets. With
this change we are on par with gcc's performance (gcc also performs this
transformation). There was a 1.2 % performance improvement on a ARM swift chip.
Other tests in the test-suite+external seem to be mostly uninfluenced in my
experiments:
This optimization was triggered on 41 tests such that the executable was
different before/after the patch. Only 1 out of the 40 tests (dealII) was
reproducable below 100% (by about .4%). Given that hmmer benefits so much I
believe this to be a fair trade off.

I am going to watch performance numbers across the builtbots and will revert
this if anything unexpected comes up.

llvm-svn: 179957
2013-04-20 21:42:09 +00:00
Tim Northover
de5285eb6f ARM: don't add FrameIndex offset for LDMIA (has no immediate)
Previously, when spilling 64-bit paired registers, an LDMIA with both
a FrameIndex and an offset was produced. This kind of instruction
shouldn't exist, and the extra operand was being confused with the
predicate, causing aborts later on.

This removes the invalid 0-offset from the instruction being
produced.

llvm-svn: 179956
2013-04-20 19:31:00 +00:00
Nuno Lopes
af3a791be9 recommit tests
llvm-svn: 179955
2013-04-20 17:39:52 +00:00
Stephen Lin
98df7358cd Minor renaming of tests (for consistency with an in-development patch)
llvm-svn: 179954
2013-04-20 16:21:26 +00:00
Benjamin Kramer
a7e8f887fe Don't litter .s files in test directory.
llvm-svn: 179937
2013-04-20 10:43:40 +00:00
Nadav Rotem
ed63de18d3 SLPVectorizer: Improve the cost model for loop invariant broadcast values.
llvm-svn: 179930
2013-04-20 06:13:47 +00:00
Stephen Lin
9d99ba2071 Add CodeGen support for functions that always return arguments via a new parameter attribute 'returned', which is taken advantage of in target-independent tail call opportunity detection and in ARM call lowering (when placed on an integral first parameter).
llvm-svn: 179925
2013-04-20 05:14:40 +00:00
Stephen Lin
65c1101eba Allow tail call opportunity detection through nested and/or multiple iterations of extractelement/insertelement indirection
llvm-svn: 179924
2013-04-20 04:27:51 +00:00
Akira Hatanaka
11b4211d68 [mips] Instruction selection patterns for DSP-ASE vector shifts.
llvm-svn: 179906
2013-04-19 23:21:32 +00:00
Benjamin Kramer
c79888d90d MergeFunc: Make pointer and integer types generate the same hash.
The logic that actually compares the types considers pointers and integers the
same if they are of the same size. This created a strange mismatch between hash
and reality and made the test case for this fail on some platforms (yay,
test cases).

llvm-svn: 179905
2013-04-19 23:06:44 +00:00
Bill Wendling
7b61d2bb18 Make variable match any name.
llvm-svn: 179903
2013-04-19 22:30:43 +00:00
Hal Finkel
9e44a50443 Fix PPC optimizeCompareInstr swapped-sub argument handling
When matching a compare with a subtract where the arguments of the compare are
swapped w.r.t. the arguments of the subtract, we need to negate the predicates
(or CR bit indices) of the users. This, however, is not the same as inverting
the predicate (negating LT -> GT, but inverting LT -> GE, for example). The ARM
backend seems to do this correctly, but when I adapted the code for the PPC
backend, I introduced an error in this logic.

Comparison optimization is now enabled again by default.

llvm-svn: 179899
2013-04-19 22:08:38 +00:00
Bill Wendling
d79d1a22a6 Try explicitly setting the target triple to see if this gets it to pass on ARM.
llvm-svn: 179890
2013-04-19 21:24:51 +00:00
Anton Korobeynikov
f95220dd8b Do not mangle in MS-way the globals with magic \001 in the name.
Based on the patch by David Nadlinger!

llvm-svn: 179889
2013-04-19 21:20:56 +00:00
Bill Wendling
e8c6d1cb09 Make test slightly more readable.
llvm-svn: 179888
2013-04-19 21:14:59 +00:00
Bill Wendling
7256108f6f Add a testcase to make sure we generate the proper compact unwind section for a function that cannot produce a compact unwind encoding.
llvm-svn: 179887
2013-04-19 21:07:11 +00:00