1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

15085 Commits

Author SHA1 Message Date
Nadav Rotem
f8e096f4ee X86: PerformOrCombine introduced a vselect node with a wrong order of operands. This bug was introduced when a dedicated blend sdnode was replaced with the vselect node (in 139479).
llvm-svn: 145488
2011-11-30 10:13:37 +00:00
Andrew Trick
8da55f9048 Better test case found in duplicate PR10570.
llvm-svn: 145484
2011-11-30 06:26:42 +00:00
Andrew Trick
247f749767 LSR: handle the expansion of phi operands that use postinc forms of the IV.
Fixes PR11431: SCEVExpander::expandAddRecExprLiterally(const llvm::SCEVAddRecExpr*): Assertion `(!isa<Instruction>(Result) || SE.DT->dominates(cast<Instruction>(Result), Builder.GetInsertPoint())) && "postinc expansion does not dominate use"' failed.

llvm-svn: 145482
2011-11-30 06:07:54 +00:00
Chad Rosier
c5fa9f413a Add support for sqrt, sqrtl, and sqrtf in TargetLibraryInfo. Disable
(fptrunc (sqrt (fpext x))) -> (sqrtf x) transformation if -fno-builtin is 
specified.
rdar://10466410

llvm-svn: 145460
2011-11-29 23:57:10 +00:00
Jakob Stoklund Olesen
f227c539d4 FileCheckize.
llvm-svn: 145452
2011-11-29 23:09:16 +00:00
Akira Hatanaka
fd548b69f5 Change names for MIPS "generic" processors defined in Mips.td to match what GNU
tools use. Patch by Simon Atanasyan.

"mips32r1" => "mips32"
"4ke" => mips32r2"
"mips64r1" => "mips64"

llvm-svn: 145451
2011-11-29 23:08:41 +00:00
Jim Grosbach
538759efa7 ARM assembly parsing and encoding for four-register VST1.
llvm-svn: 145450
2011-11-29 22:58:48 +00:00
Evan Cheng
5c1efd630b Add another missing pattern. llvm-gcc likes f64 but clang likes i64 so it was generating poor code for some SSE builtins.
llvm-svn: 145448
2011-11-29 22:48:34 +00:00
Jim Grosbach
fc7e76b194 Enable some VST1 tests and add a few more.
llvm-svn: 145443
2011-11-29 22:40:32 +00:00
Jakob Stoklund Olesen
5d6a4584d9 Make X86::FsFLD0SS / FsFLD0SD real pseudo-instructions.
Like V_SET0, these instructions are expanded by ExpandPostRA to xorps /
vxorps so they can participate in execution domain swizzling.

This also makes the AVX variants redundant.

llvm-svn: 145440
2011-11-29 22:27:25 +00:00
Chad Rosier
0ff2f46d12 If fast-isel fails, remove dead instructions generated during the failed
attempt.  

llvm-svn: 145425
2011-11-29 19:40:47 +00:00
Duncan Sands
97cc6da56c Fix a theoretical problem (not seen in the wild): if different instances of a
weak variable are compiled by different compilers, such as GCC and LLVM, while
LLVM may increase the alignment to the preferred alignment there is no reason to
think that GCC will use anything more than the ABI alignment.  Since it is the
GCC version that might end up in the final program (as the linkage is weak), it
is wrong to increase the alignment of loads from the global up to the preferred
alignment as the alignment might only be the ABI alignment.

Increasing alignment up to the ABI alignment might be OK, but I'm not totally
convinced that it is.  It seems better to just leave the alignment of weak
globals alone.

llvm-svn: 145413
2011-11-29 18:26:38 +00:00
Michael J. Spencer
5fade79478 MC/X86/COFF: Allow quotes in names when targeting MS/Windows,
as MC is the only assembler we support.

This splits MS/Windows and GNU/Windows ASM infos into two seperate classes.
While there is currently only one difference, full MS C++ ABI support will
require many more.

llvm-svn: 145409
2011-11-29 18:00:06 +00:00
Danil Malyshev
5ce4e1a9d3 Fixed ObjectFile functions:
- getSymbolOffset() renamed as getSymbolFileOffset()
- getSymbolFileOffset(), getSymbolAddress(), getRelocationAddress() returns same result for ELFObjectFile, MachOObjectFile and COFFObjectFile.
- added getRelocationOffset()
- fixed MachOObjectFile::getSymbolSize()
- fixed MachOObjectFile::getSymbolSection()
- fixed MachOObjectFile::getSymbolOffset() for symbols without section data.

llvm-svn: 145408
2011-11-29 17:40:10 +00:00
Elena Demikhovsky
735cff1fa8 Fixed vsqrt.ss intrinsic usage - order of input operands was wrong.
Added a test.
Thanks Bruno for reviewing the patch.

llvm-svn: 145403
2011-11-29 15:00:45 +00:00
Craig Topper
637e60afdd Fix shuffle decoding for memory forms for (V)SHUFPS/D.
llvm-svn: 145392
2011-11-29 07:58:09 +00:00
Craig Topper
4550fc2649 Fix issues in shuffle decoding around VPERM* instructions. Fix shuffle decoding for VSHUFPS/D for 256-bit types. Add pattern matching for memory forms of VPERMILPS/VPERMILPD.
llvm-svn: 145390
2011-11-29 07:49:05 +00:00
Craig Topper
aca91b9f14 Fix VINSERTF128/VEXTRACTF128 to be marked as FP instructions. Allow execution dependency fix pass to convert them to their integer equivalents when AVX2 is enabled.
llvm-svn: 145376
2011-11-29 05:37:58 +00:00
Craig Topper
a6c1d25798 Correctly mark VPERM2F128 as being an FP instruction and add execution domain fixing support to convert it to VPERM2I128 for AVX2.
llvm-svn: 145370
2011-11-29 03:57:34 +00:00
Andrew Trick
63f81b112e SCEV fix. In general, Add/Mul expressions should not inherit NSW/NUW.
This reverts r139450, fixes r139453, and adds much needed comments and a
unit test.

llvm-svn: 145367
2011-11-29 02:16:38 +00:00
Andrew Trick
a033165cf9 Filecheckize.
llvm-svn: 145363
2011-11-29 02:05:23 +00:00
Andrew Trick
4c2bb1ade3 Reenable this IndVars unit test.
SCEV can't optimize undef in all cases, which is a separate issue from this test case.

llvm-svn: 145343
2011-11-29 00:52:04 +00:00
Eli Friedman
bc47555417 Add a missing safety check to ProcessUGT_ADDCST_ADD. Fixes PR11438.
llvm-svn: 145316
2011-11-28 23:32:19 +00:00
Eli Friedman
473a76a0df Make SelectionDAG::InferPtrAlignment use llvm::ComputeMaskedBits instead of duplicating the logic for globals. Make llvm::ComputeMaskedBits handle GlobalVariables slightly more aggressively, to match what InferPtrAlignment knew how to do.
llvm-svn: 145304
2011-11-28 22:48:22 +00:00
Evan Cheng
1435aa5fdc Revert r145273 and fix in SelectionDAG::InferPtrAlignment() instead.
Conservatively returns zero when the GV does not specify an alignment nor is it
initialized. Previously it returns ABI alignment for type of the GV. However, if
the type is a "packed" type, then the under-specified alignments is attached to
the load / store instructions. In that case, the alignment of the type cannot be
trusted.
rdar://10464621

llvm-svn: 145300
2011-11-28 22:37:34 +00:00
Evan Cheng
567aa3dfb3 DAG combine should not increase alignment of loads / stores with alignment less
than ABI alignment. These are loads / stores from / to "packed" data structures.
Their alignments are intentionally under-specified.

rdar://10301431

llvm-svn: 145273
2011-11-28 20:42:56 +00:00
Craig Topper
6f5a0bc4e3 Add X86 instruction selection for VPERM2I128 when AVX2 is enabled. Merge VPERMILPS/VPERMILPD detection since they are pretty similar.
llvm-svn: 145238
2011-11-28 10:14:51 +00:00
NAKAMURA Takumi
85ad69c8df test/lit.cfg: Enable the feature 'asserts' to check output of llc -version.
llc knows whether he is compiled with -DNDEBUG.
|  Optimized build with assertions.

llvm-svn: 145230
2011-11-28 05:09:15 +00:00
Chris Lattner
c050288d89 remove a test that is using old-style llvm.dbg intrinsics, apparently only
fails on ppc and arm hosts.

llvm-svn: 145188
2011-11-27 18:13:47 +00:00
Chandler Carruth
bb4c250613 Take two on rotating the block ordering of loops. My previous attempt
was centered around the premise of laying out a loop in a chain, and
then rotating that chain. This is good for preserving contiguous layout,
but bad for actually making sane rotations. In order to keep it safe,
I had to essentially make it impossible to rotate deeply nested loops.
The information needed to correctly reason about a deeply nested loop is
actually available -- *before* we layout the loop. We know the inner
loops are already fused into chains, etc. We lose information the moment
we actually lay out the loop.

The solution was the other alternative for this algorithm I discussed
with Benjamin and some others: rather than rotating the loop
after-the-fact, try to pick a profitable starting block for the loop's
layout, and then use our existing layout logic. I was worried about the
complexity of this "pick" step, but it turns out such complexity is
needed to handle all the important cases I keep teasing out of benchmarks.

This is, I'm afraid, a bit of a work-in-progress. It is still
misbehaving on some likely important cases I'm investigating in Olden.
It also isn't really tested. I'm going to try to craft some interesting
nested-loop test cases, but it's likely to be extremely time consuming
and I don't want to go there until I'm sure I'm testing the correct
behavior. Sadly I can't come up with a way of getting simple, fine
grained test cases for this logic. We need complex loop structures to
even trigger much of it.

llvm-svn: 145183
2011-11-27 13:34:33 +00:00
Chandler Carruth
e6374e6953 Rework a bit of the implementation of loop block rotation to not rely so
heavily on AnalyzeBranch. That routine doesn't behave as we want given
that rotation occurs mid-way through re-ordering the function. Instead
merely check that there are not unanalyzable branching constructs
present, and then reason about the CFG via successor lists. This
actually simplifies my mental model for all of this as well.

The concrete result is that we now will rotate more loop chains. I've
added a test case from Olden highlighting the effect. There is still
a bit more to do here though in order to regain all of the performance
in Olden.

llvm-svn: 145179
2011-11-27 09:22:53 +00:00
Chris Lattner
84bf52737a remove autoupgrade support for old forms of llvm.prefetch and the old
trampoline forms.  Both of these were correct in LLVM 3.0, and we don't
need to support LLVM 2.9 and earlier in mainline.

llvm-svn: 145174
2011-11-27 07:42:04 +00:00
Chris Lattner
9d1e8420ff Upgrade syntax of tests using volatile instructions to use 'load volatile' instead of 'volatile load', which is archaic.
llvm-svn: 145171
2011-11-27 06:54:59 +00:00
Chris Lattner
011a5bf0aa remove autoupgrade support for really old-style debug info intrinsics.
I think this is the last of autoupgrade that can be removed in 3.1.
Can the atomic upgrade stuff also go?

llvm-svn: 145169
2011-11-27 06:18:33 +00:00
Chris Lattner
8067661775 remove some old autoupgrade logic
llvm-svn: 145167
2011-11-27 06:10:54 +00:00
Chris Lattner
03be469732 remove support for reading llvm 2.9 .bc files. LLVM 3.1 is only compatible back to 3.0
llvm-svn: 145164
2011-11-27 05:48:27 +00:00
Wesley Peck
13edec82a8 Add several new instructions supported by the latest MicroBlaze.
These instructions are not generated by the backend yet, this will come in a later commit.

llvm-svn: 145161
2011-11-27 05:16:58 +00:00
Chandler Carruth
0d073febb6 Introduce a loop block rotation optimization to the new block placement
pass. This is designed to achieve one of the important optimizations
that the old code placement pass did, but more simply.

This is a somewhat rough and *very* conservative version of the
transform. We could get a lot fancier here if there are profitable cases
to do so. In particular, this only looks for a single pattern, it
insists that the loop backedge being rotated away is the last backedge
in the chain, and it doesn't provide any means of doing better in-loop
placement due to the rotation. However, it appears that it will handle
the important loops I am finding in the LLVM test suite.

llvm-svn: 145158
2011-11-27 00:38:03 +00:00
Chandler Carruth
2bf0dccc04 FileCheck-ize this test and make it more precise. This is in preparation
for adding other tests.

llvm-svn: 145143
2011-11-26 08:24:25 +00:00
Eli Friedman
448be745f6 Fix APFloat::convert so that it handles narrowing conversions correctly; it
was returning incorrect values in rare cases, and incorrectly marking
exact conversions as inexact in some more common cases. Fixes PR11406, and a
missed optimization in test/CodeGen/X86/fp-stack-O0.ll.

llvm-svn: 145141
2011-11-26 03:38:02 +00:00
Bruno Cardoso Lopes
626d04cc6f This patch contains support for encoding FMA4 instructions and
tablegen patterns for scalar FMA4 operations and intrinsic. Also
add tests for vfmaddsd.

Patch by Jan Sjodin

llvm-svn: 145133
2011-11-25 19:33:42 +00:00
Craig Topper
e761f42368 Remove 256-bit specific node types for UNPCKHPS/D and instead use the 128-bit versions and let the operand type disinquish. Also fix the load form of the v8i32 patterns for these to realize that the load would be promoted to v4i64.
llvm-svn: 145126
2011-11-24 22:57:10 +00:00
Benjamin Kramer
d03fc374bd X86: alias cqo to cqto.
llvm-svn: 145121
2011-11-24 12:02:46 +00:00
Chandler Carruth
f6e96b54f8 Fix a silly use-after-free issue. A much earlier version of this code
need lots of fanciness around retaining a reference to a Chain's slot in
the BlockToChain map, but that's all gone now. We can just go directly
to allocating the new chain (which will update the mapping for us) and
using it.

Somewhat gross mechanically generated test case replicates the issue
Duncan spotted when actually testing this out.

llvm-svn: 145120
2011-11-24 11:23:15 +00:00
Chandler Carruth
1d3f68ffd0 When adding blocks to the list of those which no longer have any CFG
conflicts, we should only be adding the first block of the chain to the
list, lest we try to merge into the middle of that chain. Most of the
places we were doing this we already happened to be looking at the first
block, but there is no reason to assume that, and in some cases it was
clearly wrong.

I've added a couple of tests here. One already worked, but I like having
an explicit test for it. The other is reduced from a test case Duncan
reduced for me and used to crash. Now it is handled correctly.

llvm-svn: 145119
2011-11-24 08:46:04 +00:00
Richard Smith
d647537b9c Correctly byte-swap APInts with bit-widths greater than 64.
llvm-svn: 145111
2011-11-23 21:33:37 +00:00
Duncan Sands
3c1878ef53 Fix a crash in which a multiplication was being reported as being both negative
and positive: positive, because it could be directly computed to be positive;
negative, because the nsw flags means it is either negative or undefined (the
multiplication always overflowed).

llvm-svn: 145104
2011-11-23 16:26:47 +00:00
Benjamin Kramer
825892c47f X86: Use btq for bit tests if the immediate can't be encoded in 32 bits.
Before:
	movabsq	$4294967296, %rax       ## encoding: [0x48,0xb8,0x00,0x00,0x00,0x00,0x01,0x00,0x00,0x00]
	testq	%rax, %rdi              ## encoding: [0x48,0x85,0xf8]
	jne	LBB0_2                  ## encoding: [0x75,A]

After:
	btq	$32, %rdi               ## encoding: [0x48,0x0f,0xba,0xe7,0x20]
	jb	LBB0_2                  ## encoding: [0x72,A]

btq is usually slower than testq because it doesn't fuse with the jump, but here we're better off
saving one register and a giant movabsq.

llvm-svn: 145103
2011-11-23 13:54:17 +00:00
NAKAMURA Takumi
364c7c80ec test/CodeGen/X86/block-placement.ll: Add explicit -mtriple=i686-linux. X86 Win32 CodeGen does not support EH yet.
llvm-svn: 145101
2011-11-23 12:18:22 +00:00
Chandler Carruth
dcf04fc35a Relax an invariant that block placement was trying to assert a bit
further. This invariant just wasn't going to work in the face of
unanalyzable branches; we need to be resillient to the phenomenon of
chains poking into a loop and poking out of a loop. In fact, we already
were, we just needed to not assert on it.

This was found during a bootstrap with block placement turned on.

llvm-svn: 145100
2011-11-23 10:35:36 +00:00