1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00
Commit Graph

28635 Commits

Author SHA1 Message Date
Matt Arsenault
670643e5d9 R600/SI: Add missing offset operand to buffer bothen
llvm-svn: 229605
2015-02-18 02:04:38 +00:00
Matt Arsenault
3d3d606dd0 R600/SI: Add missing soffset operand to global atomics
llvm-svn: 229604
2015-02-18 02:04:35 +00:00
Sanjoy Das
f5d762cf78 Generalize getExtendAddRecStart to work with both sign and zero
extensions.

This change also removes `DEBUG(dbgs() << "SCEV: untested prestart
overflow check\n");` because that case has a unit test now.

Differential Revision: http://reviews.llvm.org/D7645

llvm-svn: 229600
2015-02-18 01:47:07 +00:00
Andrea Di Biagio
016c12ee8d [X86][FastIsel] Teach how to select scalar integer to float/double conversions.
This patch teaches fast-isel how to select a (V)CVTSI2SSrr for an integer to 
float conversion, and how to select a (V)CVTSI2SDrr for an integer to double
conversion.

Added test 'fast-isel-int-float-conversion.ll'.

Differential Revision: http://reviews.llvm.org/D7698

llvm-svn: 229589
2015-02-17 23:40:58 +00:00
Rafael Espindola
be6855ec2a Add r228939 back with a fix.
The problem in the original patch was not switching back to .text after printing
an eh table.

Original message:

On ELF, put PIC jump tables in a non executable section.

Fixes PR22558.

llvm-svn: 229586
2015-02-17 23:34:51 +00:00
Rafael Espindola
686783175f Add a test showing the problem in r228939.
If an EH table is printed in between the function and the jump table we would
fail to switch back to the text section to print the jump table.

llvm-svn: 229580
2015-02-17 23:21:46 +00:00
Simon Pilgrim
6da8e17d64 [X86][SSE] Generalised unpckl/unpckh shuffle matching
Added commuted unpckl/unpckh shuffle matching patterns as many cases containing undefined lanes fail to commute by themselves.

Differential Revision: http://reviews.llvm.org/D7564

llvm-svn: 229571
2015-02-17 22:24:32 +00:00
Sanjay Patel
f2c498a0c4 use a triple instead of a cpu; less builbot sadness
llvm-svn: 229563
2015-02-17 21:59:54 +00:00
Kevin Enderby
5109879fd6 Add code to llvm-objdump so the -section option with -macho will dump literal pointer sections
with the Mach-O S_LITERAL_POINTERS section type.

Also fix the printing of the leading addresses for literal sections to be consistent and
not print the 0x prefix.  Updated test cases to match.

llvm-svn: 229548
2015-02-17 21:35:48 +00:00
Rafael Espindola
6d441224de Add testcases I missed in r229541.
llvm-svn: 229542
2015-02-17 20:50:39 +00:00
Sanjay Patel
ec471315ab make basic block label matching more flexible for less sad buildbots
llvm-svn: 229535
2015-02-17 20:29:31 +00:00
Tom Stellard
225ebd9101 R600/SI: Fix asam errors in SIFoldOperands
We were trying to fold into implicit uses, which led to out of bounds
access of the MCInstrDesc::OpInfo arrray.

llvm-svn: 229533
2015-02-17 20:11:54 +00:00
Sanjay Patel
5abe0e38c5 prevent folding a scalar FP load into a packed logical FP instruction (PR22371)
Change the memory operands in sse12_fp_packed_scalar_logical_alias from scalars to vectors. 
That's what the hardware packed logical FP instructions define: 128-bit memory operands.
There are no scalar versions of these instructions...because this is x86.

Generating the wrong code (folding a scalar load into a 128-bit load) is still possible
using the peephole optimization pass and the load folding tables. We won't completely
solve this bug until we either fix the lowering in fabs/fneg/fcopysign and any other
places where scalar FP logic is created or fix the load folding in foldMemoryOperandImpl()
to make sure it isn't changing the size of the load.

Differential Revision: http://reviews.llvm.org/D7474

llvm-svn: 229531
2015-02-17 20:08:21 +00:00
Simon Atanasyan
3c81b245de [Object] Support reading 64-bit MIPS ELF archives
The 64-bit MIPS ELF archive file format is used by MIPS64 targets.
The main difference from a regular archive file is the symbol table format:
1. ar_name is equal to "/SYM64/"
2. number of symbols and offsets are 64-bit integers

http://techpubs.sgi.com/library/manuals/4000/007-4658-001/pdf/007-4658-001.pdf
Page 96

The patch allows reading of such archive files by llvm-nm, llvm-objdump
and other tools. But it does not support archive files with number of symbols
and/or offsets exceed 2^32. I think it is a rather rare case requires more
significant modification of `Archive` class code.

http://reviews.llvm.org/D7546

llvm-svn: 229520
2015-02-17 18:54:22 +00:00
Sanjay Patel
0117b06b3f Canonicalize splats as build_vectors (PR22283)
This is a follow-on patch to:
http://reviews.llvm.org/D7093

That patch canonicalized constant splats as build_vectors, 
and this patch removes the constant check so we can canonicalize
all splats as build_vectors.

This fixes the 2nd test case in PR22283:
http://llvm.org/bugs/show_bug.cgi?id=22283

The unfortunate code duplication between SelectionDAG and DAGCombiner
is discussed in the earlier patch review. At least this patch is just
removing code...

This improves an existing x86 AVX test and changes codegen in an ARM test.

Differential Revision: http://reviews.llvm.org/D7389

llvm-svn: 229511
2015-02-17 16:54:32 +00:00
Tom Stellard
521960e294 R600/SI: Extend private extload pattern to include zext loads
llvm-svn: 229507
2015-02-17 16:36:00 +00:00
Elena Demikhovsky
b1743d071c Fixed a bug in store sinking.
The problem was in store-sink barrier check.

Store sink barrier should be checked for ModRef (read-write) mode.

http://llvm.org/bugs/show_bug.cgi?id=22613

llvm-svn: 229495
2015-02-17 13:10:05 +00:00
Andrea Di Biagio
c38ea9f435 [X86][FastISel] Add missing flag -fast-isel-abort to run lines in test fast-isel-fptrunc-fpext.ll.
Flag -fast-isel-abort is required in order to verify that X86FastISel
never fails to select FPExt (float-to-double) and FPTrunc (double-to-float).
No Functional change intended.

llvm-svn: 229489
2015-02-17 12:25:49 +00:00
Elena Demikhovsky
30ee20b16b AVX-512: changes in intel_ocl_bi calling conventions
- added mask types v8i1 and v16i1 to possible function parameters
- enabled passing 512-bit vectors in standard CC
- added a test for KNL intel_ocl_bi conventions

llvm-svn: 229482
2015-02-17 09:20:12 +00:00
Michael Kuperstein
812c46b9de [X86] Combine vector anyext + and into a vector zext
Vector zext tends to get legalized into a vector anyext, represented as a vector shuffle with an undef vector + a bitcast, that gets ANDed with a mask that zeroes the undef elements.
Combine this into an explicit shuffle with a zero vector instead. This allows shuffle lowering to match it as a zext, instead of matching it as an anyext and emitting an explicit AND.
This combine only covers a subset of the cases, but it's a start.

Differential Revision: http://reviews.llvm.org/D7666

llvm-svn: 229480
2015-02-17 08:22:51 +00:00
Eric Christopher
49ad15fa29 Move ABI handling and 64-bitness to the PowerPC target machine.
This required changing how the computation of the ABI is handled
and how some of the checks for ABI/target are done.

llvm-svn: 229471
2015-02-17 06:45:15 +00:00
Chandler Carruth
b51ccf2731 [x86] Teach the unpack lowering to try wider element unpacks.
This allows it to match still more places where previously we would have
to fall back on floating point shuffles or other more complex lowering
strategies.

I'm hoping to replace some of the hand-rolled unpack matching with this
routine is it gets more and more clever.

llvm-svn: 229463
2015-02-17 02:12:24 +00:00
Hal Finkel
c9890f4fe1 [BDCE] Add a bit-tracking DCE pass
BDCE is a bit-tracking dead code elimination pass. It is based on ADCE (the
"aggressive DCE" pass), with the added capability to track dead bits of integer
valued instructions and remove those instructions when all of the bits are
dead.

Currently, it does not actually do this all-bits-dead removal, but rather
replaces the instruction's uses with a constant zero, and lets instcombine (and
the later run of ADCE) do the rest. Because we essentially get a run of ADCE
"for free" while tracking the dead bits, we also do what ADCE does and removes
actually-dead instructions as well (this includes instructions newly trivially
dead because all bits were dead, but not all such instructions can be removed).

The motivation for this is a case like:

int __attribute__((const)) foo(int i);
int bar(int x) {
  x |= (4 & foo(5));
  x |= (8 & foo(3));
  x |= (16 & foo(2));
  x |= (32 & foo(1));
  x |= (64 & foo(0));
  x |= (128& foo(4));
  return x >> 4;
}

As it turns out, if you order the bit-field insertions so that all of the dead
ones come last, then instcombine will remove them. However, if you pick some
other order (such as the one above), the fact that some of the calls to foo()
are useless is not locally obvious, and we don't remove them (without this
pass).

I did a quick compile-time overhead check using sqlite from the test suite
(Release+Asserts). BDCE took ~0.4% of the compilation time (making it about
twice as expensive as ADCE).

I've not looked at why yet, but we eliminate instructions due to having
all-dead bits in:
External/SPEC/CFP2006/447.dealII/447.dealII
External/SPEC/CINT2006/400.perlbench/400.perlbench
External/SPEC/CINT2006/403.gcc/403.gcc
MultiSource/Applications/ClamAV/clamscan
MultiSource/Benchmarks/7zip/7zip-benchmark

llvm-svn: 229462
2015-02-17 01:36:59 +00:00
Hal Finkel
0e48ce380c Specify arch in test/CodeGen/X86/float-conv-elim.ll
This test was failing on non-x86 hosts because it specified a cpu of x86_64,
but not an architecture. x86_64 is obviously not a valid cpu on all
architectures.

llvm-svn: 229460
2015-02-17 00:11:19 +00:00
Hal Finkel
a9011331c4 [PowerPC] Support non-direct-sub/superclass VSX copies
Our register allocation has become better recently, it seems, and is now
starting to generate cross-block copies into inflated register classes. These
copies are not transformed into subregister insertions/extractions by the
PPCVSXCopy class, and so need to be handled directly by
PPCInstrInfo::copyPhysReg. The code to do this was *almost* there, but not
quite (it was unnecessarily restricting itself to only the direct
sub/super-register-class case (not copying between, for example, something in
VRRC and the lower-half of VSRC which are super-registers of F8RC).

Triggering this behavior manually is difficult; I'm including two
bugpoint-reduced test cases from the test suite.

llvm-svn: 229457
2015-02-16 23:46:30 +00:00
Cameron McInally
857d405820 [AVX512] Make 512b vector floating point rounds legal on AVX512.
llvm-svn: 229445
2015-02-16 22:15:42 +00:00
Simon Pilgrim
e9778d75a1 [X86][SSE] Add SSE MOVQ instructions to SSEPackedInt domain
Patch to explicitly add the SSE MOVQ (rr,mr,rm) instructions to SSEPackedInt domain - prevents a number of costly domain switches.

Differential Revision: http://reviews.llvm.org/D7600

llvm-svn: 229439
2015-02-16 21:50:56 +00:00
Mehdi Amini
d44ae390cb SelectionDAG: fold (fp_to_u/sint (s/uint_to_fp)) here too
Update SPARC tests to match.

From: Fiona Glaser <fglaser@apple.com>
llvm-svn: 229438
2015-02-16 21:47:58 +00:00
Mehdi Amini
ce11b626e7 InstCombine: fold more cases of (fp_to_u/sint (u/sint_to_fp val))
Fixes radar 15486701.

From: Fiona Glaser <fglaser@apple.com>
llvm-svn: 229437
2015-02-16 21:47:54 +00:00
Mehdi Amini
1468b597ea Tests: reformat sitofp.ll and use FileCheck
From: Fiona Glaser <fglaser@apple.com>
llvm-svn: 229436
2015-02-16 21:47:50 +00:00
Craig Topper
b6e168f770 [X86] Remove the multiply by 8 that goes into the shift constant for X86ISD::VSHLDQ and X86ISD::VSRLDQ. This simplifies the pattern matching in isel and allows these nodes to become the patterns embedded in the instruction.
llvm-svn: 229431
2015-02-16 20:52:07 +00:00
David Majnemer
4da38e22ad ConstantFold: Properly fold GEP indices wider than i64
llvm-svn: 229420
2015-02-16 19:10:02 +00:00
Andrew Trick
e7964c82c7 AArch64: Safely handle the incoming sret call argument.
This adds a safe interface to the machine independent InputArg struct
for accessing the index of the original (IR-level) argument. When a
non-native return type is lowered, we generate the hidden
machine-level sret argument on-the-fly. Before this fix, we were
representing this argument as OrigArgIndex == 0, which is an outright
lie. In particular this crashed in the AArch64 backend where we
actually try to access the type of the original argument.

Now we use a sentinel value for machine arguments that have no
original argument index. AArch64, ARM, Mips, and PPC now check for this
case before accessing the original argument.

Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering

llvm-svn: 229413
2015-02-16 18:10:47 +00:00
James Molloy
317bb7473f [LoopReroll] Relax some assumptions a little.
We won't find a root with index zero in any loop that we are able to reroll.
However, we may find one in a non-rerollable loop, so bail gracefully instead
of failing hard.

llvm-svn: 229406
2015-02-16 17:02:00 +00:00
James Molloy
382c1caece [LoopReroll] Don't crash on dead code
If a PHI has no users, don't crash; bail gracefully. This shouldn't
happen often, but we can make no guarantees that previous passes didn't leave
dead code around.

llvm-svn: 229405
2015-02-16 17:01:52 +00:00
Chandler Carruth
d44ede78e3 [x86] Add a generic unpack-targeted lowering technique. This can be used
to generically lower blends and is particularly nice because it is
available frome SSE2 onward. This removes a lot of the remaining domain
crossing blends in SSE2 code.

I'm hoping to replace some of the "interleaved" lowering hacks with
something closer to this which should be more principled. First, this
needs to learn how to detect and use other interleavings besides that of
the natural type provided. That will be a follow-up patch though.

llvm-svn: 229378
2015-02-16 12:28:18 +00:00
Chandler Carruth
3816521c2e [x86] Switch this test to use checks generated by my update script. NFC
llvm-svn: 229377
2015-02-16 12:23:22 +00:00
Michael Kuperstein
70d8ac0a8e Fix quoting of #pragma comment for MS compat, LLVM part.
For #pragma comment(linker, ...) MSVC expects the comment string to be quoted, but for #pragma comment(lib, ...) the compiler itself quotes the library name.
Since this distinction disappears by the time the directive reaches the backend, move quoting for the "lib" version to the frontend.

Differential Revision: http://reviews.llvm.org/D7652

llvm-svn: 229375
2015-02-16 11:57:17 +00:00
Chandler Carruth
358c1db65e [x86] Add initial basic support for forming blends of v16i8 vectors.
This blend instruction is ... really lame. The register usage is insane.
As a consequence this is probably only *barely* better than 2 pshufbs
followed by a por, and that mostly because it only has to read from
a single memory location.

However, this doesn't fix as much as I kind of expected, so more to go.
Pretty sure that the ordering and delegation of v16i8 is just really,
really bad.

llvm-svn: 229373
2015-02-16 10:58:23 +00:00
Chandler Carruth
db7a8ca276 [x86] Add some more test cases for i8 vector blends.
llvm-svn: 229372
2015-02-16 10:51:49 +00:00
David Majnemer
64cd3dbda5 IR: SrcTy == DstTy doesn't imply that a cast is valid
Cast validity depends on the cast's kind, not just its types.

llvm-svn: 229366
2015-02-16 09:37:35 +00:00
David Majnemer
4f5d97ee4f AsmParser: extractvalue requires at least one index operand
llvm-svn: 229365
2015-02-16 09:18:13 +00:00
David Majnemer
7f40c08dca AsmParser: Make sure GlobalVariables have sane types
llvm-svn: 229364
2015-02-16 08:41:08 +00:00
David Majnemer
b5464fbff9 AsmParser: Reject alloca with function type
llvm-svn: 229363
2015-02-16 08:38:03 +00:00
David Majnemer
9580d6a824 Verifier: Diagnose module flags which have null ID operands
llvm-svn: 229361
2015-02-16 08:14:22 +00:00
Craig Topper
b3a29e8067 [X86] Add support for lowering shuffles to 256-bit PALIGNR instruction.
llvm-svn: 229359
2015-02-16 06:29:06 +00:00
Craig Topper
988e9c859c [X86] Remove some hard tab characters from tests.
llvm-svn: 229358
2015-02-16 06:29:02 +00:00
David Majnemer
3ae73b8e74 DebugInfo: Don't crash if 'Debug Info Version' has a strange value
llvm-svn: 229356
2015-02-16 06:04:53 +00:00
David Majnemer
271992a42e DataLayout: Validate that the pref alignment is at least the ABI align
llvm-svn: 229355
2015-02-16 05:41:55 +00:00
David Majnemer
18f3685387 DataLayout: Report when the datalayout type alignment/width is too large
llvm-svn: 229354
2015-02-16 05:41:53 +00:00