1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00
Commit Graph

86575 Commits

Author SHA1 Message Date
Matt Arsenault
97a3b39dcb AMDGPU: Restore AMDGPU prefixed rsq intrinsic for now
Also move into backend intrinsics to discourage use of the old name.

llvm-svn: 258783
2016-01-26 04:14:16 +00:00
Dan Gohman
bf8e0c60a6 [WebAssembly] Optimize memcpy/memmove/memcpy calls.
These calls return their first argument, but because LLVM uses an intrinsic
with a void return type, they can't use the returned attribute. Generalize
the store results pass to optimize these calls too.

llvm-svn: 258781
2016-01-26 04:01:11 +00:00
Dan Gohman
82741c1205 [WebAssembly] Remove a completed entry from the README.txt.
llvm-svn: 258780
2016-01-26 03:43:48 +00:00
Dan Gohman
e694253126 [WebAssembly] Implement unaligned loads and stores.
Differential Revision: http://reviews.llvm.org/D16534

llvm-svn: 258779
2016-01-26 03:39:31 +00:00
Haicheng Wu
5302d65f58 [LIR] Add support for structs and hand unrolled loops
This is a recommit of r258620 which causes PR26293.

The original message:

Now LIR can turn following codes into memset:

typedef struct foo {
  int a;
  int b;
} foo_t;

void bar(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; ++i) {
    f[i].a = 0;
    f[i].b = 0;
  }
}

void test(foo_t *f, unsigned n) {
  for (unsigned i = 0; i < n; i += 2) {
    f[i] = 0;
    f[i+1] = 0;
  }
}

llvm-svn: 258777
2016-01-26 02:27:47 +00:00
Reid Kleckner
0081990bfb Use binary search for intrinsic ID lookups
This improves compile time of Function.cpp from 57s to 37s for me
locally.  Intrinsic IDs are cached on the Function object, so this
shouldn't regress performance.

llvm-svn: 258774
2016-01-26 02:06:41 +00:00
Matthias Braun
036b32dcef LiveIntervalAnalysis: Improve some comments
As recommended by Justin.

llvm-svn: 258771
2016-01-26 01:40:48 +00:00
Reid Kleckner
b834dd994c Sort intrinsics by LLVM intrinsic name, rather than tablegen def name
Step one towards using a simple binary search to lookup intrinsic IDs
instead of our crazy table generated switch+memcmp+startswith code that
makes Function.cpp take about a minute to compile.  See PR24785 and
PR11951 for why we should do this.

The X86 backend contains tables that need to be sorted on intrinsic ID,
so reorder those.

llvm-svn: 258757
2016-01-26 00:55:00 +00:00
Matthias Braun
7d3caf5306 LiveIntervalAnalysis: Cleanup handleMove{Down|Up}() functions, NFC
These two functions are hard to reason about. This commit makes the code
more comprehensible:

- Use four distinct variables (OldIdxIn, OldIdxOut, NewIdxIn, NewIdxOut)
  with a fixed value instead of a changing iterator I that points to
  different things during the function.
- Remove the early explanation before the function in favor of more
  detailed comments inside the function. Should have more/clearer comments now
  stating which conditions are tested and which invariants hold at
  different points in the functions.

The behaviour of the code was not changed.

I hope that this will make it easier to review the changes in
http://reviews.llvm.org/D9067 which I will adapt next.

Differential Revision: http://reviews.llvm.org/D16379

llvm-svn: 258756
2016-01-26 00:43:50 +00:00
Dan Gohman
a72e83c26e [MC] Use .p2align instead of .align
For historic reasons, the behavior of .align differs between targets.
Fortunately, there are alternatives, .p2align and .balign, which make the
interpretation of the parameter explicit, and which behave consistently across
targets.

This patch teaches MC to use .p2align instead of .align, so that people reading
code for multiple architectures don't have to remember which way each platform
does its .align directive.

Differential Revision: http://reviews.llvm.org/D16549

llvm-svn: 258750
2016-01-26 00:03:25 +00:00
Philip Reames
75dc59c9a3 [GVN] Rearrange code to make local vs non-local cases more obvious [NFCI]
llvm-svn: 258747
2016-01-25 23:37:53 +00:00
Evgeniy Stepanov
258db6665b [cfi] Cross-DSO CFI diagnostic mode (LLVM part).
* __cfi_check gets a 3rd argument: ubsan handler data
* Instead of trapping on failure, call __cfi_check_fail which must be
  present in the module (generated in the frontend).

llvm-svn: 258746
2016-01-25 23:35:03 +00:00
Philip Reames
9298d3408c [GVN] Factor out common code [NFCI]
We had the same code duplicated for each type of Def.  We also have the entire block duplicated between the local and non-local case, but let's start with local cleanup.

llvm-svn: 258740
2016-01-25 23:19:12 +00:00
Matthias Braun
69950e68ba X86ISelLowering: Fix cmov(cmov) special lowering bug
There's a special case in EmitLoweredSelect() that produces an improved
lowering for cmov(cmov) patterns. However this special lowering is
currently broken if the inner cmov has multiple users so this patch
stops using it in this case.

If you wonder why this wasn't fixed by continuing to use the special
lowering and inserting a 2nd PHI for the inner cmov: I believe this
would incur additional copies/register pressure so the special lowering
does not improve upon the normal one anymore in this case.

This fixes http://llvm.org/PR26256 (= rdar://24329747)

llvm-svn: 258729
2016-01-25 22:08:25 +00:00
Teresa Johnson
a52cfbb4e3 [ThinLTO] Find all needed metadata when linking metadata as postpass
For metadata postpass linking, after importing all functions, we need
to recursively walk through any nodes reached via imported functions to
locate needed subprogram metadata. Some might only be reached indirectly
via the variable list for an inlined function.

llvm-svn: 258728
2016-01-25 22:04:56 +00:00
Simon Pilgrim
a63717e4c0 [X86][AVX] Add commutation support for VPERM2X128 instructions
Its main use is to allow memory folding of the 1st operand

Differential Revision: http://reviews.llvm.org/D16521

llvm-svn: 258726
2016-01-25 21:51:34 +00:00
Teresa Johnson
8b2ed4fc5e [ThinLTO] Handle DISubprogram reached indirectly from DIImportedEntity
Extend fix for PR26037 to identify DISubprogram reached from a
DIImportedEntity via a DILexicalBlock.

llvm-svn: 258722
2016-01-25 21:29:55 +00:00
Lawrence Hu
baafd4c214 Enable loopreroll to rerool loop with pointer induction variable.
Example:

while (buf !=end ) {
   S += buf[0];
   S += buf[1];
   buf +=2;
};

Differential Revision: http://reviews.llvm.org/D13151

llvm-svn: 258709
2016-01-25 19:43:45 +00:00
Lawrence Hu
0572a631ee Undo commit 258700 due to missing commit message
llvm-svn: 258708
2016-01-25 19:36:30 +00:00
Matthew Simpson
d9e4b63bf8 Reapply commit r25804 with fix
We were hitting an assertion because we were computing smaller type sizes for
instructions that cannot be demoted. The fix first determines the instructions
that will be demoted, and then applies the smaller type size to only those
instructions.

This should fix PR26239.

llvm-svn: 258705
2016-01-25 19:24:29 +00:00
Quentin Colombet
06230e1d45 Speculatively revert r258620 as it is the likely culprid of PR26293.
llvm-svn: 258703
2016-01-25 19:12:49 +00:00
Ivan Krasin
09873095d3 Temporary disable broken fuzzer/timeout tests.
Reviewers: kcc

Differential Revision: http://reviews.llvm.org/D16543

llvm-svn: 258702
2016-01-25 19:05:45 +00:00
Lawrence Hu
1cf7c9fba6 Differential Revision: http://reviews.llvm.org/D13151
llvm-svn: 258700
2016-01-25 18:53:39 +00:00
Dan Gohman
d5075eb344 [WebAssembly] Fix unbalanced register stack code in the case of late DCE.
Instructions can be DCE'd after the RegStackify pass. If the instruction which
would be the pop for what would be a push is removed, don't use a push.

llvm-svn: 258694
2016-01-25 16:48:44 +00:00
Dan Gohman
6aa3a18666 [WebAssembly] Minor code formatting cleanups. NFC.
llvm-svn: 258692
2016-01-25 15:12:05 +00:00
Dan Gohman
105451ecf0 [SelectionDAG] Use the correct return type for memcpy, memmove, and memset.
When generating calls to memcpy, memmove, and memset, use void* as the return
type rather than void, to match the standard signatures for these functions.

This has no practical effect for most targets, since the return values of
these calls aren't being used anyway, and most calling conventions tolerate
this kind of mismatch. However, this change will help support future
optimizations to utilize the return value to avoid holding the argument
value live across a call.

llvm-svn: 258691
2016-01-25 15:05:56 +00:00
James Molloy
fe66a6200d [DemandedBits] Fix computation of demanded bits for ICmps
The computation of ICmp demanded bits is independent of the individual operand being evaluated. We simply return a mask consisting of the minimum leading zeroes of both operands.

We were incorrectly passing "I" to ComputeKnownBits - this should be "UserI->getOperand(0)". In cases where we were evaluating the 1th operand, we were taking the minimum leading zeroes of it and itself.

This should fix PR26266.

llvm-svn: 258690
2016-01-25 14:49:36 +00:00
Michael Zuckerman
847379aa25 [AVX512] Adding PTESTNMB/D/W/Q instruction
Differential Revision: http://reviews.llvm.org/D16520

llvm-svn: 258688
2016-01-25 14:43:23 +00:00
Michael Zuckerman
5131ab0907 [AVX512] Adding PTESTMB/W/D/Q instruction
Differential Revision: http://reviews.llvm.org/D16519

llvm-svn: 258686
2016-01-25 13:27:32 +00:00
Bradley Smith
849b958836 [ARM] Add DSP build attribute and extension targeting
This patch was originally committed as r257885, but was reverted due to windows
failures. The cause of these failures has been fixed under r258677, hence
re-committing the original patch.

llvm-svn: 258683
2016-01-25 11:26:11 +00:00
Bradley Smith
28db0fcf02 [ARM] Add new system registers to ARMv8-M Baseline/Mainline
This patch was originally committed as r257884, but was reverted due to windows
failures. The cause of these failures has been fixed under r258677, hence
re-committing the original patch.

llvm-svn: 258682
2016-01-25 11:25:36 +00:00
Bradley Smith
bb33c3478a [ARM] Add ARMv8-M security extension instructions to ARMv8-M Baseline/Mainline
This patch was originally committed as r257883, but was reverted due to windows
failures. The cause of these failures has been fixed under r258677, hence
re-committing the original patch.

llvm-svn: 258681
2016-01-25 11:24:47 +00:00
Asaf Badouh
ec3729528a [X86][IFMA] adding intrinsics and encoding for multiply and add of unsigned 52bit integer
VPMADD52LUQ - Packed Multiply of Unsigned 52-bit Integers and Add the Low 52-bit Products to Qword Accumulators
 VPMADD52HUQ - Packed Multiply of Unsigned 52-bit Unsigned Integers and Add High 52-bit Products to 64-bit Accumulators

Differential Revision: http://reviews.llvm.org/D16407

llvm-svn: 258680
2016-01-25 11:14:24 +00:00
Oliver Stannard
43a6ec7452 [ARM] Add ARMv8.2-A FP16 scalar instructions
This was originally committed as r255762, but reverted as it broke windows
bots. Re-commitiing the exact same patch, as the underlying cause was fixed by
r258677.

ARMv8.2-A adds 16-bit floating point versions of all existing VFP
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.

The assembly for these instructions uses S registers (AArch32 does not
have H registers), but the instructions have ".f16" type specifiers
rather than ".f32" or ".f64". The top 16 bits of each source register
are ignored, and the top 16 bits of the destination register are set to
zero.

These instructions are mostly the same as the 32- and 64-bit versions,
but they use coprocessor 9 rather than 10 and 11.

Two new instructions, VMOVX and VINS, have been added to allow packing
and extracting two 16-bit floats stored in the top and bottom halves of
an S register.

New fixup kinds have been added for the PC-relative load and store
instructions, but no ELF relocations have been added as they have a
range of 512 bytes.

Differential Revision: http://reviews.llvm.org/D15038

llvm-svn: 258678
2016-01-25 10:26:26 +00:00
Oliver Stannard
4ff6a9f924 [TableGen] Fix sort order of asm operand classes
This is a fix for https://llvm.org/bugs/show_bug.cgi?id=22796.

The previous implementation of ClassInfo::operator< allowed cycles of classes
such that x < y < z < x, meaning that a list of them cannot be correctly
sorted, and the sort order could differ with different standard libraries.

The original implementation sorted classes by ValueName if they were otherwise
equal. This isn't strictly necessary, but some backends seem to accidentally
rely on it. If I reverse this comparison I get 8 test failures spread across
the AArch64, Mips and X86 backends, so I have left it in until those backends
can be fixed.

There was one case in the X86 backend where the observable behaviour of the
assembler is changed by this patch. This was because some of the memory asm
operands were not marked as children of X86MemAsmOperand.

Differential Revision: http://reviews.llvm.org/D16141

llvm-svn: 258677
2016-01-25 10:20:19 +00:00
Junmo Park
2adffa1342 Silence a -Wparentheses warning; NFC.
llvm-svn: 258676
2016-01-25 10:17:17 +00:00
Igor Breger
66fa90c341 AVX1 : Enable vector masked_load/store to AVX1.
Use AVX1 FP instructions (vmaskmovps/pd) in place of the AVX2 int instructions (vpmaskmovd/q).

Differential Revision: http://reviews.llvm.org/D16528

llvm-svn: 258675
2016-01-25 10:17:11 +00:00
Michael Zuckerman
e13e338f45 [AVX512] [CMPPS ][ CMPPD ] Adding full Comparison Predicate names
X86AsmParser.cpp is missing full comparison predicate names for CMPPD and CMPPS Instructions.
X86AsmParser.cpp defines only the short names of the Comparison predicate that you can find in the following pdf:
https://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf
Page 5-61 table 5-3

Differential Revision: http://reviews.llvm.org/D16518

llvm-svn: 258671
2016-01-25 08:43:26 +00:00
Lang Hames
19769657b9 [Object][COFF] Revert r258665 - It doesn't do what I had intended.
I'm discussing the right approach for tracking visibility for COFF symbols on
the llvm-dev list.

llvm-svn: 258666
2016-01-25 01:21:45 +00:00
Lang Hames
84361d0d8a [Object][COFF] Set the generic SF_Exported flag on COFF exported symbols.
The ORC ObjectLinkingLayer uses this flag during symbol lookup. Failure to set
it causes all symbols to behave as if they were non-exported, which has caused
failures in the kaleidoscope tutorials on Windows. Raising the flag should
un-break the tutorials.

No test case yet - none of the existing command line tools for printing symbol
tables (llvm-nm, llvm-objdump) show the status of this flag, and I don't want to
change the format from these tools without consulting their owners. I'll send an
email to the dev-list to figure out the right way forward.

llvm-svn: 258665
2016-01-24 21:56:40 +00:00
David Majnemer
ea5702b618 [COFF] Simplify SetSectionName
Consolidate the code which handles string table offsets less than 999999
with the code for offsets less than 9999999.  While we are here,
simplify the code by not using sprintf to generate the string.

llvm-svn: 258664
2016-01-24 20:46:11 +00:00
David Majnemer
51c9237bd6 [LoopSimplify] Reuse changeToUnreachable
Use existing functionality provided in changeToUnreachable instead of
reinventing it in LoopSimplify.

No functionality change is intended.

llvm-svn: 258663
2016-01-24 19:32:52 +00:00
David Majnemer
58f71414f2 Fix build bot breakage
llvm-svn: 258661
2016-01-24 16:46:53 +00:00
Elena Demikhovsky
832e2d5858 Added Skylake client to X86 targets and features
Changes in X86.td:

I set features of Intel processors in incremental form: IVB = SNB + X HSW = IVB + X ..
I added Skylake client processor and defined it's features
FeatureADX was missing on KNL
Added some new features to appropriate processors SMAP, IFMA, PREFETCHWT1, VMFUNC and others

Differential Revision: http://reviews.llvm.org/D16357

llvm-svn: 258659
2016-01-24 10:41:28 +00:00
Amjad Aboud
01175f32ea Fixed few comments.
llvm-svn: 258658
2016-01-24 08:18:55 +00:00
Igor Breger
f91b2666bb AVX512: VMOVDQU8/16/32/64 (load) intrinsic implementation.
Differential Revision: http://reviews.llvm.org/D16137

llvm-svn: 258657
2016-01-24 08:04:33 +00:00
David Majnemer
5ab451ac92 Fix buildbot failures
llvm-svn: 258655
2016-01-24 06:40:37 +00:00
David Majnemer
bfc3671cd7 [SCCP] Remove duplicate code
SCCP has code identical to changeToUnreachable's behavior, switch it
over to just call changeToUnreachable.

No functionality change intended.

llvm-svn: 258654
2016-01-24 06:26:47 +00:00
David Majnemer
0fce247968 [InstCombine, SCCP] Consolidate code used to remove instructions
InstCombine and SCCP both want to remove dead code in a very particular
way but using identical means to do so.  Share the code between the two.

No functionality change is intended.

llvm-svn: 258653
2016-01-24 05:26:18 +00:00
David Majnemer
1fad37bd8d [WinEH] Don't miscompile cleanups which conditionally unwind to caller
A cleanup can have paths which unwind or end up in unreachable.
If there is an unreachable path *and* a path which unwinds to caller,
we would mistakenly inject an unwind path to a catchswitch on the
unreachable path.  This results in a verifier assertion firing because
the cleanup unwinds to two different places: to the caller and to the
catchswitch.

This occured because we used getCleanupRetUnwindDest to determine if the
cleanuppad had no cleanuprets.
This is incorrect, getCleanupRetUnwindDest returns null for cleanuprets
which unwind to caller.

llvm-svn: 258651
2016-01-23 23:54:33 +00:00