1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00
Commit Graph

87834 Commits

Author SHA1 Message Date
Valery Pykhtin
78abad3379 [AMDGPU] Using table-driven amd_kernel_code_t field parser in assembler.
Engages code from r262804.

Differential Revision: http://reviews.llvm.org/D17151

llvm-svn: 262808
2016-03-06 20:25:36 +00:00
Valery Pykhtin
cc0f8f9567 fix sanitizer-ppc64be-linux failure for r262804
error: moving a local object in a return statement prevents copy elision [-Werror,-Wpessimizing-move]

http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/930

llvm-svn: 262805
2016-03-06 15:13:54 +00:00
Valery Pykhtin
7df6fd953a [AMDGPU] table-driven parser/printer for amd_kernel_code_t structure fields
Differential Revision: http://reviews.llvm.org/D17150

llvm-svn: 262804
2016-03-06 13:27:13 +00:00
Igor Breger
c376e5b7a2 AVX512BW: Support llvm intrinsic masked vector load/store for i8/i16 element types on SKX
Differential Revision: http://reviews.llvm.org/D17913

llvm-svn: 262803
2016-03-06 12:38:58 +00:00
Valery Pykhtin
a9fc8dac69 [AMDGPU] SOPxx instructions operand naming fixed in td files.
dst -> sdst
ssrcN -> srcN

Differential Revision: http://reviews.llvm.org/D17646

llvm-svn: 262801
2016-03-06 10:31:44 +00:00
Craig Topper
1ff9694fdd [X86] Use high bits of return value from getEncoding instead of predicate functions to populate the REX and VEX prefix bits that extend register encodings. NFC
llvm-svn: 262800
2016-03-06 08:12:47 +00:00
Craig Topper
83f6a31d8f [X86] Remove unnecessary masking. The assert above it already guaranteed it. NFC
llvm-svn: 262799
2016-03-06 08:12:44 +00:00
Craig Topper
30785a5dd9 [X86] Use uint8_t instead of unsigned char as it shortens the code and more explicitly reflects the desired size.
llvm-svn: 262798
2016-03-06 08:12:42 +00:00
Igor Breger
d0d6119cbd AVX512: Remove VSHRI kmask patterns from TD file. It is incorrect to use kshiftw to implement VSHRI v4i1 , bits 15-4 is undef so the upper bits of v4i1 may not be zeroed. v4i1 should be zero_extend to v16i1 ( or any natively supported vector).
Differential Revision: http://reviews.llvm.org/D17763

llvm-svn: 262797
2016-03-06 07:46:03 +00:00
Simon Pilgrim
0c991e17ab [X86][AVX] Improved VPERMILPS variable shuffle mask decoding.
Added support for decoding VPERMILPS variable shuffle masks that aren't in the constant pool.

Added target shuffle mask decoding for SCALAR_TO_VECTOR+VZEXT_MOVL cases - these can happen for v2i64 constant re-materialization

Followup to D17681

llvm-svn: 262784
2016-03-05 22:53:31 +00:00
Simon Pilgrim
203d86217a [X86] AMD Bobcat CPU (btver1) doesn't support XSAVE
btver1 is a SSSE3/SSE4a only CPU - it doesn't have AVX and doesn't support XSAVE.

Differential Revision: http://reviews.llvm.org/D17683

llvm-svn: 262782
2016-03-05 22:00:50 +00:00
Saleem Abdulrasool
6159f0c20b Support: catch invalid accesses
It is possible to invoke these methods on an invalid input resulting in an
invalid substring construction.  It seems that we do not have unit tests for
these methods.  Tests to ensure that the invalid call is caught to follow in
clang.

Resolves PR26839.

llvm-svn: 262778
2016-03-05 20:00:44 +00:00
Saleem Abdulrasool
24d4acb669 ExecutionEngine: tweak debug log
Add a newline to separate the log message.  NFC.

llvm-svn: 262777
2016-03-05 20:00:41 +00:00
Krzysztof Parzyszek
c3a6777f22 Add DAG mutation interface to the post-RA scheduler
Differential Revision: http://reviews.llvm.org/D17868

llvm-svn: 262774
2016-03-05 15:45:23 +00:00
Matthias Braun
d7e6a2dcfd RegisterCoalescer: Remap subregister lanemasks before exchanging operands
Rematerializing and merging into a bigger register class at the same
time, requires the subregister range lanemasks getting remapped to the
new register class.

This fixes http://llvm.org/PR26805

llvm-svn: 262768
2016-03-05 04:36:13 +00:00
Matthias Braun
2662d0d0fe RegisterCoalescer: Need to check DstReg+SrcReg for missing undef flags
copy coalescing with enabled subregister liveness can reveal undef uses,
previously this was only checked for the SrcReg in updateRegDefsUses()
but we need to check DstReg as well.

llvm-svn: 262767
2016-03-05 04:36:10 +00:00
Matthias Braun
0e737a768d RegisterPressure: Small cleanup
llvm-svn: 262766
2016-03-05 04:36:08 +00:00
Quentin Colombet
32c807a283 [X86] Fix the lowering of setjmp intrinsic on i386.
When the lowering of the setjmp intrinsic requires
a global base pointer to be set, make sure such pointer
gets defined by the CGBR pass.

This fixes PR26742.

llvm-svn: 262762
2016-03-05 00:31:04 +00:00
Quentin Colombet
248c4c35f1 [X86] Do not use cmpxchgXXb when we need the base pointer (RBX).
cmpxchgXXb uses RBX as one of its implicit argument. I.e., when
we use that instruction we need to clobber RBX. This is generally
fine, expect when RBX is a reserved register because in that case,
the register allocator will not track its value and will not
save and restore it when interferences occur.

rdar://problem/24851412

llvm-svn: 262759
2016-03-04 23:29:39 +00:00
Mike Aizatsky
a4ce43c759 [libfuzzer] adding std:string to allowed adaptable argument.
llvm-svn: 262757
2016-03-04 23:18:01 +00:00
David Majnemer
caaf1ef4e5 Fix build breakage
llvm-svn: 262756
2016-03-04 23:02:15 +00:00
David Majnemer
0db3c7acce [X86] Support cleaning more than 2**16 bytes of stack
The x86 ret instruction has a 16 bit immediate indicating how many bytes
to pop off of the stack beyond the return address.

There is a problem when extremely large structs are passed by value: we
might not be able to fit the number of bytes to pop into the return
instruction.

To fix this, expand RET_FLAG a little later and use a special sequence
to clean the stack:

pop  %ecx     ; return address is now in %ecx
add  $n, %esp ; clean the stack
push %ecx     ; bring the return address back on the stack
ret           ; pop the return address and jmp to it's value

llvm-svn: 262755
2016-03-04 22:56:17 +00:00
Kostya Serebryany
c6f46ed530 [libFuzzer] log less when re-loading files; fix a silly bug: when running single files actually run all of them, not just the first one
llvm-svn: 262754
2016-03-04 22:35:40 +00:00
Philip Reames
7a7bffd3c2 [LVI] Fix a bug which prevented use of !range metadata within a query
The diff is relatively large since I took a chance to rearrange the code I had to touch in a more obvious way, but the key bit is merely using the !range metadata when we can't analyze the instruction further.  The previous !range metadata code was essentially just dead since no binary operator or cast will have !range metadata (per Verifier) and it was otherwise dropped on the floor.

llvm-svn: 262751
2016-03-04 22:27:39 +00:00
Rong Xu
1103a979e7 [PGO] Add a commandline option to control number of the VP annotation metadata.
llvm-svn: 262750
2016-03-04 22:08:44 +00:00
Michael Kuperstein
f8caea219d [DAGCombine] Fix divrem combine not to assume div/rem type is simple.
The divrem combine assumed the type of the div/rem is simple, which isn't
necessarily true. This probably worked fine until r250825, since it only
saw legal types, but now breaks when it runs as a pre-type-legalization 
combine.

This fixes PR26835.

Differential Revision: http://reviews.llvm.org/D17878

llvm-svn: 262746
2016-03-04 21:23:29 +00:00
Dan Gohman
6d636cd56e [WebAssembly] Add another possible code-size optimization to README.txt
llvm-svn: 262740
2016-03-04 20:09:57 +00:00
Renato Golin
52bc44295a [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make
sure we only emit valid types or the ones that were explicitly marked as custom.
Now, passing check-all and test-suite on x86, ARM and AArch64.

This patch fixes PR17193 (and a long time FIXME in the tests).

llvm-svn: 262738
2016-03-04 19:19:36 +00:00
Tom Stellard
c656bfbad2 AMDGPU/SI: Add support for spiling SGPRs to scratch buffer
Summary:
This is necessary for when we run out of VGPRs and can no
longer use v_{read,write}_lane for spilling SGPRs.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17592

llvm-svn: 262732
2016-03-04 18:31:18 +00:00
Tom Stellard
eaca07fc18 AMDGPU/SI: Enable frame index scavenging during PrologEpilogueInserter
Summary:
This allows us to use virtual registers when we need extra registers
for inserting spill instructions in SIRegisterInfo:eliminateFrameIndex().

Once all the frame indices have been eliminated, the
PrologEpilogueInserter does an extra pass over the program to replace
all virtual registers with physical ones.

This allows us to make more efficient use of our emergency spill slots,
so we only need to create one.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17591

llvm-svn: 262728
2016-03-04 18:02:01 +00:00
Krzysztof Parzyszek
0eb70ee94a [Hexagon] Fix lowering of calls with the return type of i1
This fixes an assertion in test/CodeGen/Hexagon/ifcvt-edge-weight.ll
when run with -debug-only=isel

llvm-svn: 262726
2016-03-04 17:38:05 +00:00
Zoran Jovanovic
ed3c27bd7d [mips][microMIPS] Prevent usage of OR16_MMR6 instruction when code for microMIPS is generated.
Author: milena.vujosevic.janicic
Reviewers: dsanders
Differential Revision: http://reviews.llvm.org/D17373

llvm-svn: 262725
2016-03-04 17:34:31 +00:00
Teresa Johnson
31ee1fa4e9 Change split code gen to use ThreadPool
Part of D15390.

llvm-svn: 262719
2016-03-04 15:39:13 +00:00
Sam Kolton
5660f4b2b1 Test commit access
llvm-svn: 262714
2016-03-04 12:29:14 +00:00
Valery Pykhtin
c7ff55dc02 test commit
llvm-svn: 262709
2016-03-04 10:59:50 +00:00
Benjamin Kramer
585cc07d12 Make headers self-contained again.
llvm-svn: 262702
2016-03-04 10:49:30 +00:00
Nikolay Haustov
3529c0cbe0 AMDGPU/SI: add llvm.amdgcn.image.atomic.* intrinsics
These correspond to IMAGE_ATOMIC_* and are going to be used by Mesa for the
GL_ARB_shader_image_load_store extension.

Initial change by Nicolai H.hnle

Differential Revision: http://reviews.llvm.org/D17401

llvm-svn: 262701
2016-03-04 10:39:50 +00:00
Easwaran Raman
a916b2e62f Fix a memory leak.
llvm-svn: 262682
2016-03-04 01:18:40 +00:00
Easwaran Raman
587391856c Fix a use-after-free bug introduced in r262636
llvm-svn: 262679
2016-03-04 00:44:01 +00:00
Mike Aizatsky
f9401b724f [libfuzzer] arbitrary function adapter.
The adapter automates converting sequence of bytes into arbitrary
arguments.

Differential Revision: http://reviews.llvm.org/D17829

llvm-svn: 262673
2016-03-03 23:45:29 +00:00
Guozhi Wei
a7fc8e012e [InstCombine] Combine A->B->A BitCast
This patch enhances InstCombine to handle following case:

        A  ->  B    bitcast
        PHI
        B  ->  A    bitcast

llvm-svn: 262670
2016-03-03 23:21:38 +00:00
Kostya Serebryany
fc674d4015 [libFuzzer] when interrupted, call _Exit() instead of exit()
llvm-svn: 262667
2016-03-03 22:36:37 +00:00
Simon Pilgrim
c831916cbc [X86][AVX512BW] Fixed 512-bit PSHUFB shuffle mask decode and added combine test.
PSHUFB decoder was assuming that input was 128 or 256-bit vector only.

llvm-svn: 262661
2016-03-03 21:55:01 +00:00
Lang Hames
c3c1858604 [RuntimeDyld] Fix '_' stripping in RTDyldMemoryManager::getSymbolAddressInProcess.
The RTDyldMemoryManager::getSymbolAddressInProcess method accepts a
linker-mangled symbol name, but it calls through to dlsym to do the lookup (via
DynamicLibrary::SearchForAddressOfSymbol), and dlsym expects an unmangled
symbol name.

Historically we've attempted to "demangle" by removing leading '_'s on all
platforms, and fallen back to an extra search if that failed. That's broken, as
it can cause symbols to resolve incorrectly on platforms that don't do mangling
if you query '_foo' and the process also happens to contain a 'foo'.

Fix this by demangling conditionally based on the host platform. That's safe
here because this function is specifically for symbols in the host process, so
the usual cross-process JIT looking concerns don't apply.

M    unittests/ExecutionEngine/ExecutionEngineTest.cpp
M    lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp

llvm-svn: 262657
2016-03-03 21:23:15 +00:00
Philip Reames
b31e6a2515 [ValueTracking] "constant fold" an experimental hidden option
llvm-svn: 262648
2016-03-03 19:50:32 +00:00
Philip Reames
f86417771d [ValueTracking] Remove dead code from an old experiment
This experiment was originally about trying to use facts implied dominating conditions to infer more precise known bits.  While the compile time was found to be acceptable on several large code bases, we never found sufficiently profitable examples to justify turning on the code by default.  Given this, it's time to abandon the experiment.  

Several folks have commented that they've found this useful for experimentation, but nothing has come of those experiments.  Given how easy the patch is to apply, there's no reason to leave the code in tree.  

For anyone interested in further investigation in this area, I recommend finding the summary email I sent on one of the original review threads.  In particular, I now believe the use-list based approach is strictly worse than the dom-tree-walking approach.  

llvm-svn: 262646
2016-03-03 19:44:06 +00:00
Sanjay Patel
fe4b2c4bc6 [InstCombine] transform bitcasted bitwise logic ops with constants (PR26702)
Given that we're not actually reducing the instruction count in the included
regression tests, I think we would call this a canonicalization step.

The motivation comes from the example in PR26702:
https://llvm.org/bugs/show_bug.cgi?id=26702

If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable
example of:

define <4 x i32> @is_negative(<4 x i32> %x) {
  %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
  %not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1>
  %bc = bitcast <4 x i32> %not to <2 x i64>
  %notnot = xor <2 x i64> %bc, <i64 -1, i64 -1>
  %bc2 = bitcast <2 x i64> %notnot to <4 x i32>
  ret <4 x i32> %bc2
}

Simplifies to the expected:

define <4 x i32> @is_negative(<4 x i32> %x) {
  %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
  ret <4 x i32> %lobit
}

Differential Revision: http://reviews.llvm.org/D17583

llvm-svn: 262645
2016-03-03 19:19:04 +00:00
Easwaran Raman
c5bba219e7 Fix breakage caused by r262636.
Use LLVM_ATTRIBUTE_UNUSED instead of __attribute_((unused))

llvm-svn: 262643
2016-03-03 18:53:20 +00:00
Sanjoy Das
c51e182cd8 [SCEV] Prove no-overflow via constant ranges
Exploit ScalarEvolution::getRange's newly acquired smartness (since
r262438) by using that to infer nsw and nuw when possible.

llvm-svn: 262639
2016-03-03 18:31:29 +00:00
Sanjoy Das
7b29c5b2d5 [SCEV] Be less eager about demoting zexts to sexts
After r262438 we can have provably positive NSW SCEV expressions whose
zero extensions cannot be simplified (since r262438 makes SCEV better at
computing constant ranges).  This means demoting sexts of positive add
recurrences eagerly can result in an unsimplified zero extension where
we could have had a simplified sign extension.  This change fixes the
issue by teaching SCEV to demote sext of a positive SCEV expression to a
zext only if the sext could not be simplified.

llvm-svn: 262638
2016-03-03 18:31:23 +00:00