1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 20:51:52 +01:00

140993 Commits

Author SHA1 Message Date
Yaxun Liu
7bae0ef103 Fix known zero bits for addrspacecast.
Currently LLVM assumes that a pointer addrspacecasted to a different addr space is equivalent to trunc or zext bitwise, which is not true. For example, in amdgcn target, when a null pointer is addrspacecasted from addr space 4 to 0, its value is changed from i64 0 to i32 -1.

This patch teaches LLVM not to assume known bits of addrspacecast instruction to its operand.

Differential Revision: https://reviews.llvm.org/D26803

llvm-svn: 287545
2016-11-21 15:42:31 +00:00
Simon Pilgrim
fa36f21d80 [X86][SSE] Add SSE reciprocal estimate tests
llvm-svn: 287543
2016-11-21 15:28:21 +00:00
Simon Pilgrim
5d49644469 [SelectionDAG] Add ComputeNumSignBits support for CONCAT_VECTORS opcode
llvm-svn: 287541
2016-11-21 14:36:19 +00:00
Alex Lorenz
11e1f2366f [llvm-cov] Avoid 0% when reporting something that's 0/0
This commit makes llvm-cov avoid showing 0% (0/0) coverage for things
like file function coverage, etc. in reports and HTML output. This can happen
for files like headers that have macros but no functions. This commit makes
llvm-cov report - (0/0) instead.

rdar://29246480

Differential Revision: https://reviews.llvm.org/D26615

llvm-svn: 287539
2016-11-21 14:00:04 +00:00
Benjamin Kramer
7f8d50a7e2 Adjust arm64-irtranslator.ll test to changes from r287368
The test is currently broken, and this CL should fix it.

Patch by Adrian Kuegel!

Differential Revision: https://reviews.llvm.org/D26910

llvm-svn: 287536
2016-11-21 13:15:38 +00:00
Simon Pilgrim
ccffba1888 [X86][SSE] Allow PACKSS to be used to truncate any type of all/none sign bits input
At the moment we only use truncateVectorCompareWithPACKSS with direct vector comparison results (just one example of a known all/none signbits input).

This change relaxes the direct matching of a SETCC opcode by moving the logic up into SelectionDAG::ComputeNumSignBits and accepting any input with a known splatted signbit.

llvm-svn: 287535
2016-11-21 12:05:49 +00:00
Marcin Koscielnicki
84d15a6c72 [InstrProfiling] Mark __llvm_profile_instrument_target last parameter as i32 zeroext if appropriate.
On some architectures (s390x, ppc64, sparc64, mips), C-level int is passed
as i32 signext instead of plain i32.  Likewise, unsigned int may be passed
as i32, i32 signext, or i32 zeroext depending on the platform.  Mark
__llvm_profile_instrument_target properly (its last parameter is unsigned
int).

This (together with the clang change) makes compiler-rt profile testsuite pass
on s390x.

Differential Revision: http://reviews.llvm.org/D21736

llvm-svn: 287534
2016-11-21 11:57:19 +00:00
Marcin Koscielnicki
3aa3dc33a3 [TLI] Add functions determining if int parameters/returns should be zeroext/signext.
On some architectures (s390x, ppc64, sparc64, mips), C-level int is passed
as i32 signext instead of plain i32.  Likewise, unsigned int may be passed
as i32, i32 signext, or i32 zeroext depending on the platform.  Add this
information to TargetLibraryInfo, to be used whenever some LLVM pass
inserts a compiler-rt call to a function involving int parameters
or returns.

Differential Revision: http://reviews.llvm.org/D21739

llvm-svn: 287533
2016-11-21 11:57:11 +00:00
Michael Zuckerman
5b15b8b9d0 Fixing a small typo (A->U).
This seem to fixes PR30992.

-         HasAVX512 ? X86::VMOVAPSZ128rm_NOVLX 
+         HasAVX512 ? X86::VMOVUPSZ128rm_NOVLX 

llvm-svn: 287532
2016-11-21 11:52:11 +00:00
Jacob Baungard Hansen
bc964a0654 [Sparc] Use target name instead of namespace as prefix for MCRegisterClasses array
Summary:
For Sparc the namespace (SP) is different from the target name (Sparc),
which causes the name of the array in this declaration to differ from
the name used in the definition.

Patch by Daniel Cederman.

Reviewers: jyknight

Subscribers: llvm-commits, jyknight

Differential Revision: https://reviews.llvm.org/D23650

llvm-svn: 287528
2016-11-21 09:33:05 +00:00
Craig Topper
514d5c696a [AVX-512] Add EVEX form of VMOVZPQILo2PQIZrm to load folding tables to match SSE and AVX.
llvm-svn: 287523
2016-11-21 07:51:31 +00:00
Alexei Starovoitov
6da6d0e98d [bpf] attempt to fix big-endian bots
attempt to fix big-endian bots failing on new dwarfdump test

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 287522
2016-11-21 07:26:23 +00:00
Alexei Starovoitov
5e8323860d [bpf] fix dwarf elf relocs and line numbers
- teach RelocVisitor to recognize bpf relocations
- fix AsmInfo->PointerSize to make sure dwarf is emitted correctly
- add a test for the above

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 287521
2016-11-21 06:21:23 +00:00
Craig Topper
c5ad05d406 [TableGen][ISel] Do a better job of factoring ScopeMatchers created during creation of SwitchTypeMatcher.
Previously we were factoring when the ScopeMatcher was initially created, but it might get more Matchers added to it later. Delay factoring until we have fully created/populated the ScopeMatchers.

This reduces X86 isel tables by 154 bytes.

llvm-svn: 287520
2016-11-21 04:07:58 +00:00
Craig Topper
e4cf482936 [X86] Remove duplicate instructions for (v)movq and replace with patterns on other instructions. NFC
llvm-svn: 287519
2016-11-21 04:07:56 +00:00
Dean Michael Berris
aeea4f9adc [XRay][AArch64] Implemented a test for the compile-time sleds emitted, and fixed a bug in the jump instruction
This patch adds a test for the assembly code emitted with XRay
instrumentation. It also fixes a bug where the operand of a jump
instruction must be not the number of bytes to jump over, but rather the
number of 4-byte instructions.

Author: rSerge

Reviewers: dberris, rengolin

Differential Revision: https://reviews.llvm.org/D26805

llvm-svn: 287516
2016-11-21 03:01:43 +00:00
Davide Italiano
651649aedf [GlobalSplit] Port to the new pass manager.
llvm-svn: 287511
2016-11-21 00:28:23 +00:00
Simon Dardis
29c3915faa [mips] Restrict tail call optimization
The tail call optimization was being used without proper consideration of
ABI requirements for saving and restoring the GP. This patch restricts tail
call optimization to functions within the same translation unit.

Reviewers: vkalintiris

Differential Revision: https://reviews.llvm.org/D24763

llvm-svn: 287505
2016-11-20 21:23:08 +00:00
Simon Pilgrim
a97617aadc [X86][SSE] Add some initial combine tests that could (should?) use PACKSS
llvm-svn: 287504
2016-11-20 21:12:49 +00:00
Craig Topper
e16c3bda04 [AVX-512] Add tests for masked palignr/valignd/valignq shuffles, many of which show failures to fold the masking into the operation.
Many of these problems are because shuffle lowering widens element size and reduces element count when possible. This causes the shuffle to become separated from the select by a bitcast. Future patches will work to improve these cases by rewriting the shuffle back to a narrow element type if we think it can result in folding the mask.

llvm-svn: 287503
2016-11-20 19:50:32 +00:00
Coby Tayree
c8112b9e1c The 'vpmultishiftqb' instruction was implemented falsely, this patch amend it.
More specifically - (MS dialect) broadcasting variants were implemented falsely.

Differential Revision: https://reviews.llvm.org/D26257

llvm-svn: 287501
2016-11-20 17:19:55 +00:00
Coby Tayree
dfe03a8851 Some instructions were missing, other implemented falsely. this patch aims at amending those issues. full list:
vcvtps2pd
vcvtudq2pd
vcvtps2qq
vcvttps2qq
vcvtps2uqq
vcvttps2uqq

variants are:

[Dst]XMM(zero-masked/merge-masked/unmasked)
[Src]Mem64

Differential Revision: https://reviews.llvm.org/D26799

llvm-svn: 287500
2016-11-20 17:09:56 +00:00
Simon Pilgrim
0f773325da [X86][AVX512] Combine unary + zero target shuffles to VPERMV3 with a zero vector where possible
llvm-svn: 287497
2016-11-20 16:11:36 +00:00
Simon Pilgrim
7ee727735c [X86][AVX512] Add support for VBMI VPERMV3 target shuffle combines
llvm-svn: 287496
2016-11-20 15:24:38 +00:00
Simon Pilgrim
11a92eecb6 [X86][AVX512] Add support for VBMI VPERMV target shuffle combines
llvm-svn: 287495
2016-11-20 15:05:45 +00:00
Simon Pilgrim
a67d90472e [X86][AVX512] Add some initial VBMI target shuffle combine tests
llvm-svn: 287494
2016-11-20 14:45:46 +00:00
Simon Pilgrim
477296e35a [X86][AVX512VL] Removed duplicate operation action
Basic AVX512F already declared uint_to_fp v4i32 as legal

llvm-svn: 287493
2016-11-20 14:19:29 +00:00
Simon Pilgrim
d4d881b5ab Strip trailing whitespace
llvm-svn: 287492
2016-11-20 14:05:23 +00:00
Simon Pilgrim
415ba9bf65 [X86][AVX512F] Add support for uint_to_fp v2i32 to v2f64 on AVX512F-only targets
Use 512-bit instructions (we already do something similar for uint_to_fp v4i32 to v4f64)

llvm-svn: 287491
2016-11-20 14:03:23 +00:00
Simon Pilgrim
beecd7c52e Fix comment typos. NFC.
Identified by Pedro Giffuni in PR27636.

llvm-svn: 287490
2016-11-20 13:47:59 +00:00
Simon Pilgrim
467ec60244 Fix spelling mistakes in Tools/Tests comments. NFC.
Identified by Pedro Giffuni in PR27636.

llvm-svn: 287489
2016-11-20 13:31:13 +00:00
Simon Pilgrim
e116136251 Fix spelling mistakes in Transforms comments. NFC.
Identified by Pedro Giffuni in PR27636.

llvm-svn: 287488
2016-11-20 13:19:49 +00:00
Simon Pilgrim
148a658151 Fix spelling mistakes in SelectionDAG comments. NFC.
Identified by Pedro Giffuni in PR27636.

llvm-svn: 287487
2016-11-20 13:14:57 +00:00
Simon Pilgrim
070f334aac Fix comment typos. NFC.
Identified by Pedro Giffuni in PR27636.

llvm-svn: 287486
2016-11-20 13:10:51 +00:00
Oren Ben Simhon
bce4fd5213 [X86] RegCall - Handling long double arguments
The change is part of RegCall calling convention support for LLVM.
Long double (f80) requires special treatment as the first f80 parameter is saved in FP0 (floating point stack).
This review present the change and the corresponding tests.

Differential Revision: https://reviews.llvm.org/D26151

llvm-svn: 287485
2016-11-20 11:06:07 +00:00
Coby Tayree
c2ed22c0ae [X86][InlineAsm]Test commit.
Fixing a wrong comment on X86AsmParser.cpp::ParseZ: "true" --> "false"

Differential Revision: https://reviews.llvm.org/D26797

llvm-svn: 287484
2016-11-20 09:31:11 +00:00
Serge Pavlov
6f8fcd92a7 Fix file name resolution in nested response files
If a response file in construct `@file` was specified by relative name,
constructs `@file` nested within it were resolved incorrectly if the
flag RelativeNames in call to ExpandResponseFile was set to true.
This feature is used in configuration files, tests for it are in
respective change (see D24933).

llvm-svn: 287482
2016-11-20 06:25:07 +00:00
Saleem Abdulrasool
b58305138d ExceptionDemo: remove some undefined behaviour
The casting based reading of the LSDA could attempt to read unsuitably aligned
data.  Avoid that case by explicitly using a memcpy.  A similar approach is used
in libc++abi to address the same UB.

llvm-svn: 287479
2016-11-20 02:36:38 +00:00
Saleem Abdulrasool
6d168b5903 ExceptionDemo: prefer headers over redeclarations
Rather than redeclaring the interfaces for exceptions, prefer using the
`unwind.h` header.  This is vended by at least gcc and clang, and can also be
found by an external unwinding library (e.g. libunwind).  Doing this simplifies
the example to the exception handling itself.  Minor tweaks are the result of
_Unwind_Context_t not being defined, which is just a typedef for struct
_Unwind_Context *.  NFC.

llvm-svn: 287478
2016-11-20 02:36:36 +00:00
Alexei Starovoitov
79a4851ba7 [bpf] add BPF disassembler
add BPF disassembler, so tools like llvm-objdump can be used:
$ llvm-objdump -d -no-show-raw-insn ./sockex1_kern.o

./sockex1_kern.o:	file format ELF64-BPF

Disassembly of section socket1:
bpf_prog1:
       0:	r6 = r1
       8:	r0 = *(u8 *)skb[23]
      10:	*(u32 *)(r10 - 4) = r0
      18:	r1 = *(u32 *)(r6 + 4)
      20:	if r1 != 4 goto 8
      28:	r2 = r10
      30:	r2 += -4

ld_imm64 (the only 16-byte insn) and special ld_abs/ld_ind instructions
had to be treated in a special way. The decoders for the rest of the insns
are automatically generated.

Add tests to cover new functionality.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 287477
2016-11-20 02:25:00 +00:00
Rui Ueyama
146760b33e Attempt to fix big-endian buildbots.
llvm-svn: 287476
2016-11-20 01:41:28 +00:00
Rui Ueyama
f5edbb07b3 Style fix. NFC.
llvm-svn: 287475
2016-11-20 01:15:56 +00:00
Rui Ueyama
4a5010a15e Fix buildbot.
llvm-svn: 287474
2016-11-20 01:13:22 +00:00
Rui Ueyama
4ca3b2a6d2 SHA1: unroll loop in hashBlock.
This code is taken from public domain.
https://github.com/jsonn/src/blob/trunk/common/lib/libc/hash/sha1/sha1.c

I wrote a sha1 command and ran it on my Xeon E5-2680 v2 2.80GHz machine.
Here is a result. The new hash function is 37% faster than before.

 Performance counter stats for './llvm-sha1-old /ssd/build/bin/lld' (10 runs):

       6640.503687 task-clock (msec)         #    1.001 CPUs utilized            ( +-  0.03% )
                54 context-switches          #    0.008 K/sec                    ( +-  5.03% )
                 5 cpu-migrations            #    0.001 K/sec                    ( +- 31.73% )
           183,803 page-faults               #    0.028 M/sec                    ( +-  0.00% )
    18,527,954,113 cycles                    #    2.790 GHz                      ( +-  0.03% )
     4,993,237,485 stalled-cycles-frontend   #   26.95% frontend cycles idle     ( +-  0.11% )
   <not supported> stalled-cycles-backend
    50,217,149,423 instructions              #    2.71  insns per cycle
                                             #    0.10  stalled cycles per insn  ( +-  0.00% )
     6,094,322,337 branches                  #  917.750 M/sec                    ( +-  0.00% )
        11,778,239 branch-misses             #    0.19% of all branches          ( +-  0.01% )

       6.634017401 seconds time elapsed                                          ( +-  0.03% )

 Performance counter stats for './llvm-sha1-new /ssd/build/bin/lld' (10 runs):

       4167.062720 task-clock (msec)         #    1.001 CPUs utilized            ( +-  0.02% )
                52 context-switches          #    0.012 K/sec                    ( +- 16.45% )
                 7 cpu-migrations            #    0.002 K/sec                    ( +- 32.20% )
           183,804 page-faults               #    0.044 M/sec                    ( +-  0.00% )
    11,626,611,958 cycles                    #    2.790 GHz                      ( +-  0.02% )
     4,491,897,976 stalled-cycles-frontend   #   38.63% frontend cycles idle     ( +-  0.05% )
   <not supported> stalled-cycles-backend
    24,320,180,617 instructions              #    2.09  insns per cycle
                                             #    0.18  stalled cycles per insn  ( +-  0.00% )
     1,574,674,576 branches                  #  377.886 M/sec                    ( +-  0.00% )
        11,769,693 branch-misses             #    0.75% of all branches          ( +-  0.00% )

       4.163251552 seconds time elapsed                                          ( +-  0.02% )

Differential Revision: https://reviews.llvm.org/D26890

llvm-svn: 287473
2016-11-20 01:03:22 +00:00
Saleem Abdulrasool
b1c3bc90b5 Demangle: remove references to allocator for default allocator
The demangler had stopped using a custom allocator but had not been updated to
remove the use of the explicit allocator passing.  This removes that as we do
not need to do anything special here anymore.  This just makes the code more
compact.  NFC.

llvm-svn: 287472
2016-11-20 00:20:27 +00:00
Saleem Abdulrasool
7526310875 Demangle: remove unnecessary typedef for std::vector
We could create a local typedef for std::vector called Vector.  Inline the use
of std::vector rather than use the typedef.  NFC.

llvm-svn: 287471
2016-11-20 00:20:25 +00:00
Saleem Abdulrasool
2550fd9efb Demangle: replace custom typedef for std::string with std::string
We created a local typedef for `std::basic_string<char, std::char_traits<char>>`
which is just `std::string`.  Remove the local typedef and propagate the type
information through the rest of the demangler.  NFC.

llvm-svn: 287470
2016-11-20 00:20:23 +00:00
Saleem Abdulrasool
e1c07c2b23 Demangle: use direct member initialization (NFC)
Prefer direct member initialization over the explicit out-of-line initialization
for the construction of the local type.  NFC.

llvm-svn: 287469
2016-11-20 00:20:20 +00:00
Benjamin Kramer
7cad84ff09 Give some helper classes/functions internal linkage. NFC.
llvm-svn: 287462
2016-11-19 20:44:26 +00:00
Simon Pilgrim
04cf2ff75c [X86][SSE] Improve PSHUFB lowering from either input
Canonicalization may leave the zeroable vector in the first input.

llvm-svn: 287461
2016-11-19 20:41:48 +00:00