1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

50717 Commits

Author SHA1 Message Date
Mandeep Singh Grang
553513010b [AArch64] Fix unused variable [NFC]
llvm-svn: 352940
2019-02-01 23:42:34 +00:00
Yonghong Song
edc0ffa9fb [BPF] [BTF] Process FileName with absolute path correctly
In IR, sometimes the following attributes for DIFile may be
generated:
  filename: /home/yhs/test.c
  directory: /tmp
The /tmp may represent the working directory of the compilation
process.

In such cases, since filename is with absolute path,
the directory should be ignored by BTF. The filename alone is
enough to get the source.

Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Yonghong Song <yhs@fb.com>
llvm-svn: 352939
2019-02-01 23:23:17 +00:00
Dan Gohman
d1a9210e21 [WebAssembly] Add codegen support for the import_field attribute
This adds the LLVM side of https://reviews.llvm.org/D57602 -- the
import_field attribute. See that patch for details.

Differential Revision: https://reviews.llvm.org/D57603

llvm-svn: 352931
2019-02-01 22:27:34 +00:00
Mandeep Singh Grang
9cf3ca95c2 [COFF, ARM64] Fix localaddress to handle stack realignment and variable size objects
Summary: This fixes using the correct stack registers for SEH when stack realignment is needed or when variable size objects are present.

Reviewers: rnk, efriedma, ssijaric, TomTan

Reviewed By: rnk, efriedma

Subscribers: javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D57183

llvm-svn: 352923
2019-02-01 21:41:33 +00:00
Simon Pilgrim
6a4d664a21 [X86][AVX] Add VMOVDDUP-VPBROADCASTQ execution domain mapping
Noticed in D57514.

Differential Revision: https://reviews.llvm.org/D57519

llvm-svn: 352922
2019-02-01 21:41:30 +00:00
James Y Knight
d34c1cbe9e [opaque pointer types] Pass value type to GetElementPtr creation.
This cleans up all GetElementPtr creation in LLVM to explicitly pass a
value type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57173

llvm-svn: 352913
2019-02-01 20:44:47 +00:00
James Y Knight
c8b30de05f [opaque pointer types] Pass value type to LoadInst creation.
This cleans up all LoadInst creation in LLVM to explicitly pass the
value type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57172

llvm-svn: 352911
2019-02-01 20:44:24 +00:00
James Y Knight
31a2057127 [opaque pointer types] Pass function types to CallInst creation.
This cleans up all CallInst creation in LLVM to explicitly pass a
function type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57170

llvm-svn: 352909
2019-02-01 20:43:25 +00:00
Roland Froese
02a8e4b973 test commit (add blank line) NFC
llvm-svn: 352897
2019-02-01 18:55:43 +00:00
Tim Corringham
4837270633 [AMDGPU] Fix for vector element insertion
Summary:
Incorrect code was generated when lowering insertelement operations
for vectors with 8 or 16 bit elements.  The value being inserted was
not adjusted for the position of the element within the 32 bit word
and so only the low element within each 32 bit word could receive
the intended value.

Fixed by simply replicating the value to each element of a
congruent vector before the mask and or operation used to
update the intended element.

A number of affected LIT tests have been updated appropriately.

before the mask & or into the intended

Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: llvm-commits, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57588

llvm-svn: 352885
2019-02-01 16:51:09 +00:00
Simon Pilgrim
b6f67a1dea [X86][SSE] Use PSLLDQ/PSRLDQ to mask out zeroable ends of a shuffle
As suggested on PR40318, this patch uses PSLLDQ/PSRLDQ to lower shuffles to zero out the ends of a vector, leaving a sequential inner section.

For pre-SSSE3 we do this for shuffles with zeros at either end (requiring up to 3 shifts), but once PSHUFB is available I've limited this to shuffles with a single zeroable end (2 shifts).

Differential Revision: https://reviews.llvm.org/D56784

llvm-svn: 352883
2019-02-01 16:02:12 +00:00
Simon Pilgrim
de708f7472 [X86][AVX] Combine INSERT_SUBVECTOR(SRC0, BITCAST(SHUFFLE(EXTRACT_SUBVECTOR(SRC1)))
Enable peeking through one use bitcasts to the subvector shuffle.

This still depends on the subvector being the same scalar-size but D57514 has already helped with the more tricky patterns

llvm-svn: 352879
2019-02-01 15:31:01 +00:00
Adhemerval Zanella
36b7b3c0fa [AArch64] Optimize floating point materialization
This patch changes isFPImmLegal to return if the value can be enconded
as the immediate operand of a logical instruction besides checking if
for immediate field for fmov.

This optimizes some floating point materization, inclusive values
used on isinf lowering.

Reviewed By: rengolin, efriedma, evandro

Differential Revision: https://reviews.llvm.org/D57044

llvm-svn: 352866
2019-02-01 12:26:06 +00:00
Roman Lebedev
8909e74bb8 [X86][BdVer2] Transfer delays from the integer to the floating point unit.
Summary:
I'm unable to find this number in the "AMD SOG for family 15h".
llvm-exegesis measures the latencies of these instructions as `2`,
which matches the latencies specified in "AMD SOG for family 15h".

However if we look at Agner, Microarchitecture, "AMD Bulldozer, Piledriver,
Steamroller and Excavator pipeline", "Data delay between different execution
domains", the int->ivec transfer is listed as `8`..`10`cy of additional latency.

Also, Agner's "Instruction tables", for Piledriver, lists their latencies as `12`,
which is consistent with `2cy` from exegesis / AMD SOG + `10cy` transfer delay.

Additional data point comes from the fact that Agner's "Instruction tables",
for Jaguar, lists their latencies as `8`; and "AMD SOG for family 16h" does
state the `+6cy` int->ivec delay, which is consistent with instr latency of `1` or `2`.

Reviewers: andreadb, RKSimon, craig.topper

Reviewed By: andreadb

Subscribers: gbedwell, courbet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57300

llvm-svn: 352861
2019-02-01 11:15:13 +00:00
Yevgeny Rouban
fb40fa7cec Provide reason messages for unviable inlining
InlineCost's isInlineViable() is changed to return InlineResult
instead of bool. This provides messages for failure reasons and
allows to get more specific messages for cases where callsites
are not viable for inlining.

Reviewed By: xbolva00, anemet

Differential Revision: https://reviews.llvm.org/D57089

llvm-svn: 352849
2019-02-01 10:44:43 +00:00
Alex Bradbury
04689d88a7 [RISCV] Implement RV64D codegen
This patch:
* Adds necessary RV64D codegen patterns
* Modifies CC_RISCV so it will properly handle f64 types (with soft float ABI)

Note that in general there is no reason to try to select fcvt.w[u].d rather than fcvt.l[u].d for i32 conversions because fptosi/fptoui produce poison if the input won't fit into the target type.

Differential Revision: https://reviews.llvm.org/D53237

llvm-svn: 352833
2019-02-01 03:53:30 +00:00
James Y Knight
846be29e5e [opaque pointer types] Add a FunctionCallee wrapper type, and use it.
Recommit r352791 after tweaking DerivedTypes.h slightly, so that gcc
doesn't choke on it, hopefully.

Original Message:
The FunctionCallee type is effectively a {FunctionType*,Value*} pair,
and is a useful convenience to enable code to continue passing the
result of getOrInsertFunction() through to EmitCall, even once pointer
types lose their pointee-type.

Then:
- update the CallInst/InvokeInst instruction creation functions to
  take a Callee,
- modify getOrInsertFunction to return FunctionCallee, and
- update all callers appropriately.

One area of particular note is the change to the sanitizer
code. Previously, they had been casting the result of
`getOrInsertFunction` to a `Function*` via
`checkSanitizerInterfaceFunction`, and storing that. That would report
an error if someone had already inserted a function declaraction with
a mismatching signature.

However, in general, LLVM allows for such mismatches, as
`getOrInsertFunction` will automatically insert a bitcast if
needed. As part of this cleanup, cause the sanitizer code to do the
same. (It will call its functions using the expected signature,
however they may have been declared.)

Finally, in a small number of locations, callers of
`getOrInsertFunction` actually were expecting/requiring that a brand
new function was being created. In such cases, I've switched them to
Function::Create instead.

Differential Revision: https://reviews.llvm.org/D57315

llvm-svn: 352827
2019-02-01 02:28:03 +00:00
Thomas Lively
7beb8d9be8 [WebAssembly] Fix a regression selecting negative build_vector lanes
Summary:
The custom lowering introduced in rL352592 creates build_vector nodes
with negative i32 operands, but these operands did not meet the value
range constraints necessary to match build_vector nodes. This CL fixes
the issue by removing the unnecessary constraints.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish

Differential Revision: https://reviews.llvm.org/D57481

llvm-svn: 352813
2019-01-31 23:22:39 +00:00
Alex Bradbury
c16c6fb506 [RISCV] Add RV64F codegen support
This requires a little extra work due tothe fact i32 is not a legal type. When
call lowering happens post-legalisation (e.g. when an intrinsic was inserted
during legalisation). A bitcast from f32 to i32 can't be introduced. This is
similar to the challenges with RV32D. To handle this, we introduce
target-specific DAG nodes that perform bitcast+anyext for f32->i64 and
trunc+bitcast for i64->f32.

Differential Revision: https://reviews.llvm.org/D53235

llvm-svn: 352807
2019-01-31 22:48:38 +00:00
Richard Trieu
3622e4a6fe [Hexagon] Rename textually included file from .h to .inc
llvm-svn: 352802
2019-01-31 21:58:42 +00:00
James Y Knight
06da6dcca4 Revert "[opaque pointer types] Add a FunctionCallee wrapper type, and use it."
This reverts commit f47d6b38c7a61d50db4566b02719de05492dcef1 (r352791).

Seems to run into compilation failures with GCC (but not clang, where
I tested it). Reverting while I investigate.

llvm-svn: 352800
2019-01-31 21:51:58 +00:00
Thomas Lively
630700f768 [WebAssembly] Add bulk memory target feature
Summary: Also clean up some preexisting target feature code.

Reviewers: aheejin

Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, jfb

Differential Revision: https://reviews.llvm.org/D57495

llvm-svn: 352793
2019-01-31 21:02:19 +00:00
James Y Knight
fa51e33345 [opaque pointer types] Add a FunctionCallee wrapper type, and use it.
The FunctionCallee type is effectively a {FunctionType*,Value*} pair,
and is a useful convenience to enable code to continue passing the
result of getOrInsertFunction() through to EmitCall, even once pointer
types lose their pointee-type.

Then:
- update the CallInst/InvokeInst instruction creation functions to
  take a Callee,
- modify getOrInsertFunction to return FunctionCallee, and
- update all callers appropriately.

One area of particular note is the change to the sanitizer
code. Previously, they had been casting the result of
`getOrInsertFunction` to a `Function*` via
`checkSanitizerInterfaceFunction`, and storing that. That would report
an error if someone had already inserted a function declaraction with
a mismatching signature.

However, in general, LLVM allows for such mismatches, as
`getOrInsertFunction` will automatically insert a bitcast if
needed. As part of this cleanup, cause the sanitizer code to do the
same. (It will call its functions using the expected signature,
however they may have been declared.)

Finally, in a small number of locations, callers of
`getOrInsertFunction` actually were expecting/requiring that a brand
new function was being created. In such cases, I've switched them to
Function::Create instead.

Differential Revision: https://reviews.llvm.org/D57315

llvm-svn: 352791
2019-01-31 20:35:56 +00:00
Nirav Dave
ec1c381c57 [DAG][SystemZ] Define unwrapAddress for PCREL_WRAPPER.
Summary:
Like with X86, this allows better DAG-level alias analysis and
alignment inference for wrapped addresses.

Reviewers: jonpa, uweigand

Reviewed By: uweigand

Subscribers: hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D57407

llvm-svn: 352786
2019-01-31 19:58:34 +00:00
Craig Topper
9394787051 Revert "[X86] Mark EMMS and FEMMS as clobbering MM0-7 and ST0-7."
This is causing a failure in chromium

llvm-svn: 352782
2019-01-31 19:05:22 +00:00
Simon Pilgrim
7f74c780bd Trim trailing whitespace. NFCI.
llvm-svn: 352775
2019-01-31 17:49:25 +00:00
Simon Pilgrim
d530a336e7 [X86][AVX] Fold concat(broadcast(x),broadcast(x)) -> broadcast(x)
Differential Revision: https://reviews.llvm.org/D57514

llvm-svn: 352774
2019-01-31 17:48:35 +00:00
Simon Pilgrim
1a3ae748a1 [X86][AVX] insert_subvector(bitcast(v), bitcast(s), c1) -> bitcast(insert_subvector(v,s,c2))
Similar to what we already do in DAGCombiner, but this version also handles bitcasts from types with different scalar sizes, which x86 is better at handling.

Differential Revision: https://reviews.llvm.org/D57514

llvm-svn: 352773
2019-01-31 17:38:10 +00:00
Simon Pilgrim
c8d3ba7d8f [X86][AVX] Fold broadcast(bitcast(src)) -> bitcast(broadcast(src))
llvm-svn: 352751
2019-01-31 14:04:07 +00:00
Simon Pilgrim
8d677e1ebc [X86] combineExtractWithShuffle - more aggressively peek through bitcasts
Fixes regression introduced by rL352743

llvm-svn: 352745
2019-01-31 11:55:30 +00:00
Simon Pilgrim
2fd0f78147 [X86][AVX] Enable AVX1 broadcasts in shuffle combining
Enables 32/64-bit scalar load broadcasts on AVX1 targets

The extractelement-load.ll regression will be fixed shortly in a followup commit.

llvm-svn: 352743
2019-01-31 11:41:10 +00:00
Simon Pilgrim
4fa00a57a9 [X86][AVX] Fold vt1 concat_vectors(vt2 undef, vt2 broadcast(x)) --> vt1 broadcast(x)
If we're not inserting the broadcast into the lowest subvector then we can avoid the insertion by just performing a larger broadcast.

Avoids a regression when we enable AVX1 broadcasts in shuffle combining

llvm-svn: 352742
2019-01-31 11:15:05 +00:00
Sjoerd Meijer
2ef8508947 [ARM] Thumb2: ConstantMaterializationCost
Constants can also be materialised using the negated value and a MVN, and this
case seem to have been missed for Thumb2. To check the constant materialisation
costs, we now call getT2SOImmVal twice, once for the original constant and then
also for its negated value, and this function checks if the constant can both
be splatted or rotated.

This was revealed by a test that optimises for minsize: instead of a LDR
literal pool load and having a literal pool entry, just a MVN with an immediate
is smaller (and also faster).

Differential Revision: https://reviews.llvm.org/D57327

llvm-svn: 352737
2019-01-31 08:38:06 +00:00
Sjoerd Meijer
068d715728 [SelectionDAG] Codesize: don't expand SHIFT to SHIFT_PARTS
And instead just generate a libcall. My motivating example on ARM was a simple:
  
  shl i64 %A, %B

for which the code bloat is quite significant. For other targets that also
accept __int128/i128 such as AArch64 and X86, it is also beneficial for these
cases to generate a libcall when optimising for minsize. On these 64-bit targets,
the 64-bits shifts are of course unaffected because the SHIFT/SHIFT_PARTS
lowering operation action is not set to custom/expand.

Differential Revision: https://reviews.llvm.org/D57386

llvm-svn: 352736
2019-01-31 08:07:30 +00:00
Matt Arsenault
d57348e6c2 GlobalISel: Handle odd splits in fewerElementsVector for load/store
llvm-svn: 352720
2019-01-31 02:46:05 +00:00
Matt Arsenault
1f33c5710d GlobalISel: Implement narrowScalar for bswap
llvm-svn: 352719
2019-01-31 02:34:03 +00:00
Matt Arsenault
51c3d3146e GlobalISel: Allow bitcount ops to have different result type
For AMDGPU the result is always 32-bit for 64-bit inputs.

llvm-svn: 352717
2019-01-31 02:09:57 +00:00
Matt Arsenault
1067bf29d3 GlobalISel: Fix creating MMOs with align 0
llvm-svn: 352712
2019-01-31 01:38:47 +00:00
Craig Topper
74077920f5 [X86] Remove handling of ISD::INTRINSIC_WO_CHAIN in ReplaceNodeResults.
I believe this was there to handle avx512bw intrinsics that returned i64 type in 32-bit mode. But all those intrinsics have since been changed to v64i1 results or replaced with generic IR.

llvm-svn: 352698
2019-01-31 00:04:46 +00:00
Jessica Paquette
156cec86e5 [GlobalISel][AArch64] Select G_FEXP
This teaches the legalizer to handle G_FEXP in AArch64. As a result, it also
allows us to select G_FEXP.

It...

- Updates the legalizer-info tests
- Adds a test for legalizing exp
- Updates the existing fp tests to show that we can now select G_FEXP

https://reviews.llvm.org/D57483

llvm-svn: 352692
2019-01-30 23:46:15 +00:00
Chen Zheng
238fb80db4 [PowerPC] delete no more needed workaround for readsRegister() in PowerPC
Differential Revision: https://reviews.llvm.org/D57439

llvm-svn: 352689
2019-01-30 23:18:38 +00:00
Jessica Paquette
d46e179ece [GlobalISel][AArch64] Select G_FABS
This adds instruction selection support for G_FABS in AArch64. It also updates
the existing basic FP tests, adds a selection test for G_FABS.

https://reviews.llvm.org/D57418

llvm-svn: 352684
2019-01-30 22:54:21 +00:00
Heejin Ahn
39b0555c03 [WebAssembly] Restore stack pointer right after catch instruction
Summary:
After the staack is unwound due to a thrown exxception,
`__stack_pointer` global can point to an invalid address. So
a `global.set` to restore `__stack_pointer` should be inserted right
after `catch` instruction.

But after r352598 the `global.set` instruction is inserted not right
after `catch` but after `block` - `br-on-exn` - `end_block` -
`extract_exception` sequence. This CL fixes it.

While doing that, we can actually move ReplacePhysRegs pass after
LateEHPrepare and merge EHRestoreStackPointer pass into LateEHPrepare,
and now placing `global.set` to `__stack_pointer` right after `catch` is
much easier. Otherwise it is hard to guarantee that `global.set` is
still right after `catch` and not touched with other transformations, in
which case we have to do something to hoist it.

Reviewers: dschuff

Subscribers: mgorny, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D57421

llvm-svn: 352681
2019-01-30 22:44:45 +00:00
Jessica Paquette
d5349f419b [GlobalISel][AArch64] Add instruction selection support for @llvm.log2
This teaches GlobalISel to emit a RTLib call for @llvm.log2 when it encounters
it.

It updates the existing floating point tests to show that we don't fall back on
the intrinsic, and select the correct instructions. It also adds a legalizer
test for G_FLOG2.

https://reviews.llvm.org/D57357

llvm-svn: 352673
2019-01-30 21:16:04 +00:00
Jessica Paquette
d03d1c2ace [GlobalISel][AArch64] Add instruction selection support for @llvm.sqrt
This teaches the legalizer about G_FSQRT in AArch64. Also adds a legalizer
test for G_FSQRT, a selection test for it, and updates existing floating point
tests.

https://reviews.llvm.org/D57361

llvm-svn: 352671
2019-01-30 21:03:52 +00:00
Erik Pilkington
133b958621 Add a 'dynamic' parameter to the objectsize intrinsic
This is meant to be used with clang's __builtin_dynamic_object_size.
When 'true' is passed to this parameter, the intrinsic has the
potential to be folded into instructions that will be evaluated
at run time. When 'false', the objectsize intrinsic behaviour is
unchanged.

rdar://32212419

Differential revision: https://reviews.llvm.org/D56761

llvm-svn: 352664
2019-01-30 20:34:35 +00:00
Craig Topper
18b9e8a676 [X86] Mark EMMS and FEMMS as clobbering MM0-7 and ST0-7.
This fixes the test case in PR35982 by preventing MMX instructions that read MM0-7 from being moved below EMMS/FEMMS by the post RA scheduler.

Though as discussed in bugzilla, this is not a complete fix. There is still the possibility of reordering in IR or by the pre-RA scheduler.

Differential Revision: https://reviews.llvm.org/D57298

llvm-svn: 352660
2019-01-30 19:57:01 +00:00
Matt Arsenault
a49701a9f1 AMDGPU: Stop generating unused intrinsic .inc files
llvm-svn: 352635
2019-01-30 17:25:37 +00:00
Simon Pilgrim
e4faf9f948 [X86][AVX] Prefer to combine shuffle to broadcasts whenever possible
This is the first step towards improving broadcast support on AVX1 targets.

llvm-svn: 352634
2019-01-30 16:19:19 +00:00
Shiva Chen
13ee5ada79 [RISCV] Insert R_RISCV_ALIGN relocation type and Nops for code alignment when linker relaxation enabled
Linker relaxation may change code size. We need to fix up the alignment
of alignment directive in text section by inserting Nops and R_RISCV_ALIGN
relocation type. So then linker could satisfy the alignment by removing Nops.

To do this:

1. Add shouldInsertExtraNopBytesForCodeAlign target hook to calculate
   the Nops we need to insert.

2. Add shouldInsertFixupForCodeAlign target hook to insert
   R_RISCV_ALIGN fixup type.

Differential Revision: https://reviews.llvm.org/D47755

llvm-svn: 352616
2019-01-30 11:16:59 +00:00