1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00
Commit Graph

40679 Commits

Author SHA1 Message Date
Matthias Braun
4741c1e622 ScheduleDAGInstrs: Add condjump deps to addSchedBarrierDeps()
addSchedBarrierDeps() is supposed to add use operands to the ExitSU
node. The current implementation adds uses for calls/barrier instruction
and the MBB live-outs in all other cases. The use
operands of conditional jump instructions were missed.

Also added code to macrofusion to set the latencies between nodes to
zero to avoid problems with the fusing nodes lingering around in the
pending list now.

Differential Revision: https://reviews.llvm.org/D25140

llvm-svn: 286544
2016-11-11 01:34:21 +00:00
Stanislav Mekhanoshin
b36f36e6c4 Revert "[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies"
This reverts commit r286171, it breaks piglit test fs-discard-exit-2

llvm-svn: 286530
2016-11-11 00:22:34 +00:00
Matthias Braun
2a033e8287 ScheduleDAGInstrs: Ignore dependencies of constant physregs
There is no need to track dependencies for constant physregs, as they
don't change their value no matter in what order you read/write to them.

Differential Revision: https://reviews.llvm.org/D26221

llvm-svn: 286526
2016-11-10 23:46:44 +00:00
Simon Pilgrim
152021163f [SelectionDAG] Add support for vector demandedelts in ADD/SUB opcodes
llvm-svn: 286516
2016-11-10 22:41:49 +00:00
Justin Lebar
8980b4a3ca [LSR] Tweak loop-strength-reduce-crash test. Test-only change.
Run opt instead of llc, and update the comment.

llvm-svn: 286515
2016-11-10 22:37:13 +00:00
Peter Collingbourne
fbb7ea5270 IR: Introduce inrange attribute on getelementptr indices.
If the inrange keyword is present before any index, loading from or
storing to any pointer derived from the getelementptr has undefined
behavior if the load or store would access memory outside of the bounds of
the element selected by the index marked as inrange.

This can be used, e.g. for alias analysis or to split globals at element
boundaries where beneficial.

As previously proposed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-July/102472.html

Differential Revision: https://reviews.llvm.org/D22793

llvm-svn: 286514
2016-11-10 22:34:55 +00:00
Simon Pilgrim
cba269ed0b [X86] Updated knownbits vector ADD/SUB test
In preparation for demandedelts support

llvm-svn: 286513
2016-11-10 22:34:12 +00:00
Simon Pilgrim
00dc794ee9 [X86] Add knownbits vector ADD test
llvm-svn: 286511
2016-11-10 22:21:04 +00:00
Simon Pilgrim
f01009c55d [SelectionDAG] Add support for splatted vectors in SUB opcode
llvm-svn: 286509
2016-11-10 21:57:42 +00:00
Simon Pilgrim
1cff91326f [X86] Add knownbits vector SUB test
llvm-svn: 286508
2016-11-10 21:50:23 +00:00
Matthias Braun
7ae7138a52 RegisterCoalescer: Ignore interferences for constant physregs
When copying to/from a constant register interferences can be ignored.

Also update the documentation for isConstantPhysReg() to make it more
obvious that this transformation is valid.

Differential Revision: https://reviews.llvm.org/D26106

llvm-svn: 286503
2016-11-10 21:22:47 +00:00
Yaxun Liu
a4fdadbaa3 AMDGPU: Emit runtime metadata as a note element in .note section
Currently runtime metadata is emitted as an ELF section with name .AMDGPU.runtime_metadata.

However there is a standard way to convey vendor specific information about how to run an ELF binary, which is called vendor-specific note element (http://www.netbsd.org/docs/kernel/elf-notes.html).

This patch lets AMDGPU backend emits runtime metadata as a note element in .note section.

Differential Revision: https://reviews.llvm.org/D25781

llvm-svn: 286502
2016-11-10 21:18:49 +00:00
Adam Nemet
f8a50d283f [OptDiag] Remove non-printable chars from function name
The r283656 did this in the remark arguments.  We also need to do this
in the main function attribute as that is written to YAML as well.

llvm-svn: 286482
2016-11-10 17:47:03 +00:00
Simon Pilgrim
9b52058816 [SelectionDAG] Add support for vector demandedelts in TRUNCATE opcodes
llvm-svn: 286481
2016-11-10 17:43:52 +00:00
Simon Pilgrim
97b59349cb [X86] Add knownbits vector TRUNC test
In preparation for demandedelts support

llvm-svn: 286477
2016-11-10 17:24:33 +00:00
Teresa Johnson
db6785496f Restore part of "[ThinLTO] Prevent exporting of locals used/defined in module level asm"
This restores the part of r286297 that didn't require adding a
dependency from the Analysis to Object library. There are two parts
to the original fix, and this will address the handling for the case
where locals are used in module level asm.

The part that requires functionality in libObject handles local defs
in module level asm, and was reverted because our downstream build
of clang builds lib/Bitcode into a single library, and this new
dependency introduced a cycle there. I am trying to get that fixed
(see D26502), so for now that change isn't being restored

llvm-svn: 286475
2016-11-10 16:57:32 +00:00
Simon Pilgrim
c228feab22 [SelectionDAG] Add support for vector demandedelts in MUL opcodes
llvm-svn: 286471
2016-11-10 16:27:42 +00:00
Asaf Badouh
d3240c52be reproducer for pr29002
https://reviews.llvm.org/D26449

llvm-svn: 286470
2016-11-10 16:27:27 +00:00
Tom Stellard
fca8e2011d AMDGPU: Add VI i16 support
Patch By: Wei Ding

Differential Revision: https://reviews.llvm.org/D18049

llvm-svn: 286464
2016-11-10 16:02:37 +00:00
Simon Pilgrim
d2824aa2f7 [X86] Add knownbits vector MUL test
In preparation for demandedelts support

llvm-svn: 286463
2016-11-10 15:57:33 +00:00
Simon Pilgrim
d9fcf3c063 [SelectionDAG] Add support for vector demandedelts in SRA opcodes
llvm-svn: 286461
2016-11-10 15:05:09 +00:00
Sanjay Patel
d261791644 [InstCombine] auto-generate better checks; NFC
Note that the existing metadata checking was re-added by hand because the 
script doesn't currently know how to generate checks for lines outside of 
functions.

llvm-svn: 286460
2016-11-10 14:58:17 +00:00
Simon Pilgrim
502e3f027d [X86] Add knownbits vector arithmetic shift test
In preparation for demandedelts support

llvm-svn: 286457
2016-11-10 14:46:24 +00:00
Simon Pilgrim
3091f96ce6 [DAGCombiner] Correctly extract the ConstOrConstSplat shift value for SHL nodes
We were failing to extract a constant splat shift value if the shifted value was being masked.

The (shl (and (setcc) N01CV) N1CV) -> (and (setcc) N01CV<<N1CV) combine was unnecessarily preventing this.

llvm-svn: 286454
2016-11-10 14:35:09 +00:00
Chad Rosier
1bd36b2600 Remove unnecessary check prefix directives. NFC.
llvm-svn: 286453
2016-11-10 14:28:44 +00:00
Simon Pilgrim
5a36ec9d38 [DAGCombiner] Show missed opportunity to UNDEF out-of-range SHL
Fails to match constant shift value due to presence of AND mask.

llvm-svn: 286452
2016-11-10 14:19:45 +00:00
Tobias Grosser
68e42a688d [RegionInfo] Add three tests that include infinite loops
These examples are variations that were inspired from a small subgraph taken
from paper.ll which are interesting as they show certain issues with infinite
loops.

llvm-svn: 286450
2016-11-10 13:56:19 +00:00
Simon Pilgrim
7f153a9a3d [SelectionDAG] Add support for vector demandedelts in SHL/SRL opcodes
llvm-svn: 286448
2016-11-10 13:52:42 +00:00
Simon Pilgrim
f00c22daa1 [X86] Add knownbits vector logical shift test
In preparation for demandedelts support

llvm-svn: 286447
2016-11-10 13:34:17 +00:00
Oliver Stannard
e6a1e88b9b [ARM] Thumb2 LDR (literal) should accept PC as the destination
The version of this instruction with the .w suffix already correctly accepts
this, but the alias without the .w did not.

Differential Revision: https://reviews.llvm.org/D26499

llvm-svn: 286446
2016-11-10 13:20:41 +00:00
Craig Topper
8969a9fe8d [AVX-512] Allow legacy cvtpd2dq intrinsics to select EVEX encoded instruction when available.
llvm-svn: 286435
2016-11-10 07:47:17 +00:00
Craig Topper
996de74ecd [AVX-512][X86] Convert avx_cvtt_ps2dq_256 and sse2_cvttps2dq intrinsics to ISD::FP_TO_SINT in the intrinsics table and delete patterns. While nearby also move CVTDQ2PS patterns into their instructions.
This allows these intrinsics to also use EVEX instructons.

llvm-svn: 286434
2016-11-10 07:24:52 +00:00
Craig Topper
6a3572c8b3 [X86] Convert int_x86_avx_cvtt_pd2dq_256 to fp_to_sint using the intrinsics table. Removes extra patterns and allows legacy intrinsic to select EVEX encoded instructions when available.
llvm-svn: 286433
2016-11-10 06:45:39 +00:00
Craig Topper
3bc846bf9f [AVX-512] Add test cases to show missed opportunities for using VALIGND/Q to handle shuffles.
llvm-svn: 286425
2016-11-10 03:39:19 +00:00
Sanjay Patel
3a42ddfbd2 [InstCombine] avoid infinite loop from shuffle-extract-insert sequence (PR30923)
Removing the limitation in visitInsertElementInst() causes several regressions
because we're not prepared to fold sequences of shuffles or inserts and extracts
separated by shuffles. Fixing that appears to be a difficult mission because we
are purposely trying to avoid creating shuffles with arbitrary shuffle masks
because some targets may choke on those.

https://llvm.org/bugs/show_bug.cgi?id=30923

llvm-svn: 286423
2016-11-10 00:15:14 +00:00
Peter Collingbourne
53c709eaaf Re-apply r286384, "X86: Introduce the "relocImm" ComplexPattern, which represents a relocatable immediate.", with a fix for 32-bit x86.
Teach X86InstrInfo::analyzeCompare() not to crash on CMP and SUB instructions
that take a global address operand.

llvm-svn: 286420
2016-11-09 23:53:43 +00:00
Dylan McKay
391ac08575 [AVR] Add a selection of CodeGen tests
Summary: This adds all of the CodeGen tests which currently pass.

Reviewers: arsenm, kparzysz

Subscribers: japaric, wdng

Differential Revision: https://reviews.llvm.org/D26388

llvm-svn: 286418
2016-11-09 23:46:52 +00:00
Dylan McKay
6faa0c3ed4 [AVR] Add all of the machine code test suite
Summary: This adds all of the AVR machine code tests.

Reviewers: arsenm, kparzysz

Subscribers: wdng, japaric

Differential Revision: https://reviews.llvm.org/D26387

llvm-svn: 286417
2016-11-09 23:46:25 +00:00
Tim Northover
bf0daf0392 GlobalISel: translate invoke and landingpad instructions
Pretty bare-bones support for exception handling (no weird MSVC stuff, no SjLj
etc), but it should get things going.

llvm-svn: 286407
2016-11-09 22:39:54 +00:00
Dehao Chen
d5e59904e2 Update vectorization debug info unittest.
Summary:
The change will test the change in r286159.
The idea behind the change: Make the dbg location different between loop header and preheader/exit. Originally, dbg location 21 exists in 3 BBs: preheader, header, critical edge (exit). Update the debug location of inside the loop header from !21 to !22 so that it will reflect the correct location.

Reviewers: probinson

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26428

llvm-svn: 286403
2016-11-09 22:25:19 +00:00
Sanjay Patel
137e15dbcb [InstCombine] regenerate checks; NFC
llvm-svn: 286402
2016-11-09 22:21:58 +00:00
Sanjay Patel
1c56e3588c [InstCombine] regenerate checks; NFC
llvm-svn: 286399
2016-11-09 21:41:34 +00:00
Krzysztof Parzyszek
4e7a3e05a1 [Hexagon] Separate Hexagon subreg indices for different register classes
For pairs of 32-bit registers: isub_lo, isub_hi.
For pairs of vector registers: vsub_lo, vsub_hi.

Add generic subreg indices: ps_sub_lo, ps_sub_hi, and a function
  HexagonRegisterInfo::getHexagonSubRegIndex(RegClass, GenericSubreg)
that returns the appropriate subreg index for RegClass.

llvm-svn: 286377
2016-11-09 16:19:08 +00:00
Krzysztof Parzyszek
b28daffca5 [Hexagon] Eliminate Insert4 pseudo-instruction, use combines instead
llvm-svn: 286368
2016-11-09 14:16:29 +00:00
Alexandros Lamprineas
8a98bf69b0 [ARM] Loop Strength Reduction crashes when targeting ARM or Thumb.
Scalar Evolution asserts when not all the operands of an Add Recurrence
Expression are loop invariants. Loop Strength Reduction should only
create affine Add Recurrences, so that both the start and the step of
the expression are loop invariants.

Differential Revision: https://reviews.llvm.org/D26185

llvm-svn: 286347
2016-11-09 08:53:07 +00:00
Craig Topper
0c4245f530 [AVX-512] Add lowering to cvttpd2udq/cvttps2udq for fptoui v2f64/2f32 to 2i32
This patch adds support for fptoui to 2i32 from both 2f64 and 2f32, building on Simon's change for the signed version in r284459 and using AVX-512 instructions.

If we don't have VLX support we need to use a 512-bit operation for v2f64->v2i32 and extract the result.

It also recognises that cvttpd2udq zeroes the upper 64-bits of the xmm result.

Differential Revision: https://reviews.llvm.org/D26331

llvm-svn: 286345
2016-11-09 07:48:51 +00:00
Craig Topper
3648078183 [X86] Lower AVX512 and SSE intrinsics for CVTTPD2DQ to X86ISD::CVTTPD2DQ.
Summary: This allows the SSE intrinsic to use the EVEX instruction when available. It also fixes EVEX to not use a weird (v4i32 (fp_to_sint v2f64)) node and it merges some isel patterns. This also fixes some cases that weren't combining vzmovl with cvttpd2dq to remove extra moves.

Reviewers: delena, zvi, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26330

llvm-svn: 286344
2016-11-09 07:31:32 +00:00
Craig Topper
8625109c6d [AVX-512] Add more varied alignments to tests for storing the lower 128-bits of a 256 or 512-bit subvector extract.
llvm-svn: 286343
2016-11-09 05:38:47 +00:00
Craig Topper
e377fc59db [AVX-512] Use alignedstore256 in patterns that look for stores of the lower 256-bits of a 512-bit vector to use a 256-bit aligned store.
Previously we were only checking for 16 byte alignment instead of 32 byte alignment. Fixes PR30947.

llvm-svn: 286342
2016-11-09 05:31:57 +00:00
Craig Topper
1832cf469b [AVX-512] Add test cases to demonstrate PR30947. We accidentally use 32 byte aligned store instructions when the original store was only 16 byte aligned if the store is from the lower bits of a subvector extract.
llvm-svn: 286341
2016-11-09 05:31:53 +00:00