1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00
Commit Graph

2748 Commits

Author SHA1 Message Date
Ranjeet Singh
e9e99b10c8 [ARM] fpscr read/write intrinsics not aware of each other
The intrinsics __builtin_arm_get_fpscr and __builtin_arm_set_fpscr read and
write to the fpscr (Floating-Point Status and Control Register) register.

A bug exists in the __builtin_arm_get_fpscr intrinsic definition in llvm which
treats this intrinsic as a IntroNoMem which means it's not a memory access and
doesn't have any other side-effects. Having this property on this intrinsic
means that various optimizations can be done on this such as common
sub-expression elimination with other reads. This can cause issues if there has
been write to this register, e.g.

void foo(int *p) {
     p[0] = __builtin_arm_get_fpscr();
     __builtin_arm_set_fpscr(1);
     p[1] = __builtin_arm_get_fpscr();
}

in the above example the second read is currently CSE'd into the first read,
this is because llvm isn't aware that the write done by __builtin_arm_set_fpscr
effects the same register that __builtin_arm_get_fpscr reads from, to fix this
problem I've removed the property IntrNoMem so that __builtin_arm_get_fpscr is
treated as a memory access.

Differential Revision: https://reviews.llvm.org/D30542

llvm-svn: 296865
2017-03-03 11:40:07 +00:00
Eugene Zelenko
ec780d0459 [Support] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 296714
2017-03-01 23:59:26 +00:00
Igor Laevsky
194b41ec66 [BasicAA] Take attributes into account when requesting modref info for a call site
Differential Revision: https://reviews.llvm.org/D29989

llvm-svn: 296617
2017-03-01 13:19:51 +00:00
Dehao Chen
356ba78c03 Add function importing info from samplepgo profile to the module summary.
Summary: For SamplePGO, the profile may contain cross-module inline stacks. As we need to make sure the profile annotation happens when all the hot inline stacks are expanded, we need to pass this info to the module importer so that it can import proper functions if necessary. This patch implemented this feature by emitting cross-module targets as part of function entry metadata. In the module-summary phase, the metadata is used to build call edges that points to functions need to be imported.

Reviewers: mehdi_amini, tejohnson

Reviewed By: tejohnson

Subscribers: davidxl, llvm-commits

Differential Revision: https://reviews.llvm.org/D30053

llvm-svn: 296498
2017-02-28 18:09:44 +00:00
David Bozier
faeb49676c [Stack Protection] Add diagnostic information for why stack protection was applied to a function
Stack Smash Protection is not completely free, so in hot code, the overhead it causes can cause performance issues. By adding diagnostic information for which functions have SSP and why, a user can quickly determine what they can do to stop SSP being applied to a specific hot function.

This change adds a remark that is reported by the stack protection code when an instruction or attribute is encountered that causes SSP to be applied.

Patch by: James Henderson

Differential Revision: https://reviews.llvm.org/D29023

llvm-svn: 296483
2017-02-28 16:02:37 +00:00
Chandler Carruth
f35bb1abdb [IR] Add range accessors for the indices of a GEP instruction.
These were noticed as missing in a code review. Add them and the boring
unit test to make sure they compile and DTRT.

llvm-svn: 296444
2017-02-28 08:04:20 +00:00
Matt Arsenault
3168453e6e AMDGPU: Basic folds for fmed3 intrinsic
Constant fold, canonicalize constants to RHS,
reduce to minnum/maxnum when inputs are nan/undef.

llvm-svn: 296409
2017-02-27 23:08:49 +00:00
Jan Vesely
ab021283cf AMDGPU/SI: export s_waitcnt builtin
Differential Revision: https://reviews.llvm.org/D30358

llvm-svn: 296228
2017-02-25 02:13:32 +00:00
Craig Topper
acb62567aa [AVX-512] Remove lzcnt intrinsics and autoupgrade them to generic ctlz intrinsics with select.
Clang has been emitting cltz intrinsics for a while now.

llvm-svn: 296091
2017-02-24 05:35:04 +00:00
Adam Nemet
9eb2f45ec5 [OptDiag] Comment about the legacy status of emitOptimizationRemark*
functions

llvm-svn: 296039
2017-02-23 23:11:23 +00:00
Adam Nemet
544af1e150 [OptDiag] Remove hotness parameter from legacy remark ctors
Anything using hotness should be using ORE.

llvm-svn: 296038
2017-02-23 23:11:21 +00:00
Adam Nemet
b4d60a3880 [OptDiag] Hide legacy remark ctors
These are only used when emitting remarks without ORE directly using the free
functions emitOptimizationRemark*.

llvm-svn: 296037
2017-02-23 23:11:11 +00:00
Sanjoy Das
c1d9ef40b5 [IR] Add a Instruction::dropPoisonGeneratingFlags helper
Summary:
The helper will be used in a later change.  This change itself is NFC
since the only user of this new function is its unit test.

Reviewers: majnemer, efriedma

Reviewed By: efriedma

Subscribers: efriedma, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D30184

llvm-svn: 296035
2017-02-23 22:50:52 +00:00
Ahmed Bougacha
c14f2beb64 [ORE] Use const CodeRegions in the remark diagnostics. NFC.
llvm-svn: 296008
2017-02-23 19:17:34 +00:00
Matt Arsenault
4452689e07 AMDGPU: Add replacement bfe intrinsics
llvm-svn: 295899
2017-02-22 23:04:58 +00:00
Daniel Berlin
130730ae28 Add pair conversion functions to BasicBlockEdge.
llvm-svn: 295888
2017-02-22 22:20:53 +00:00
Krzysztof Parzyszek
5a596770d3 [Hexagon] Add intrinsics for masked vector stores
Patch by Harsha Jagasia.

llvm-svn: 295879
2017-02-22 21:23:09 +00:00
Justin Bogner
dbcb2141ed OptDiag: Add const to some interfaces that don't modify anything. NFC
This needed a const_cast for the dominator tree recalculation in
OptimizationRemarkEmitter, but we do that all over the place already
and it's safe.

llvm-svn: 295812
2017-02-22 07:38:17 +00:00
Matt Arsenault
d2e2dba6a0 AMDGPU: Add cvt.pkrtz intrinsic
Convert llvm.SI.packf16 test uses

llvm-svn: 295797
2017-02-22 00:27:34 +00:00
NAKAMURA Takumi
9eb0aec05d Untabify.
llvm-svn: 295599
2017-02-19 06:51:46 +00:00
Craig Topper
109be6a74d Recommit "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR."
Clang has now been fixed to not use these intrinsics.

llvm-svn: 295571
2017-02-18 21:50:58 +00:00
Craig Topper
72006e7fa3 Revert "[X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR."
This reverts r295564. I missed that clang was still using the intrinsics despite our half implemented autoupgrade support.

llvm-svn: 295565
2017-02-18 20:14:20 +00:00
Craig Topper
5086bcdf87 [X86] Remove XOP VPCMOV intrinsics and autoupgrade them to native IR.
It seems we were already upgrading 128-bit VPCMOV, but the intrinsic was still defined and being used in isel patterns. While I was here I also simplified the tablegen multiclasses.

llvm-svn: 295564
2017-02-18 19:51:25 +00:00
Craig Topper
cb8ca39166 [AVX-512] Remove 128/256-bit masked fp max/min intrinsics. Upgrade them to legacy unmasked intrinsics and select instructions.
llvm-svn: 295543
2017-02-18 07:07:50 +00:00
Justin Bogner
eed08f63c9 OptDiag: Allow constructing DiagnosticLocation from DISubprograms
This avoids creating a DILocation just to represent a line number,
since creating Metadata is expensive. Creating a DiagnosticLocation
directly is much cheaper.

llvm-svn: 295531
2017-02-18 02:00:27 +00:00
Justin Bogner
0053d2142a OptDiag: Decouple backend diagnostics from debug info metadata
This creates and uses a DiagnosticLocation type rather than using
DebugLoc for this purpose in the backend diagnostics. This is NFC for
now, but will allow us to create locations for diagnostics without
having to create new metadata nodes when we don't have a DILocation.

llvm-svn: 295519
2017-02-18 00:42:23 +00:00
Justin Bogner
b96f67f67e OptDiag: Rename DiagnosticInfoWithDebugLoc to WithLocation. NFC
This generalizes the name in preparation for decoupling the concept
from DebugLoc.

llvm-svn: 295465
2017-02-17 17:34:37 +00:00
Eugene Zelenko
0deefbc2dc [IR] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 295383
2017-02-17 00:00:09 +00:00
Matt Arsenault
240b1c3d6d AMDGPU: Remove llvm.AMDGPU.cube intrinsic
llvm-svn: 295359
2017-02-16 19:09:04 +00:00
Craig Topper
19ca5c2ad5 [AVX-512] Remove masked packss/packus intrinsics and autoupgrade to unmasked intrinsics with select instructions. For 512-bit add new unmasked intrinsics.
The new 512-bit unmasked intrinsics will make it easy to handle these with the SSE/AVX intrinsics in InstCombine where we currently have a TODO.

llvm-svn: 295290
2017-02-16 06:31:54 +00:00
Ahmed Bougacha
b0c2ac7a60 [OptDiag] Pass const Values/Types to Argument. NFC.
llvm-svn: 295228
2017-02-15 20:38:28 +00:00
Ahmed Bougacha
233dd4cec3 [IR] Accept 'const Type &' in the Type operator<<. NFC.
Type::print is const; there's no reason for the operator not to be.

llvm-svn: 295227
2017-02-15 20:38:22 +00:00
Dehao Chen
ca296c6427 Expose getBaseDiscriminatorFromDiscriminator, getDuplicationFactorFromDiscriminator and getCopyIdentifierFromDiscriminator API so that downstream tools can use them to get the correct encoding.
Summary: Discriminators are now encoded with rich information. This patch exposes the encoding API to downstream tools.

Reviewers: davidxl, hfinkel

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29852

llvm-svn: 295210
2017-02-15 17:54:39 +00:00
Sanjay Patel
71f9fed9f2 fix documentation comments for Argument; NFC
llvm-svn: 295068
2017-02-14 16:43:49 +00:00
Peter Collingbourne
37b11338c0 IR: Type ID summary extensions for WPD; thread summary into WPD pass.
Make the whole thing testable by adding YAML I/O support for the WPD
summary information and adding some negative tests that exercise the
YAML support.

Differential Revision: https://reviews.llvm.org/D29782

llvm-svn: 294981
2017-02-13 19:26:18 +00:00
Sanjay Patel
1cd81a8cb8 fix documentation comments; NFC
llvm-svn: 294964
2017-02-13 16:17:29 +00:00
Peter Collingbourne
fd1115698d Address Mehdi's post-commit review comments on r294795.
llvm-svn: 294822
2017-02-11 03:19:22 +00:00
Krzysztof Parzyszek
5f0b4b1ff0 [Hexagon] Introduce Hexagon V62
llvm-svn: 294805
2017-02-10 23:46:45 +00:00
Peter Collingbourne
33fe886dfb IR: Function summary extensions for whole-program devirtualization pass.
The summary information includes all uses of llvm.type.test and
llvm.type.checked.load intrinsics that can be used to devirtualize calls,
including any constant arguments for virtual constant propagation.

Differential Revision: https://reviews.llvm.org/D29734

llvm-svn: 294795
2017-02-10 22:29:38 +00:00
Dehao Chen
a75059ebaa Encode duplication factor from loop vectorization and loop unrolling to discriminator.
Summary:
This patch starts the implementation as discuss in the following RFC: http://lists.llvm.org/pipermail/llvm-dev/2016-October/106532.html

When optimization duplicates code that will scale down the execution count of a basic block, we will record the duplication factor as part of discriminator so that the offline process tool can find the duplication factor and collect the accurate execution frequency of the corresponding source code. Two important optimization that fall into this category is loop vectorization and loop unroll. This patch records the duplication factor for these 2 optimizations.

The recording will be guarded by a flag encode-duplication-in-discriminators, which is off by default.

Reviewers: probinson, aprantl, davidxl, hfinkel, echristo

Reviewed By: hfinkel

Subscribers: mehdi_amini, anemet, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D26420

llvm-svn: 294782
2017-02-10 21:09:07 +00:00
David Bozier
1b4cfb5426 Revert: "[Stack Protection] Add diagnostic information for why stack protection was applied to a function"
this reverts revision r294590 as it broke some buildbots.

llvm-svn: 294593
2017-02-09 15:40:14 +00:00
David Bozier
4feda20555 [Stack Protection] Add diagnostic information for why stack protection was applied to a function
Stack Smash Protection is not completely free, so in hot code, the overhead it causes can cause performance issues. By adding diagnostic information for which function have SSP and why, a user can quickly determine what they can do to stop SSP being applied to a specific hot function.

This change adds an SSP-specific DiagnosticInfo class and uses of it to the Stack Protection code. A subsequent change to clang will cause the remarks to be emitted when enabled.

Patch by: James Henderson

Differential Revision: https://reviews.llvm.org/D29023

llvm-svn: 294590
2017-02-09 15:08:40 +00:00
Craig Topper
c2247a32db [X86] Clzero intrinsic and its addition under znver1
This patch does the following.

1. Adds an Intrinsic int_x86_clzero which works with __builtin_ia32_clzero
2. Identifies clzero feature using cpuid info. (Function:8000_0008, Checks if EBX[0]=1)
3. Adds the clzero feature under znver1 architecture.
4. The custom inserter is added in Lowering.
5. A testcase is added to check the intrinsic.
6. The clzero instruction is added to assembler test.

Patch by Ganesh Gopalasubramanian with a couple formatting tweaks, a disassembler test, and using update_llc_test.py from me.

Differential revision: https://reviews.llvm.org/D29385

llvm-svn: 294558
2017-02-09 04:27:34 +00:00
Igor Laevsky
0c2f267058 [InstCombineCalls] Remove zero length atomic memcpy intrinsics
Differential Revision: https://reviews.llvm.org/D28909

llvm-svn: 294452
2017-02-08 14:23:47 +00:00
Daniel Berlin
c93f06ccf1 This patch adds a ssa_copy intrinsic, as part of splitting up D29316.
Summary:
The intrinsic, marked as returning it's first argument, has no code
generation effect (though currently not every optimization pass knows
that intrinsics with the returned attribute can be looked through).

It is about to be used to by the PredicateInfo pass to attach
predicate information to existing operands, and be able to tell what
the predicate information affects.

We deliberately do not attach any info through a second operand so
that the intrinsics do not need to dominate the comparisons/etc (since
in the case of assume, we may want to push them up the post-dominator
tree).

Reviewers: davide, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29517

llvm-svn: 294341
2017-02-07 19:29:25 +00:00
Chandler Carruth
cefe125ec4 [IR/Analysis] Defend against getting slightly wrong template arguments
passed into CRTP base classes.

This can sometimes happen and not cause an immediate failure when the
derived class is, itself, a template. You can end up essentially calling
methods on the wrong derived type but a type where many things will
appear to "work".

To fail fast and with a clear error message we can use a static_assert,
but we have to stash that static_assert inside a method body or nested
type that won't need to be completed while building the base class. I've
tried to pick a reasonably small number of places that seemed like
reliably places for this to be instantiated.

llvm-svn: 294272
2017-02-07 03:17:30 +00:00
Chandler Carruth
b07ec0321e Revert r293017 and fix the actual underlying issue.
The patch committed in r293017, as discussed on the list, doesn't really
make sense but was causing an actual issue to go away.

The issue turns out to be that in one place the extra template arguments
were dropped from the OuterAnalysisManagerProxy. This in turn caused the
types used in one set of places to access the key to be completely
different from the types used in another set of places for both Loop and
CGSCC cases where there are extra arguments.

I have literally no idea how anything seemed to work with this bug in
place. It blows my mind. But it did except for mingw64 in a DLL build.

I've added a really handy static assert that helps ensure we don't break
this in the future. It immediately diagnoses the issue with a compile
failure and a very clear error message. Much better that staring at
backtraces on a build bot. =]

llvm-svn: 294267
2017-02-07 01:50:48 +00:00
Dehao Chen
4fb3035c34 Fix the samplepgo indirect call promotion bug: we should not promote a direct call.
Summary: Checking CS.getCalledFunction() == nullptr does not necessary indicate indirect call. We also need to check if CS.getCalledValue() is not a constant.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29570

llvm-svn: 294260
2017-02-06 23:33:15 +00:00
Mehdi Amini
265b7f9709 Revert "[ThinLTO] Add an auto-hide feature"
This reverts commit r293970.

After more discussion, this belongs to the linker side and
there is no added value to do it at this level.

llvm-svn: 293993
2017-02-03 07:41:43 +00:00
Mehdi Amini
45f4286def [ThinLTO] Add an auto-hide feature
When a symbol is not exported outside of the
DSO, it is can be hidden. Usually we try to internalize
as much as possible, but it is not always possible, for
instance a symbol can be referenced outside of the LTO
unit, or there can be cross-module reference in ThinLTO.

This is a recommit of r293912 after fixing build failures,
and a recommit of r293918 after fixing LLD tests.

Differential Revision: https://reviews.llvm.org/D28978

llvm-svn: 293970
2017-02-03 00:32:38 +00:00