1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-24 21:42:54 +02:00
Commit Graph

144834 Commits

Author SHA1 Message Date
Brian Cain
23c7af799c Correct a typo, s/hosting/hoisting/
llvm-svn: 295066
2017-02-14 16:41:10 +00:00
Diego Novillo
9a15512934 Remove unused variable.
llvm-svn: 295065
2017-02-14 16:39:54 +00:00
Pavel Labath
39fd775332 [Support] Add formatv support for StringLiteral
Summary:
This is achieved by generalizing the expression selecting the StringRef
format_provider. Now, anything that can be converted to a StringRef will
use it's formatter.

Reviewers: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29898

llvm-svn: 295064
2017-02-14 16:35:56 +00:00
Matthew Simpson
96843032a3 Reapply "[LV] Extend trunc optimization to all IVs with constant integer steps"
This reapplies commit r294967 with a fix for the execution time regressions
caught by the clang-cmake-aarch64-quick bot. We now extend the truncate
optimization to non-primary induction variables only if the truncate isn't
already free.

Differential Revision: https://reviews.llvm.org/D29847

llvm-svn: 295063
2017-02-14 16:28:32 +00:00
Simon Pilgrim
989d5449f9 [X86][SSE] Allow matchVectorShuffleWithUNPCK to recognise UNDEF inputs
Add support for specifying an UNPCK input as UNDEF

llvm-svn: 295061
2017-02-14 16:22:04 +00:00
Igor Laevsky
cf821eac6c [SCEV] Cache results during GetMinTrailingZeros query
Differential Revision: https://reviews.llvm.org/D29759

llvm-svn: 295060
2017-02-14 15:53:12 +00:00
Simon Pilgrim
83c421dbdc [X86][SSE] Add shuffle combine tests showing missed opportunities to use UNPCK
Not correctly using UNDEF or ZERO inputs to combine to UNPCK shuffles

llvm-svn: 295059
2017-02-14 15:49:37 +00:00
Simon Pilgrim
7876f0f007 [X86][SSE] Regenerate intrinsic upgrade tests
Remove excess semicolons

llvm-svn: 295058
2017-02-14 15:29:50 +00:00
Alexey Bataev
eab1c078e4 [SLP] Fix for PR31879: vectorize repeated scalar ops that don't get put
back into a vector

Previously the cost of the existing ExtractElement/ExtractValue
instructions was considered as a dead cost only if it was detected that
they have only one use. But these instructions may be considered
dead also if users of the instructions are also going to be vectorized,
like:
```
%x0 = extractelement <2 x float> %x, i32 0
%x1 = extractelement <2 x float> %x, i32 1
%x0x0 = fmul float %x0, %x0
%x1x1 = fmul float %x1, %x1
%add = fadd float %x0x0, %x1x1
```
This can be transformed to
```
%1 = fmul <2 x float> %x, %x
%2 = extractelement <2 x float> %1, i32 0
%3 = extractelement <2 x float> %1, i32 1
%add = fadd float %2, %3
```
because though `%x0` and `%x1` have 2 users each other, these users are
part of the vectorized tree and we can consider these `extractelement`
instructions as dead.

Differential Revision: https://reviews.llvm.org/D29900

llvm-svn: 295056
2017-02-14 15:20:48 +00:00
Artyom Skrobov
7174382d29 Removing a redundant assignment
llvm-svn: 295055
2017-02-14 14:44:01 +00:00
Alexander Timofeev
ede4a3e0f8 Revert "[AMDGPU] Fix for SIMachineScheduler crash. SI Scheduler should track"
This reverts commit ce06d9cb99298eb844b66e117f5108a06747c907.

llvm-svn: 295054
2017-02-14 14:29:05 +00:00
Simon Pilgrim
84146f081a [X86][SSE] Move unary inputs handling inside matchVectorShuffleWithUNPCK.
llvm-svn: 295053
2017-02-14 13:47:17 +00:00
Simon Pilgrim
e285f80da0 [X86][SSE] Tidyup matchVectorShuffleWithUNPCK helper function call.
Don't bother setting the V1/V2 operands again for unary shuffles.

Don't bother legalizing the value type unless the match succeeds.

llvm-svn: 295051
2017-02-14 12:54:39 +00:00
Alexey Bataev
3d4930d7b1 [SLP] Additional tests for extractelement cost fix.
llvm-svn: 295050
2017-02-14 12:52:05 +00:00
Simon Pilgrim
959be31745 [X86][SSE] Test case showing missed PSHUFB target shuffle constant fold opportunity.
It also shows an unnecessary pshufb/broadcast being used - the original pshufb mask only requested the lowest byte.

llvm-svn: 295046
2017-02-14 11:20:11 +00:00
Karl-Johan Karlsson
9b88013fdb Revert "[LoopVectorize] Added address space check when analysing interleaved accesses"
This reverts r295038. The buildbot clang-with-thin-lto-ubuntu failed.
I'm reverting to investigate.

llvm-svn: 295042
2017-02-14 10:06:16 +00:00
Karl-Johan Karlsson
31668a8c6e [LoopVectorize] Added address space check when analysing interleaved accesses
Prevent memory objects of different address spaces to be part of
the same load/store groups when analysing interleaved accesses.

This is fixing pr31900.

Reviewers: HaoLiu, mssimpso, mkuper

Reviewed By: mssimpso, mkuper

Subscribers: llvm-commits, efriedma, mzolotukhin

Differential Revision: https://reviews.llvm.org/D29717

llvm-svn: 295038
2017-02-14 08:14:06 +00:00
Karl-Johan Karlsson
2923bdc76a Test commit permission
Removing whitespace.

llvm-svn: 295037
2017-02-14 07:31:36 +00:00
Daniel Jasper
edd508438b Add initializer that was missed in r295009.
llvm-svn: 295036
2017-02-14 07:10:03 +00:00
Craig Topper
1b09ad0367 [AVX-512] Add PAVGB/PAVGW to load folding tables.
llvm-svn: 295035
2017-02-14 06:54:57 +00:00
Mikael Holmen
b41e5c8bc7 [LSR] Pointers with different address spaces are considered incompatible.
Summary:
Function isCompatibleIVType is already used as a guard before the call to

 SE.getMinusSCEV(OperExpr, PrevExpr);

in LSRInstance::ChainInstruction. getMinusSCEV requires the expressions
to be of the same type, so we now consider two pointers with different
address spaces to be incompatible, since it is possible that the pointers
in fact have different sizes.

Reviewers: qcolombet, eli.friedman

Reviewed By: qcolombet

Subscribers: nhaehnle, Ka-Ka, llvm-commits, mzolotukhin

Differential Revision: https://reviews.llvm.org/D29885

llvm-svn: 295033
2017-02-14 06:37:42 +00:00
Lang Hames
90897fe570 [Orc][RPC] Remove lanch policies in favor of async handlers.
Launch policies provided a mechanism for running RPC handlers on a background
thread (unblocking the main RPC receiver thread). Async handlers generalize
this by passing the responder function (the function that sends the RPC return
value) as an argument to the handler. The handler can optionally do its work on
a background thread (the same way launch policies do), but can also (a) can
inspect the call arguments before deciding to run the work on a different
thread, or (b) can use the responder in a subsequent RPC call (e.g. in the
handler of a callAsync), allowing the handler to call back to the originator (or
to a 3rd party) without blocking the listener thread, and without launching a
new thread.

llvm-svn: 295030
2017-02-14 05:40:01 +00:00
Alex Bradbury
319f959d2d [RISCV] Fix RV32 datalayout string and ensure initAsmInfo is called
llvm-svn: 295028
2017-02-14 05:20:20 +00:00
Alex Bradbury
9709853f86 [RISCV] Pseudo instructions are isCodeGenOnly, have blank asmstr
llvm-svn: 295027
2017-02-14 05:17:23 +00:00
Alex Bradbury
93a24638b3 [RISCV] Fix unused variable in RISCVMCTargetDesc. NFC
Also, for better uniformity use TargetRegistry::RegisterMCAsmInfo rather than 
RegisterMCAsmInfoFn. Again, no functional change.

llvm-svn: 295026
2017-02-14 05:15:24 +00:00
Peter Collingbourne
8c133228c6 ThinLTOBitcodeWriter: Write available_externally copies of VCP eligible functions to merged module.
Differential Revision: https://reviews.llvm.org/D29701

llvm-svn: 295021
2017-02-14 03:42:38 +00:00
Mehdi Amini
5cb64b18b7 [ThinLTO] Make a copy of buffer identifier in ThinLTOCodeGenerator
We can't assume that the `const char *` provided through libLTO has a
lifetime that expands beyond the codegenerator itself.

llvm-svn: 295018
2017-02-14 02:20:51 +00:00
Philip Reames
88e492b37d [LICM] Make store promotion work in the face of unordered atomics
Extend our store promotion code to deal with unordered atomic accesses. Ordered atomics continue to be unhandled.

Most of the change is straight-forward, the only complicated bit is in the reasoning around mixing of atomic and non-atomic memory access. Rather than trying to reason about the complex semantics in these cases, I simply disallowed promotion when both atomic and non-atomic accesses are present. This is conservatively correct.

It seems really tempting to just promote all access to atomics, but the original accesses might have been conditional. Since we can't lower an arbitrary atomic type, it might not be safe to promote all access to atomic. Consider a loop like the following:
while(b) {
  load i128 ...
  if (can lower i128 atomic)
    store atomic i128 ...
  else
    store i128
}

It could be there's no race on the location and thus the code is perfectly well defined even if we can't lower a i128 atomically. 

It's not clear we need to be this conservative - arguably the program above is brocken since it can't be lowered unless the branch is folded - but I didn't want to have to fix any fallout which might result.

Differential Revision: https://reviews.llvm.org/D15592

llvm-svn: 295015
2017-02-14 01:38:31 +00:00
Reid Kleckner
d1fdba9087 Undef MemoryFence, which is defined to _mm_mfence by winnt.h
llvm-svn: 295014
2017-02-14 01:38:14 +00:00
Reid Kleckner
e420d88d86 Use std::call_once on Windows
Previously we could not use it because std::once_flag's default
constructor was not constexpr. Today, all supported versions of VS
correctly mark it constexpr. I confirmed that MSVC 2015 does not emit
any problematic racy dynamic initialization code, so we should be safe
to use this now.

llvm-svn: 295013
2017-02-14 01:21:39 +00:00
Eugene Zelenko
c7114c5b3a [MC] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
Same changes in files affected by reduced MC headers dependencies.

llvm-svn: 295009
2017-02-14 00:33:36 +00:00
Peter Collingbourne
cf5951c340 FunctionAttrs: Factor out a function for querying memory access of a specific copy of a function. NFC.
This will later be used by ThinLTOBitcodeWriter to add copies of readnone
functions to the regular LTO module.

Differential Revision: https://reviews.llvm.org/D29695

llvm-svn: 295008
2017-02-14 00:28:13 +00:00
Michael Kuperstein
b36c66b0a3 Silence redundant semicolon warnings. NFC.
llvm-svn: 295005
2017-02-13 23:42:27 +00:00
Andrew Kaylor
ab08c57ca4 [X86] Add MXCSR register
This adds MXCSR to the set of recognized registers for X86 targets and updates the instructions that read or write it. I do not intend for all of the various floating point instructions that implicitly use the control bits or update the status bits of this register to ever have that usage modeled by default. However, when constrained floating point modes (such as strict FP exception status modeling or dynamic rounding modes) are enabled, implicit use/def information for MXCSR will be added to those instructions.

Until those additional updates are made this should cause (almost?) no functional changes. Theoretically, this will prevent instructions like LDMXCSR and STMXCSR from being moved past one another, but that should be prevented anyway and I haven't found a case where it is happening now.

Differential Revision: https://reviews.llvm.org/D29903

llvm-svn: 295004
2017-02-13 23:38:52 +00:00
Sanjoy Das
363ce56069 [LangRef] Explicitly allow readnone and reaodnly functions to unwind
Summary:
This change edits the language reference to explicitly allow the
existence of readnone and readonly functions that can throw.  Full
discussion at
http://lists.llvm.org/pipermail/llvm-dev/2017-January/108637.html

Reviewers: dberlin, chandlerc, hfinkel, majnemer

Reviewed By: majnemer

Subscribers: majnemer, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D28740

llvm-svn: 295000
2017-02-13 23:19:07 +00:00
Sanjoy Das
a5086f163f [LangRef] Update the TBAA section
Summary:
Update the TBAA section to mention the struct path TBAA that LLVM
implements today.  This is not a proposal or change in semantics -- it
is intended only to **document** what LLVM already does today.

This is related to https://reviews.llvm.org/D26438 where I've tried to
implement some of the constraints as verifier checks.

Reviewers: anna, reames, rsmith, chandlerc, hfinkel, rjmccall, mehdi_amini, dexonsmith, manmanren

Reviewed By: manmanren

Subscribers: dberlin, dberris, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D26831

llvm-svn: 294999
2017-02-13 23:14:03 +00:00
Sanjay Patel
1e7c803052 [FunctionAttrs] try to extend nonnull-ness of arguments from a callsite back to its parent function
As discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2016-December/108182.html
...we should be able to propagate 'nonnull' info from a callsite back to its parent.

The original motivation for this patch is our botched optimization of "dyn_cast" (PR28430),
but this won't solve that problem.

The transform is currently disabled by default while we wait for clang to work-around
potential security problems:
http://lists.llvm.org/pipermail/cfe-dev/2017-January/052066.html

Differential Revision: https://reviews.llvm.org/D27855

llvm-svn: 294998
2017-02-13 23:10:51 +00:00
Amaury Sechet
f72eb8e1e7 Revert autogenerated check result for test/CodeGen/X86/atomic-minmax-i6432.ll as they don't regenerate cleanly.
llvm-svn: 294996
2017-02-13 23:00:23 +00:00
Tim Northover
979f34e8cc GlobalISel: represent atomic loads & stores via the MachineMemOperand.
Also make sure the AArch64 backend doesn't try to convert them into normal
loads and stores.

llvm-svn: 294993
2017-02-13 22:14:16 +00:00
Tim Northover
291842e005 MIR: parse & print the atomic parts of a MachineMemOperand.
We're going to need them very soon for GlobalISel.

llvm-svn: 294992
2017-02-13 22:14:08 +00:00
Reid Kleckner
772249d679 [CodeGen] Use bitfields instead of manual masks in ArgFlagsTy, NFC
This revealed that we actually have 8 more unused flag bits, and byval
size doesn't need to be a bitfield at all.

This came up during code review here:
https://reviews.llvm.org/D29668#inline-258469

llvm-svn: 294989
2017-02-13 21:33:26 +00:00
Taewook Oh
f644f79154 Address post-commit comments for https://reviews.llvm.org/D29596. NFCI.
llvm-svn: 294985
2017-02-13 21:12:27 +00:00
Arnold Schwaighofer
51dc444272 swiftcc: Don't emit tail calls from callers with swifterror parameters
Backends don't support this yet. They would have to move to the swifterror
register before the tail call to make sure it is live-in to the call.

rdar://30495920

llvm-svn: 294982
2017-02-13 19:58:28 +00:00
Peter Collingbourne
37b11338c0 IR: Type ID summary extensions for WPD; thread summary into WPD pass.
Make the whole thing testable by adding YAML I/O support for the WPD
summary information and adding some negative tests that exercise the
YAML support.

Differential Revision: https://reviews.llvm.org/D29782

llvm-svn: 294981
2017-02-13 19:26:18 +00:00
Alexey Bataev
a44ca284d3 [SLP] Test for extractelement cost fix.
llvm-svn: 294980
2017-02-13 19:08:19 +00:00
Taewook Oh
c49af0f212 Make MachineBasicBlock::updateTerminator to update DebugLoc as well
Summary:
Currently MachineBasicBlock::updateTerminator simply drops DebugLoc for newly created branch instructions, which may cause incorrect stepping and/or imprecise sample profile data. Below is an example:

```
  1 extern int bar(int x);
  2
  3 int foo(int *begin, int *end) {
  4   int *i;
  5   int ret = 0;
  6   for (
  7       i = begin ;
  8       i != end ;
  9       i++)
 10   {
 11       ret += bar(*i);
 12   }
 13   return ret;
 14 }
```

Below is a bitcode of 'foo' at the end of LLVM-IR level optimizations with -O3:

```
define i32 @foo(i32* readonly %begin, i32* readnone %end) !dbg !4 {
entry:
  %cmp6 = icmp eq i32* %begin, %end, !dbg !9
  br i1 %cmp6, label %for.end, label %for.body.preheader, !dbg !12

for.body.preheader:                               ; preds = %entry
  br label %for.body, !dbg !13

for.body:                                         ; preds = %for.body.preheader, %for.body
  %ret.08 = phi i32 [ %add, %for.body ], [ 0, %for.body.preheader ]
  %i.07 = phi i32* [ %incdec.ptr, %for.body ], [ %begin, %for.body.preheader ]
  %0 = load i32, i32* %i.07, align 4, !dbg !13, !tbaa !15
  %call = tail call i32 @bar(i32 %0), !dbg !19
  %add = add nsw i32 %call, %ret.08, !dbg !20
  %incdec.ptr = getelementptr inbounds i32, i32* %i.07, i64 1, !dbg !21
  %cmp = icmp eq i32* %incdec.ptr, %end, !dbg !9
  br i1 %cmp, label %for.end.loopexit, label %for.body, !dbg !12, !llvm.loop !22

for.end.loopexit:                                 ; preds = %for.body
  br label %for.end, !dbg !24

for.end:                                          ; preds = %for.end.loopexit, %entry
  %ret.0.lcssa = phi i32 [ 0, %entry ], [ %add, %for.end.loopexit ]
  ret i32 %ret.0.lcssa, !dbg !24
}
```

where

```
!12 = !DILocation(line: 6, column: 3, scope: !11)
```

. As you can see, the terminator of 'entry' block, which is a loop control branch, has a DebugLoc of line 6, column 3. Howerver, after the execution of 'MachineBlock::updateTerminator' function, which is triggered by MachineSinking pass, the DebugLoc info is dropped as below (see there's no debug-location for JNE_1):

```
  bb.0.entry:
    successors: %bb.4(0x30000000), %bb.1.for.body.preheader(0x50000000)
    liveins: %rdi, %rsi

    %6 = COPY %rsi
    %5 = COPY %rdi
    %8 = SUB64rr %5, %6, implicit-def %eflags, debug-location !9
    JNE_1 %bb.1.for.body.preheader, implicit %eflags
```

This patch addresses this issue and make newly created branch instructions to keep debug-location info.

Reviewers: aprantl, MatzeB, craig.topper, qcolombet

Reviewed By: qcolombet

Subscribers: qcolombet, llvm-commits

Differential Revision: https://reviews.llvm.org/D29596

llvm-svn: 294976
2017-02-13 18:15:31 +00:00
Matthew Simpson
c00928de84 Revert "[LV] Extend trunc optimization to all IVs with constant integer steps"
This reverts commit r294967. This patch caused execution time slowdowns in a
few LLVM test-suite tests, as reported by the clang-cmake-aarch64-quick bot.
I'm reverting to investigate.

llvm-svn: 294973
2017-02-13 18:02:35 +00:00
Quentin Colombet
e407a5d391 [FastISel] Add a diagnostic to warm on fallback.
This is consistent with what we do for GlobalISel. That way, it is easy
to see whether or not FastISel is able to fully select a function.
At some point we may want to switch that to an optimization remark.

llvm-svn: 294970
2017-02-13 17:38:59 +00:00
James Molloy
29f53382f7 [ARM] Fix crash caused by r294945
I'd missed a creator of FCMP nodes - duplicateCmp().

Kindly and promptly reported by Gabor Ballabas, due to his CSiBE test suite.

llvm-svn: 294968
2017-02-13 17:18:00 +00:00
Matthew Simpson
658ae0c60e [LV] Extend trunc optimization to all IVs with constant integer steps
This patch extends the optimization of truncations whose operand is an
induction variable with a constant integer step. Previously we were only
applying this optimization to the primary induction variable. However, the cost
model assumes the optimization is applied to the truncation of all integer
induction variables (even regardless of step type). The transformation is now
applied to the other induction variables, and I've updated the cost model to
ensure it is better in sync with the transformation we actually perform.

Differential Revision: https://reviews.llvm.org/D29847

llvm-svn: 294967
2017-02-13 16:48:00 +00:00