1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00
Commit Graph

102563 Commits

Author SHA1 Message Date
Davide Italiano
334449f848 [NewGVN] Fix a consistent order for phi nodes operands.
The way we currently define congruency for two PHIExpression(s) is:

1) The operands to the phi functions are congruent
2) The PHIs are defined in the same BasicBlock.

NewGVN works under the assumption that phi operands are in predecessor
order, or at least in some consistent order. OTOH, is valid IR:

patatino:
  %meh = phi i16 [ %0, %winky ], [ %conv1, %tinky ]
  %banana = phi i16 [ %0, %tinky ], [ %conv1, %winky ]
  br label %end

and the in-memory representations of the two SSA registers have an
inconsistent order. This violation of NewGVN assumptions results into
two PHIs found congruent when they're not. While we think it's useful
to have always a consistent order enforced, let's fix this in NewGVN
sorting uses in predecessor order before creating a PHI expression.

Differential Revision:  https://reviews.llvm.org/D32990

llvm-svn: 302552
2017-05-09 16:58:28 +00:00
Craig Topper
cd4fbccfe2 [APInt] Remove return value from tcFullMultiply.
The description says it returns the number of words needed to represent the results. But the way it was coded it always returns (lhsWords + rhsWords) or (lhsWords + rhsWords - 1). But the result could be even smaller than that and it wouldn't tell you.

No one uses the result today so rather than try to fix it, just remove it.

llvm-svn: 302551
2017-05-09 16:47:33 +00:00
Daniel Berlin
28e9f6316a NewGVN: Make all of symbolic evaluation logically const.
llvm-svn: 302550
2017-05-09 16:40:04 +00:00
Craig Topper
4cf65193e1 [X86] Add more patterns for BZHI isel
This patch adds more patterns that a reasonable person might write that can be compiled to BZHI.

This adds support for

(~0U >> (32 - b)) & a;

and

a << (32 - b) >> (32 - b);

This was inspired by the code in APInt::clearUnusedBits.

This can pass an index of 32 to the bzhi instruction which a quick test of Haswell hardware shows will not mask any bits. Though the description text in the Intel manual says the "index is saturated to OperandSize-1". The pseudocode in the same manual indicates no bits will be zeroed for this case.

I think this is still missing cases where the subtract portion is an 8-bit operation.

Differential Revision: https://reviews.llvm.org/D32616

llvm-svn: 302549
2017-05-09 16:32:11 +00:00
Sanjay Patel
4f2714d90b [InstCombineCasts] Fix checks in sext->lshr->trunc pattern.
The comment says to avoid the case where zero bits are shifted into the truncated value, 
but the code checks that the shift is smaller than the truncated value instead of the 
number of bits added by the sign extension. Fixing this allows a shift by more than the 
value size to be introduced, which is undefined behavior, so the shift is capped at the 
value size minus one, which has the expected behavior of filling the value with the sign 
bit.

Patch by Jacob Young!

Differential Revision: https://reviews.llvm.org/D32285

llvm-svn: 302548
2017-05-09 16:24:59 +00:00
Guy Blank
52106f6b42 VX512] Only look at lower bit in constant scalar masks
for scalar masked instructions only the lower bit of the mask is relevant. so for constant masks we should either do an unmasked operation or no operation, depending on the value of the lower bit.
This patch handles cases where the lower bit is '1'.

Differential Revision: https://reviews.llvm.org/D32805

llvm-svn: 302546
2017-05-09 16:16:48 +00:00
Reid Kleckner
bed1389ae3 Re-land "Use the frame index side table for byval and inalloca arguments"
This re-lands r302483. It was not the cause of PR32977.

llvm-svn: 302544
2017-05-09 16:02:20 +00:00
Reid Kleckner
fc145824a1 Re-land "Don't add DBG_VALUE instructions for static allocas in dbg.declare"
This re-lands commit r302461. It was not the cause of PR32977.

llvm-svn: 302543
2017-05-09 16:01:47 +00:00
Tim Shen
5f6285f048 [Atomic] Remove IsStore/IsLoad in the interface, and pass the instruction instead. NFC.
Now both emitLeadingFence and emitTrailingFence take the instruction
itself, instead of taking IsLoad/IsStore pairs.
Instruction::mayReadFromMemory and Instrucion::mayWriteToMemory are used
for determining those two booleans.

The instruction argument is also useful for later D32763, in
emitTrailingFence. For emitLeadingFence, it seems to have cleaner
interface with the proposed change.

Differential Revision: https://reviews.llvm.org/D32762

llvm-svn: 302539
2017-05-09 15:27:17 +00:00
Aaron Ballman
6b3ff33b51 Amend r302535; ifndef and ifdef are different, as it turns out.
llvm-svn: 302537
2017-05-09 15:12:03 +00:00
Aaron Ballman
2df735d0a8 ARMRegisterBankInfo.h requires LLVM_BUILD_GLOBAL_ISEL to be defined. If it is not defined, then ARMGenRegisterBank.inc is not table generated and the inclusion of this header causes the build to fail.
llvm-svn: 302535
2017-05-09 14:59:48 +00:00
Hans Wennborg
1ddac6ae37 Revert r302469 "Make it illegal for two Functions to point to the same DISubprogram"
This caused PR32977.

Original commit message:

> Make it illegal for two Functions to point to the same DISubprogram
>
> As recently discussed on llvm-dev [1], this patch makes it illegal for
> two Functions to point to the same DISubprogram and updates
> FunctionCloner to also clone the debug info of a function to conform
> to the new requirement. To simplify the implementation it also factors
> out the creation of inlineAt locations from the Inliner into a
> general-purpose utility in DILocation.
>
> [1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html
> <rdar://problem/31926379>
>
> Differential Revision: https://reviews.llvm.org/D32975

llvm-svn: 302533
2017-05-09 14:44:15 +00:00
Anna Thomas
3580c4d010 [LV] Fix insertion point for shuffle vectors in first order recurrence
Summary:
In first order recurrence vectorization, when the previous value is a phi node, we need to
set the insertion point to the first non-phi node.
We can have the previous value being a phi node, due to the generation of new
IVs as part of trunc optimization [1].

[1] https://reviews.llvm.org/rL294967

Reviewers: mssimpso, mkuper

Subscribers: mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D32969

llvm-svn: 302532
2017-05-09 14:29:33 +00:00
Aaron Ballman
f74a6f060f Removing a file that is not necessary (and was causing link diagnostics with MSVC 2015); NFC.
llvm-svn: 302531
2017-05-09 14:22:48 +00:00
Serge Pavlov
b8ce9ec478 Add extra operand to CALLSEQ_START to keep frame part set up previously
Using arguments with attribute inalloca creates problems for verification
of machine representation. This attribute instructs the backend that the
argument is prepared in stack prior to  CALLSEQ_START..CALLSEQ_END
sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size
stored in CALLSEQ_START in this case does not count the size of this
argument. However CALLSEQ_END still keeps total frame size, as caller can
be responsible for cleanup of entire frame. So CALLSEQ_START and
CALLSEQ_END keep different frame size and the difference is treated by
MachineVerifier as stack error. Currently there is no way to distinguish
this case from actual errors.

This patch adds additional argument to CALLSEQ_START and its
target-specific counterparts to keep size of stack that is set up prior to
the call frame sequence. This argument allows MachineVerifier to calculate
actual frame size associated with frame setup instruction and correctly
process the case of inalloca arguments.

The changes made by the patch are:
- Frame setup instructions get the second mandatory argument. It
  affects all targets that use frame pseudo instructions and touched many
  files although the changes are uniform.
- Access to frame properties are implemented using special instructions
  rather than calls getOperand(N).getImm(). For X86 and ARM such
  replacement was made previously.
- Changes that reflect appearance of additional argument of frame setup
  instruction. These involve proper instruction initialization and
  methods that access instruction arguments.
- MachineVerifier retrieves frame size using method, which reports sum of
  frame parts initialized inside frame instruction pair and outside it.

The patch implements approach proposed by Quentin Colombet in
https://bugs.llvm.org/show_bug.cgi?id=27481#c1.
It fixes 9 tests failed with machine verifier enabled and listed
in PR27481.

Differential Revision: https://reviews.llvm.org/D32394

llvm-svn: 302527
2017-05-09 13:35:13 +00:00
Simon Dardis
76a4991023 Revert "[MIPS] Add support to match more patterns for DINS instruction"
This reverts commit rL302512. This broke the mips buildbots.

llvm-svn: 302526
2017-05-09 13:18:48 +00:00
Simon Pilgrim
50affcce7b [X86][SSE42] Lower v2i64/v4i64 ASHR(X, 63) as PCMPGTQ(0, X)
Similar to what we do for vXi8 ASHR(X, 7), use SSE42's PCMPGTQ to splat the sign instead of using the PSRAD+PSHUFD.

Avoiding bitcasts this improves combines that utilize computeNumSignBits, permits memory folding and reduces pipe pressure. Although it does require a second register, given that this is a (cheap) zero register the impact is minimal.

Differential Revision: https://reviews.llvm.org/D32973

llvm-svn: 302525
2017-05-09 13:14:40 +00:00
Diana Picus
2185814192 Revert "[Dwarf] Disable reference verification for now (PR32972)"
This reverts commit r302520 because it break the unit tests.

llvm-svn: 302524
2017-05-09 13:05:43 +00:00
Renato Golin
12077cc392 [Dwarf] Disable reference verification for now (PR32972)
There is no other explanation about why this only started happening
now, even though it crashes on old code (supposedly reachable from
here).

The only common factor between the failing bots is that they use GCC
(4.9 and 5.3) to compile Clang, while the others use Clang 3.8, but the
failure is while building the tests, as an assertion, on Clang.

Commenting it out for now in hope the bots will go back green, but we
should keep looking for the real cause, and update bugzilla.

llvm-svn: 302520
2017-05-09 12:36:50 +00:00
Amara Emerson
59ff6c8c60 Introduce experimental generic intrinsics for horizontal vector reductions.
- This change allows targets to opt-in to using them instead of the log2
  shufflevector algorithm.
- The SLP and Loop vectorizers have the common code to do shuffle reductions
  factored out into LoopUtils, and now have a unified interface for generating
  reductions regardless of the preference of the target. LoopUtils now uses TTI
  to determine what kind of reductions the target wants to handle.
- For CodeGen, basic legalization support is added.

Differential Revision: https://reviews.llvm.org/D30086

llvm-svn: 302514
2017-05-09 10:43:25 +00:00
Nikolai Bozhenov
3789a9bfa0 [X86] Clang option -fuse-init-array has no effect when generating for MCU target
Reviewers: Eugene.Zelenko, dschuff, craig.topper

Reviewed By: craig.topper

Subscribers: ahatanak, aaboud, DavidKreitzer, llvm-commits, cfe-commits

Differential Revision: https://reviews.llvm.org/D32543
Patch by AndreiGrischenko <andrei.l.grischenko@intel.com>

llvm-svn: 302513
2017-05-09 10:14:03 +00:00
Strahinja Petrovic
d020c9cb48 [MIPS] Add support to match more patterns for DINS instruction
This patch adds support for recognizing patterns to match
DINS instruction.

Differential Revision: https://reviews.llvm.org/D31465

llvm-svn: 302512
2017-05-09 10:02:00 +00:00
Diana Picus
f9985f10cb [ARM GlobalISel] Remove hand-written G_FADD selection
Remove the code selecting G_FADD - now that TableGen can handle more
opcodes, it's not needed anymore.

llvm-svn: 302511
2017-05-09 08:32:42 +00:00
Craig Topper
7c07003444 [ConstantRange] Rewrite shl to avoid repeated calls to getUnsignedMax and avoid creating the min APInt until we're sure we need it. Use inplace shift operations.
llvm-svn: 302510
2017-05-09 07:04:04 +00:00
Craig Topper
0a38cca54d [ConstantRange] Combine the two adds max+1 in lshr into a single addition.
llvm-svn: 302509
2017-05-09 07:04:02 +00:00
Craig Topper
4b11266674 [ConstantRange] Use APInt::isNullValue in place of comparing with 0. The compiler should be able to generate slightly better code for the former. NFC
llvm-svn: 302508
2017-05-09 05:01:29 +00:00
Reid Kleckner
1a48591876 Revert "Don't add DBG_VALUE instructions for static allocas in dbg.declare"
This reverts commit r302461.

It appears to be causing failures compiling gtest with debug info on the
Linux sanitizer bot. I was unable to reproduce the failure locally,
however.

llvm-svn: 302504
2017-05-09 01:57:44 +00:00
Teresa Johnson
7ff9f7abb3 Fix code section prefix for proper layout
Summary:
r284533 added hot and cold section prefixes based on profile
information, to enable grouping of hot/cold functions at link time.
However, it used "cold" as the prefix for cold sections, but gold only
recognizes "unlikely" (which is used by gcc for cold sections).
Therefore, cold sections were not properly being grouped. Switch to
using "unlikely"

Reviewers: danielcdh, davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32983

llvm-svn: 302502
2017-05-09 01:43:24 +00:00
Kostya Serebryany
945ac266a3 [libFuzzer] update docs on -print_coverage/-dump_coverage
llvm-svn: 302498
2017-05-09 01:34:27 +00:00
Kostya Serebryany
ef5f540cec [libFuzzer] make sure the input data is not overwritten in the fuzz target (if it is -- report an error)
llvm-svn: 302494
2017-05-09 01:17:29 +00:00
Reid Kleckner
e98eae6da6 Revert "Use the frame index side table for byval and inalloca arguments"
This reverts r302483 and it's follow up fix.

llvm-svn: 302493
2017-05-09 01:14:39 +00:00
Craig Topper
d01aa8fcf3 [APInt] Use default constructor instead of explicitly creating a 1-bit APInt in udiv and urem. NFC
The default constructor does the same thing.

llvm-svn: 302487
2017-05-08 23:49:54 +00:00
Craig Topper
06bee7941a [APInt] Remove 'else' after 'return' in udiv and urem. NFC
llvm-svn: 302486
2017-05-08 23:49:49 +00:00
Evgeniy Stepanov
49f6da0167 Ignore !associated metadata with null argument.
Fixes PR32577 (comment 10).
Such metadata may legitimately appear in LTO.

llvm-svn: 302485
2017-05-08 23:46:20 +00:00
Reid Kleckner
d320dddb9e Use the frame index side table for byval and inalloca arguments
Summary:
For inalloca functions, this is a very common code pattern:

  %argpack = type <{ i32, i32, i32 }>
  define void @f(%argpack* inalloca %args) {
  entry:
    %a = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 0
    %b = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 1
    %c = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 2
    tail call void @llvm.dbg.declare(metadata i32* %a, ... "a")
    tail call void @llvm.dbg.declare(metadata i32* %c, ... "b")
    tail call void @llvm.dbg.declare(metadata i32* %b, ... "c")

Even though these GEPs can be simplified to a constant offset from EBP
or RSP, we don't do that at -O0, and each GEP is computed into a
register. Registers used to compute argument addresses are typically
spilled and clobbered very quickly after the initial computation, so
live debug variable tracking loses information very quickly if we use
DBG_VALUE instructions.

This change moves processing of dbg.declare between argument lowering
and basic block isel, so that we can ask if an argument has a frame
index or not. If the argument lives in a register as is the case for
byval arguments on some targets, then we don't put it in the side table
and during ISel we emit DBG_VALUE instructions.

Reviewers: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32980

llvm-svn: 302483
2017-05-08 23:20:27 +00:00
Sanjoy Das
ff0a2209a9 [InstNamer] Use range-for
llvm-svn: 302481
2017-05-08 23:18:43 +00:00
Sanjoy Das
cc1126650a [InstNamer] Don't check type of arguments (they're never void)
llvm-svn: 302480
2017-05-08 23:18:39 +00:00
Sanjoy Das
46bbd2c18c Delete trailing whitespace
llvm-svn: 302479
2017-05-08 23:18:36 +00:00
Greg Clayton
d186c095c9 Add const to "DWARFDie &Die" in a few functions as they can't change the DWARFDie.
llvm-svn: 302471
2017-05-08 21:29:17 +00:00
Eugene Zemtsov
e6c32cf03f Fix typo
llvm-svn: 302470
2017-05-08 21:20:53 +00:00
Adrian Prantl
d85e85bcdc Make it illegal for two Functions to point to the same DISubprogram
As recently discussed on llvm-dev [1], this patch makes it illegal for
two Functions to point to the same DISubprogram and updates
FunctionCloner to also clone the debug info of a function to conform
to the new requirement. To simplify the implementation it also factors
out the creation of inlineAt locations from the Inliner into a
general-purpose utility in DILocation.

[1] http://lists.llvm.org/pipermail/llvm-dev/2017-May/112661.html
<rdar://problem/31926379>

Differential Revision: https://reviews.llvm.org/D32975

llvm-svn: 302469
2017-05-08 21:17:08 +00:00
Greg Clayton
e2aef05f51 Fix typo "veify" to "verify".
llvm-svn: 302466
2017-05-08 20:53:00 +00:00
Sanjay Patel
af36fe6110 [InstCombine] add folds for not-of-shift-right
This is another step towards getting rid of dyn_castNotVal, 
so we can recommit:
https://reviews.llvm.org/rL300977

As the tests show, we were missing the lshr case for constants
and both ashr/lshr vector splat folds. The ashr case with constant
was being performed inefficiently in 2 steps. It's also possible
there was a latent bug in that case because we can't do that fold
if the constant is positive:
http://rise4fun.com/Alive/Bge

llvm-svn: 302465
2017-05-08 20:49:59 +00:00
Davide Italiano
f80014fd35 [PartialInlining] Capture by reference rather than by value.
llvm-svn: 302464
2017-05-08 20:44:01 +00:00
Tim Northover
a310a0689f ARM: use divmod libcalls on embedded MachO platforms too.
The separated libcalls are implemented in terms of __divmodsi4 and __udivmodsi4
anyway, so we should always use them if possible.

llvm-svn: 302462
2017-05-08 20:00:14 +00:00
Reid Kleckner
e681620142 Don't add DBG_VALUE instructions for static allocas in dbg.declare
Summary:
An llvm.dbg.declare of a static alloca is always added to the
MachineFunction dbg variable map, so these values are entirely
redundant. They survive all the way through codegen to be ignored by
DWARF emission.

Effectively revert r113967

Two bugpoint-reduced test cases from 2012 broke as a result of this
change. Despite my best efforts, I haven't been able to rewrite the test
case using dbg.value. I'm not too concerned about the lost coverage
because these were reduced from the test-suite, which we still run.

Reviewers: aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32920

llvm-svn: 302461
2017-05-08 19:58:15 +00:00
Zachary Turner
b78cfee1e6 [CodeView] Add support for random access type visitors.
Previously type visitation was done strictly sequentially, and
TypeIndexes were computed by incrementing the TypeIndex of the
last visited record.  This works fine for situations like dumping,
but not when you want to visit types in random order.  For example,
in a debug session someone might lookup a symbol by name, find that
it has TypeIndex 10,000 and then want to go straight to TypeIndex
10,000.

In order to make this work, the visitation framework needs a mode
where it can plumb TypeIndices through the callback pipeline.  This
patch adds such a mode.  In doing so, it is necessary to provide
an alternative implementation of TypeDatabase that supports random
access, so that is done as well.

Nothing actually uses these random access capabilities yet, but
this will be done in subsequent patches.

Differential Revision: https://reviews.llvm.org/D32928

llvm-svn: 302454
2017-05-08 18:38:43 +00:00
Quentin Colombet
e86bed8cbe [AArch64][RegisterBankInfo] Change the default mapping of fp loads.
This fixes PR32550, in a way that does not imply running the greedy
mode at O0.

The fix consists in checking if a load is used by any floating point
instruction and if yes, we return a default mapping with FPR instead
of GPR.

llvm-svn: 302453
2017-05-08 18:16:31 +00:00
Quentin Colombet
2987889926 [AArch64][RegisterBankInfo] Fix mapping cost for GPR.
In r292478, we changed the order of the enum that is referenced by
PMI_FirstXXX. This had the side effect of changing the cost of the
mapping of all the loads, instead of just the FPRs ones.

Reinstate the higher cost for all but GPR loads.
Note: This did not have any external visible effects:
- For Fast mode, the cost would have been higher, but we don't care
  because we don't try to use alternative mappings.
- For Greedy mode, the higher cost of the GPR loads, would have
  triggered the use of the supposedly alternative mapping, that
  would be in fact the same GPR mapping but with a lower cost.

llvm-svn: 302452
2017-05-08 18:16:23 +00:00
Craig Topper
c6a3bfadb3 [ARM] Use a Changed flag to avoid making a pass's return value dependent on a compare with a Statistic object.
Statistic compile to always be 0 in release build so this compare would always return false. And in the debug builds Statistic are global variables and remember their values across pass runs. So this compare returns true anytime the pass runs after the first time it modifies something.

This was found after reviewing all usages of comparison operators on a Statistic object. We had some internal code that did a compare with a statistic that caused a mismatch in output between debug and release builds. So we did an audit out of paranoia.

llvm-svn: 302450
2017-05-08 18:02:51 +00:00
Craig Topper
3fb44dd62f [SCEV] Don't use std::move on both inputs to APInt::operator+ or operator-. It might be confusing to the reader. NFC
llvm-svn: 302448
2017-05-08 17:39:01 +00:00
Daniel Berlin
66d32d31c4 ConstantFold: Handle gep nonnull, undef as well
llvm-svn: 302447
2017-05-08 17:37:33 +00:00
Daniel Berlin
7e86e4b1f7 ConstantFold: Fold getelementptr (i32, i32* null, i64 undef) to null.
Transforms/IndVarSimplify/2011-10-27-lftrnull will fail if this regresses.
Transforms/GVN/PRE/2011-06-01-NonLocalMemdepMiscompile.ll has been changed to still test what it was
trying to test.

llvm-svn: 302446
2017-05-08 17:37:29 +00:00
Craig Topper
dc461c9a16 [ValueTracking] Use KnownOnes to provide a better bound on known zeros for ctlz/cttz intrinics
This patch uses KnownOnes of the input of ctlz/cttz to bound the value that can be returned from these intrinsics. This makes these intrinsics more similar to the handling for ctpop which already uses known bits to produce a similar bound.

Differential Revision: https://reviews.llvm.org/D32521

llvm-svn: 302444
2017-05-08 17:22:34 +00:00
Sanjay Patel
2e02754f0f [InstSimplify] fix typo; NFC
llvm-svn: 302439
2017-05-08 16:35:02 +00:00
Sanjay Patel
b20117ebc9 [InstCombine] use local variable to reduce code duplication; NFCI
llvm-svn: 302438
2017-05-08 16:33:42 +00:00
Craig Topper
01c1847bc2 [ValueTracking] Introduce a version of computeKnownBits that returns a KnownBits struct. Begin using it to replace internal usages of ComputeSignBit
This introduces a new interface for computeKnownBits that returns the KnownBits object instead of requiring it to be pre-constructed and passed in by reference.

This is a much more convenient interface as it doesn't require the caller to figure out the BitWidth to pre-construct the object. It's so convenient that I believe we can use this interface to remove the special ComputeSignBit flavor of computeKnownBits.

As a step towards that idea, this patch replaces all of the internal usages of ComputeSignBit with this new interface. As you can see from the patch there were a couple places where we called ComputeSignBit which really called computeKnownBits, and then called computeKnownBits again directly. I've reduced those places to only making one call to computeKnownBits. I bet there are probably external users that do it too.

A future patch will update the external users and remove the ComputeSignBit interface. I'll also working on moving more locations to the KnownBits returning interface for computeKnownBits.

Differential Revision: https://reviews.llvm.org/D32848

llvm-svn: 302437
2017-05-08 16:22:48 +00:00
Sanjay Patel
3e7ca7a083 [InstCombine/InstSimplify] add comments about code duplication; NFC
llvm-svn: 302436
2017-05-08 16:21:55 +00:00
Zvi Rackover
0560bf08c5 InstructionSimplify: Refactor foldIdentityShuffles. NFC.
Summary:
Minor refactoring of foldIdentityShuffles() which allows the removal of a
ConstantDataVector::get() in SimplifyShuffleVectorInstruction.

Reviewers: spatel

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32955

Conflicts:
	lib/Analysis/InstructionSimplify.cpp

llvm-svn: 302433
2017-05-08 15:46:58 +00:00
Simon Pilgrim
4099d531e6 [X86][SSE] Improve combineLogicBlendIntoPBLENDV to use general masks.
Currently combineLogicBlendIntoPBLENDV can only match ASHR to detect sign splatting of a bit mask, this patch generalises this to use computeNumSignBits instead.

This is a first step in several things we can do to improve PBLENDV support:

 * Better matching of X86ISD::ANDNP patterns.
 * Handle floating point cases.
 * Better vector and bitcast support in computeNumSignBits.
 * Recognise that PBLENDV only uses the sign bit of the mask, we should be able strip away sign splats (ASHR, PCMPGT isNeg tests etc.).

Differential Revision: https://reviews.llvm.org/D32953

llvm-svn: 302424
2017-05-08 14:16:39 +00:00
Zvi Rackover
b32fbfb1f1 IR: Add a shufflevector mask commutation helper function. NFC.
Summary:
Following up on Sanjay's suggetion in D32955, move this functionality
into ShuffleVectornstruction.

Reviewers: spatel, RKSimon

Reviewed By: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32956

llvm-svn: 302420
2017-05-08 12:40:18 +00:00
Simon Pilgrim
e29dd41674 [ARM][NEON] Add support for ISD::ABS lowering
Update NEON int_arm_neon_vabs intrinsic to use the ISD::ABS opcode directly

Added constant folding tests.

Differential Revision: https://reviews.llvm.org/D32938

llvm-svn: 302417
2017-05-08 10:37:34 +00:00
Martin Storsjo
bba254df79 [ARM] Clear the constant pool cache on explicit .ltorg directives
Multiple ldr pseudoinstructions with the same constant value will
reuse the same constant pool entry. However, if the constant pool
is explicitly flushed with a .ltorg directive, we should not try
to reference constants in the previous pool any longer, since they
may be out of range.

This fixes assembling hand-written assembler source which repeatedly
loads the same constant value, across a binary size larger than the
pc-relative fixup range for ldr instructions (4096 bytes). Such
assembler source already uses explicit .ltorg instructions to emit
constant pools with regular intervals. However if we try to reuse
constants emitted in earlier pools, they end up out of range.

This makes the output of the testcase match what binutils gas does
(prior to this patch, it would fail to assemble).

Differential Revision: https://reviews.llvm.org/D32847

llvm-svn: 302416
2017-05-08 10:26:24 +00:00
Simon Pilgrim
7889ea478c [AARCH64][NEON] Add support for ISD::ABS lowering
Update int_aarch64_neon_abs intrinsic to use the ISD::ABS opcode directly

Differential Revision: https://reviews.llvm.org/D32940

llvm-svn: 302415
2017-05-08 10:25:18 +00:00
Igor Breger
53f8ce5987 [GlobalISel][X86] G_GEP selection support.
Summary: [GlobalISel][X86] G_GEP selection support.

Reviewers: zvi, guyblank

Reviewed By: guyblank

Subscribers: dberris, rovka, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D32396

llvm-svn: 302412
2017-05-08 09:40:43 +00:00
Igor Breger
915957ad23 [GlobalISel][X86] G_MUL legalizer/selector support.
Summary:
G_MUL legalizer/selector/regbank support.
Use only Tablegen-erated instruction selection.
This patch dealing with legal operations only.

Reviewers: zvi, guyblank

Reviewed By: guyblank

Subscribers: krytarowski, rovka, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D32698

llvm-svn: 302410
2017-05-08 09:03:37 +00:00
Craig Topper
63b5b25d3d [APInt] Modify tcMultiplyPart's overflow detection to not depend on 'i' from the earlier loop. NFC
The value of 'i' is always the smaller of DstParts and SrcParts so we can just use that fact to write all the code in terms of SrcParts and DstParts.

llvm-svn: 302408
2017-05-08 06:34:41 +00:00
Craig Topper
ee2d9260d4 [APInt] Use std::min instead of writing the same thing with the ternary operator. NFC
llvm-svn: 302407
2017-05-08 06:34:39 +00:00
Craig Topper
649f1ad9b5 [APInt] Remove 'else' after 'return' in tcMultiply methods. NFC
llvm-svn: 302406
2017-05-08 06:34:36 +00:00
Dean Michael Berris
5083dffff0 [XRay] Custom event logging intrinsic
This patch introduces an LLVM intrinsic and a target opcode for custom event
logging in XRay. Initially, its use case will be to allow users of XRay to log
some type of string ("poor man's printf"). The target opcode compiles to a noop
sled large enough to enable calling through to a runtime-determined relative
function call. At runtime, when X-Ray is enabled, the sled is replaced by
compiler-rt with a trampoline to the logic for creating the custom log entries.

Future patches will implement the compiler-rt parts and clang-side support for
emitting the IR corresponding to this intrinsic.

Reviewers: timshen, dberris

Subscribers: igorb, pelikan, rSerge, timshen, echristo, dberris, llvm-commits

Differential Revision: https://reviews.llvm.org/D27503

llvm-svn: 302405
2017-05-08 05:45:21 +00:00
Craig Topper
76e9306b76 [SCEV] Use APInt::operator*=(uint64_t) to avoid a temporary APInt for a constant.
llvm-svn: 302404
2017-05-08 04:55:13 +00:00
Craig Topper
b45ca040d2 [APInt] Take advantage of new operator*=(uint64_t) to remove a temporary APInt.
llvm-svn: 302403
2017-05-08 04:55:12 +00:00
Craig Topper
eb151b0f34 [APInt] Add support for multiplying by a uint64_t.
This makes multiply similar to add, sub, xor, and, and or.

llvm-svn: 302402
2017-05-08 04:55:09 +00:00
Eric Beckmann
61da778e46 Hopefully one last commit to fix this patch, addresses string reference
issues.

llvm-svn: 302401
2017-05-08 02:47:42 +00:00
Eric Beckmann
f467e9d298 Update llvm-readobj -coff-resources to display tree structure.
Summary: Continue making updates to llvm-readobj to display resource sections.  This is necessary for testing the up and coming cvtres tool.

Reviewers: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32609

llvm-svn: 302399
2017-05-08 02:47:07 +00:00
Craig Topper
cbf0448c49 [SCEV] Have getRangeForAffineARHelper take StartRange by const reference to avoid a copy in many of the cases.
llvm-svn: 302398
2017-05-08 02:29:15 +00:00
Eric Beckmann
918ba0feda Revert "Hopefully one last commit to fix this patch, addresses string reference"
Summary:
This reverts commit 56beec1b1cfc6d263e5eddb7efff06117c0724d2.

Revert "Quick fix to D32609, it seems .o files are not transferred in all cases."

This reverts commit 7652eecd29cfdeeab7f76f687586607a99ff4e36.

Revert "Update llvm-readobj -coff-resources to display tree structure."

This reverts commit 422b62c4d302cfc92401418c2acd165056081ed7.

Reviewers: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32958

llvm-svn: 302397
2017-05-08 02:25:03 +00:00
Eric Beckmann
f84fcd81e8 Hopefully one last commit to fix this patch, addresses string reference
issues.

llvm-svn: 302395
2017-05-08 01:48:55 +00:00
Eric Beckmann
c614436be9 Update llvm-readobj -coff-resources to display tree structure.
Summary: Continue making updates to llvm-readobj to display resource sections.  This is necessary for testing the up and coming cvtres tool.

Reviewers: zturner

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32609

llvm-svn: 302386
2017-05-07 22:47:22 +00:00
Craig Topper
f9e6e29ef0 [ConstantRange][SimplifyCFG] Add a helper method to allow SimplifyCFG to determine if a ConstantRange has more than 8 elements without requiring an allocation if the ConstantRange is 64-bits wide.
Previously SimplifyCFG used getSetSize which returns an APInt that is 1 bit wider than the ConstantRange's bit width. In the reasonably common case that the ConstantRange is 64-bits wide, this requires returning a 65-bit APInt. APInt's can only store 64-bits without a memory allocation so this is inefficient.

The new method takes the 8 as an input and tells if the range contains more than that many elements without requiring any wider math.

llvm-svn: 302385
2017-05-07 22:22:11 +00:00
Craig Topper
712410d6fb [ConstantRange] Remove 'Of' from name of ConstantRange::isSizeStrictlySmallerThanOf so that it reads better. NFC
llvm-svn: 302383
2017-05-07 21:48:08 +00:00
Simon Pilgrim
1e7b4fdb56 [X86][AVX1] Improve 256-bit vector costs for integer unary intrinsics.
Account for subvector extraction/insertion, helps prevent the vectorizers from selecting 256-bit vectors that will have to be split anyhow on AVX1 targets. 

llvm-svn: 302378
2017-05-07 20:58:55 +00:00
Zvi Rackover
e274f9ca04 InstructionSimplify: Relanding r301766
Summary:
Re-applying r301766 with a fix to a typo and a regression test.

The log message for r301766 was:
==================================================================================
    InstructionSimplify: Canonicalize shuffle operands. NFC-ish.

    Summary:
     Apply canonicalization rules:
        1. Input vectors with no elements selected from can be replaced with undef.
        2. If only one input vector is constant it shall be the second one.

    This allows constant-folding to cover more ad-hoc simplifications that
    were in place and avoid duplication for RHS and LHS checks.

    There are more rules we may want to add in the future when we see a
    justification. e.g. mask elements that select undef elements can be
    replaced with undef.
==================================================================================

Reviewers: spatel, RKSimon

Reviewed By: spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32863

llvm-svn: 302373
2017-05-07 18:16:37 +00:00
Lang Hames
e759a41442 Make llvm-rtdlyd -check preserve automatic address mappings made by RuntimeDyld.
Currently llvm-rtdyld in -check mode will map sections to back-to-back 4k
aligned slabs starting at 0x1000. Automatically remapping sections by default is
helpful because it quickly exposes relocation bugs due to use of local addresses
rather than load addresses (these would silently pass if the load address was
not remapped). These mappings can be explicitly overridden on a per-section
basis using llvm-rtdlyd's -map-section option. This patch extends this scheme to
also preserve any mappings made by RuntimeDyld itself. Preserving RuntimeDyld's
automatic mappings allows us to write test cases to verify that these automatic
mappings have been applied.

This will allow the fix in https://reviews.llvm.org/D32899 to be tested with
llvm-rtdyld -check.

llvm-svn: 302372
2017-05-07 17:19:53 +00:00
Craig Topper
d2d0986b7c [SCEV] Use move semantics in ScalarEvolution::setRange
Summary: This makes setRange take ConstantRange by rvalue reference since most callers were passing an unnamed temporary ConstantRange. We can then move that ConstantRange into the DenseMap caches. For the callers that weren't passing a temporary, I've added std::move to to the local variable being passed.

Reviewers: sanjoy, mzolotukhin, efriedma

Reviewed By: sanjoy

Subscribers: takuto.ikuta, llvm-commits

Differential Revision: https://reviews.llvm.org/D32943

llvm-svn: 302371
2017-05-07 16:28:17 +00:00
Sanjay Patel
41b5979dcf [InstSimplify] use ConstantRange to simplify or-of-icmps
We can simplify (or (icmp X, C1), (icmp X, C2)) to 'true' or one of the icmps in many cases.
I had to check some of these with Alive to prove to myself it's right, but everything seems 
to check out. Eg, the deleted code in instcombine was completely ignoring predicates with
mismatched signedness.

This is a follow-up to:
https://reviews.llvm.org/rL301260
https://reviews.llvm.org/D32143

llvm-svn: 302370
2017-05-07 15:11:40 +00:00
Sanjoy Das
120ac49e07 Remove unnecessary const_cast
llvm-svn: 302368
2017-05-07 05:29:36 +00:00
Sanjoy Das
9340f1ac11 Use array_pod_sort instead of std::sort
llvm-svn: 302367
2017-05-07 05:29:34 +00:00
Simon Pilgrim
dbf61bb821 [X86][AVX512] Relax assertion and just exit combine for unsupported types (PR32907)
llvm-svn: 302361
2017-05-06 20:53:52 +00:00
Simon Pilgrim
f17aa562a5 [X86][AVX512] Move v2i64/v4i64 VPABS lowering to tablegen
Extend NoVLX targets to use the 512-bit versions

llvm-svn: 302359
2017-05-06 19:11:59 +00:00
Simon Pilgrim
a125ccfde1 [X86] Reduce code for setting operations actions by merging into loops across multiple types/ops. NFCI.
llvm-svn: 302357
2017-05-06 18:17:56 +00:00
Simon Pilgrim
05c4646203 [NVPTX] Add support for ISD::ABS lowering
Use the ISD::ABS opcode directly

Differential Revision: https://reviews.llvm.org/D32944

llvm-svn: 302356
2017-05-06 17:42:09 +00:00
Simon Pilgrim
2d8e2fa524 [X86][SSE] Break register dependencies on v16i8/v8i16 BUILD_VECTOR on SSE41
rL294581 broke unnecessary register dependencies on partial v16i8/v8i16 BUILD_VECTORs, but on SSE41 we (currently) use insertion for full BUILD_VECTORs as well. By allowing full insertion to occur on SSE41 targets we can break register dependencies here as well.

llvm-svn: 302355
2017-05-06 17:30:39 +00:00
Simon Pilgrim
63ed21483f [DAGCombiner] If ISD::ABS is legal/custom, use it directly instead of canonicalizing first.
Remove an extra canonicalization step if ISD::ABS is going to be used anyway.

Updated x86 abs combine to check that we are lowering from both canonicalizations.

llvm-svn: 302337
2017-05-06 13:44:42 +00:00
Craig Topper
1433390382 [SCEV] Remove extra APInt copies from getRangeForAffineARHelper.
This changes one parameter to be a const APInt& since we only read from it. Use std::move on local APInts once they are no longer needed so we can reuse their allocations. Lastly, use operator+=(uint64_t) instead of adding 1 to an APInt twice creating a new APInt each time.

llvm-svn: 302335
2017-05-06 06:03:07 +00:00
Craig Topper
3364e88e41 [SCEV] Use std::move to avoid some APInt copies.
llvm-svn: 302334
2017-05-06 05:22:56 +00:00
Craig Topper
f682419d39 [SCEV] Use APInt's uint64_t operations instead of creating a temporary APInt to hold 1.
llvm-svn: 302333
2017-05-06 05:15:11 +00:00
Craig Topper
d98a504b8c [SCEV] Avoid a couple APInt copies by capturing by reference since the method returns a reference.
llvm-svn: 302332
2017-05-06 05:15:09 +00:00
Craig Topper
76afb39c1b [LazyValueInfo] Avoid unnecessary copies of ConstantRanges
Summary:
ConstantRange contains two APInts which can allocate memory if their width is larger than 64-bits. So we shouldn't copy it when we can avoid it.

This changes LVILatticeVal::getConstantRange() to return its internal ConstantRange by reference. This allows many places that just need a ConstantRange reference to avoid making a copy.

Several places now capture the return value of getConstantRange() by reference so they can call methods on it that don't need a new object.

Lastly it adds std::move in one place to capture to move a local ConstantRange into an LVILatticeVal.

Reviewers: reames, dberlin, sanjoy, anna

Reviewed By: reames

Subscribers: grandinj, llvm-commits

Differential Revision: https://reviews.llvm.org/D32884

llvm-svn: 302331
2017-05-06 03:35:15 +00:00
Kostya Serebryany
5c1bead8ef [sanitizer-coverage] implement -fsanitize-coverage=no-prune,... instead of a hidden -mllvm flag. llvm part.
llvm-svn: 302319
2017-05-05 23:14:40 +00:00