1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00
Commit Graph

483 Commits

Author SHA1 Message Date
Christopher Tetreault
f49a6d5d99 Clean up usages of asserting vector getters in Type
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.

Reviewers: stoklund, sdesmalen, efriedma

Reviewed By: sdesmalen

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D77272
2020-04-10 14:53:43 -07:00
Eli Friedman
db20f1e2c5 Remove "mask" operand from shufflevector.
Instead, represent the mask as out-of-line data in the instruction. This
should be more efficient in the places that currently use
getShuffleVector(), and paves the way for further changes to add new
shuffles for scalable vectors.

This doesn't change the syntax in textual IR. And I don't currently plan
to change the bitcode encoding in this patch, although we'll probably
need to do something once we extend shufflevector for scalable types.

I expect that once this is finished, we can then replace the raw "mask"
with something more appropriate for scalable vectors.  Not sure exactly
what this looks like at the moment, but there are a few different ways
we could handle it.  Maybe we could try to describe specific shuffles.
Or maybe we could define it in terms of a function to convert a fixed-length
array into an appropriate scalable vector, using a "step", or something
like that.

Differential Revision: https://reviews.llvm.org/D72467
2020-03-31 13:08:59 -07:00
Guozhi Wei
a059f4193c [CodeGenPrepare] Delete intrinsic call to llvm.assume to enable more tailcall
The attached test case is simplified from tcmalloc. Both function calls should be optimized as tailcall. But llvm can only optimize the first call. The second call can't be optimized because function dupRetToEnableTailCallOpts failed to duplicate ret into block case2.

There 2 problems blocked the duplication:

  1 Intrinsic call llvm.assume is not handled by dupRetToEnableTailCallOpts.
  2 The control flow is more complex than expected, dupRetToEnableTailCallOpts can only duplicate ret into its predecessor, but here we have an intermediate block between call and ret.

The solutions:

  1 Since CodeGenPrepare is already at the end of LLVM IR phase, we can simply delete the intrinsic call to llvm.assume.
  2 A general solution to the complex control flow is hard, but for this case, after exit2 is duplicated into case1, exit2 is the only successor of exit1 and exit1 is the only predecessor of exit2, so they can be combined through eliminateFallThrough. But this function is called too late, there is no more dupRetToEnableTailCallOpts after it. We can add an earlier call to eliminateFallThrough to solve it.

Differential Revision: https://reviews.llvm.org/D76539
2020-03-31 11:55:51 -07:00
Juneyoung Lee
1f40e952e7 Minor fixes to a comment in CodeGenPrepare 2020-03-25 16:34:43 +09:00
Juneyoung Lee
bc1d564c2b Minor fix to a comment in CodeGenPrepare.cpp 2020-03-17 01:10:26 +09:00
Juneyoung Lee
3cd755e246 [CodeGenPrepare] Freeze condition when transforming select to br
Summary:
This is a simple fix for CodeGenPrepare that freezes branch condition when transforming select to branch.
If it is not frozen, instsimplify or the later pipeline can potentially exploit undefined behavior.

The diff shows optimized form becase D75859 and D76048 already made a few changes to CodeGenPrepare for optimizing freeze(cmp).

Reviewers: jdoerfert, spatel, lebedev.ri, efriedma

Reviewed By: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76179
2020-03-16 12:46:20 +09:00
Juneyoung Lee
3086134f9d Revert "[CodeGenPrepare] Freeze condition when transforming select to br"
This reverts commit 10aa7ea951e22dbd7f2ebdeb6410cfbc8a251eb1.
2020-03-16 12:45:54 +09:00
Juneyoung Lee
b73d82e804 [CodeGenPrepare] Freeze condition when transforming select to br
Summary:
This is a simple fix for CodeGenPrepare that freezes branch condition when transforming select to branch.
If it is not freezed, instsimplify or the later pipeline can potentially exploit undefined behavior.

The diff shows optimized form becase D75859 and D76048 already made a few changes to CodeGenPrepare for optimizing freeze(cmp).

Reviewers: jdoerfert, spatel, lebedev.ri, efriedma

Reviewed By: lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76179
2020-03-15 11:10:46 +09:00
Juneyoung Lee
8574f053c4 [CodeGenPrepare] Expand freeze conversion to support fcmp and icmp with null
Summary:
This is a simple patch that expands https://reviews.llvm.org/D75859 to pointer comparison and fcmp

Checked with Alive2

Reviewers: reames, jdoerfert

Reviewed By: jdoerfert

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D76048
2020-03-13 17:21:33 +09:00
Huihui Zhang
58a931b286 [SVE] Update API ConstantVector::getSplat() to use ElementCount.
Summary:
Support ConstantInt::get() and Constant::getAllOnesValue() for scalable
vector type, this requires ConstantVector::getSplat() to take in 'ElementCount',
instead of 'unsigned' number of element count.

This change is needed for D73753.

Reviewers: sdesmalen, efriedma, apazos, spatel, huntergr, willlovett

Reviewed By: efriedma

Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74386
2020-03-12 13:22:41 -07:00
Juneyoung Lee
58f4a73f70 [CodeGenPrepare] Fold br(freeze(icmp x, const)) to br(icmp(freeze x, const))
Summary:
This patch helps CodeGenPrepare move freeze into the icmp when it is used by branch.
It reenables generation of efficient conditional jumps.

This is only done when at least one of icmp's operands is constant to prevent the transformation from increasing # of freeze instructions.

Performance degradation of MultiSource/Benchmarks/Ptrdist/yacr2/yacr2.test is resolved with this patch.

Checked with Alive2

Reviewers: reames, fhahn, nlopes

Reviewed By: reames

Subscribers: jdoerfert, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75859
2020-03-12 03:16:15 +09:00
Guozhi Wei
559c35e089 [CodeGenPrepare] Handle ExtractValueInst in dupRetToEnableTailCallOpts
As the test case shows if there is an ExtractValueInst in the Ret block, function dupRetToEnableTailCallOpts can't duplicate it into the block containing call. So later no tail call is generated in CodeGen.

    This patch adds the ExtractValueInst handling code in function dupRetToEnableTailCallOpts and FoldReturnIntoUncondBranch, and later tail call can be generated for this case.

Differential Revision: https://reviews.llvm.org/D74242
2020-03-04 11:10:32 -08:00
Florian Hahn
5f8cf84ae0 Recommit "[PatternMatch] Match XOR variant of unsigned-add overflow check."
This version fixes a buildbot failure cause by picking the wrong insert
point for XORs. We cannot pick the XOR binary operator as insert point,
as it is not guaranteed that both input operands for the overflow
intrinsic are defined before it.

This reverts the revert commit
c7fc0e5da6c3c36eb5f3a874a6cdeaedb26856e0.
2020-02-23 18:33:18 +00:00
Nikita Popov
055f58eb4f [SimplifyLibCalls][IRBuilder] Accept any IRBuilder in SimplifyLibCalls
This changes the SimplifyLibCalls utility to accept an IRBuilderBase,
which allows us to pass through the IRBuilder used by InstCombine.
This will ensure that new instructions get added to the worklist.
The annotated test-case drops from 4 to 2 InstCombine iterations thanks
to this.

To achieve this, I'm adding an IRBuilderBase::OperandBundlesGuard,
which is basically the same as the existing InsertPointGuard and
FastMathFlagsGuard, but for operand bundles. Also add a
setDefaultOperandBundles() method so these can be set outside the
constructor.

Differential Revision: https://reviews.llvm.org/D74792
2020-02-21 18:26:05 +01:00
Florian Hahn
05986e5408 Revert "[PatternMatch] Match XOR variant of unsigned-add overflow check."
This reverts commit e01a3d49c224d6f8a7afc01205b05b9deaa07afa.
and commit a6a585b8030b6e8d4c50c71f54a6addb21995fe0.

This causes a failure on GreenDragon:
http://lab.llvm.org:8080/green/view/LLDB/job/lldb-cmake/9597
2020-02-19 19:37:08 +01:00
Florian Hahn
aaa658ee7c [PatternMatch] Match XOR variant of unsigned-add overflow check.
Instcombine folds (a + b <u a) to (a ^ -1 <u b) and that does not match
the expected pattern in CodeGenPerpare via UAddWithOverflow.

This causes a regression over Clang 7 on both X86 and AArch64:
https://gcc.godbolt.org/z/juhXYV

This patch extends UAddWithOverflow to also catch the XOR case, if the
XOR is only used in the ICMP. This covers just a single case, but I'd
like to make sure I am not missing anything before tackling the other
cases.

Reviewers: nikic, RKSimon, lebedev.ri, spatel

Reviewed By: nikic, lebedev.ri

Differential Revision: https://reviews.llvm.org/D74228
2020-02-19 15:25:18 +01:00
Florian Hahn
476c0607dd [TargetLower] Update shouldFormOverflowOp check if math is used.
On some targets, like SPARC, forming overflow ops is only profitable if
the math result is used: https://godbolt.org/z/DxSmdB
This patch adds a new MathUsed parameter to allow the targets
to make the decision and defaults to only allowing it
if the math result is used. That is the conservative choice.

This patch also updates AArch64ISelLowering, X86ISelLowering,
ARMISelLowering.h, SystemZISelLowering.h to allow forming overflow
ops if the math result is not used. On those targets using the
overflow intrinsic for the overflow check only generates better code.

Reviewers: nikic, RKSimon, lebedev.ri, spatel

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D74722
2020-02-19 11:28:33 +01:00
Jim Lin
0596dad096 [NFC] Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h,td}
2020-02-18 10:49:13 +08:00
Clement Courbet
7f1e47f301 [CodeGen] Fix the computation of the alignment of split stores.
Summary:
Right now the alignment of the lower half of a store is computed as
align/2, which fails for unaligned stores (align = 1), and is overly
pessimitic for, e.g. a 8 byte store aligned to 4 bytes.
Fixes PR44851
Fixes PR44877

Reviewers: gchatelet, spatel, lebedev.ri

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D74311
2020-02-12 10:37:30 +01:00
Matt Arsenault
726bd885da CodeGenPrepare: Reorder check for cold and shouldOptimizeForSize
shouldOptimizeForSize is showing up in a profile, spending around 10%
of the pass time in one function. This should probably not be so slow,
but the much cheaper attribute check should be done first anyway.
2020-02-04 11:23:13 -08:00
Fangrui Song
8dcded4bd8 [CodeGenPrepare] Delete dead !DL check
Follow-up for D73754

DL is assigned in CodeGenPrepare::runOnFunction and is guaranteed to be
non-null.
2020-02-02 09:49:06 -08:00
Fangrui Song
1e7c397a16 [CodeGenPrepare] Make TargetPassConfig required
The code paths in the absence of TargetMachine, TargetLowering or
TargetRegisterInfo are poorly tested. As rL285987 said, requiring
TargetPassConfig allows us to delete many (untested) checks littered
everywhere.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D73754
2020-02-02 09:28:45 -08:00
Guillaume Chatelet
2efa9bb646 [Alignment][NFC] Use Align with CreateAlignedStore
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, bollu

Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D73274
2020-01-23 17:34:32 +01:00
Hiroshi Yamauchi
023bdc26f9 [PGO][PGSO] Update BFI in CodeGenPrepare::optimizeSelectInst.
Summary:
Without the BFI update, some hot blocks are incorrectly treated as cold code.

This fixes a FDO perf regression in the TSVC benchmark from D71288.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D73146
2020-01-22 08:36:54 -08:00
Sander de Smalen
3c233e1b36 [AArch64][SVE] Add patterns for unpredicated load/store to frame-indices.
This patch also fixes up a number of cases in DAGCombine and
SelectionDAGBuilder where the size of a scalable vector is used in a
fixed-width context (thus triggering an assertion failure).

Reviewers: efriedma, c-rhodes, rovka, cameron.mcinally

Reviewed By: efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71215
2020-01-22 14:32:27 +00:00
Sander de Smalen
c07e22a824 Add support for (expressing) vscale.
In LLVM IR, vscale can be represented with an intrinsic. For some targets,
this is equivalent to the constexpr:

  getelementptr <vscale x 1 x i8>, <vscale x 1 x i8>* null, i32 1

This can be used to propagate the value in CodeGenPrepare.

In ISel we add a node that can be legalized to one or more
instructions to materialize the runtime vector length.

This patch also adds SVE CodeGen support for VSCALE, which maps this
node to RDVL instructions (for scaled multiples of 16bytes) or CNT[HSD]
instructions (scaled multiples of 2, 4, or 8 bytes, respectively).

Reviewers: rengolin, cameron.mcinally, hfinkel, sebpop, SjoerdMeijer, efriedma, lattner

Reviewed by: efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68203
2020-01-22 10:09:27 +00:00
Valentin Churavy
6b8e308145 [CodegenPrepare] Guard against degenerate branches
Summary:
Guard against a potential crash observed in https://github.com/JuliaLang/julia/issues/32994#issuecomment-524249628
If two branches are collapsed we can encounter a degenerate conditional branch `TBB==FBB`.
The subsequent code assumes that they differ, so we exit out early.

Reviewers: ributzka, spatel

Subscribers: loladiro, dexonsmith, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66657
2019-12-16 04:23:32 -05:00
Reid Kleckner
74bbf4a42b [IR] Split out target specific intrinsic enums into separate headers
This has two main effects:
- Optimizes debug info size by saving 221.86 MB of obj file size in a
  Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of
  object file size.
- Incremental step towards decoupling target intrinsics.

The enums are still compact, so adding and removing a single
target-specific intrinsic will trigger a rebuild of all of LLVM.
Assigning distinct target id spaces is potential future work.

Part of PR34259

Reviewers: efriedma, echristo, MaskRay

Reviewed By: echristo, MaskRay

Differential Revision: https://reviews.llvm.org/D71320
2019-12-11 18:02:14 -08:00
Hiroshi Yamauchi
1aa49b7b60 [PGO][PGSO] Instrument the code gen / target passes.
Summary:
Split off of D67120.

Add the profile guided size optimization instrumentation / queries in the code
gen or target passes. This doesn't enable the size optimizations in those passes
yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass
queries).

A second try after reverted D71072.

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71149
2019-12-09 12:42:59 -08:00
Jeremy Morse
e640a69d80 [DebugInfo] Nerf placeDbgValues, with prejudice
CodeGenPrepare::placeDebugValues moves variable location intrinsics to be
immediately after the Value they refer to. This makes tracking of locations
very easy; but it changes the order in which assignments appear to the
debugger, from the source programs order to the order in which the
optimised program computes values. This then leads to PR43986 and PR38754,
where variable locations that were in a conditional block are made
unconditional, which is highly misleading.

This patch adjusts placeDbgValues to only re-order variable location
intrinsics if they use a Value before it is defined, significantly reducing
the damage that it does. This is still not 100% safe, but the rest of
CodeGenPrepare needs polishing to correctly update debug info when
optimisations are performed to fully fix this.

This will probably break downstream debuginfo tests -- if the
instruction-stream position of variable location changes isn't the focus of
the test, an easy fix should be to manually apply placeDbgValues' behaviour
to the failing tests, moving dbg.value intrinsics next to SSA variable
definitions thus:

  %foo = inst1
  %bar = ...
  %baz = ...
  void call @llvm.dbg.value(metadata i32 %foo, ...

to

  %foo = inst1
  void call @llvm.dbg.value(metadata i32 %foo, ...
  %bar = ...
  %baz = ...

This should return your test to exercising whatever it was testing before.

Differential Revision: https://reviews.llvm.org/D58453
2019-12-09 12:52:10 +00:00
Hiroshi Yamauchi
7e81b42fc1 Revert "[PGO][PGSO] Instrument the code gen / target passes."
This reverts commit 9a0b5e14075a1f42a72eedb66fd4fde7985d37ac.

This seems to break buildbots.
2019-12-06 12:17:32 -08:00
Hiroshi Yamauchi
800ab3625d [PGO][PGSO] Instrument the code gen / target passes.
Summary:
Split off of D67120.

Add the profile guided size optimization instrumentation / queries in the code
gen or target passes. This doesn't enable the size optimizations in those passes
yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass
queries).

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71072
2019-12-06 10:43:39 -08:00
Jeremy Morse
765a5a757e [DebugInfo][CGP] Update dbg.values when sinking address computations
One of CodeGenPrepare's optimizations is to duplicate address calculations
into basic blocks, so that as much information as possible can be folded
into memory addressing operands. This is great -- but the dbg.value
variable location intrinsics are not updated in the same way. This can lead
to dbg.values referring to address computations in other blocks that will
never be encoded into the DAG, while duplicate address computations are
performed locally that could be used by the dbg.value. Some of these (such
as non-constant-offset GEPs) can't be salvaged past.

Fix this by, whenever we duplicate an address computation into a block,
looking for dbg.value users of the original memory address in the same
block, and redirecting those to the local computation.

Differential Revision: https://reviews.llvm.org/D58403
2019-12-06 11:27:19 +00:00
Hans Wennborg
0c2616b1a5 Fix a typo. 2019-11-30 13:23:49 +01:00
Reid Kleckner
68092989f3 Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.

I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
  recompiles    touches affected_files  header
  342380        95      3604    llvm/include/llvm/ADT/STLExtras.h
  314730        234     1345    llvm/include/llvm/InitializePasses.h
  307036        118     2602    llvm/include/llvm/ADT/APInt.h
  213049        59      3611    llvm/include/llvm/Support/MathExtras.h
  170422        47      3626    llvm/include/llvm/Support/Compiler.h
  162225        45      3605    llvm/include/llvm/ADT/Optional.h
  158319        63      2513    llvm/include/llvm/ADT/Triple.h
  140322        39      3598    llvm/include/llvm/ADT/StringRef.h
  137647        59      2333    llvm/include/llvm/Support/Error.h
  131619        73      1803    llvm/include/llvm/Support/FileSystem.h

Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.

Reviewers: bkramer, asbirlea, bollu, jdoerfert

Differential Revision: https://reviews.llvm.org/D70211
2019-11-13 16:34:37 -08:00
Yi-Hong Lyu
886fba1618 [CGP] Make ICMP_EQ use CR result of ICMP_S(L|G)T dominators
For example:

long long test(long long a, long long b) {
  if (a << b > 0)
    return b;
  if (a << b < 0)
    return a;
  return a*b;
}

Produces:

        sld. 5, 3, 4
        ble 0, .LBB0_2
        mr 3, 4
        blr
.LBB0_2:                                # %if.end
        cmpldi  5, 0
        li 5, 1
        isel 4, 4, 5, 2
        mulld 3, 4, 3
        blr

But the compare (cmpldi 5, 0) is redundant and can be removed (CR0 already
contains the result of that comparison).

The root cause of this is that LLVM converts signed comparisons into equality
comparison based on dominance. Equality comparisons are unsigned by default, so
we get either a record-form or cmp (without the l for logical) feeding a cmpl.
That is the situation we want to avoid here.

Differential Revision: https://reviews.llvm.org/D60506
2019-11-11 17:28:50 +00:00
Guillaume Chatelet
89f87ca322 [Alignment][NFC] Remove dependency on GlobalObject::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, mehdi_amini, jvesely, nhaehnle, hiraditya, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68944

llvm-svn: 374880
2019-10-15 11:24:36 +00:00
Joerg Sonnenberger
5200dee212 Reapply r374743 with a fix for the ocaml binding
Add a pass to lower is.constant and objectsize intrinsics

This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.

The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.

Differential Revision: https://reviews.llvm.org/D65280

llvm-svn: 374784
2019-10-14 16:15:14 +00:00
Dmitri Gribenko
d8ea0e7773 Revert "Add a pass to lower is.constant and objectsize intrinsics"
This reverts commit r374743. It broke the build with Ocaml enabled:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218

llvm-svn: 374768
2019-10-14 12:22:48 +00:00
Joerg Sonnenberger
ea06f385c5 Add a pass to lower is.constant and objectsize intrinsics
This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.

The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.

Differential Revision: https://reviews.llvm.org/D65280

llvm-svn: 374743
2019-10-13 23:00:15 +00:00
Simon Pilgrim
a23fa47859 CodeGenPrepare - silence static analyzer dyn_cast<> null dereference warnings. NFCI.
The static analyzer is warning about potential null dereferences, but in these cases we should be able to use cast<> directly and if not assert will fire for us.

llvm-svn: 374085
2019-10-08 17:00:01 +00:00
Guillaume Chatelet
e4601bbf20 [Alignment][NFC] Remove AllocaInst::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, arsenm, jvesely, nhaehnle, eraman, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68141

llvm-svn: 373207
2019-09-30 13:34:44 +00:00
Jesper Antonsson
9125982a28 [CodeGenPrepare] Mend "avoid crashing from replacing a phi twice" fix.
Summary:
An erroneously negated if-statement by an earlier (March 2019) bugfix left phi replacement/simplification under optimizeMemoryInst()  in CodeGenPrepare largely inactivated. The error was found when csmith found that the same assert as in the original bug report could still be triggered in a different way. This patch fixes the bugfix. The original bug was:
 https://bugs.llvm.org/show_bug.cgi?id=41052
... and the previous fix was D59358.

Reviewers: aprantl, skatkov

Reviewed By: skatkov

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67838

llvm-svn: 373084
2019-09-27 13:01:37 +00:00
Sanjay Patel
f51a1549df Revert [IR] allow fast-math-flags on phi of FP values
This reverts r372866 (git commit dec03223a97af0e4dfcb23da55c0f7f8c9b62d00)

llvm-svn: 372868
2019-09-25 13:29:09 +00:00
Sanjay Patel
06a326f643 [IR] allow fast-math-flags on phi of FP values
The changes here are based on the corresponding diffs for allowing FMF on 'select':
D61917

As discussed there, we want to have fast-math-flags be a property of an FP value
because the alternative (having them on things like fcmp) leads to logical
inconsistency such as:
https://bugs.llvm.org/show_bug.cgi?id=38086

The earlier patch for select made almost no practical difference because most
unoptimized conditional code begins life as a phi (based on what I see in clang).
Similarly, I don't expect this patch to do much on its own either because
SimplifyCFG promptly drops the flags when converting to select on a minimal
example like:
https://bugs.llvm.org/show_bug.cgi?id=39535

But once we have this plumbing in place, we should be able to wire up the FMF
propagation and start solving cases like that.

The change to RecurrenceDescriptor::AddReductionVar() is required to prevent a
regression in a LoopVectorize test. We are intersecting the FMF of any
FPMathOperator there, so if a phi is not properly annotated, new math
instructions may not be either. Once we fix the propagation in SimplifyCFG, it
may be safe to remove that hack.

Differential Revision: https://reviews.llvm.org/D67564

llvm-svn: 372866
2019-09-25 13:14:12 +00:00
David Green
afc4123d6c [CGP] Ensure sinking multiple instructions does not invalidate dominance checks
In MVE, as of rL371218, we are attempting to sink chains of instructions such as:
  %l1 = insertelement <8 x i8> undef, i8 %l0, i32 0
  %broadcast.splat26 = shufflevector <8 x i8> %l1, <8 x i8> undef, <8 x i32> zeroinitializer
In certain situations though, we can end up breaking the dominance relations of
instructions. This happens when we sink the instruction into a loop, but cannot
remove the originals. The Use is updated, which might in fact be a Use from the
second instruction to the first.

This attempts to fix that by reversing the order of instruction that are sunk,
and ensuring that we update the uses on new instructions if they have already
been sunk, not the old ones.

Differential Revision: https://reviews.llvm.org/D67366

llvm-svn: 371743
2019-09-12 16:00:07 +00:00
Tim Northover
734aa27517 CodeGenPrep: add separate hook say when GEPs should be used for sinking. NFCI.
Up to now, we've decided whether to sink address calculations using GEPs or
normal arithmetic based on the useAA hook, but there are other reasons GEPs
might be preferred. So this patch splits the two questions, with a default
implementation falling back to useAA.

llvm-svn: 371721
2019-09-12 10:21:00 +00:00
Teresa Johnson
0062c013da Change TargetLibraryInfo analysis passes to always require Function
Summary:
This is the first change to enable the TLI to be built per-function so
that -fno-builtin* handling can be migrated to use function attributes.
See discussion on D61634 for background. This is an enabler for fixing
handling of these options for LTO, for example.

This change should not affect behavior, as the provided function is not
yet used to build a specifically per-function TLI, but rather enables
that migration.

Most of the changes were very mechanical, e.g. passing a Function to the
legacy analysis pass's getTLI interface, or in Module level cases,
adding a callback. This is similar to the way the per-function TTI
analysis works.

There was one place where we were looking for builtins but not in the
context of a specific function. See FindCXAAtExit in
lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround
could provide the wrong behavior in some corner cases. Suggestions
welcome.

Reviewers: chandlerc, hfinkel

Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66428

llvm-svn: 371284
2019-09-07 03:09:36 +00:00
Craig Topper
4054fdb457 [CGP] Remove ModifiedDT from the makeBitReverse loop
I don't think anything in this loop modifies the control flow and we don't restart any iteration after setting the flag.

This code was added in http://reviews.llvm.org/D16893 but looking at the test case added there the code that caused the dominator tree to change was merging blocks with their predecessor not the bitreverse optimization.

Differential Revision: https://reviews.llvm.org/D66366

llvm-svn: 369283
2019-08-19 18:02:24 +00:00
Sanjay Patel
45efa06f0f [CodeGenPrepare] Fix use-after-free
If OptimizeExtractBits() encountered a shift instruction with no operands at all,
it would erase the instruction, but still return false.

This previously didn’t matter because its caller would always return after
processing the instruction, but https://reviews.llvm.org/D63233 changed the
function’s caller to fall through if it returned false, which would then cause
a use-after-free detectable by ASAN.

This change makes OptimizeExtractBits return true if it removes a shift
instruction with no users, terminating processing of the instruction.

Patch by: @brentdax (Brent Royal-Gordon)

Differential Revision: https://reviews.llvm.org/D66330

llvm-svn: 369168
2019-08-16 23:10:34 +00:00