1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 19:52:54 +01:00
Commit Graph

174539 Commits

Author SHA1 Message Date
Mircea Trofin
6673639046 [llvm] Opt-in flag for X86DiscriminateMemOps
Summary:
Currently, if an instruction with a memory operand has no debug information,
X86DiscriminateMemOps will generate one based on the first line of the
enclosing function, or the last seen debug info.

This may cause confusion in certain debugging scenarios. The long term
approach would be to use the line number '0' in such cases, however, that
brings in challenges: the base discriminator value range is limited
(4096 values).

For the short term, adding an opt-in flag for this feature.

See bug 40319 (https://bugs.llvm.org/show_bug.cgi?id=40319)

Reviewers: dblaikie, jmorse, gbedwell

Reviewed By: dblaikie

Subscribers: aprantl, eraman, hiraditya

Differential Revision: https://reviews.llvm.org/D57257

llvm-svn: 352246
2019-01-25 21:49:54 +00:00
Jessica Paquette
00aa216e32 [GlobalISel][AArch64][NFC] Fix incorrect comment in selectUnmergeValues
s/scalar/vector/

llvm-svn: 352243
2019-01-25 21:28:27 +00:00
Alina Sbirlea
ea700743c3 Revert rL352238.
llvm-svn: 352241
2019-01-25 21:12:08 +00:00
Alex Bradbury
6a6ea207eb [RISCV] Add another potential combine to {double,float}-bitmanip-dagcombines.ll
(fcopysign a, (fneg b)) will be expanded to bitwise operations by
DAGTypeLegalizer::SoftenFloatRes_FCOPYSIGN if the floating point type isn't
legal. Arguably it might be worth doing a combine even if it is legal.

llvm-svn: 352240
2019-01-25 21:06:47 +00:00
Alina Sbirlea
357b85a9d2 [WarnMissedTransforms] Set default to 1.
Summary:
Set default value for retrieved attributes to 1, since the check is against 1.
Eliminates the warning noise generated when the attributes are not present.

Reviewers: sanjoy

Subscribers: jlebar, llvm-commits

Differential Revision: https://reviews.llvm.org/D57253

llvm-svn: 352238
2019-01-25 20:51:55 +00:00
Ana Pazos
015647cf0a Reapply: [RISCV] Set isAsCheapAsAMove for ADDI, ORI, XORI, LUI
This reapplies commit r352010 with RISC-V test fixes.

llvm-svn: 352237
2019-01-25 20:22:49 +00:00
Guozhi Wei
49bf5cf617 [MBP] Don't move bottom block before header if it can't reduce taken branches
If bottom of block BB has only one successor OldTop, in most cases it is profitable to move it before OldTop, except the following case:

-->OldTop<-
|    .    |
|    .    |
|    .    |
---Pred   |
     |    |
    BB-----

Move BB before OldTop can't reduce the number of taken branches, this patch detects this case and prevent the moving.

Differential Revision: https://reviews.llvm.org/D57067

llvm-svn: 352236
2019-01-25 19:45:13 +00:00
Craig Topper
359e356db7 [X86] Combine masked store and truncate into masked truncating stores.
We also need to combine to masked truncating with saturation stores, but I'm leaving that for a future patch.

This does regress some tests that used truncate wtih saturation followed by a masked store. Those now use a truncating store and use min/max to saturate.

Differential Revision: https://reviews.llvm.org/D57218

llvm-svn: 352230
2019-01-25 18:37:36 +00:00
Vedant Kumar
03b893cf5f [HotColdSplit] Introduce a cost model to control splitting behavior
The main goal of the model is to avoid *increasing* function size, as
that would eradicate any memory locality benefits from splitting. This
happens when:

  - There are too many inputs or outputs to the cold region. Argument
    materialization and reloads of outputs have a cost.

  - The cold region has too many distinct exit blocks, causing a large
    switch to be formed in the caller.

  - The code size cost of the split code is less than the cost of a
    set-up call.

A secondary goal is to prevent excessive overall binary size growth.

With the cost model in place, I experimented to find a splitting
threshold that works well in practice. To make warm & cold code easily
separable for analysis purposes, I moved split functions to a "cold"
section. I experimented with thresholds between [0, 4] and set the
default to the threshold which minimized geomean __text size.

Experiment data from building LNT+externals for X86 (N = 639 programs,
all sizes in bytes):

| Configuration | __text geom size | __cold geom size | TEXT geom size |
| **-Os**       | 1736.3           | 0, n=0           | 10961.6        |
| -Os, thresh=0 | 1740.53          | 124.482, n=134   | 11014          |
| -Os, thresh=1 | 1734.79          | 57.8781, n=90    | 10978.6        |
| -Os, thresh=2 | ** 1733.85 **    | 65.6604, n=61    | 10977.6        |
| -Os, thresh=3 | 1733.85          | 65.3071, n=61    | 10977.6        |
| -Os, thresh=4 | 1735.08          | 67.5156, n=54    | 10965.7        |
| **-Oz**       | 1554.4           | 0, n=0           | 10153          |
| -Oz, thresh=2 | ** 1552.2 **     | 65.633, n=61     | 10176          |
| **-O3**       | 2563.37          | 0, n=0           | 13105.4        |
| -O3, thresh=2 | ** 2559.49 **    | 71.1072, n=61    | 13162.4        |

Picking thresh=2 reduces the geomean __text section size by 0.14% at
-Os, -Oz, and -O3 and causes ~0.2% growth in the TEXT segment. Note that
TEXT size is page-aligned, whereas section sizes are byte-aligned.

Experiment data from building LNT+externals for ARM64 (N = 558 programs,
all sizes in bytes):

| Configuration | __text geom size | __cold geom size | TEXT geom size |
| **-Os**       | 1763.96          | 0, n=0           | 42934.9        |
| -Os, thresh=2 | ** 1760.9 **     | 76.6755, n=61    | 42934.9        |

Picking thresh=2 reduces the geomean __text section size by 0.17% at
-Os and causes no growth in the TEXT segment.

Measurements were done with D57082 (r352080) applied.

Differential Revision: https://reviews.llvm.org/D57125

llvm-svn: 352228
2019-01-25 18:30:37 +00:00
Vedant Kumar
8057a16e18 [MC] Teach the MachO object writer about N_FUNC_COLD
N_FUNC_COLD is a new MachO symbol attribute. It's a hint to the linker
to order a symbol towards the end of its section, to improve locality.

Example:

```
void a1() {}
__attribute__((cold)) void a2() {}
void a3() {}
int main() {
  a1();
  a2();
  a3();
  return 0;
}
```

A linker that supports N_FUNC_COLD will order _a2 to the end of the text
section. From `nm -njU` output, we see:

```
_a1
_a3
_main
_a2
```

Differential Revision: https://reviews.llvm.org/D57190

llvm-svn: 352227
2019-01-25 18:30:22 +00:00
Florian Hahn
1973dbeae9 [opt-viewer] Add javascript to expand/hide full message for multiline remarks.
This patch adds support for displaying remarks with multiple
lines. For such remarks, it creates a hidden div
containing the message's lines except the first one in a <pre>
tag. It also prepends a link (with '+' as text) to the regular remark
line. This link can be used to show/hide the div containing the
full remark.

In combination with D57159, this allows for better displaying of
multiline remarks in the html pages generated by opt-viewer.

The Javascript is very simple and should be supported by any recent
major browser.

Reviewers: hfinkel, anemet, thegameg, serge-sans-paille

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D57167

llvm-svn: 352223
2019-01-25 17:48:31 +00:00
Sanjay Patel
202fc46d56 [x86] simplify logic in lowerShuffleWithUndefHalf(); NFCI
This seems unnecessarily complicated because we gave names to
opposite polarity bools and have code comments that don't really
line up with the logic. 

Step 1: remove UndefUpper and assert that it is the opposite of 
UndefLower after the initial early exit.

llvm-svn: 352217
2019-01-25 17:00:41 +00:00
Florian Hahn
163d79f756 [DiagnosticInfo] Add support for preserving newlines in remark arguments.
This patch adds a new type StringBlockVal which can be used to emit a
YAML block scalar, which preserves newlines in a multiline string. It
also updates  MappingTraits<DiagnosticInfoOptimizationBase::Argument> to
use it for argument values with more than a single newline.

This is helpful for remarks that want to display more in-depth
information in a more structured way.

Reviewers: thegameg, anemet

Reviewed By: anemet

Subscribers: hfinkel, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D57159

llvm-svn: 352216
2019-01-25 16:59:06 +00:00
Tom Weaver
6b44c1dccf [TEST][COMMIT] - fix comment typo in AsmPrinter/DwarfDebug.cpp - NFC
llvm-svn: 352214
2019-01-25 16:29:35 +00:00
Javed Absar
84a3b263c4 [TblGen][NFC] Fix documentation formatting
llvm-svn: 352212
2019-01-25 16:17:57 +00:00
Alex Bradbury
eca30becf2 [RISCV][NFC] s/f32/f64 in double-arith.ll
The intrinsic names erroneously used the .f32 variant. As the return and
argument types were still double the intrinsics calls worked properly.

llvm-svn: 352211
2019-01-25 16:04:04 +00:00
Simon Pilgrim
b35fb0e22d [X86] Simplify X86ISD::ADD/SUB if we don't use the result flag
Simplify to the generic ISD::ADD/SUB if we don't make use of the result flag.

This mainly helps with ADDCARRY/SUBBORROW intrinsics which get expanded to X86ISD::ADD/SUB but could be simplified further.

Noticed in some of the test cases in PR31754

Differential Revision: https://reviews.llvm.org/D57234

llvm-svn: 352210
2019-01-25 15:58:28 +00:00
Sanjay Patel
5a5e82c5f0 [x86] narrow a shuffle that doesn't use or set any high elements
This isn't the final fix for our reduction/horizontal codegen, but it takes care 
of a lot of the problems. After we narrow the shuffle, existing combines for 
insert/extract and binops kick in, and we end up with cheaper 128-bit ops.

The avg and mul reduction tests show an existing shuffle lowering hole for 
AVX2/AVX512. I think in its most minimal form this is:
https://bugs.llvm.org/show_bug.cgi?id=40434
...but we might need multiple fixes to get it right.

Differential Revision: https://reviews.llvm.org/D57156

llvm-svn: 352209
2019-01-25 15:37:42 +00:00
Clement Courbet
9fd325a694 Revert r351954 "Add a value_type to ArrayRef."
This breaks arm self-hosted buildbots.

llvm-svn: 352206
2019-01-25 15:25:52 +00:00
Sam McCall
7010cdc6c4 [JSON] Work around excess-precision issue when comparing T_Integer numbers.
Reviewers: bkramer

Subscribers: kristina, llvm-commits

Differential Revision: https://reviews.llvm.org/D57237

llvm-svn: 352204
2019-01-25 15:05:33 +00:00
Nico Weber
76033369a6 gn build: Merge r352149
llvm-svn: 352202
2019-01-25 14:53:30 +00:00
Nico Weber
afa8033cbe gn build: Revert r352200, commit message was wrong
llvm-svn: 352201
2019-01-25 14:52:50 +00:00
Nico Weber
0ecac48cd5 gn build: Merge r352148
llvm-svn: 352200
2019-01-25 14:50:14 +00:00
Alex Bradbury
e56b57bb29 [RISCV] Add tests to demonstrate bitcasted fneg/fabs dagcombines
This target-independent code won't trigger for cases such as RV32FD where
custom SelectionDAG nodes are generated. These new tests demonstrate such
cases. Additionally, float-arith.ll was updated so that fneg.s, fsgnjn.s, and
fabs.s selection patterns are actually exercised.

llvm-svn: 352199
2019-01-25 14:33:08 +00:00
Simon Pilgrim
505985ee45 Fix line endings and trim trailing whitespace. NFCI.
llvm-svn: 352198
2019-01-25 14:29:57 +00:00
Haojian Wu
ac0fb6c2ed gitignore: ignore clangd index files.
Reviewers: kadircet

Subscribers: ilya-biryukov, ioeric, MaskRay, jkorous, arphaman, llvm-commits

Differential Revision: https://reviews.llvm.org/D57227

llvm-svn: 352197
2019-01-25 14:05:18 +00:00
Simon Pilgrim
7f21a5dceb [X86] Add addcarry/subborrow combine tests
Show failure to simplify cases with zero op/flags

llvm-svn: 352196
2019-01-25 12:26:27 +00:00
James Henderson
011ee92bbc [llvm-symbolizer] Add switch to adjust addresses by fixed offset
If a stack trace or similar has a list of addresses from an executable
or DSO loaded at a variable address (e.g. due to ASLR), the addresses
will not directly correspond to the addresses stored in the object file.
If a user wishes to use llvm-symbolizer, they have to subtract the load
address from every address. This is somewhat inconvenient, especially as
the output of --print-address will result in the adjusted address being
listed, rather than the address coming from the stack trace, making it
harder to map results between the two.

This change adds a new switch to llvm-symbolizer --adjust-vma which
takes an offset, which is then used to automatically do this
calculation. The printed address remains the input address (allowing for
easy mapping), whilst the specified offset is applied to the addresses
when performing the lookup.

The switch is conceptually similar to llvm-objdump's new switch of the
same name (see D57051), which in turn mirrors a GNU switch. There is no
equivalent switch in addr2line.

Reviewed by: grimar

Differential Revision: https://reviews.llvm.org/D57151

llvm-svn: 352195
2019-01-25 11:49:21 +00:00
Max Kazantsev
7de83f3ceb [NFC] One more crashing test on LoopSimplifyCFG
llvm-svn: 352194
2019-01-25 11:47:16 +00:00
Simon Pilgrim
504bf36a53 Fix gcc -Wparentheses warning. NFCI.
llvm-svn: 352193
2019-01-25 11:38:40 +00:00
Simon Pilgrim
ef0bbfb3e0 Fix gcc -Wparentheses warning. NFCI.
llvm-svn: 352191
2019-01-25 11:34:58 +00:00
Max Kazantsev
5f00afb14d [NFC] Add failing test on LCSSA forming
llvm-svn: 352190
2019-01-25 11:32:21 +00:00
Diana Picus
3fa34396d2 [ARM GlobalISel] Support shifts for Thumb2
Same as ARM.

On this occasion we split some of the instruction select tests for more
complicated instructions into their own files, so we can reuse them for
ARM and Thumb mode. Likewise for the legalizer tests.

llvm-svn: 352188
2019-01-25 10:48:42 +00:00
Diana Picus
d574f09322 [ARM GlobalISel] Remove rebase artifact from r351882. NFC
r351882 introduced some superfluous calls to mark G_INTTOPTR and
G_PTRTOINT as legal (looks like a rebase mishap). Remove them.

llvm-svn: 352187
2019-01-25 10:48:35 +00:00
Javed Absar
173ca6da08 [TblGen] Extend !if semantics through new feature !cond
This patch extends TableGen language with !cond operator.
Instead of embedding !if inside !if which can get cumbersome,
one can now use !cond.
Below is an example to convert an integer 'x' into a string:

    !cond(!lt(x,0) : "Negative",
          !eq(x,0) : "Zero",
          !eq(x,1) : "One,
          1        : "MoreThanOne")

Reviewed By: hfinkel, simon_tatham, greened
Differential Revision: https://reviews.llvm.org/D55758

llvm-svn: 352185
2019-01-25 10:25:25 +00:00
Douglas Yung
cd104b2063 [llvm-objcopy] Add support for -g as an alias for --strip-debug
This change adds an option -g to llvm-objcopy which is an alias for the existing option --strip-debug.

This fixes PR40003.

Reviewed by: alexshap

Differential Revision: https://reviews.llvm.org/D57217

llvm-svn: 352182
2019-01-25 09:57:20 +00:00
Simon Pilgrim
8d0bd607bf [llvm-mca][X86] Add missing shuffle tests
Match the coverage of test\CodeGen\X86\avx512-shuffle-schedule.ll so we can get rid of -print-schedule (and fix PR37160) without losing schedule tests

llvm-svn: 352179
2019-01-25 09:17:30 +00:00
Anton Korobeynikov
058d2e13b4 [MSP430] Fix absolute addressing mode printing in AsmPrinter
Align checks for absolute addressing mode with its current
implementation (SR is used as a base register).

This fixes https://bugs.llvm.org/show_bug.cgi?id=39993

Patch by Kristina Bessonova!

Differential Revision: https://reviews.llvm.org/D56785

llvm-svn: 352178
2019-01-25 09:14:05 +00:00
Max Kazantsev
68cab6e22b [NFC] Add test with multiple loops
llvm-svn: 352176
2019-01-25 08:46:00 +00:00
Zi Xuan Wu
287883a58b [PowerPC] Enhance the fast selection of cmp instruction and clean up related asserts
Fast selection of llvm icmp and fcmp instructions is not handled well about VSX instruction support.

We'd use VSX float comparison instruction instead of non-vsx float comparison instruction 
if the operand register class is VSSRC or VSFRC because i32 and i64 are mapped to VSSRC and 
VSFRC correspondingly if VSX feature is opened.

If the target does not have corresponding VSX instruction comparison for some type, 
just copy VSX-related register to common float register class and use non-vsx comparison instruction.

Differential Revision: https://reviews.llvm.org/D57078

llvm-svn: 352174
2019-01-25 07:24:59 +00:00
Craig Topper
5c87dfd905 [X86] Add non-masked versions of vpconflict intrinsics so we can use a select in the header file in clang.
I'll remove and autoupgrade the old intrinsics in a future commit.

llvm-svn: 352172
2019-01-25 07:08:07 +00:00
Alex Bradbury
0fc69297a4 [RISCV] Custom-legalise i32 SDIV/UDIV/UREM on RV64M
Follow the same custom legalisation strategy as used in D57085 for
variable-length shifts (see that patch summary for more discussion). Although
we may lose out on some late-stage DAG combines, I think this custom
legalisation strategy is ultimately easier to reason about.

There are some codegen changes in rv64m-exhaustive-w-insts.ll but they are all
neutral in terms of the number of instructions.

Differential Revision: https://reviews.llvm.org/D57096

llvm-svn: 352171
2019-01-25 05:11:34 +00:00
Max Kazantsev
fe1793aa58 [LoopSimplifyCFG] Fix inconsistency in blocks in loop markup
2nd part of D57095 with the same reason, just in another place. We never
fold branches that are not immediately in the current loop, but this check
is missing in `IsEdgeLive` As result, it may think that the edge in subloop is
dead while it's live. It's a pessimization in the current stance.

Differential Revision: https://reviews.llvm.org/D57147
Reviewed By: rupprecht	

llvm-svn: 352170
2019-01-25 05:05:02 +00:00
Alex Bradbury
b821e9787f [RISCV] Custom-legalise 32-bit variable shifts on RV64
The previous DAG combiner-based approach had an issue with infinite loops
between the target-dependent and target-independent combiner logic (see
PR40333). Although this was worked around in rL351806, the combiner-based
approach is still potentially brittle and can fail to select the 32-bit shift
variant when profitable to do so, as demonstrated in the pr40333.ll test case.

This patch instead introduces target-specific SelectionDAG nodes for
SHLW/SRLW/SRAW and custom-lowers variable i32 shifts to them. pr40333.ll is a
good example of how this approach can improve codegen.

This adds DAG combine that does SimplifyDemandedBits on the operands (only
lower 32-bits of first operand and lower 5 bits of second operand are read).
This seems better than implementing SimplifyDemandedBitsForTargetNode as there
is no guarantee that would be called (and it's not for e.g. the anyext return
test cases). Also implements ComputeNumSignBitsForTargetNode.

There are codegen changes in atomic-rmw.ll and atomic-cmpxchg.ll but the new
instruction sequences are semantically equivalent.

Differential Revision: https://reviews.llvm.org/D57085

llvm-svn: 352169
2019-01-25 05:04:00 +00:00
Matt Arsenault
f4a5a82995 AMDGPU/GlobalISel: Remove leftover setAction
Also move G_GEP actions together.

llvm-svn: 352168
2019-01-25 04:54:00 +00:00
Matt Arsenault
fe67015989 AMDGPU/GlobalISel: Scalarize add/sub
llvm-svn: 352167
2019-01-25 04:53:57 +00:00
Matt Arsenault
387fcef573 GlobalISel: fewerElementsVector for more cast types
llvm-svn: 352166
2019-01-25 04:37:33 +00:00
Matt Arsenault
9fbadced65 GlobalISel: fewerElementsVector for a few more trivial ops
llvm-svn: 352165
2019-01-25 04:03:38 +00:00
Matt Arsenault
922c4ade38 AMDGPU/GlobalISel: Legalize smulh/umulh and scalarize mul
llvm-svn: 352162
2019-01-25 03:23:04 +00:00
Vedant Kumar
ac7ab5ba9e [HotColdSplit] Describe the pass in more detail, NFC
llvm-svn: 352161
2019-01-25 03:22:38 +00:00