1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 19:52:54 +01:00
Commit Graph

183655 Commits

Author SHA1 Message Date
Sanjay Patel
0ba8d8298e [InstCombine] add tests for mismatched cast ops for icmp; NFC
Motivating case is shown in PR42700:
https://bugs.llvm.org/show_bug.cgi?id=42700

llvm-svn: 369439
2019-08-20 20:51:50 +00:00
Jinsong Ji
fa4131d59e [llvm-extract] Update the help message for group extraction feature
Summary:
https://reviews.llvm.org/D60973 exposed the group extraction feature of
the BlockExtractor to llvm-extract.
However, the help message was not updated, so users might not be able to
know how to use this feature without looking into history/commits.

This patch just update the help message to show how to use this group
extraction feature.

Reviewers: qcolombet, volkan

Reviewed By: qcolombet

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66438

llvm-svn: 369438
2019-08-20 20:45:16 +00:00
Craig Topper
357f5b0cc3 [X86] Add a DAG combine to transform (i8 (bitcast (v8i1 (extract_subvector (v16i1 X), 0)))) -> (i8 (trunc (i16 (bitcast (v16i1 X))))) on KNL target
Without AVX512DQ we don't have KMOVB so we can't really copy 8-bits of a k-register to a GPR. We have to copy 16 bits instead. We do this even if the DAG copy is from v8i1->v16i1. If we detect the (i8 (bitcast (v8i1 (extract_subvector (v16i1 X), 0)))) we should rewrite the types to match the copy we do support. By doing this, we can help known bits to propagate without losing the upper 8 bits of the input to the extract_subvector. This allows some zero extends to be removed since we have an isel pattern to use kmovw for (zero_extend (i16 (bitcast (v16i1 X))).

Differential Revision: https://reviews.llvm.org/D66489

llvm-svn: 369434
2019-08-20 20:20:04 +00:00
Craig Topper
a93395c550 [X86] Add isel patterns for (i64 (zext (i8 (bitcast (v16i1 X))))) to use a KMOVW and a SUBREG_TO_REG. Similar for i8 and anyextend.
We already had patterns for extending to i32 to take advantage of
the impliciting zeroing of the upper bits of a 32-bit GPR that is
done by KMOVW/KMOVB. But the extend might be all the way to i64,
in which case the existing patterns would fail and we'd get a
KMOVW/B followed by a MOVZX. By adding patterns for i64 we can
use the fact that KMOVW/B zero the upper bits of the 32-bit GPR
and the normal property that 32-bit GPR writes implicitly zero the
upper 32-bits of the full 64-bit GPR.

The anyextend patterns are slightly different since we don't care
about the upper zeros. For the i8->i64 I think this avoids selecting
the anyextend as a MOVZX to prevent a partial register issue that
doesn't exist. For i16->i64 I think we would have just emitted an
insert_subreg on top of the extract_subreg that the vXi16->i16
bitcast pattern emits. The register coalescer or peephole pass
should combine those, but this saves that work and makes i8/16
consistent.

llvm-svn: 369431
2019-08-20 19:43:48 +00:00
Martin Storsjo
c0b824d734 [TargetMachine] Don't try to create COFFSTUB references on windows on non-COFF
This avoids spurious relocation types for windows/elf targets.

Differential Revision: https://reviews.llvm.org/D66401

llvm-svn: 369426
2019-08-20 18:58:05 +00:00
Sam Clegg
d50df7bc78 [WebAssembly][lld] Fix crash when applying relocations to debug sections
Debug sections are special in that they can contain relocations against
symbols that are not present in the final output (i.e. not live).
However it is also possible to have R_WASM_TABLE_INDEX relocations
against symbols that don't have a table index assigned (since they are
not address taken by actual code.

Fixes: https://github.com/emscripten-core/emscripten/issues/9023

Differential Revision: https://reviews.llvm.org/D66435

llvm-svn: 369423
2019-08-20 18:39:24 +00:00
Sanjay Patel
f7eabb0a58 [InstCombine] add helper function for icmp+zext/sext; NFC
llvm-svn: 369421
2019-08-20 18:15:17 +00:00
Simon Pilgrim
5d5f9fbb0b Fix typo in comment. NFCI.
llvm-svn: 369419
2019-08-20 17:54:37 +00:00
Matt Arsenault
9dc720a012 Revert "AMDGPU: Fix iterator error when lowering SI_END_CF"
This reverts r367500 and r369203. This is causing various test
failures.

llvm-svn: 369417
2019-08-20 17:45:25 +00:00
Andrea Di Biagio
60bf5e7c65 [X86][BtVer2] Use ReadAfterLd entries for the register operands of CMPXCHG.
This is a follow-up of r369365.

llvm-svn: 369412
2019-08-20 17:05:56 +00:00
Sanjay Patel
af3a0f7768 [InstCombine] make fold for icmp with sext more efficient; NFC
We were creating 2 instructions and relying on a subsequent fold
to invert a not(icmp). Create the final icmp directly instead.

llvm-svn: 369411
2019-08-20 17:03:22 +00:00
Craig Topper
a5daa6c00e [X86] Use isNullConstant instead of getConstantOperandVal == 0. NFC
llvm-svn: 369410
2019-08-20 16:55:12 +00:00
Thomas Raoux
929428465f [CodeGen] Add EarlyIfConvert test missed in previous commit
llvm-svn: 369405
2019-08-20 16:34:47 +00:00
Sam Tebbs
4c0045ed50 [ARM] Select vaddva
This patch adds vaddva selection.

Differential revision: https://reviews.llvm.org/D66410

llvm-svn: 369404
2019-08-20 16:33:34 +00:00
Aditya Nandakumar
5a7b0eb242 [GlobalISel] Handle multiple registers in dbg.value intrinsic
https://reviews.llvm.org/D66077

The value passed into dbg.value may relate to multiple registers,
each of which need a DBG_VALUE.

This fix calls MIRBuilder.buildDirectDbgValue for each register.

Without this, IR passed in from flang-compiler/flang may fail an
assertion in getOrCreateVReg.

Patch by : peterwaller-arm.

llvm-svn: 369403
2019-08-20 16:28:37 +00:00
Nico Weber
ec4fa0e154 gn build: Merge r369298
llvm-svn: 369401
2019-08-20 16:19:50 +00:00
Jan Kratochvil
8da2801405 Regex: Add isValid() with no parameter
There will be some performance (only a little) improvement for LLDB's
RegularExpression::Execute.

Differential Revision: https://reviews.llvm.org/D66463

llvm-svn: 369396
2019-08-20 16:05:23 +00:00
Thomas Raoux
b79a880b38 [CodeGen] Add a pass to do block predication on SSA machine IR.
For targets requiring aggressive scheduling and/or software pipeline we need to
    apply predication before preRA scheduling. This adds a pass re-using the early
    if-cvt infrastructure but generating predicated instructions instead of
    speculatively executing instructions. It allows doing if conversion on blocks
    containing instructions with side-effects. The pass re-use the target hook from
    postRA if-conversion to let the target decide on the heuristic to apply.

    Differential Revision: https://reviews.llvm.org/D66190

llvm-svn: 369395
2019-08-20 15:54:59 +00:00
Fangrui Song
65b109bf33 [llvm-objcopy][test] Add a test to show that argv[0] is included in error/warning messages
test/llvm-objcopy/ELF/error-format.test is similar to test/llvm-readobj/error-format.test added in D66425.

Reviewed By: grimar, jhenderson

Differential Revision: https://reviews.llvm.org/D66476

llvm-svn: 369392
2019-08-20 15:34:07 +00:00
Fangrui Song
b0559dae5f [llvm-objcopy] Append '\n' to warning messages
Currently the warning message of `llvm-strip %t.o %t.o` does not include
the trailing newline. Fix this by appending a '\n'.

This is the only warning llvm-objcopy and llvm-strip can issue.

Reviewed By: jhenderson

Differential Revision: https://reviews.llvm.org/D66475

llvm-svn: 369391
2019-08-20 15:00:07 +00:00
Sanjay Patel
5950893514 [InstCombine] improve readability for icmp with cast folds; NFC
1. Update function name and stale code comments.
2. Use variable names that are less ambiguous.
3. Move operand checks into the function as early exits.

llvm-svn: 369390
2019-08-20 14:56:44 +00:00
Jinsong Ji
54215e5d31 [BlockExtractor] Avoid assert with wrong line format
Summary:
When the line format is wrong, we may end up accessing out of bound
memory. eg: the test with invalide line will cause assert.
Assertion `idx < size()' failed

The fix is to report fatal when we found mismatched line format.

Reviewers: qcolombet, volkan

Reviewed By: qcolombet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66444

llvm-svn: 369389
2019-08-20 14:46:02 +00:00
Andrea Di Biagio
fd00d5a846 [X86][BtVer2] Fix latency and throughput of atomic INC/DEC/NEG/NOT.
Latency and throughput of LOCK INC/DEC/NEG/NOT is always 19cy.
Number of uOPs is still 1.

Differential Revision: https://reviews.llvm.org/D66469

llvm-svn: 369388
2019-08-20 14:31:27 +00:00
Sanjay Patel
f2327f04db [InstCombine] simplify min/max of min/max with same operands (PR35607)
This is the original integer variant requested in:
https://bugs.llvm.org/show_bug.cgi?id=35607

As noted in the TODO and several similar TODOs around this block,
we could do this in instsimplify, but then it would cost more
because we would be trying to match min/max via ValueTracking
in 2 different places.

There are 4 commuted variants for each of smin/smax/umin/umax
that are not matched here. There are also icmp predicate variants
that are not included in the affected test file because they are
already handled by instsimplify by folding the final icmp to
true/false.

https://rise4fun.com/Alive/3KVc

  Name: smax(smax, smin)
  %c1 = icmp slt i32 %x, %y
  %c2 = icmp slt i32 %y, %x
  %min = select i1 %c1, i32 %x, i32 %y
  %max = select i1 %c2, i32 %x, i32 %y
  %c3 = icmp sgt i32 %max, %min
  %r = select i1 %c3, i32 %max, i32 %min
  =>
  %r = %max

  Name: smin(smax, smin)
  %c1 = icmp slt i32 %x, %y
  %c2 = icmp slt i32 %y, %x
  %min = select i1 %c1, i32 %x, i32 %y
  %max = select i1 %c2, i32 %x, i32 %y
  %c3 = icmp sgt i32 %max, %min
  %r = select i1 %c3, i32 %min, i32 %max
  =>
  %r = %min

  Name: umax(umax, umin)
  %c1 = icmp ult i32 %x, %y
  %c2 = icmp ult i32 %y, %x
  %min = select i1 %c1, i32 %x, i32 %y
  %max = select i1 %c2, i32 %x, i32 %y
  %c3 = icmp ult i32 %min, %max
  %r = select i1 %c3, i32 %max, i32 %min
  =>
  %r = %max

  Name: umin(umax, umin)
  %c1 = icmp ult i32 %x, %y
  %c2 = icmp ult i32 %y, %x
  %min = select i1 %c1, i32 %x, i32 %y
  %max = select i1 %c2, i32 %x, i32 %y
  %c3 = icmp ult i32 %min, %max
  %r = select i1 %c3, i32 %min, i32 %max
  =>
  %r = %min

llvm-svn: 369386
2019-08-20 13:39:17 +00:00
Simon Pilgrim
067b5fe034 [X86][FMA] Add FMA 'negated expression' combine tests for D63141
llvm-svn: 369384
2019-08-20 13:25:55 +00:00
Jan Kratochvil
94efa010f3 Regex: +regex string lifetime comment
Differential Revision: https://reviews.llvm.org/D66464

llvm-svn: 369383
2019-08-20 13:25:19 +00:00
George Rimar
816a7c4f33 [llvm-objdump] - Remove one of report_error functions and improve the error reporting.
One of the report_error functions was taking object::Archive::Child as an
argument. It feels excessive, this patch removes it and introduce a helper
function instead. Also I fixed a "TODO" in this patch what improved the message printed.

Differential revision: https://reviews.llvm.org/D66468

llvm-svn: 369382
2019-08-20 13:19:16 +00:00
Igor Kudrin
8f1dedbdf9 [DWARF] Fix reading 64-bit DWARF type units.
The type_offset field is 8 bytes long in DWARF64. The patch extends
TypeOffset to uint64_t and fixes its reading. The patch also fixes
checking of TypeOffset bounds as it was inaccurate in DWARF64 case.

Differential Revision: https://reviews.llvm.org/D66465

llvm-svn: 369378
2019-08-20 12:52:32 +00:00
Fangrui Song
b648113afa [llvm-readobj] Prepend argv[0] to error/warning messages
Summary:
Currently, we report:

    error: ...

Prepend argv[0] (tool name):

    llvm-readobj: error: ...

This is consistent with most GNU binutils/clang/lld, and gives a bit
more context in a long build log.

Reviewed By: grimar, jhenderson, rupprecht

Differential Revision: https://reviews.llvm.org/D66425

llvm-svn: 369377
2019-08-20 12:49:15 +00:00
Sanjay Patel
728efc9012 [InstCombine] add tests for min/max with min/max of same operands; NFC
llvm-svn: 369376
2019-08-20 12:49:03 +00:00
Alex Bradbury
5467ceb0ea [RISCV] Implement getExprForFDESymbol to ensure RISCV_32_PCREL is used for the FDE location
Follow binutils in using RISCV_32_PCREL for the FDE initial location. As
explained in the relevant binutils commit
<a6cbf936e3>,
the ADD/SUB pair of relocations is problematic in the presence of linker
relaxation.

This patch has the same end goal as D64715 but includes test changes and
avoids adding a new global VariantKind to MCExpr.h (preferring
RISCVMCExpr VKs like the rest of the RISC-V backend).

Differential Revision: https://reviews.llvm.org/D66419

llvm-svn: 369375
2019-08-20 12:32:31 +00:00
Pavel Labath
06e6938b76 Recommit "MemoryBuffer: Add a missing error-check to getOpenFileImpl"
This recommits r368977, which was reverted in r369027 due to test
failures in lldb. The cause of this was different behavior of
readNativeFileSlice on windows and unix. These have been addressed in
r369269.

The original commit message was:
In case the function was called with a desired read size *and* the file
was not an "mmap()" candidate, the function was falling back to a
"pread()", but it was failing to check the result of that system call.
This meant that the function would return "success" even though the read
operation failed, and it returned a buffer full of uninitialized memory.

Reviewers: rnk, dblaikie

Subscribers: kristina, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66224

llvm-svn: 369370
2019-08-20 12:08:52 +00:00
Simon Pilgrim
372c62cc4c [CMake] Update C4324 MSVC warning comment to explain its still broken at VS2019
As promised, I've updated the comment for the C4324 MSVC warning that was re-disabled at rL367409 / rG8f823e63e3edf87ab029ba32b68f3eb5d2f392b5 to put it in terms of currently supported VS versions

llvm-svn: 369368
2019-08-20 11:20:05 +00:00
Simon Pilgrim
aa50d0d398 [MCA][X86] Add tests for LOCK variants of standard X86 arithmetic ops
D66424 adds the base support for LOCK so we should be able to add special case support for all these cases in future patches

llvm-svn: 369367
2019-08-20 11:13:20 +00:00
Simon Pilgrim
e0da28268f Fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFCI.
llvm-svn: 369366
2019-08-20 10:25:57 +00:00
Andrea Di Biagio
d5dd3f579a [X86][Btver2] Fix latency and throughput of CMPXCHG instructions.
On Jaguar, CMPXCHG has a latency of 11cy, and a maximum throughput of 0.33 IPC.
Throughput is superiorly limited to 0.33 because of the implicit in/out
dependency on register EAX. In the case of repeated non-atomic CMPXCHG with the
same memory location, store-to-load forwarding occurs and values for sequent
loads are quickly forwarded from the store buffer.

Interestingly, the functionality in LLVM that computes the reciprocal throughput
doesn't seem to know about RMW instructions. That functionality only looks at
the "consumed resource cycles" for the throughput computation. It should be
fixed/improved by a future patch. In particular, for RMW instructions, that
logic should also take into account for the write latency of in/out register
operands.

An atomic CMPXCHG has a latency of ~17cy. Throughput is also limited to
~17cy/inst due to cache locking, which prevents other memory uOPs to start
executing before the "lock releasing" store uOP.

CMPXCHG8rr and CMPXCHG8rm are treated specially because they decode to one less
macro opcode. Their latency tend to be the same as the other RR/RM variants. RR
variants are relatively fast 3cy (but still microcoded - 5 macro opcodes).

CMPXCHG8B is 11cy and unfortunately doesn't seem to benefit from store-to-load
forwarding. That means, throughput is clearly limited by the in/out dependency
on GPR registers. The uOP composition is sadly unknown (due to the lack of PMCs
for the Integer pipes). I have reused the same mix of consumed resource from the
other CMPXCHG instructions for CMPXCHG8B too.
LOCK CMPXCHG8B is instead 18cycles.

CMPXCHG16B is 32cycles. Up to 38cycles when the LOCK prefix is specified. Due to
the in/out dependencies, throughput is limited to 1 instruction every 32 (or 38)
cycles dependeing on whether the LOCK prefix is specified or not.
I wouldn't be surprised if the microcode for CMPXCHG16B is similar to 2x
microcode from CMPXCHG8B. So, I have speculatively set the JALU01 consumption to
2x the resource cycles used for CMPXCHG8B.

The two new hasLockPrefix() functions are used by the btver2 scheduling model
check if a MCInst/MachineInst has a LOCK prefix. Calls to hasLockPrefix() have
been encoded in predicates of variant scheduling classes that describe lat/thr
of CMPXCHG.

Differential Revision: https://reviews.llvm.org/D66424

llvm-svn: 369365
2019-08-20 10:23:55 +00:00
Seiya Nuta
d310c5451f [yaml2obj/obj2yaml][MachO] Fix a test failure in big endian hosts
These section contents are dummy data (0xdeadbeef) and it's endianess
does not matter.

- http://lab.llvm.org:8011/builders/clang-ppc64be-linux/builds/37265

llvm-svn: 369360
2019-08-20 09:58:31 +00:00
Igor Kudrin
2ee0278731 [DWARF] Fix DWARFUnit::getDebugInfoSize() for 64-bit DWARF.
The calculation there was correct only for DWARF32.

Differential Revision: https://reviews.llvm.org/D66421

llvm-svn: 369356
2019-08-20 09:50:44 +00:00
Seiya Nuta
52efdb7adc [yaml2obj/obj2yaml][MachO] Allow setting custom section data
Reviewers: alexshap, jhenderson, rupprecht

Reviewed By: alexshap, jhenderson

Subscribers: abrachet, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65799

llvm-svn: 369348
2019-08-20 08:49:07 +00:00
Seiya Nuta
d6e74f9c7c [llvm-objcopy][MachO] Fix method names. NFC.
Reviewers: alexshap, rupprecht, jhenderson

Reviewed By: alexshap, rupprecht

Subscribers: jakehehrlich, abrachet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65540

llvm-svn: 369346
2019-08-20 08:36:47 +00:00
George Rimar
aac454ba82 [test/Object] - Move/rewrite 2 more test cases.
This patch makes a change for test/Object tests responsible
for relocations.

* 2 tests were moved to llvm-readobj/llvm-objdump folders:
Object/elf-reloc-no-sym.test -> tools/llvm-readobj/elf-reloc-no-sym.test
Object/objdump-reloc-shared.test -> tools/llvm-objdump/relocations-in-nonreloc.test

* A prerecompiled binary was removed and these tests were refactored.

Differential revision: https://reviews.llvm.org/D66291

llvm-svn: 369342
2019-08-20 08:23:57 +00:00
Fangrui Song
0141652683 [MC] Delete an overload of MCExpr::evaluateKnownAbsolute and its associated hack
The hack dated back to 2010 (r121076) and was documented by r122144:

  // FIXME: The use if InSet = Addrs is a hack. Setting InSet causes us
  // absolutize differences across sections and that is what the MachO writer
  // uses Addrs for.

llvm-svn: 369337
2019-08-20 07:42:04 +00:00
Fangrui Song
a458870688 [Attributor] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after r369331
llvm-svn: 369334
2019-08-20 07:21:43 +00:00
Craig Topper
4f9e2bf436 [X86] Add back the -x86-experimental-vector-widening-legalization comand line flag and all associated code, but leave it enabled by default
Google is reporting performance issues with the new default behavior
and have asked for a way to switch back to the old behavior while we
investigate and make fixes.

I've restored all of the code that had since been removed and added
additional checks of the command flag onto code paths that are
not otherwise guarded by a check of getTypeAction.

I've also modified the cost model tables to hopefully get us back
to the previous costs.

Hopefully we won't need to support this for very long since we
have no test coverage of the old behavior so we can very easily
break it.

llvm-svn: 369332
2019-08-20 06:58:00 +00:00
Johannes Doerfert
45e6a509d0 [Attributor] Create abstract attributes on-demand
Before, we create the set of abstract attributes initially and then
dealt with the fact hat a lookup could fail, e.g., return a nullptr.
This patch will ensure we always return a valid object from a lookup,
allowing us not only to remove the nullptr checks but also to grow the
set of abstract attributes "in-flight" on-demand.

One can now start from those that have the best chance of improving
performance without the need to specify all they might depend on.

While this introduces some boilerplate, the usage of attributes is much
easier and cleaner now.

Reviewers: uenoku, sstefan1

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66276

llvm-svn: 369331
2019-08-20 06:15:50 +00:00
Johannes Doerfert
76537d874a [Attributor][NFC] Cleanup statistics code
llvm-svn: 369330
2019-08-20 06:09:56 +00:00
Johannes Doerfert
8c11d0e37f [Attributor] Use structured deduction for AADereferenceable
Summary:
This is analogous to D66128 but for AADereferenceable. We have the logic
concentrated in the floating value updateImpl and we use the combiner
helper classes for arguments and return values.

The regressions will go away with "on-demand" attribute creation.
Improvements are already visible in the existing tests.

Reviewers: uenoku, sstefan1

Subscribers: hiraditya, bollu, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66272

llvm-svn: 369329
2019-08-20 06:08:35 +00:00
Johannes Doerfert
95f3dc61a8 [Attributor] Use structured deduction for AANonNull
Summary:
What D66126 did for AAAlign, this patch does for AANonNull. Agian, the
logic becomes more concise and localized. Again, returned poiners are
not annotated properly but that will not be an issue if this lands with
the "on-demand" generation of attributes. First improvements due to the
genericValueTraversal are already visible.

Reviewers: sstefan1, uenoku

Subscribers: hiraditya, bollu, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66128

llvm-svn: 369328
2019-08-20 06:02:39 +00:00
Johannes Doerfert
b23b231c6a [Attributor] Fix the "clamp" operator
The clamp operator should not take the known of the given state as the
known is potentially based on assumed information. This also adds TODOs
to guide improvements.

llvm-svn: 369327
2019-08-20 05:57:01 +00:00
Thomas Raoux
74ef61ec21 [NFC] Test commit, fix some comment spelling.
llvm-svn: 369326
2019-08-20 05:21:27 +00:00