1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00
Commit Graph

216192 Commits

Author SHA1 Message Date
Philip Reames
87f0cf2715 Move a definition into cpp from header in advance of other changes [nfc] 2021-05-21 09:18:04 -07:00
Jinsong Ji
aa4586c113 [PowerPC] Add stack guard tests
Copied from X86 and test powerpc triple.
Preparing for AIX stack guard tests.
2021-05-21 16:16:19 +00:00
Florian Hahn
486912a2de [X86] Lower calls with clang.arc.attachedcall bundle
This patch adds support for lowering function calls with the
`clang.arc.attachedcall` bundle. The goal is to expand such calls to the
following sequence of instructions:

    callq   @fn
    movq  %rax, %rdi
    callq   _objc_retainAutoreleasedReturnValue / _objc_unsafeClaimAutoreleasedReturnValue

This sequence of instructions triggers Objective-C runtime optimizations,
hence we want to ensure no instructions get moved in between them.
This patch achieves that by adding a new CALL_RVMARKER ISD node,
which gets turned into the CALL64_RVMARKER pseudo, which eventually gets
expanded into the sequence mentioned above.

The ObjC runtime function to call is determined by the
argument in the bundle, which is passed through as a
target constant to the pseudo.

@ahatanak is working on using this attribute in the front- & middle-end.

Together with the front- & middle-end changes, this should address
PR31925 for X86.

This is the X86 version of 46bc40e50246c1902a1ca7916c8286cb837643ee,
which added similar support for AArch64.

Reviewed By: ab

Differential Revision: https://reviews.llvm.org/D94597
2021-05-21 16:33:58 +01:00
Jim Lin
0c43c880e4 [X86] Don't fold (fneg (fma (fneg X), Y, (fneg Z))) to (fma X, Y, Z)
Check if it has no signed zeros flag (nsz) in getNegatedExpression for x86.
This patch fixed miscompilation: https://alive2.llvm.org/ce/z/XxwBAJ

Reviewed By: RKSimon, spatel

Differential Revision: https://reviews.llvm.org/D90901
2021-05-21 23:02:19 +08:00
Jim Lin
83f2168ce3 [X86] Pre-commit test for D90901
Differential Revision: https://reviews.llvm.org/D102621
2021-05-21 23:02:19 +08:00
maekawatoshiki
def1bd540b [LoopUnrollAndJam] Change LoopUnrollAndJamPass to LoopNest pass
This patch changes LoopUnrollAndJamPass from FunctionPass to LoopNest pass.
The next patch will utilize LoopNest to effectively handle loop nests.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D99149
2021-05-21 23:57:39 +09:00
Matt Arsenault
c2fa720a37 AMDGPU/GlobalISel: Add subtarget to a test
SelectionDAG forces us to have a weird ABI for 16-bit values without
legal 16-bit operations, but currently GlobalISel bypasses this and
sometimes ends up using the gfx8+ ABI in some contexts. Make sure
we're testing the normal ABI to avoid a test change in a future patch.
2021-05-21 23:57:38 +09:00
Alexey Bataev
dbae85454e [SLP]Improve handling of compensate external uses cost.
External insertelement users can be represented as a result of shuffle
of the vectorized element and noconsecutive insertlements too. Added
support for handling non-consecutive insertelements.

Differential Revision: https://reviews.llvm.org/D101555
2021-05-21 07:45:31 -07:00
Alexey Bataev
9e03ad1edb [SLP][NFC]Add a test for diamond match of broadcast tree nodes. 2021-05-21 07:05:48 -07:00
Florian Hahn
c4badc856f [VectorCombine] Add positive test for scalarizing multiple extracts.
As suggested in D100273. Also adds an out-of-bound access test
2021-05-21 13:39:37 +01:00
Djordje Todorovic
010f2336bc [DebugInfo] Salvage dbg.value() during ADCE
This has been found by using the [0].

[0] https://llvm.org/docs/HowToUpdateDebugInfo.html#\
    test-original-debug-info-preservation-in-optimizations

 Differential Revision: https://reviews.llvm.org/D100844
2021-05-21 05:25:59 -07:00
Daniil Fukalov
872293f9fe [TTI] NFC: Change getCostOfKeepingLiveOverCall to return InstructionCost.
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D102831
2021-05-21 15:18:12 +03:00
Daniil Fukalov
53f301ea10 [TTI] NFC: Change getRegUsageForType to return InstructionCost.
This patch migrates the TTI cost interfaces to return an InstructionCost.

See this patch for the introduction of the type: https://reviews.llvm.org/D91174
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2020-November/146408.html

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D102541
2021-05-21 15:17:23 +03:00
Simon Pilgrim
02b1451e34 [CostModel][X86] Tweak fptoui v4f32->v4i32 + v8f32->v8i32 SSE/AVX costs
Adjust for worst case for atom/slm (SSE), btver2/sandybridge (AVX1) and haswell/znver* (AVX2)
2021-05-21 12:09:31 +01:00
Simon Pilgrim
e2bdc7fda9 [CostModel][X86] Match SSE41 legalized conversion costs as well as SSE2 2021-05-21 11:42:22 +01:00
Simon Pilgrim
61e6792674 [CostModel][X86] Add uitpfp v4f32->v4i32 + v8f32->v8i32 SSE/AVX costs
These were using (default) scalarized values.
2021-05-21 11:30:15 +01:00
Luke Benes
38eda91550 Fix warning: comparison of integer expressions of different signedness. NFC
This patch resolves the Wsign-compare warning that I observed on armv7l and x86 with both gcc and clang.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D102792
2021-05-21 18:23:27 +08:00
Tony Tye
8dfab88553 [NFC][AMDGPU] Mark C code in AMDGPUUsage.rst
Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D102910
2021-05-21 10:08:05 +00:00
Stephen Tozer
15caaeb8e5 3rd Reapply "[DebugInfo] Use variadic debug values to salvage BinOps and GEP instrs with non-const operands"
This reapplies c0f3dfb9, which was reverted following the discovery of
crashes on linux kernel and chromium builds - these issues have since
been fixed, allowing this patch to re-land.

This reverts commit 4397b7095d640f9b9426c4d0135e999c5a1de1c5.
2021-05-21 11:06:20 +01:00
Joe Ellis
bbfb092906 [InstSimplify] Properly constrain {insert,extract}_subvector intrinsic fold
The previous rule:

   (insert_vector _, (extract_vector X, 0), 0) -> X

is not quite correct. The correct fold should be:

   (insert_vector Y, (extract_vector X, 0), 0) -> X
   where: Y is X, or Y is undef

This commit updates the pattern.

Reviewed By: peterwaller-arm, paulwalker-arm

Differential Revision: https://reviews.llvm.org/D102699
2021-05-21 10:05:03 +00:00
Djordje Todorovic
034cc0a244 [NFC][Debugify][Original DI] Use MapVector insted of DenseMap for DI tracking
By using MapVector instead of DenseMap, reporting issues will be in
deterministic order.

Differential Revision: https://reviews.llvm.org/D102841
2021-05-21 02:58:16 -07:00
Andy Wingo
107b591be0 [IR][Verifier] Relax restriction on alloca address spaces
In the WebAssembly target, we would like to allow alloca in two address
spaces.  The alloca instruction already has an address space argument,
but the verifier asserts that the address space of an alloca is the
default alloca address space from the datalayout.  This patch removes
this restriction.  Targets that would like to impose additional
restrictions should do so via target-specific verification passes.

Differential Revision: https://reviews.llvm.org/D101045
2021-05-21 11:52:45 +02:00
Djordje Todorovic
88aa158bd7 Recommit: "[Debugify][Original DI] Test dbg var loc preservation""
[Debugify][Original DI] Test dbg var loc preservation

    This is an improvement of [0]. This adds checking of
    original llvm.dbg.values()/declares() instructions in
    optimizations.

    We have picked a real issue that has been found with
    this (actually, picked one variable location missing
    from [1] and resolved the issue), and the result is
    the fix for that -- D100844.

    Before applying the D100844, using the options from [0]
    (but with this patch applied) on the compilation of GDB 7.11,
    the final HTML report for the debug-info issues can be found
    at [1] (please scroll down, and look for
    "Summary of Variable Location Bugs"). After applying
    the D100844, the numbers has improved a bit -- please take
    a look into [2].

    [0] https://llvm.org/docs/HowToUpdateDebugInfo.html#\
        test-original-debug-info-preservation-in-optimizations
    [1] https://djolertrk.github.io/di-check-before-adce-fix/
    [2] https://djolertrk.github.io/di-check-after-adce-fix/

    Differential Revision: https://reviews.llvm.org/D100845

The Unit test was failing because the pass from the test that
modifies the IR, in its runOnFunction() didn't return 'true',
so the expensive-check configuration triggered an assertion.
2021-05-21 02:04:29 -07:00
David Green
c47364d339 [ARM] Fix the operand used for WLS in ARMLowOverheadLoops
The Loop start instruction handled by the ARMLowOverheadLoops are:
$lr = t2DoLoopStart $r0
$lr = t2DoLoopStartTP $r1, $r0
$lr = t2WhileLoopStartLR $r0, %bb, implicit-def dead $cpsr
All three of these will have LR as the 0 argument, the trip count as the
1 argument.

This patch updated a few places in ARMLowOverheadLoops where the 0th arg
was being used for t2WhileLoopStartLR instructions as the trip count.
One place was entirely removed as it does not seem valid any more, the
case the code is trying to protect against should not be able to occur
with our correct-by-construction low overhead loops.

Differential Revision: https://reviews.llvm.org/D102620
2021-05-21 09:29:30 +01:00
Yevgeny Rouban
ebb8c67ccd Allow incomplete template types in unique_function arguments
We can't declare unique_function that has in its arguments a reference to
a template type with an incomplete argument.
For instance, we can't declare unique_function<void(SmallVectorImpl<A>&)>
when A is forward declared.

This is because SFINAE will trigger a hard error in this case, when instantiating
IsSizeLessThanThresholdT with the incomplete type.

This patch specialize AdjustedParamT for references to remove this error.

Committed on behalf of: @math-fehr (Fehr Mathieu)

Reviewed By: DaniilSuchkov, yrouban
2021-05-21 14:09:33 +07:00
Igor Kudrin
f79eaab45a [unittests][CodeGen] Mark tests that cannot be executed with GTEST_SKIP()
This helps to distinguish such tests from successfully passed ones.

Differential Revision: https://reviews.llvm.org/D102754
2021-05-21 13:39:52 +07:00
Igor Kudrin
ca34183385 [lit][gtest] Support SKIPPED tests
This updates the googletest format to support tests that use GTEST_SKIP(),
which is now available with the updated googletest framework.

Differential Revision: https://reviews.llvm.org/D102694
2021-05-21 13:39:52 +07:00
Xiang1 Zhang
b32d7b7d22 [HWASAN] No code changed, Only clang-format for HWAddressSanitizer.cpp 2021-05-21 14:00:34 +08:00
Christudasan Devadasan
44fa7cd53e GlobalISel: Help reduce operation width for instruction with two results.
The function `reduceOperationWidth` helps to legalize a vector
operation either by narrowing its type or by scalarizing the
operation itself. It currently supports instructions with one result.
This patch, in addition allows the same for instructions with two
results (for instance, G_SDIVREM).

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D100725
2021-05-21 10:34:18 +05:30
Serge Pavlov
72fd6b9af9 [APFloat] convertToDouble/Float can work on shorter types
Previously APFloat::convertToDouble may be called only for APFloats that
were built using double semantics. Other semantics like single precision
were not allowed although corresponding numbers could be converted to
double without loss of precision. The similar restriction applied to
APFloat::convertToFloat.

With this change any APFloat that can be precisely represented by double
can be handled with convertToDouble. Behavior of convertToFloat was
updated similarly. It make the conversion operations more convenient and
adds support for formats like half and bfloat.

Differential Revision: https://reviews.llvm.org/D102671
2021-05-21 11:02:51 +07:00
Stanislav Mekhanoshin
ebc40fee5f [AMDGPU] Request module used variables from LDS lowering as internal
I do not see any practical difference but technically
used.* variables are internal and a call to getGlobalVariable
misses true as a second argument. NFC as far as I can tell.

Differential Revision: https://reviews.llvm.org/D102884
2021-05-20 20:55:47 -07:00
Jinsong Ji
068fb4deaa [AIX] Print printable byte list as quoted string
.byte supports string, so if the whole byte list are printable,
we can actually print the string for readability and LIT tests maintainence.

        .byte 'H,'e,'l,'l,'o,',,' ,'w,'o,'r,'l,'d
->
        .byte "Hello, world"

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D102814
2021-05-21 02:37:55 +00:00
Nicolai Hähnle
5d6415281f [IR] Memory intrinsics are not unconditionally nosync
Remove the `nosync` attribute from the memory intrinsic definitions
(i.e. memset, memcpy, memmove).

Like native memory accesses, memory intrinsics can be volatile. This is
indicated by an immarg in the intrinsic call. All else equal, a volatile
memory intrinsic is `sync`, so we cannot annotate the intrinsic functions
themselves as `nosync`. The attributor and function-attr passes know to
take the volatile bit into account.

Since `nosync` is a default attribute, this means we have to stop using
the DefaultAttrIntrinsic tablegen class for memory intrinsics, and
specify all default attributes other than `nosync` explicitly.

Most of the test changes are trivial churn, but one test case
(in nosync.ll) was in fact incorrect before this change.

Differential Revision: https://reviews.llvm.org/D102295
2021-05-21 03:40:59 +02:00
Nicolai Hähnle
43ee457d7b [tests] Update Transforms/DeadStoreElim/multiblock-malloc-free.ll
This change is generated by running update_test_checks.py. It serves to
make subsequent diffs easier to understand.
2021-05-21 02:36:57 +02:00
Stanislav Mekhanoshin
cc39b4a525 [AMDGPU] Fix module LDS selection
Accesses to global module LDS variable start from null,
but kernel also thinks its variables start address is
null. Fixed by not using a null as an address.

Differential Revision: https://reviews.llvm.org/D102882
2021-05-20 15:59:01 -07:00
Vitaly Buka
d949d3f98f [asan] Add autogenerated test for fake stack
This will help to see result of D102462.

Test was generated with
./llvm/utils/update_test_checks.py llvm/test/Instrumentation/AddressSanitizer/fake-stack.ll --opt-binary <build_dir>/bin/opt

Differential Revision: https://reviews.llvm.org/D102867
2021-05-20 15:48:16 -07:00
Min-Yih Hsu
906bc160f9 [M68k] Support for inline asm operands w/ simple constraints
This patch adds supports for inline assembly operands and some simple
operand constraints, including register and constant operands.

Differential Revision: https://reviews.llvm.org/D102585
2021-05-20 14:00:09 -07:00
Min-Yih Hsu
7cd0b2d7ab [M68k] Allow user to preserve certain registers
Add `-ffixed-a[0-6]` and `-ffixed-d[0-7]` and the corresponding
subtarget features to prevent certain register from being allocated.

Differential Revision: https://reviews.llvm.org/D102805
2021-05-20 13:57:22 -07:00
Jan Kratochvil
528de0554b [lldb] Improve invalid DWARF DW_AT_ranges error reporting
In D98289#inline-939112 @dblaikie said:
  Perhaps this could be more informative about what makes the range list
  index of 0 invalid? "index 0 out of range of range list table (with
  range list base 0xXXX) with offset entry count of XX (valid indexes
  0-(XX-1))" Maybe that's too verbose/not worth worrying about since
  this'll only be relevant to DWARF producers trying to debug their
  DWARFv5, maybe no one will ever see this message in practice. Just
  a thought.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D102851
2021-05-20 21:37:01 +02:00
Jessica Clarke
8c42ad8897 [SelectionDAG][Mips][PowerPC][RISCV][WebAssembly] Teach computeKnownBits/ComputeNumSignBits about atomics
Unlike normal loads these don't have an extension field, but we know
from TargetLowering whether these are sign-extending or zero-extending,
and so can optimise away unnecessary extensions.

This was noticed on RISC-V, where sign extensions in the calling
convention would result in unnecessary explicit extension instructions,
but this also fixes some Mips inefficiencies. PowerPC sees churn in the
tests as all the zero extensions are only for promoting 32-bit to
64-bit, but these zero extensions are still not optimised away as they
should be, likely due to i32 being a legal type.

This also simplifies the WebAssembly code somewhat, which currently
works around the lack of target-independent combines with some ugly
patterns that break once they're optimised away.

Re-landed with correct handling in ComputeNumSignBits for Tmp == VTBits,
where zero-extending atomics were incorrectly returning 0 rather than
the (slightly confusing) required return value of 1.

Re-landed again after D102819 fixed PowerPC to correctly zero-extend all
of its atomics as it claimed to do, since the combination of that bug
and this optimisation caused buildbot regressions.

Reviewed By: RKSimon, atanasyan

Differential Revision: https://reviews.llvm.org/D101342
2021-05-20 20:34:23 +01:00
LLVM GN Syncbot
6ad4e7dfe1 [gn build] Port 0af3105b641a 2021-05-20 19:20:25 +00:00
Jon Roelofs
8ecb13b84d Revert "[Remarks] Add analysis remarks for memset/memcpy/memmove lengths"
This reverts commit 4bf69fb52b3c445ddcef5043c6b292efd14330e0.

This broke spec2k6/403.gcc under -global-isel. Details to follow once I've
reduced the problem.
2021-05-20 12:19:16 -07:00
Nico Weber
b71c3b1c87 [gn build] try reverting code part of f05fbb7795
Maybe aa8fe8fe6c7b was all that was needed to fix the build and
we can keep the code with fewer conditionals after all.
2021-05-20 15:08:39 -04:00
Nico Weber
77abe655fc [gn build] attempt again to unbreak linux after fc9696130c8 2021-05-20 15:01:35 -04:00
Nico Weber
8f7f747cf3 [gn build] use PEP-8 indents in symbol_exports.py 2021-05-20 15:00:24 -04:00
Nico Weber
56a414f103 [gn build] attempt to unbreak linux after fc9696130c8
Only emit `global:` if there are any exported symbols.

While here, `chmod +x` the symbol_exports.py script.
2021-05-20 14:55:40 -04:00
Nico Weber
c41aa09bfb [gn build] Use .export files
Just fixing an old TODO, no dramatic behavior change.

Differential Revision: https://reviews.llvm.org/D102843
2021-05-20 14:48:12 -04:00
Kevin P. Neal
d53eff6d30 [FPEnv] EarlyCSE support for constrained intrinsics, default FP environment edition
EarlyCSE cannot distinguish between floating point instructions and
constrained floating point intrinsics that are marked as running in the
default FP environment. Said intrinsics are supposed to behave exactly the
same as the regular FP instructions. Teach EarlyCSE to handle them in that
case.

Differential Revision: https://reviews.llvm.org/D99962
2021-05-20 14:40:51 -04:00
Fraser Cormack
21ba453e3b [RISCV] Ensure small mask BUILD_VECTORs aren't expanded
The default expansion for BUILD_VECTORs -- save for going through
shuffles -- is to go through the stack. This method only works when the
type is at least byte-sized, so for v2i1 and v4i1 we would crash.

This patch ensures that small mask-type BUILD_VECTORs are always handled
without crashing. We lower to a SETCC of the equivalent i8 type.

This also exposes some pre-existing issues where the lowering when
optimizing for size results in larger code than without. Those will be
tackled in future patches.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D102767
2021-05-20 19:12:29 +01:00
Reid Kleckner
c18c30409c [PGO] Don't reference functions unless value profiling is enabled
This reduces the size of chrome.dll.pdb built with optimizations,
coverage, and line table info from 4,690,210,816 to 2,181,128,192, which
makes it possible to fit under the 4GB limit.

This change can greatly reduce binary size in coverage builds, which do
not need value profiling. IR PGO builds are unaffected. There is a minor
behavior change for frontend PGO.

PGO and coverage both use InstrProfiling to create profile data with
counters. PGO records the address of each function in the __profd_
global. It is used later to map runtime function pointer values back to
source-level function names. Coverage does not appear to use this
information.

Recording the address of every function with code coverage drastically
increases code size. Consider this program:

  void foo();
  void bar();
  inline void inlineMe(int x) {
    if (x > 0)
      foo();
    else
      bar();
  }
  int getVal();
  int main() { inlineMe(getVal()); }

With code coverage, the InstrProfiling pass runs before inlining, and it
captures the address of inlineMe in the __profd_ global. This greatly
increases code size, because now the compiler can no longer delete
trivial code.

One downside to this approach is that users of frontend PGO must apply
the -mllvm -enable-value-profiling flag globally in TUs that enable PGO.
Otherwise, some inline virtual method addresses may not be recorded and
will not be able to be promoted. My assumption is that this mllvm flag
is not popular, and most frontend PGO users don't enable it.

Differential Revision: https://reviews.llvm.org/D102818
2021-05-20 11:09:24 -07:00