1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 13:11:39 +01:00

5002 Commits

Author SHA1 Message Date
Fangrui Song
a99d809eec [SimplifyLibcalls] Don't replace locked IO (fgetc/fgets/fputc/fputs/fread/fwrite) with unlocked IO (*_unlocked)
This essentially reverts some of the SimplifyLibcalls part changes of D45736 [SimplifyLibcalls] Replace locked IO with unlocked IO.

C11 7.21.5.2 The fflush function

> If stream is a null pointer, the fflush function performs this flushing action on all streams for which the behavior is defined above.

i.e. fopen'ed FILE* is inherently captured.

POSIX.1-2017 getc_unlocked, getchar_unlocked, putc_unlocked, putchar_unlocked - stdio with explicit client locking

> These functions can safely be used in a multi-threaded program if and only if they are called while the invoking thread owns the ( FILE *) object, as is the case after a successful call to the flockfile() or ftrylockfile() functions.

After a thread fopen'ed a FILE*, when it is calling foobar() which is now replaced by foobar_unlocked(),
if another thread is concurrently calling fflush(0), the behavior is undefined.

C11 7.22.4.4 The exit function

> Next, all open streams with unwritten buffered data are flushed, all open streams are closed, and all files created by the tmpfile function are removed.

The replacement is only feasible if the program is single threaded, or exit or fflush(0) is never called.
See also http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180528/556615.html
for how the replacement makes libc interceptors difficult to implement.

dalias: in a worst case, it's unbounded data corruption because of concurrent access to pointers
without synchronization.  f->wpos or rpos could get outside of the buffer, thread A could do
f->wpos += j after knowing j is in bounds, while thread B also changes it concurrently.

This can produce exploitable conditions depending on libc internals.

Revert the SimplifyLibcalls part change because the cons obviously
overweigh the pros.  Even when the replacement is feasible, the benefit
is indemonstrable, more so in an application instead of an artificial
glibc benchmark.  Theoretically the replacement could be beneficial when
calling getc_unlocked/putc_unlocked in a loop, but then it is better
using a blocked IO operation and the user is likely aware of that.

The function attribute inference is still useful and thus kept.

Reviewed By: xbolva00

Differential Revision: https://reviews.llvm.org/D75933
2020-03-10 11:11:58 -07:00
Simon Moll
24a623a765 [instcombine] remove fsub to fneg hacks; only emit fneg
Summary: Rewrite the fsub-0.0 idiom to fneg and always emit fneg for fp
negation. This also extends the scalarization cost in instcombine for unary
operators to result in the same IR rewrites for fneg as for the idiom.

Reviewed By: cameron.mcinally

Differential Revision: https://reviews.llvm.org/D75467
2020-03-10 16:57:02 +01:00
Florian Hahn
bc7c21fd6c [InstCombine] Support vectors in SimplifyAddWithRemainder.
SimplifyAddWithRemainder currently also matches for vector types, but
tries to create an integer constant, which causes a crash.

By using Constant::getIntegerValue() we can support both the scalar and
vector cases.

The 2 added test cases crash without the fix.

Reviewers: spatel, lebedev.ri

Reviewed By: spatel, lebedev.ri

Differential Revision: https://reviews.llvm.org/D75906
2020-03-10 14:29:40 +00:00
Sanjay Patel
04f0791d5b [InstCombine] regenerate test checks; NFC
tmp -> t because 'tmp' tends to cause problems for the auto-generation script.
2020-03-10 09:57:41 -04:00
Sanjay Patel
7af984d05f [InstCombine] fold gep-of-select-of-constants (PR45084)
As shown in:
https://bugs.llvm.org/show_bug.cgi?id=45084
...we failed to combine a gep with constant indexes with a
pointer operand that is a select of constants.

Differential Revision: https://reviews.llvm.org/D75807
2020-03-10 09:25:13 -04:00
Sanjay Patel
b22a808ae3 [InstCombine] add/adjust tests for select-gep; NFC
Goes with D75807
2020-03-10 09:25:13 -04:00
JF Bastien
11b6dfb970 Test that volatile load type isn't changed
Summary: As discussed in D75505, it's not particularly useful to change the type of a load to/from floating-point/integer because it's followed by a bitcast, and it might lead to surprising code generation. Check that this doesn't generally happen.

Reviewers: lebedev.ri

Subscribers: jkorous, dexonsmith, ributzka, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75644
2020-03-09 11:19:23 -07:00
Nikita Popov
70e47f8656 [InstSimplify] Simplify calls with "returned" attribute
If a call argument has the "returned" attribute, we can simplify
the call to the value of that argument. The "-inst-simplify" pass
already handled this for the constant integer argument case via
known bits, which is invoked in SimplifyInstruction. However,
non-constant (or non-int) arguments are not handled at all right now.

This addresses one of the regressions from D75801.

Differential Revision: https://reviews.llvm.org/D75815
2020-03-09 18:53:47 +01:00
Nikita Popov
3ff2f152d1 [InstCombine] Don't simplify calls without uses
When simplifying a call without uses, replaceInstUsesWith() is
going to do nothing, but we'll skip all following folds. We can
only run into this problem with calls that both simplify and are
not trivially dead if unused, which currently seems to happen only
with calls to undef, as the test diff shows. When extending
SimplifyCall() to handle "returned" attributes, this becomes a much
bigger problem, so I'm fixing this first.

Differential Revision: https://reviews.llvm.org/D75814
2020-03-09 18:47:46 +01:00
Nikita Popov
0394d877a8 [InstCombine] Fix known bits handling in SimplifyDemandedUseBits
Fixes a regression from D75801. SimplifyDemandedUseBits() is also
supposed to compute the known bits (of the demanded subset) of the
instruction. For unknown instructions it does so by directly calling
computeKnownBits(). For known instructions it will compute known
bits itself. However, for instructions where only some cases are
handled directly (e.g. a constant shift amount) the known bits
invocation for the unhandled case is sometimes missing. This patch
adds the missing calls and thus removes the main discrepancy with
ExpensiveCombines mode.

Differential Revision: https://reviews.llvm.org/D75804
2020-03-07 18:16:41 +01:00
Nikita Popov
9cf5fbac61 [InstCombine] Regenerate test checks; NFC 2020-03-07 17:58:33 +01:00
Nikita Popov
e37fe0311a [InstCombine] Add additional known bits folding tests; NFC 2020-03-07 17:17:03 +01:00
Nikita Popov
57a74f64e3 [InstCombine] Highlight tests using expensive combines; NFC 2020-03-07 17:16:47 +01:00
Sanjay Patel
ce0f3e8557 [InstCombine] regenerate complete test checks; NFC 2020-03-07 10:20:38 -05:00
Sanjay Patel
f0de4cc6f2 [InstCombine] add test for gep (select),... (PR45084); NFC 2020-03-07 10:00:31 -05:00
Roman Lebedev
7c5b5c8fe7 [InstComine] Forego of one-use check in (X - (X & Y)) --> (X & ~Y) if Y is a constant
Summary:
This is potentially more friendly for further optimizations,
analysies, e.g.: https://godbolt.org/z/G24anE

This resolves phase-ordering bug that was introduced
in D75145 for https://godbolt.org/z/2gBwF2
https://godbolt.org/z/XvgSua

Reviewers: spatel, nikic, dmgreen, xbolva00

Reviewed By: nikic, xbolva00

Subscribers: hiraditya, zzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75757
2020-03-06 21:39:07 +03:00
Roman Lebedev
914faf6193 [NFC][InstCombine] Add 'x - (x & y)' tests with multi-use 'and'
If %y is constant, we could still perform the fold
2020-03-06 19:41:19 +03:00
Nikita Popov
88c43a1ace [InstCombine] Use IRBuilder to create bitcast
This makes sure that the constant expression bitcast goes through
target-dependent constant folding, and thus avoids an additional
iteration of InstCombine.
2020-03-04 18:28:38 +01:00
Roman Lebedev
628089dc06 [NFC][InstCombine] Add test with non-CSE'd casts of load
in @t0 we can still change type of load and get rid of casts.
2020-03-03 11:27:27 +03:00
Reid Kleckner
20236045b1 [WinEH] Fix inttoptr+phi optimization in presence of catchswitch
getFirstInsertionPt's return value must be checked for validity before
casting it to Instruction*. Don't attempt to insert casts after a phi in
a catchswitch block.

Fixes PR45033, introduced in D37832.

Reviewed By: davidxl, hfinkel

Differential Revision: https://reviews.llvm.org/D75381
2020-03-01 07:49:28 -08:00
Simon Pilgrim
6019f1db6d [X86][F16C] Remove cvtph2ps intrinsics and use generic half2float conversion (PR37554)
This removes everything but int_x86_avx512_mask_vcvtph2ps_512 which provides the SAE variant, but even this can use the fpext generic if the rounding control is the default.

Differential Revision: https://reviews.llvm.org/D75162
2020-02-29 18:57:35 +00:00
Nikita Popov
730509657a [InstCombine] DCE instructions earlier
When InstCombine initially populates the worklist, it already
performs constant folding and DCE. However, as the instructions
are initially visited in program order, this DCE can pick up only
the last instruction of a dead chain, the rest would only get
picked up in the main InstCombine run.

To avoid this, we instead perform the DCE in separate pass over the
collected instructions in reverse order, which will allow us to
pick up full dead instruction chains. We already need to do this
reverse iteration anyway to populate the worklist, so this
shouldn't add extra cost.

This by itself only fixes a small part of the problem though:
The same basic issue also applies during the main InstCombine loop.
We generally always want DCE to occur as early as possible,
because it will allow one-use folds to happen. Address this by also
performing DCE while adding deferred instructions to the main worklist.

This drops the number of tests that perform more than 2 InstCombine
iterations from ~80 to ~40. There's some spurious test changes due
to operand order / icmp toggling.

Differential Revision: https://reviews.llvm.org/D75008
2020-02-27 18:45:59 +01:00
Simon Moll
9f544f2f1b Remove BinaryOperator::CreateFNeg
Use UnaryOperator::CreateFNeg instead.

Summary:
With the introduction of the native fneg instruction, the
fsub -0.0, %x idiom is obsolete. This patch makes LLVM
emit fneg instead of the idiom in all places.

Reviewed By: cameron.mcinally

Differential Revision: https://reviews.llvm.org/D75130
2020-02-27 09:06:03 -08:00
Simon Pilgrim
9b56ed0634 [InstCombine] Add PR14365 test cases + vector equivalents. 2020-02-27 15:54:14 +00:00
Nikita Popov
abfdd395ea [InstCombine] Remove trivially empty ranges from end
InstCombine removes pairs of start+end intrinsics that don't
have anything in between them. Currently this is done by starting
at the start intrinsic and scanning forwards. This patch changes
it to start at the end intrinsic and scan backwards.

The motivation here is as follows: When we process the start
intrinsic, we have not yet looked at the following instructions,
which may still get folded/removed. If they do, we will only be
able to remove the start/end pair on the next iteration. When we
process the end intrinsic, all the instructions before it have
already been visited, and we don't run into this problem.

Differential Revision: https://reviews.llvm.org/D75011
2020-02-26 20:04:11 +01:00
Roman Lebedev
b9cd5773e4 [InstCombine] foldShiftIntoShiftInAnotherHandOfAndInICmp(): fix miscompile (PR44802)
Much like with reassociateShiftAmtsOfTwoSameDirectionShifts(),
as input, we have the following pattern:
  icmp eq/ne (and ((x shift Q), (y oppositeshift K))), 0
We want to rewrite that as:
  icmp eq/ne (and (x shift (Q+K)), y), 0  iff (Q+K) u< bitwidth(x)

While we know that originally (Q+K) would not overflow
(because  2 * (N-1) u<= iN -1), we may have looked past extensions of
shift amounts. so it may now overflow in smaller bitwidth.

To ensure that does not happen, we need to ensure that the total maximal
shift amount is still representable in that smaller bitwidth.
If the overflow would happen, (Q+K) u< bitwidth(x) check would be bogus.

https://bugs.llvm.org/show_bug.cgi?id=44802
2020-02-25 18:23:58 +03:00
Roman Lebedev
57856c0b2d [NFC][InstCombine] Add shift amount reassociation in bittest miscompile example from PR44802
https://bugs.llvm.org/show_bug.cgi?id=44802
2020-02-25 18:23:58 +03:00
Roman Lebedev
4e2b207539 [InstCombine] reassociateShiftAmtsOfTwoSameDirectionShifts(): fix miscompile (PR44802)
As input, we have the following pattern:
  Sh0 (Sh1 X, Q), K
We want to rewrite that as:
  Sh x, (Q+K)  iff (Q+K) u< bitwidth(x)
While we know that originally (Q+K) would not overflow
(because  2 * (N-1) u<= iN -1), we may have looked past extensions of
shift amounts. so it may now overflow in smaller bitwidth.

To ensure that does not happen, we need to ensure that the total maximal
shift amount is still representable in that smaller bitwidth.
If the overflow would happen, (Q+K) u< bitwidth(x) check would be bogus.

https://bugs.llvm.org/show_bug.cgi?id=44802
2020-02-25 18:23:51 +03:00
Roman Lebedev
d971ae303f [NFC][InstCombine] Add shift amount reassociation miscompile example from PR44802
https://bugs.llvm.org/show_bug.cgi?id=44802
2020-02-25 18:23:16 +03:00
Nuno Lopes
db1959b9b6 [NFC] fix test nan value 2020-02-23 12:42:47 +00:00
Krzysztof Parzyszek
11344326c1 [Hexagon] Simplify intrinsic (vandvrt (vandqrt q b) m) -> q if possible
When each byte in b&m is non-zero, this conversion Q->V->Q is a no-op.
2020-02-21 13:56:04 -06:00
Nikita Popov
01133bb6c1 [InstCombine] Improve simplify demanded bits worklist management
This fixes a small mistake from D72944: The worklist add should
happen before assigning the new operand, not after.

In case an actual replacement happens, the old operand needs to
be added for DCE. If no actual replacement happens, then old/new
are the same, so it doesn't matter.

This drops one iteration from the annotated test case.
2020-02-21 18:51:41 +01:00
Nikita Popov
055f58eb4f [SimplifyLibCalls][IRBuilder] Accept any IRBuilder in SimplifyLibCalls
This changes the SimplifyLibCalls utility to accept an IRBuilderBase,
which allows us to pass through the IRBuilder used by InstCombine.
This will ensure that new instructions get added to the worklist.
The annotated test-case drops from 4 to 2 InstCombine iterations thanks
to this.

To achieve this, I'm adding an IRBuilderBase::OperandBundlesGuard,
which is basically the same as the existing InsertPointGuard and
FastMathFlagsGuard, but for operand bundles. Also add a
setDefaultOperandBundles() method so these can be set outside the
constructor.

Differential Revision: https://reviews.llvm.org/D74792
2020-02-21 18:26:05 +01:00
Nikita Popov
453ec55b92 Reapply [IRBuilder] Always respect inserter/folder
Some IRBuilder methods that were originally defined on
IRBuilderBase do not respect custom IRBuilder inserters/folders,
because those were not accessible prior to D73835. Fix this by
making use of existing (and now accessible) IRBuilder methods,
which will handle inserters/folders correctly.

There are some changes in OpenMP and Instrumentation tests, where
bitcasts now get constant folded. I've also highlighted one
InstCombine test which now finishes in two rather than three
iterations, thanks to new instructions being inserted into the
worklist.

Differential Revision: https://reviews.llvm.org/D74787
2020-02-19 20:51:38 +01:00
Nikita Popov
ad15d536fe Revert "[IRBuilder] Always respect inserter/folder"
This reverts commit f12fb2d99b8dd0dbef1c79f1d401200150f2d0bd.

I missed some changes in instrumentation test cases.
2020-02-19 17:51:55 +01:00
Nikita Popov
e70b7afc11 [IRBuilder] Always respect inserter/folder
Some IRBuilder methods that were originally defined on
IRBuilderBase do not respect custom IRBuilder inserters/folders,
because those were not accessible prior to D73835. Fix this by
making use of existing (and now accessible) IRBuilder methods,
which will handle inserters/folders correctly.

There are some changes in OpenMP tests, where bitcasts now get
constant folded. I've also highlighted one InstCombine test which
now finishes in two rather than three iterations, thanks to new
instructions being inserted into the worklist.

Differential Revision: https://reviews.llvm.org/D74787
2020-02-19 17:44:43 +01:00
Nikita Popov
a5bd3602df [InstCombine] Fix worklist management when simplifying demanded bits
When simplifying demanded bits, we currently only report the
instruction on which SimplifyDemandedBits was called as changed.
However, this is a recursive call, and the actually modified
instruction will usually be further up the chain. Additionally,
all the intermediate instructions should also be revisited,
as additional combines may be possible after the demanded bits
simplification. We fix this by explicitly adding them back to the
worklist.

Differential Revision: https://reviews.llvm.org/D72944
2020-02-18 17:55:40 +01:00
Nikita Popov
f35bb9153d [InstCombine] Fix multi-use handling in cttz transform
The select-of-cttz transform can currently duplicate cttz intrinsics
and zext/trunc ops. The cause is that it unnecessarily duplicates
the intrinsic and the zext/trunc when setting the "undef_on_zero"
flag to false. However, it's always legal to set the flag from true
to false, so we can make this replacement even if there are extra users.

Differential Revision: https://reviews.llvm.org/D74685
2020-02-18 17:55:00 +01:00
Nikita Popov
830db6ad23 [InstCombine] Relax preconditions for ashr+and+icmp fold (PR44754)
Fix for https://bugs.llvm.org/show_bug.cgi?id=44754. We already have
a fold that converts icmp (and (ashr X, C3), C2), C1 into
icmp (and C2'), C1', but it imposed overly strict requirements on the
transform.

Relax this by checking that both C2 and C1 don't shift out bits
(in a signed sense) when forming the new constants.

Alive proofs (https://rise4fun.com/Alive/PTz0):

    Name: ashr_legal
    Pre: ((C2 << C3) >> C3) == C2 && ((C1 << C3) >> C3) == C1
    %a = ashr i16 %x, C3
    %b = and i16 %a, C2
    %c = icmp i16 %b, C1
    =>
    %d = and i16 %x, C2 << C3
    %c = icmp i16 %d, C1 << C3

    Name: ashr_shiftout_eq
    Pre: ((C2 << C3) >> C3) == C2 && ((C1 << C3) >> C3) != C1
    %a = ashr i16 %x, C3
    %b = and i16 %a, C2
    %c = icmp eq i16 %b, C1
    =>
    %c = false

Note that >> corresponds to ashr here. The case of an equality
comparison has some special handling in this transform, because
it will form to a true/false result if the condition on the comparison
constant it violated.

Differential Revision: https://reviews.llvm.org/D74294
2020-02-18 17:49:46 +01:00
Nikita Popov
d9bc9d4b9a [InstCombine] Add more tests for icmp+and+ashr; NFC 2020-02-18 17:47:48 +01:00
Florian Hahn
9599c6b985 [InstCombine] Simplify a umul overflow check to a != 0 && b != 0.
This patch adds a simplification if an OR weakens the overflow condition
for umul.with.overflow by treating any non-zero result as overflow. In that
case, we overflow if both umul.with.overflow operands are != 0, as in that
case the result can only be 0, iff the multiplication overflows.

Code like this is generated by code using __builtin_mul_overflow with
negative integer constants, e.g.
   bool test(unsigned long long v, unsigned long long *res) {
     return __builtin_mul_overflow(v, -4775807LL, res);
   }

```
----------------------------------------
Name: D74141
  %res = umul_overflow {i8, i1} %a, %b
  %mul = extractvalue {i8, i1} %res, 0
  %overflow = extractvalue {i8, i1} %res, 1
  %cmp = icmp ne %mul, 0
  %ret = or i1 %overflow, %cmp
  ret i1 %ret
=>
  %t0 = icmp ne i8 %a, 0
  %t1 = icmp ne i8 %b, 0
  %ret = and i1 %t0, %t1
  ret i1 %ret
  %res = umul_overflow {i8, i1} %a, %b
  %mul = extractvalue {i8, i1} %res, 0
  %cmp = icmp ne %mul, 0
  %overflow = extractvalue {i8, i1} %res, 1

Done: 1
Optimization is correct!

```

Reviewers: nikic, lebedev.ri, spatel, Bigcheese, dexonsmith, aemerson

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D74141
2020-02-18 09:11:55 +01:00
Florian Hahn
72aa083958 [InstCombine] Precommit umul.with.overflow sign check test.
Precommit tests for D74141.
2020-02-18 08:46:50 +01:00
Nikita Popov
f51c934143 [InstCombine] Add multiuse tests for cttz transform; NFC
These show incorrect duplication of instructions.
2020-02-16 15:52:09 +01:00
Yuanfang Chen
dd53274771 Revert "Revert "Reland "[Support] make report_fatal_error abort instead of exit"""
This reverts commit 80a34ae31125aa46dcad47162ba45b152aed968d with fixes.

Previously, since bots turning on EXPENSIVE_CHECKS are essentially turning on
MachineVerifierPass by default on X86 and the fact that
inline-asm-avx-v-constraint-32bit.ll and inline-asm-avx512vl-v-constraint-32bit.ll
are not expected to generate functioning machine code, this would go
down to `report_fatal_error` in MachineVerifierPass. Here passing
`-verify-machineinstrs=0` to make the intent explicit.
2020-02-13 10:16:06 -08:00
Yuanfang Chen
2dbac841f9 Revert "Revert "Revert "Reland "[Support] make report_fatal_error abort instead of exit""""
This reverts commit bb51d243308dbcc9a8c73180ae7b9e47b98e68fb.
2020-02-13 10:08:05 -08:00
Yuanfang Chen
93e82c22ef Revert "Revert "Reland "[Support] make report_fatal_error abort instead of exit"""
This reverts commit 80a34ae31125aa46dcad47162ba45b152aed968d with fixes.

On bots llvm-clang-x86_64-expensive-checks-ubuntu and
llvm-clang-x86_64-expensive-checks-debian only,
llc returns 0 for these two tests unexpectedly. I tweaked the RUN line a little
bit in the hope that LIT is the culprit since this change is not in the
codepath these tests are testing.
llvm\test\CodeGen\X86\inline-asm-avx-v-constraint-32bit.ll
llvm\test\CodeGen\X86\inline-asm-avx512vl-v-constraint-32bit.ll
2020-02-13 10:02:53 -08:00
Yuanfang Chen
c7fb4c55c4 Revert "Reland "[Support] make report_fatal_error abort instead of exit""
This reverts commit rGcd5b308b828e, rGcd5b308b828e, rG8cedf0e2994c.

There are issues to be investigated for polly bots and bots turning on
EXPENSIVE_CHECKS.
2020-02-11 20:41:53 -08:00
Yuanfang Chen
83a2f3c1ba Reland "[Support] make report_fatal_error abort instead of exit"
Summary:
Reland D67847 after D73742 is committed. Replace `sys::Process::Exit(1)`
with `abort` in `report_fatal_error`.

After this patch, for tools turning on `CrashRecoveryContext`,
crash handler installed by `CrashRecoveryContext` is called unless
they installed a non-returning handler using `llvm::install_fatal_error_handler`
like `cc1_main` currently does.

Reviewers: rnk, MaskRay, aganea, hans, espindola, jhenderson

Subscribers: jholewinski, qcolombet, dschuff, jyknight, emaste, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, rupprecht, jocewei, jsji, Jim, dmgreen, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D74456
2020-02-11 18:20:40 -08:00
Sanjay Patel
c9aa97f7e3 [InstCombine] fix use check when canonicalizing abs/nabs
We were checking for extra uses of the negated operand even
if we were not going to create it as part of this canonicalization.

This was showing up as a regression when we limit EarlyCSE as
proposed in D74285.
2020-02-10 14:57:37 -05:00
Sanjay Patel
1b2f10ea80 [InstCombine] add tests for abs with extra use of operand; NFC 2020-02-10 14:57:37 -05:00