Truncate the movmsk scalar integer result to the equivalent scalar integer width as before but then bitcast to the requested type.
We still have the issue identified in PR41594 but D61114 should handle this.
llvm-svn: 359176
On Mips32r2 bitcast can be expanded to two sw instructions and an ldc1
when using bitcast i64 to double or an sdc1 and two lw instructions when
using bitcast double to i64. By introducing custom lowering that uses
mtc1/mthc1 we can avoid excessive instructions.
Patch by Mirko Brkusanin.
Differential Revision: https://reviews.llvm.org/D61069
llvm-svn: 359171
This should fix the MachO/x86-64 eh-frame regression test by ensuring that
the symbols __ZTIi and ___gxx_personality_v0 are defined on all platforms.
llvm-svn: 359169
Summary:
When refactoring vectorization flags, vectorization was disabled by default in the new pass manager.
This patch re-enables is for both managers, and changes the assumptions opt makes, based on the new defaults.
Comments in opt.cpp should clarify the intended use of all flags to enable/disable vectorization.
Reviewers: chandlerc, jgorbe
Subscribers: jlebar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61091
llvm-svn: 359167
For well-known type IDs, include the name of the type.
To not duplicate the ID->name map, make llvm-readobj call this new
function as well. It has slightly different output, so this also
requires updating a few tests.
Differential Revision: https://reviews.llvm.org/D61086
llvm-svn: 359153
Summary:
This emits labels around heapallocsite calls and S_HEAPALLOCSITE debug
info in codeview. Currently only changes FastISel, so emitting labels still
needs to be implemented in SelectionDAG.
Reviewers: rnk
Subscribers: aprantl, hiraditya, cfe-commits, llvm-commits
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D61083
llvm-svn: 359149
If we have a vector FP division with a splatted divisor, use the existing transform
that converts 'x/y' into 'x * (1.0/y)' to allow more conversions. This can then
potentially be converted into a scalar FP division by existing combines (rL358984)
as seen in the tests here.
That can be a potentially big perf difference if scalar fdiv has better timing
(including avoiding possible frequency throttling for vector ops).
Differential Revision: https://reviews.llvm.org/D61028
llvm-svn: 359147
Using initial-exec TLS variables is a reasonable performance
optimisation for system libraries. Use the correct PIC mechanism to get
hold of the GOT to avoid text relocations.
Differential Revision: https://reviews.llvm.org/D61026
llvm-svn: 359146
Summary: The code did not check if operand was undef before casting it to Instruction.
Reviewers: RKSimon, ABataev, dtemirbulatov
Reviewed By: ABataev
Subscribers: uabelho
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61024
llvm-svn: 359136
While this doesn't come up in reasonable cases currently (the only user
defined types not in type units are ones without linkage - which makes
for near-ODR violations, because it'd be a type with linkage referencing
a type without linkage - such a type can't be validly defined in more
than one TU, so arguably it shouldn't be in a type unit to begin with -
but it's a convenient way to demonstrate an issue that will become more
revalent with homed modular debug info type definitions - which also
don't need to be in type units but more legitimately so).
Precursor to the Clang change to de-type-unit (by omitting the
'identifier') types homed due to strong linkage vtables. (making that
change without this one would lead to major type duplication in type
units)
llvm-svn: 359122
ReplaceAllUsesWith doesn't remove the node that was replaced. So its left around in the graph messing up use counts on other nodes.
One thing to note, is that this isn't valid if the node being deleted is the root node of an LEA match that gets rejected. In that case the node needs to stay alive because the isel table walking code would still have a reference to it that its going to try to match next. I don't think that's the case here though because the nodes being deleted here should be "and", "srl", and "zero_extend" none of which can be the root node of an LEA match.
Differential Revision: https://reviews.llvm.org/D61048
llvm-svn: 359121
Frame Descriptor Entries (FDEs) have a pointer back to a Common Information
Entry (CIE) that describes how the rest FDE should be parsed. JITLink had been
assuming that FDEs always referred to the most recent CIE encountered, but the
spec allows them to point back to any previously encountered CIE. This patch
fixes JITLink to look up the correct CIE for the FDE.
The testcase is a MachO binary with an FDE that refers to a CIE that is not the
one immediately proceeding it (the layout can be viewed wit
'dwarfdump --eh-frame <testcase>'. This test case had to be a binary as llvm-mc
now sorts FDEs (as of r356216) to ensure FDEs *do* point to the most recent CIE.
llvm-svn: 359105
If the types don't match, we can't just remove the shuffle.
There may be some other opportunity for optimization here,
but this should prevent the crashing seen in:
https://bugs.llvm.org/show_bug.cgi?id=41414
llvm-svn: 359095
If two .res files contain the same resource, cvtres.exe (and hence
link.exe) reject the input with this message:
CVTRES : fatal error CVT1100: duplicate resource. type:STRING, name:101, language:0x0409
LINK : fatal error LNK1123: failure during conversion to COFF: file invalid or corrupt
llvm-cvtres (and lld-link) used to silently pick one of the duplicate
resources instead. This patch makes them report an error as well.
We slightly improve on cvtres by printing the name of two .res files
containing duplicate entries as well.
Differential Revision: https://reviews.llvm.org/D61049
llvm-svn: 359083
* Add support for uniquing strings in the remark streamer and emitting the string table in the remarks section.
* Add parsing support for the string table in the RemarkParser.
From this remark:
```
--- !Missed
Pass: inline
Name: NoDefinition
DebugLoc: { File: 'test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c',
Line: 7, Column: 3 }
Function: printArgsNoRet
Args:
- Callee: printf
- String: ' will not be inlined into '
- Caller: printArgsNoRet
DebugLoc: { File: 'test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c',
Line: 6, Column: 0 }
- String: ' because its definition is unavailable'
...
```
to:
```
--- !Missed
Pass: 0
Name: 1
DebugLoc: { File: 3, Line: 7, Column: 3 }
Function: 2
Args:
- Callee: 4
- String: 5
- Caller: 2
DebugLoc: { File: 3, Line: 6, Column: 0 }
- String: 6
...
```
And the string table in the .remarks/__remarks section containing:
```
inline\0NoDefinition\0printArgsNoRet\0
test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c\0printf\0
will not be inlined into \0 because its definition is unavailable\0
```
This is mostly supposed to be used for testing purposes, but it gives us
a 2x reduction in the remark size, and is an incremental change for the
updates to the remarks file format.
Differential Revision: https://reviews.llvm.org/D60227
llvm-svn: 359050
Summary:
If two arguments are both readonly, then they have no memory dependency
that would violate noalias, even if they do actually overlap.
Reviewers: hfinkel, efriedma
Reviewed By: efriedma
Subscribers: efriedma, hiraditya, llvm-commits, tstellar
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60239
llvm-svn: 359047
Add selection support for G_INTRINSIC_ROUND, add a selection test, and add
check lines to arm64-vfloatintrinsics.ll and f16-instructions.ll.
llvm-svn: 359046
Add G_INTRINSIC_ROUND to isPreISelGenericFloatingPointOpcode to ensure that its
input and output are assigned the correct register bank.
Add a regbankselect test to verify that we get what we expect here.
llvm-svn: 359044
The simple case of:
```
int *callee();
void *caller(void *a) {
if (a == NULL)
return callee();
return a;
}
```
would generate a regular call instead of a tail call because we don't
look through the bitcast of the call to `callee` when duplicating the
return blocks.
Differential Revision: https://reviews.llvm.org/D60837
llvm-svn: 359041
In preparation of https://reviews.llvm.org/D60837, add this test where
we don't perform a tail call because we don't look through a bitcast.
llvm-svn: 359040
Summary:
Always convert switches to br_tables unless there is only one case,
which is equivalent to a simple branch. This reduces code size for wasm,
and we defer possible jump table optimizations to the VM.
Addresses PR41502.
Reviewers: kripken, sunfish
Subscribers: dschuff, sbc100, jgravelle-google, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60966
llvm-svn: 359038
Summary:
This way we can change the order of tests or delete some of them without
affecting tests for other functions.
Reviewers: tlively
Subscribers: sunfish, dschuff, sbc100, jgravelle-google, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D60929
llvm-svn: 359036
Apparently FileCheck wasn't actually matching the fallback check lines in
arm64-vfloatintrinsics.ll properly. So, there were selection fallbacks for
G_INTRINSIC_TRUNC there.
Actually hook it up into AArch64InstructionSelector.cpp and write a proper
selection test.
I guess I'll figure out the FileCheck magic to make the fallback checks work
properly in arm64-vfloatintrinsics.ll.
llvm-svn: 359030
DominatorTree::dominate.
ARC contract pass has an optimization that replaces the uses of the
argument of an ObjC runtime function call with the call result.
For example:
; Before optimization
%1 = tail call i8* @foo1()
%2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1)
store i8* %1, i8** @g0, align 8
; After optimization
%1 = tail call i8* @foo1()
%2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1)
store i8* %2, i8** @g0, align 8 // %1 is replaced with %2
Before replacing the argument use, DominatorTree::dominate is called to
determine whether the user instruction is dominated by the ObjC runtime
function call instruction. The call to DominatorTree::dominate can be
expensive if the two instructions belong to the same basic block and the
size of the basic block is large. This patch checks the basic block size
and just bails out if the size exceeds the limit set by command line
option "arc-contract-max-bb-size".
rdar://problem/49477063
Differential Revision: https://reviews.llvm.org/D60900
llvm-svn: 359027
Originally committed in r358931
Reverted in r358997
Seems this change made Apple accelerator tables miss names (because
names started respecting the CU NameTableKind GNU & assuming that
shouldn't produce accelerated names too), which is never correct (apple
accelerator tables don't have separators or CU lists - if present, they
must describe all names in all CUs).
Original Description:
Currently to opt in to debug_names in DWARFv5, the IR must contain
'nameTableKind: Default' which also enables debug_pubnames.
Instead, only allow one of {debug_names, apple_names, debug_pubnames,
debug_gnu_pubnames}.
nameTableKind: Default gives debug_names in DWARFv5 and greater,
debug_pubnames in v4 and earlier - and apple_names when tuning for lldb
on MachO.
nameTableKind: GNU always gives gnu_pubnames
llvm-svn: 359026
Summary:
The opt level was not being passed down to the ThinLTO backend when
invoked via clang (for distributed ThinLTO).
This exposed an issue where the new PM was asserting if the Thin or
regular LTO backend pipelines were invoked with -O0 (not a new issue,
could be provoked by invoking in-process *LTO backends via linker using
new PM and -O0). Fix this similar to the old PM where -O0 only does the
necessary lowering of type metadata (WPD and LowerTypeTest passes) and
then quits, rather than asserting.
Reviewers: xur
Subscribers: mehdi_amini, inglorion, eraman, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits, pcc
Tags: #clang, #llvm
Differential Revision: https://reviews.llvm.org/D61022
llvm-svn: 359025
Add it to isPreISelGenericFloatingPointOpcode, and add a regbankselect test.
Update arm64-vfloatintrinsics.ll now that we can select it.
llvm-svn: 359022
Same patch as G_FCEIL etc.
Add the missing switch case in widenScalar, add G_INTRINSIC_TRUNC to the correct
rule in AArch64LegalizerInfo.cpp, and add a test.
llvm-svn: 359021
Same as G_FCEIL, G_FABS, etc. Just move it into that rule.
Add a legalizer test for G_FMA, which we didn't have before and update
arm64-vfloatintrinsics.ll.
llvm-svn: 359015