1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 13:11:39 +01:00

4540 Commits

Author SHA1 Message Date
Nick Desaulniers
5d27c8ae50 [Inline] prevent inlining on stack protector mismatch
It's common for code that manipulates the stack via inline assembly or
that has to set up its own stack canary (such as the Linux kernel) would
like to avoid stack protectors in certain functions. In this case, we've
been bitten by numerous bugs where a callee with a stack protector is
inlined into an attribute((no_stack_protector)) caller, which
generally breaks the caller's assumptions about not having a stack
protector. LTO exacerbates the issue.

While developers can avoid this by putting all no_stack_protector
functions in one translation unit together and compiling those with
-fno-stack-protector, it's generally not very ergonomic or as
ergonomic as a function attribute, and still doesn't work for LTO. See also:
https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/
https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u

SSP attributes can be ordered by strength. Weakest to strongest, they
are: ssp, sspstrong, sspreq.  Callees with differing SSP attributes may be
inlined into each other, and the strongest attribute will be applied to the
caller. (No change)

After this change:
* A callee with no SSP attributes will no longer be inlined into a
  caller with SSP attributes.
* The reverse is also true: a callee with an SSP attribute will not be
  inlined into a caller with no SSP attributes.
* The alwaysinline attribute overrides these rules.

Functions that get synthesized by the compiler may not get inlined as a
result if they are not created with the same stack protector function
attribute as their callers.

Alternative approach to https://reviews.llvm.org/D87956.

Fixes pr/47479.

Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

Reviewed By: rnk, MaskRay

Differential Revision: https://reviews.llvm.org/D91816
2020-12-02 11:00:16 -08:00
Leonard Chan
6d2c502fb1 [llvm] Fix for failing test from fdbd84c6c819d4462546961f6086c1524d5d5ae8
When handling a DSOLocalEquivalent operand change:

- Remove assertion checking that the `To` type and current type are the
  same type. This is not always a requirement.
- Add a missing bitcast from an old DSOLocalEquivalent to the type of
  the new one.
2020-12-01 15:47:55 -08:00
Wei Wang
5e0aabe083 [Remarks][2/2] Expand remarks hotness threshold option support in more tools
This is the #2 of 2 changes that make remarks hotness threshold option
available in more tools. The changes also allow the threshold to sync with
hotness threshold from profile summary with special value 'auto'.

This change expands remarks hotness threshold option
-fdiagnostics-hotness-threshold in clang and *-remarks-hotness-threshold in
other tools to utilize hotness threshold from profile summary.

Remarks hotness filtering relies on several driver options. Table below lists
how different options are correlated and affect final remarks outputs:

| profile | hotness | threshold | remarks printed |
|---------|---------|-----------|-----------------|
| No      | No      | No        | All             |
| No      | No      | Yes       | None            |
| No      | Yes     | No        | All             |
| No      | Yes     | Yes       | None            |
| Yes     | No      | No        | All             |
| Yes     | No      | Yes       | None            |
| Yes     | Yes     | No        | All             |
| Yes     | Yes     | Yes       | >=threshold     |

In the presence of profile summary, it is often more desirable to directly use
the hotness threshold from profile summary. The new argument value 'auto'
indicates threshold will be synced with hotness threshold from profile summary
during compilation. The "auto" threshold relies on the availability of profile
summary. In case of missing such information, no remarks will be generated.

Differential Revision: https://reviews.llvm.org/D85808
2020-11-30 21:55:50 -08:00
Wei Wang
d0b74589e5 [Remarks][1/2] Expand remarks hotness threshold option support in more tools
This is the #1 of 2 changes that make remarks hotness threshold option
available in more tools. The changes also allow the threshold to sync with
hotness threshold from profile summary with special value 'auto'.

This change modifies the interface of lto::setupLLVMOptimizationRemarks() to
accept remarks hotness threshold. Update all the tools that use it with remarks
hotness threshold options:

* lld: '--opt-remarks-hotness-threshold='
* llvm-lto2: '--pass-remarks-hotness-threshold='
* llvm-lto: '--lto-pass-remarks-hotness-threshold='
* gold plugin: '-plugin-opt=opt-remarks-hotness-threshold='

Differential Revision: https://reviews.llvm.org/D85809
2020-11-30 21:55:49 -08:00
Leonard Chan
03ffcb1a94 [llvm] Fix for failing test from cf8ff75bade763b054476321dcb82dcb2e7744c7
Handle null values when handling operand changes for DSOLocalEquivalent.
2020-11-30 17:22:28 -08:00
Nikita Popov
1f71c3e563 [DL] Inline getAlignmentInfo() implementation (NFC)
Apart from getting the entry in the table (which is already a
separate function), the remaining logic is different for all
alignment types and is better combined with getAlignment().

This is a minor efficiency improvement, and should make further
improvements like using separate storage for different alignment
types simpler.
2020-11-30 20:56:15 +01:00
Nick Lewycky
25d19be185 Creating a named struct requires only a Context and a name, but looking up a struct by name requires a Module. The method on Module merely accesses the LLVMContextImpl and no data from the module itself, so this patch moves getTypeByName to a static method on StructType that takes a Context and a name.
There's a small number of users of this function, they are all updated.

This updates the C API adding a new method LLVMGetTypeByName2 that takes a context and a name.

Differential Revision: https://reviews.llvm.org/D78793
2020-11-30 11:34:12 -08:00
Sanjay Patel
26ba573719 [IR][LoopRotate] remove assertion that phi must have at least one operand
This was suggested in D92247 - I initially committed an alternate
fix ( bfd2c216ea ) to avoid the crash/assert shown in
https://llvm.org/PR48296 ,
but that was reverted because it caused msan failures on other
tests. We can try to revive that patch using the test included
here, but I do not have an immediate plan to isolate that problem.
2020-11-30 11:32:42 -05:00
Sanjay Patel
187ef7e7e0 [IR] improve code comment/logic in removePredecessor(); NFC
This was suggested in the post-commit review of ce134da4b1.
2020-11-30 10:51:30 -05:00
Sanjay Patel
ac871b528f Revert "[IR][LoopRotate] avoid leaving phi with no operands (PR48296)"
This reverts commit bfd2c216ea8ef09f8fb1f755ca2b89f86f74acbb.
This appears to be causing stage2 msan failures on buildbots:
  FAIL: LLVM :: Transforms/SimplifyCFG/X86/bug-25299.ll (65872 of 71835)
  ******************** TEST 'LLVM :: Transforms/SimplifyCFG/X86/bug-25299.ll' FAILED ********************
  Script:
  --
  : 'RUN: at line 1';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/opt < /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SimplifyCFG/X86/bug-25299.ll -simplifycfg -S | /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/Transforms/SimplifyCFG/X86/bug-25299.ll
  --
  Exit Code: 2
  Command Output (stderr):
  --
  ==87374==WARNING: MemorySanitizer: use-of-uninitialized-value
      #0 0x9de47b6 in getBasicBlockIndex /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/include/llvm/IR/Instructions.h:2749:5
      #1 0x9de47b6 in simplifyCommonResume /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:4112:23
      #2 0x9de47b6 in simplifyResume /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:4039:12
      #3 0x9de47b6 in (anonymous namespace)::SimplifyCFGOpt::simplifyOnce(llvm::BasicBlock*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:6330:16
      #4 0x9dcca13 in run /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:6358:16
      #5 0x9dcca13 in llvm::simplifyCFG(llvm::BasicBlock*, llvm::TargetTransformInfo const&, llvm::SimplifyCFGOptions const&, llvm::SmallPtrSetImpl<llvm::BasicBlock*>*) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Transforms/Utils/SimplifyCFG.cpp:6369:8
      #6 0x974643d in iterativelySimplifyCFG(
2020-11-30 10:15:42 -05:00
Sanjay Patel
6b1415dcf0 [IR][LoopRotate] avoid leaving phi with no operands (PR48296)
https://llvm.org/PR48296 shows an example where we delete all of the operands
of a phi without actually deleting the phi, and that is currently considered
invalid IR. The reduced test included here would crash for that reason.

A suggested follow-up is to loosen the assert to allow 0-operand phis
in unreachable blocks.

Differential Revision: https://reviews.llvm.org/D92247
2020-11-30 09:28:45 -05:00
Juneyoung Lee
26954b5028 [ConstantFold] Don't fold and/or i1 poison to poison (NFC)
.. because it causes miscompilation when combined with select i1 -> and/or.

It is the select fold which is incorrect; but it is costly to disable the fold, so hack this one.

D92270
2020-11-30 22:58:31 +09:00
Jay Foad
e98f4ccb3f [LegacyPM] Simplify PMTopLevelManager::collectLastUses. NFC. 2020-11-30 10:36:19 +00:00
Nikita Popov
9af4629d4a [DL] Optimize address space zero lookup (NFC)
Information for pointer size/alignment/etc is queried a lot, but
the binary search based implementation makes this fairly slow.

Add an explicit check for address space zero and skip the search
in that case -- we need to specially handle the zero address space
anyway, as it serves as the fallback for all address spaces that
were not explicitly defined.

I initially wanted to simply replace the binary search with a
linear search, which would handle both address space zero and the
general case efficiently, but I was not sure whether there are
any degenerate targets that use more than a handful of declared
address spaces (in-tree, even AMDGPU only declares six).
2020-11-29 22:49:55 +01:00
Sanjay Patel
1e8aaec6b3 [IR] simplify code in removePredecessor(); NFCI
As suggested in D92247 (and independent of whatever we decide to do there),
this code is confusing as-is. Hopefully, this is at least mildly better.

We might be able to do better still, but we have a function called
"removePredecessor" with this behavior:
"Note that this function does not actually remove the predecessor." (!)
2020-11-29 09:55:04 -05:00
Sanjay Patel
ce313ba7f9 [IR] remove redundant code comments; NFC
As noted in D92247 (and independent of that patch):

http://llvm.org/docs/CodingStandards.html#doxygen-use-in-documentation-comments

"Don’t duplicate the documentation comment in the header file and in the
implementation file. Put the documentation comments for public APIs into
the header file."
2020-11-29 09:29:59 -05:00
Juneyoung Lee
45b0ec5d7b [ConstantFold] Fold more operations to poison
This patch folds more operations to poison.

Alive2 proof: https://alive2.llvm.org/ce/z/mxcb9G (it does not contain tests about div/rem because they fold to poison when raising UB)

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D92270
2020-11-29 21:19:48 +09:00
Juneyoung Lee
9bed1bd10d [ConstantFold] Fold operations to poison if possible
This patch updates ConstantFold, so operations are folded into poison if possible.

<alive2 proofs>
casts: https://alive2.llvm.org/ce/z/WSj7rw
binary operations (arithmetic): https://alive2.llvm.org/ce/z/_7dEyJ
binary operations (bitwise): https://alive2.llvm.org/ce/z/cezjVN
vector/aggregate operations: https://alive2.llvm.org/ce/z/BQ7hWz
unary ops: https://alive2.llvm.org/ce/z/yBRs4q
other ops: https://alive2.llvm.org/ce/z/iXbcFD

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D92203
2020-11-29 02:28:40 +09:00
Francesco Petrogalli
4a2f3f7420 [AllocaInst] Update getAllocationSizeInBits to return TypeSize.
Reviewed By: peterwaller-arm, sdesmalen

Differential Revision: https://reviews.llvm.org/D92020
2020-11-27 16:39:10 +00:00
Jay Foad
59eb6dddd5 [LegacyPM] Avoid a redundant map lookup in setLastUser. NFC.
As a bonus this makes it (IMO) obvious that the iterator is not
invalidated, so remove the comment explaining that.
2020-11-27 10:42:01 +00:00
Jay Foad
6fe64ab931 [LegacyPM] Remove unused undocumented parameter. NFC.
The Direction parameter to AnalysisResolver::getAnalysisIfAvailable has
never been documented or used for anything.
2020-11-27 10:41:38 +00:00
Zhengyang Liu
fdfc9baedb Fix use-of-uninitialized-value in rG75f50e15bf8f
Differential Revision: https://reviews.llvm.org/D71126
2020-11-26 01:39:22 -07:00
Zhengyang Liu
f2658edb7a Adding PoisonValue for representing poison value explicitly in IR
Define ConstantData::PoisonValue.
Add support for poison value to LLLexer/LLParser/BitcodeReader/BitcodeWriter.
Add support for poison value to llvm-c interface.
Add support for poison value to OCaml binding.
Add m_Poison in PatternMatch.

Differential Revision: https://reviews.llvm.org/D71126
2020-11-25 17:33:51 -07:00
Arthur Eubanks
cb9b83342f Make CallInst::updateProfWeight emit i32 weights instead of i64
Typically branch_weights are i32, not i64.
This fixes entry_counts_cold.ll under NPM.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D90539
2020-11-24 18:13:59 -08:00
Simon Pilgrim
b1b0060ec0 [IR] Constant::getAggregateElement - early-out for ScalableVectorType
We can't call getNumElements() for ScalableVectorType types - just bail for now, although ConstantAggregateZero/UndefValue could return a reasonable value.

Fixes crash shown in OSS-Fuzz #25272 https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=25272
2020-11-24 12:03:27 +00:00
Matt Arsenault
ec50a05ed6 Verifier: Fix assert when verifying non-pointer byval or preallocated
This would fail on a cast<PointerType> when verifying the attribute if
these attributes were incorrectly used with a non-pointer type.
2020-11-20 20:08:43 -05:00
Hongtao Yu
db4396f62a [CSSPGO] IR intrinsic for pseudo-probe block instrumentation
This change introduces a new IR intrinsic named `llvm.pseudoprobe` for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story.

A pseudo probe is used to collect the execution count of the block where the probe is instrumented. This requires a pseudo probe to be persisting. The LLVM PGO instrumentation also instruments in similar places by placing a counter in the form of atomic read/write operations or runtime helper calls. While these operations are very persisting or optimization-resilient, in theory we can borrow the atomic read/write implementation from PGO counters and cut it off at the end of compilation with all the atomics converted into binary data. This was our initial design and we’ve seen promising sample correlation quality with it. However, the atomics approach has a couple issues:

1. IR Optimizations are blocked unexpectedly. Those atomic instructions are not going to be physically present in the binary code, but since they are on the IR till very end of compilation, they can still prevent certain IR optimizations and result in lower code quality.
2. The counter atomics may not be fully cleaned up from the code stream eventually.
3. Extra work is needed for re-targeting.

We choose to implement pseudo probes based on a special LLVM intrinsic, which is expected to have most of the semantics that comes with an atomic operation but does not block desired optimizations as much as possible. More specifically the semantics associated with the new intrinsic enforces a pseudo probe to be virtually executed exactly the same number of times before and after an IR optimization. The intrinsic also comes with certain flags that are carefully chosen so that the places they are probing are not going to be messed up by the optimizer while most of the IR optimizations still work. The core flags given to the special intrinsic is `IntrInaccessibleMemOnly`, which means the intrinsic accesses memory and does have a side effect so that it is not removable, but is does not access memory locations that are accessible by any original instructions. This way the intrinsic does not alias with any original instruction and thus it does not block optimizations as much as an atomic operation does. We also assign a function GUID and a block index to an intrinsic so that they are uniquely identified and not merged in order to achieve good correlation quality.

Let's now look at an example. Given the following LLVM IR:

```
define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 {
bb0:
  %cmp = icmp eq i32 %x, 0
   br i1 %cmp, label %bb1, label %bb2
bb1:
   br label %bb3
bb2:
   br label %bb3
bb3:
   ret void
}
```

The instrumented IR will look like below. Note that each `llvm.pseudoprobe` intrinsic call represents a pseudo probe at a block, of which the first parameter is the GUID of the probe’s owner function and the second parameter is the probe’s ID.

```
define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 {
bb0:
   %cmp = icmp eq i32 %x, 0
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 1)
   br i1 %cmp, label %bb1, label %bb2
bb1:
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 2)
   br label %bb3
bb2:
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 3)
   br label %bb3
bb3:
   call void @llvm.pseudoprobe(i64 837061429793323041, i64 4)
   ret void
}

```

Reviewed By: wmi

Differential Revision: https://reviews.llvm.org/D86490
2020-11-20 10:39:24 -08:00
Alex Richardson
775dd2a2a2 [AMDGPU] Set the default globals address space to 1
This will ensure that passes that add new global variables will create them
in address space 1 once the passes have been updated to no longer default
to the implicit address space zero.
This also changes AutoUpgrade.cpp to add -G1 to the DataLayout if it wasn't
already to present to ensure bitcode backwards compatibility.

Reviewed by: arsenm

Differential Revision: https://reviews.llvm.org/D84345
2020-11-20 15:46:53 +00:00
Alex Richardson
9c96f39f77 Add a default address space for globals to DataLayout
This is similar to the existing alloca and program address spaces (D37052)
and should be used when creating/accessing global variables.
We need this in our CHERI fork of LLVM to place all globals in address space 200.
This ensures that values are accessed using CHERI load/store instructions
instead of the normal MIPS/RISC-V ones.

The problem this is trying to fix is that most of the time the type of
globals is created using a simple PointerType::getUnqual() (or ::get() with
the default address-space value of 0). This does not work for us and we get
assertion/compilation/instruction selection failures whenever a new call
is added that uses the default value of zero.

In our fork we have removed the default parameter value of zero for most
address space arguments and use DL.getProgramAddressSpace() or
DL.getGlobalsAddressSpace() whenever possible. If this change is accepted,
I will upstream follow-up patches to use DL.getGlobalsAddressSpace() instead
of relying on the default value of 0 for PointerType::get(), etc.

This patch and the follow-up changes will not have any functional changes
for existing backends with the default globals address space of zero.
A follow-up commit will change the default globals address space for
AMDGPU to 1.

Reviewed By: dylanmckay

Differential Revision: https://reviews.llvm.org/D70947
2020-11-20 15:46:52 +00:00
Leonard Chan
c24d9d2b01 [llvm][IR] Add dso_local_equivalent Constant
The `dso_local_equivalent` constant is a wrapper for functions that represents a
value which is functionally equivalent to the global passed to this. That is, if
this accepts a function, calling this constant should have the same effects as
calling the function directly. This could be a direct reference to the function,
the `@plt` modifier on X86/AArch64, a thunk, or anything that's equivalent to the
resolved function as a call target.

When lowered, the returned address must have a constant offset at link time from
some other symbol defined within the same binary. The address of this value is
also insignificant. The name is leveraged from `dso_local` where use of a function
or variable is resolved to a symbol in the same linkage unit.

In this patch:
- Addition of `dso_local_equivalent` and handling it
- Update Constant::needsRelocation() to strip constant inbound GEPs and take
  advantage of `dso_local_equivalent` for relative references

This is useful for the [Relative VTables C++ ABI](https://reviews.llvm.org/D72959)
which makes vtables readonly. This works by replacing the dynamic relocations for
function pointers in them with static relocations that represent the offset between
the vtable and virtual functions. If a function is externally defined,
`dso_local_equivalent` can be used as a generic wrapper for the function to still
allow for this static offset calculation to be done.

See [RFC](http://lists.llvm.org/pipermail/llvm-dev/2020-August/144469.html) for more details.

Differential Revision: https://reviews.llvm.org/D77248
2020-11-19 10:26:17 -08:00
Nick Desaulniers
b2b1b97849 Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch"
This reverts commit b7926ce6d7a83cdf70c68d82bc3389c04009b841.

Going with a simpler approach.
2020-11-17 17:27:14 -08:00
Christopher Tetreault
97d1986ad4 [SVE] Take constant fold fast path for splatted vscale vectors
This should be a perfectly reasonable operation for scalable vectors.
Currently, it only works for zeroinitializer values of
ScalableVectorType, but the fundamental operation is sound and it should
be possible to make it work for other splats

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D77442
2020-11-17 12:45:31 -08:00
Simon Pilgrim
62500c8769 [IR] ShuffleVectorInst::isIdentityWithPadding - bail on non-fixed-type vector shuffles.
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=27416
2020-11-17 16:16:51 +00:00
Arthur Eubanks
535a7a1f46 [Debugify] Skip debugifying on special/immutable passes
With a function pass manager, it would insert debuginfo metadata before
getting to function passes while processing the pass manager, causing
debugify to skip while running the function passes.

Skip special passes + verifier + printing passes. Compared to the legacy
implementation of -debugify-each, this additionally skips verifier
passes. Probably no need to update the legacy version since it will be
obsolete soon.

This fixes 2 instcombine tests using -debugify-each under NPM.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D91558
2020-11-16 20:39:46 -08:00
Florian Hahn
e2fe6ad000 [IRGen] Add !annotation metadata for auto-init stores.
This patch updates Clang's IRGen to add !annotation nodes with an
"auto-init" annotation to all stores for auto-initialization.

As discussed in 'RFC: Combining Annotation Metadata and Remarks'
(http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html)
this allows using optimization remarks to track down where auto-init
code was inserted (and not removed by optimizations).

There are a few cases in the tests where !annotation gets dropped by
optimizations. Those optimizations will be updated in subsequent
patches.

This patch is based on a patch by Francis Visoiu Mistrih.

Reviewed By: thegameg, paquette

Differential Revision: https://reviews.llvm.org/D91417
2020-11-16 10:37:02 +00:00
Simon Moll
28f80af0cf [VP][NFC] Rename to HANDLE_VP_TO_OPC
Use the less surprising shorthand OPC instead of OC.
2020-11-16 10:24:18 +01:00
Kazu Hirata
a80ed1a2ff [IR] Use llvm::is_contained in BasicBlock::removePredecessor (NFC) 2020-11-15 21:15:31 -08:00
Roman Lebedev
f948b65a66 Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM"
See discussion in https://bugs.llvm.org/show_bug.cgi?id=45073 / https://reviews.llvm.org/D66324#2334485
the implementation is known-broken for certain inputs,
the bugreport was up for a significant amount of timer,
and there has been no activity to address it.
Therefore, just completely rip out all of misexpect handling.

I suspect, fixing it requires redesigning the internals of MD_misexpect.
Should anyone commit to fixing the implementation problem,
starting from clean slate may be better anyways.

This reverts commit 7bdad08429411e7d0ecd58cd696b1efe3cff309e,
and some of it's follow-ups, that don't stand on their own.
2020-11-14 13:12:38 +03:00
Yuanfang Chen
06613d74b4 [CGProfile] allows bitcast in metadata node storing function pointers
For example,  during RAUW in IRMover, the `Function` ValueAsMetadata in "CG Profile" could become bitcast.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D88433
2020-11-13 09:28:21 -08:00
Florian Hahn
041da6277f Add !annotation metadata and remarks pass.
This patch adds a new !annotation metadata kind which can be used to
attach annotation strings to instructions.

It also adds a new pass that emits summary remarks per function with the
counts for each annotation kind.

The intended uses cases for this new metadata is annotating
'interesting' instructions and the remarks should provide additional
insight into transformations applied to a program.

To motivate this, consider these specific questions we would like to get answered:

* How many stores added for automatic variable initialization remain after optimizations? Where are they?
* How many runtime checks inserted by a frontend could be eliminated? Where are the ones that did not get eliminated?

Discussed on llvm-dev as part of 'RFC: Combining Annotation Metadata and Remarks'
(http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html)

Reviewed By: thegameg, jdoerfert

Differential Revision: https://reviews.llvm.org/D91188
2020-11-13 13:24:10 +00:00
serge-sans-paille
82b6e6053d llvmbuildectomy - replace llvm-build by plain cmake
No longer rely on an external tool to build the llvm component layout.

Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.

These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.

Differential Revision: https://reviews.llvm.org/D90848
2020-11-13 10:35:24 +01:00
Kazushi (Jam) Marukawa
aa1acddbb3 [VE] Support vld intrinsics
Add intrinsics for vector load instructions.  Add a regression test also.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D91332
2020-11-13 07:34:42 +09:00
Sebastian Neubauer
7e4be9501b [AMDGPU] Add amdgpu_gfx calling convention
Add a calling convention called amdgpu_gfx for real function calls
within graphics shaders. For the moment, this uses the same calling
convention as other calls in amdgpu, with registers excluded for return
address, stack pointer and stack buffer descriptor.

Differential Revision: https://reviews.llvm.org/D88540
2020-11-09 16:51:44 +01:00
Roman Lebedev
57330778e3 [IR] CmpInst: Add getFlippedSignednessPredicate()
And refactor a few places to use it
2020-11-06 11:31:09 +03:00
Roman Lebedev
05a5fb18a8 [IR] CmpInst: add isEquality(Pred)
Currently there is only a member version of isEquality(),
which requires an actual [IF]CmpInst to be avaliable,
which isn't always possible, and is inconsistent with
the general pattern here.

I wanted to use it in a new patch, but it wasn't there..
2020-11-06 11:31:09 +03:00
Roman Lebedev
a6b210b265 [IR] CmpInst: add getUnsignedPredicate()
There's already getSignedPredicate(), it is not symmetrical to not have
it's opposite. I wanted to use it in new code, but it wasn't there..
2020-11-06 11:31:08 +03:00
David Green
41688b499e [CostModel] Make target intrinsics cheap by default
This patch changes the intrinsics cost model to assume that by default
target intrinsics are cheap. This didn't seem to be the case for all
intrinsics, and is potentially an MVE problem due to our scalarization
overheads. Cheap seems to be a good default in general though.

Differential Revision: https://reviews.llvm.org/D90597
2020-11-03 09:58:28 +00:00
Arthur Eubanks
bb84082e59 Revert "Use uint64_t for branch weights instead of uint32_t"
This reverts commit 10f2a0d662d8d72eaac48d3e9b31ca8dc90df5a4.

More uint64_t overflows.
2020-10-31 00:25:32 -07:00
Arthur Eubanks
f52f1e83f5 Use uint64_t for branch weights instead of uint32_t
CallInst::updateProfWeight() creates branch_weights with i64 instead of i32.
To be more consistent everywhere and remove lots of casts from uint64_t
to uint32_t, use i64 for branch_weights.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D88609
2020-10-30 10:03:46 -07:00
Craig Disselkoen
fe85e24882 C API: support scalable vectors
This adds support for scalable vector types in the C API and in
llvm-c-test, and also adds a test to ensure that llvm-c-test can properly
roundtrip operations involving scalable vectors.

While creating this diff, I discovered that the C API cannot properly roundtrip
_constant expressions_ involving shufflevector / scalable vectors, but that
seems to be a separate enough issue that I plan to address it in a future diff
(unless reviewers feel it should be addressed here).

Differential Revision: https://reviews.llvm.org/D89816
2020-10-28 18:19:34 -04:00