Summary:
The motivation is to support better the -object_path_lto option on
Darwin. The linker needs to write down the generate object files on
disk for later use by lldb or dsymutil (debug info are not present
in the final binary). We're moving this into libLTO so that we can
be smarter when a cache is enabled and hard-link when possible
instead of duplicating the files.
Reviewers: tejohnson, deadalnix, pcc
Subscribers: dexonsmith, llvm-commits
Differential Revision: https://reviews.llvm.org/D27507
llvm-svn: 289631
Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused. Similarly we clear bit 0 for optimizing operand 0.
Also calculate UndefElts correctly.
Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support.
llvm-svn: 289629
Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused.
Also calculate UndefElts correctly.
Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support.
llvm-svn: 289628
Bots are broken and needs to be fixed before having this on by default.
The feature was committed in r289619.
I tried to disable it in r289624 and failed because it was initialized in two places.
llvm-svn: 289626
Follow-up to r289256, address a FIXME to avoid resetting the column
number. This reduced .debug_line by 2.6% in a RelWithDebInfo
self-build of clang.
llvm-svn: 289620
Summary:
Given a flag (-mllvm -reverse-iterate) this patch will enable iteration of SmallPtrSet in reverse order.
The idea is to compile the same source with and without this flag and expect the code to not change.
If there is a difference in codegen then it would mean that the codegen is sensitive to the iteration order of SmallPtrSet.
This is enabled only with LLVM_ENABLE_ABI_BREAKING_CHECKS.
Reviewers: chandlerc, dexonsmith, mehdi_amini
Subscribers: mgorny, emaste, llvm-commits
Differential Revision: https://reviews.llvm.org/D26718
llvm-svn: 289619
Summary:
LinkDyLib is only used (before arg processing) to set up the default for
LinkMode. So reset LinkMode as well, and process before --link-shared or
--link-static to allow those flags to continue to override it.
Reviewers: beanz
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D27736
llvm-svn: 289608
This change re-lands r289215, by reverting r289482. The underlying
issue that caused it to be reverted has been fixed by Tim Northover in
r289496.
Original commit message for r289215:
[SCEVExpander] Use llvm data structures; NFC
Original commit message for r289482:
Revert "[SCEVExpander] Use llvm data structures; NFC"
This reverts r289215 (git SHA1 cb7b86a1). It breaks the ubsan build
because a DenseMap that keys off of `AssertingVH<T>` will hit UB when it
tries to cast the empty and tombstone keys to `T *` (due to insufficient
alignment).
This is the relevant stack trace (thanks to Mike Aizatsky):
#0 0x25cf100 in llvm::AssertingVH<llvm::PHINode>::getValPtr() const llvm/include/llvm/IR/ValueHandle.h:212:39
#1 0x25cea20 in llvm::AssertingVH<llvm::PHINode>::operator=(llvm::AssertingVH<llvm::PHINode> const&) llvm/include/llvm/IR/ValueHandle.h:234:19
#2 0x25d0092 in llvm::DenseMapBase<llvm::DenseMap<llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >, llvm::AssertingVH<llvm::PHINode>, llvm::detail::DenseSetEmpty, llvm::DenseMapInfo<llvm::AssertingVH<llvm::PHINode> >, llvm::detail::DenseSetPair<llvm::AssertingVH<llvm::PHINode> > >::clear() llvm/include/llvm/ADT/DenseMap.h:113:23
llvm-svn: 289602
Summary:
This patch will add loop metadata on the pre and post loops generated by IRCE.
Currently, we have metadata for disabling optimizations such as vectorization,
unrolling, loop distribution and LICM versioning (and confirmed that these
optimizations check for the metadata before proceeding with the transformation).
The pre and post loops generated by IRCE need not go through loop opts (since
these are slow paths).
Added two test cases as well.
Reviewers: sanjoy, reames
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D26806
llvm-svn: 289588
We currently check if the exact trip count is known and is smaller than the
"tiny loop" bound. We should be checking the maximum bound on the trip count
instead.
Differential Revision: https://reviews.llvm.org/D27690
llvm-svn: 289583
Summary:
This patch aims to generalize matching of the strided store accesses to more general masks.
The more general rule is to have consecutive accesses based on the stride:
[x, y, ... z, x+1, y+1, ...z+1, x+2, y+2, ...z+2, ...]
All elements in the masks need not form a contiguous space, there may be gaps.
As before, undefs are allowed and filled in with adjacent element loads.
Reviewers: HaoLiu, mssimpso
Subscribers: mkuper, delena, llvm-commits
Differential Revision: https://reviews.llvm.org/D23646
llvm-svn: 289573
This is not always behaving as expected as it turns out block live-in
lists are only correct most of the time. Still waiting for reviews on
https://reviews.llvm.org/D27559 to have them correct all of the time.
See also http://llvm.org/PR31361, rdar://25117107
This reverts commit r288567.
This reverts commit r288561.
llvm-svn: 289570
since bpf instruction stream is multiple of 8 change llvm-objdump
to print decimal instruction number instead of hex address, so that
users don't have to do this math manually to match kernel verifier output
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
llvm-svn: 289569
We were using the correct pseudo-instruction, but because the operand's flags
weren't set correctly we still ended up emitting incorrect relocations during
MC lowering.
llvm-svn: 289566
Many places pass around a DWARFDebugInfoEntryMinimal and a DWARFUnit. It is easy to get things wrong by using the wrong DWARFUnit with a DWARFDebugInfoEntryMinimal. This patch creates a DWARFDie class that contains the DWARFUnit and DWARFDebugInfoEntryMinimal objects so that they can't get out of sync. All attribute extraction has been moved out of DWARFDebugInfoEntryMinimal and into DWARFDie. DWARFDebugInfoEntryMinimal was also renamed to DWARFDebugInfoEntry.
DWARFDie objects are temporary objects that are used by clients and contain 2 pointers that you always need to have anyway. Keeping them grouped will avoid errors and simplify many of the attribute extracting APIs by not having to pass in a DWARFUnit.
Differential Revision: https://reviews.llvm.org/D27634
llvm-svn: 289565
Windows uses some macros to replace DeleteFile() by DeleteFileA() or
DeleteFileW(). This was causing an error at link time.
DeleteFile was renamed to RemoveFile().
Differential Revision: https://reviews.llvm.org/D27577
llvm-svn: 289563
Implement DirName from scratch to avoid dependencies on external libraries.
It's based on MSDN documentation for Naming Files, Paths, and Namespaces.
The algorithm can't simply start from the end and look backwards for the
first separator, because we need to preserve the prefix that represent
the root location. We shouldn't remove anything there. In Windows we
have many different options, like:
\\Server\Share\ , \ , C: , C:\ , \\?\C:\ , \\?\UNC\Server\Share\
We remove the last separator in the rest of the path, if it exists.
It was implemented to have a similar behaviour to dirname() in linux,
removing trailing separators, returning "." when the path doesn't
contain separators, etc.
Differential Revision: https://reviews.llvm.org/D27579
llvm-svn: 289562
I added a new flag RunningCB to know if the Fuzzer's main thread is
running the CB function, instead of using (!CurrentUnitSize).
(!CurrentUnitSize) doesn't work properly. For example, in FuzzerLoop.cpp,
inside ShuffleAndMinimize() function, we execute the callback with an
empty string (size=0). Previous implementation failed to detect timeouts
in that execution.
Also, I add a regression test for that case.
Differential Revision: https://reviews.llvm.org/D27433
llvm-svn: 289561
Reorganize #includes to follow LLVM Coding Standards.
Include some missing headers. Required to use `Printf()`.
Aside from that, this patch contains no functional change.
It is purely a re-organization.
Differential Revision: https://reviews.llvm.org/D27363
llvm-svn: 289560
std:🧵:hardware_concurrency() returns an unsigned, so I modify
NumberOfCpuCores() to return unsigned too.
The number of cpus is used to define the number of workers, so I decided
to update the worker and jobs flags to be declared as unsigned too.
Differential Revision: https://reviews.llvm.org/D27685
llvm-svn: 289559
Use unsigned for PID instead of signed int. GetCurrentProcessId() returns
an unsigned (DWORD) so we must be sure we can deal with all possible values.
I use a long unsigned to be sure it can hold a 32 bit unsigned (DWORD).
Differential Revision: https://reviews.llvm.org/D27281
llvm-svn: 289558
Add new flags to FuzzingOptions to represent the different conditions
on the signal handling. These options are passed when calling
SetSignalHandler().
This changes simplify the implementation of Windows's exception
handling. Now we can define a unique handler for all the exceptions.
Differential Revision: https://reviews.llvm.org/D27238
llvm-svn: 289557
StringLiteral is a wrapper around a string literal useful for
replacing global tables of char arrays with global tables of
StringRefs that can initialized in a constexpr context, avoiding
the invocation of a global constructor.
Differential Revision: https://reviews.llvm.org/D27686
llvm-svn: 289551
Summary:
This is last in of a series of patches to evolve ADCE.cpp to support
removing of unnecessary control flow.
This patch adds the code to update the control and data flow graphs
to remove the dead control flow.
Also update unit tests to test the capability to remove dead,
may-be-infinite loop which is enabled by the switch
-adce-remove-loops.
Previous patches:
D23824 [ADCE] Add handling of PHI nodes when removing control flow
D23559 [ADCE] Add control dependence computation
D23225 [ADCE] Modify data structures to support removing control flow
D23065 [ADCE] Refactor anticipating new functionality (NFC)
D23102 [ADCE] Refactoring for new functionality (NFC)
Reviewers: dberlin, majnemer, nadav, mehdi_amini
Subscribers: llvm-commits, david2050, freik, twoh
Differential Revision: https://reviews.llvm.org/D24918
llvm-svn: 289548
Match a pattern where a wide type scalar value is loaded by several narrow loads and combined by shifts and ors. Fold it into a single load or a load and a bswap if the targets supports it.
Assuming little endian target:
i8 *a = ...
i32 val = a[0] | (a[1] << 8) | (a[2] << 16) | (a[3] << 24)
=>
i32 val = *((i32)a)
i8 *a = ...
i32 val = (a[0] << 24) | (a[1] << 16) | (a[2] << 8) | a[3]
=>
i32 val = BSWAP(*((i32)a))
This optimization was discussed on llvm-dev some time ago in "Load combine pass" thread. We came to the conclusion that we want to do this transformation late in the pipeline because in presence of atomic loads load widening is irreversible transformation and it might hinder other optimizations.
Eventually we'd like to support folding patterns like this where the offset has a variable and a constant part:
i32 val = a[i] | (a[i + 1] << 8) | (a[i + 2] << 16) | (a[i + 3] << 24)
Matching the pattern above is easier at SelectionDAG level since address reassociation has already happened and the fact that the loads are adjacent is clear. Understanding that these loads are adjacent at IR level would have involved looking through geps/zexts/adds while looking at the addresses.
The general scheme is to match OR expressions by recursively calculating the origin of individual bits which constitute the resulting OR value. If all the OR bits come from memory verify that they are adjacent and match with little or big endian encoding of a wider value. If so and the load of the wider type (and bswap if needed) is allowed by the target generate a load and a bswap if needed.
Reviewed By: hfinkel, RKSimon, filcab
Differential Revision: https://reviews.llvm.org/D26149
llvm-svn: 289538