This is a recommit of r258620 which causes PR26293.
The original message:
Now LIR can turn following codes into memset:
typedef struct foo {
int a;
int b;
} foo_t;
void bar(foo_t *f, unsigned n) {
for (unsigned i = 0; i < n; ++i) {
f[i].a = 0;
f[i].b = 0;
}
}
void test(foo_t *f, unsigned n) {
for (unsigned i = 0; i < n; i += 2) {
f[i] = 0;
f[i+1] = 0;
}
}
llvm-svn: 258777
These two functions are hard to reason about. This commit makes the code
more comprehensible:
- Use four distinct variables (OldIdxIn, OldIdxOut, NewIdxIn, NewIdxOut)
with a fixed value instead of a changing iterator I that points to
different things during the function.
- Remove the early explanation before the function in favor of more
detailed comments inside the function. Should have more/clearer comments now
stating which conditions are tested and which invariants hold at
different points in the functions.
The behaviour of the code was not changed.
I hope that this will make it easier to review the changes in
http://reviews.llvm.org/D9067 which I will adapt next.
Differential Revision: http://reviews.llvm.org/D16379
llvm-svn: 258756
This patch was originally committed as r257885, but was reverted due to windows
failures. The cause of these failures has been fixed under r258677, hence
re-committing the original patch.
llvm-svn: 258683
VPMADD52LUQ - Packed Multiply of Unsigned 52-bit Integers and Add the Low 52-bit Products to Qword Accumulators
VPMADD52HUQ - Packed Multiply of Unsigned 52-bit Unsigned Integers and Add High 52-bit Products to 64-bit Accumulators
Differential Revision: http://reviews.llvm.org/D16407
llvm-svn: 258680
SCCP has code identical to changeToUnreachable's behavior, switch it
over to just call changeToUnreachable.
No functionality change intended.
llvm-svn: 258654
InstCombine and SCCP both want to remove dead code in a very particular
way but using identical means to do so. Share the code between the two.
No functionality change is intended.
llvm-svn: 258653
Summary: Helper so we don't have to enumerate nvptx && nvptx64 everywhere.
Reviewers: echristo
Subscribers: llvm-commits, jhen, tra
Differential Revision: http://reviews.llvm.org/D16494
llvm-svn: 258639
Summary:
Update ObjectTransformLayer::addObjectSet to take the object set by
value rather than reference and pass it to the base layer with move
semantics rather than copy, to match r258185's changes to
ObjectLinkingLayer.
Update the unit test to verify that ObjectTransformLayer's signature stays
in sync with ObjectLinkingLayer's.
Reviewers: lhames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D16414
llvm-svn: 258630
If the intrinsic is overloaded and works on multiple types,
it cannot resolve to a single corresponding builtin and requires
handling in clang. This just causes crashes now.
llvm-svn: 258559
The intrinsic target prefix should match the target name
as it appears in the triple.
This is not yet complete, but gets most of the important ones.
llvm.AMDGPU.* intrinsics used by mesa and libclc are still handled
for compatability for now.
llvm-svn: 258557
\src\llvm-rw\include\llvm/Support/AlignOf.h(254) :
error C2872: 'detail' : ambiguous symbol
could be 'llvm::detail'
or 'llvm::support::detail'
llvm-svn: 258553
Make the variable a member of the writer trait object owned
now by the writer. Also use a different generator interface
to pass the infoObject from the writer.
llvm-svn: 258544
Using an array instead of ArrayRef would allow type inference, but
(short of using C99) one would still need to write
typedef uint16_t VT[];
LE.write(VT{0x1234, 0x5678});
llvm-svn: 258535
This reapplies r258296 and r258366, and also fixes an existing bug in
SelectionDAG.cpp's isMemSrcFromString, neglecting to account for the
offset in a GlobalAddressSDNode, which is uncovered by those patches.
llvm-svn: 258482
This reverts r258296 and the follow up r258366. With this change, we
miscompiled the following program on Windows:
#include <string>
#include <iostream>
static const char kData[] = "asdf jkl;";
int main() {
std::string s(kData + 3, sizeof(kData) - 3);
std::cout << s << '\n';
}
llvm-svn: 258465
The X86 musttail implementation finds register parameters to forward by
running the calling convention algorithm until a non-register location
is returned. However, assigning a vector memory location has the side
effect of increasing the function's stack alignment. We shouldn't
increase the stack alignment when we are only looking for register
parameters, so this change conditionalizes it.
llvm-svn: 258442
Include the needed headfile to fix the buildbot failure due to r258420 [PGO] Passmanagerbuilder change that enable IR level PGO instrumentation.
llvm-svn: 258423
This patch includes the passmanagerbuilder change that enables IR level PGO instrumentation. It adds two passmanagerbuilder options: -profile-generate=<profile_filename> and -profile-use=<profile_filename>. The new options are primarily for debug purpose.
Reviewers: davidxl, silvas
Differential Revision: http://reviews.llvm.org/D15828
llvm-svn: 258420
Summary:
And use it in PPCLoopDataPrefetch.cpp.
@hfinkel, please let me know if your preference would be to preserve the
ppc-loop-prefetch-cache-line option in order to be able to override the
value of TTI::getCacheLineSize for PPC.
Reviewers: hfinkel
Subscribers: hulx2000, mcrosier, mssimpso, hfinkel, llvm-commits
Differential Revision: http://reviews.llvm.org/D16306
llvm-svn: 258419
Summary:
This is now the same as the behaviour of the GNU assembler. This was done
as it is required in order to build the Linux kernel with the integrated
assembler enabled.
Reviewers: dsanders, vkalintiris
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D13594
llvm-svn: 258400