This fixes two bugs is lib/Object that the use in llvm-ar found:
* In OS X created archives, the name can be padded with nulls. Strip them.
* In the constructor, remember the first non special member and use that in
begin_children. This makes sure we skip all special members, not just the
first one.
The change to llvm-ar itself consist of
* Using lib/Object for reading archives instead of ArchiveReader.cpp.
* Writing the modified archive directly, instead of creating an in memory
representation.
The old Archive library was way more general than what is needed, as can
be seen by the diffstat of this patch.
Having llvm-ar using lib/Object now opens the way for creating regular symbol
tables for both native objects and bitcode files so that we can use those
archives for LTO.
llvm-svn: 186197
Fixes a 35% degradation compared to unvectorized code in
MiBench/automotive-susan and an equally serious regression on a private
image processing benchmark.
radar://14351991
llvm-svn: 186188
Address calculation for gather/scather in vectorized code can incur a
significant cost making vectorization unbeneficial. Add infrastructure to add
cost.
Tests and cost model for targets will be in follow-up commits.
radar://14351991
llvm-svn: 186187
In particular:
movsbw %al, %ax --> cbtw
movswl %ax, %eax --> cwtl
movslq %eax, %rax --> cltq
According to Intel's manual those have the same performance characteristics but
come with a smaller encoding.
llvm-svn: 186174
against a constant."
This reverts commit r186107. It didn't handle wrapping arithmetic in the
loop correctly and thus caused the following C program to count from
0 to UINT64_MAX instead of from 0 to 255 as intended:
#include <stdio.h>
int main() {
unsigned char first = 0, last = 255;
do { printf("%d\n", first); } while (first++ != last);
}
Full test case and instructions to reproduce with just the -indvars pass
sent to the original review thread rather than to r186107's commit.
llvm-svn: 186152
Normal (sext (setcc ...)) sequences are optimised into
(select_cc ..., -1, 0) by DAGCombiner::visitSIGN_EXTEND.
However, this is deliberately not done for vectors, and after
vector type legalization we have (sext_inreg (setcc ...)) instead.
I wondered about trying to extend DAGCombiner to handle this case too,
but it seemed to be a loss on some other targets I tried, even those for
which SETCC isn't "legal" and SELECT_CC is.
llvm-svn: 186149
GPR and FPR constraints like "{r2}" and "{f2}" weren't handled correctly
because the name-to-regno mapping depends on the value type and
(because of that) the internal names in RegStrings are not the
same as the AsmName.
CC constraints like "{cc}" didn't work either because there was no
associated register class.
llvm-svn: 186148
If the source of these instructions is spilled we should load the destination.
If the destination is spilled we should store the source.
llvm-svn: 186147
Summary:
This patch adds explicit calling convention types for the Win64 and
System V/x86-64 ABIs. This allows code to override the default, and use
the Win64 convention on a target that wants to use SysV (and
vice-versa). This is needed to implement the `ms_abi` and `sysv_abi` GNU
attributes.
Reviewers:
CC:
llvm-svn: 186144
Before we could vectorize PHINodes scanning successors was a good way of finding candidates. Now we can vectorize the phinodes which is simpler.
llvm-svn: 186139
We had patterns to match v4i32 immAllZerosV -> V_SET0, but not patterns for
v8i16 (which occurs in the test case) or v16i8. The same was true for
V_SETALLONES (so I added the associated patterns for those as well).
Another bug found by llvm-stress.
llvm-svn: 186108
Patch by Michele Scandale!
Adds a special handling of the case where, during the loop exit
condition rewriting, the exit value is a constant of bitwidth lower
than the type of the induction variable: instead of introducing a
trunc operation in order to match correctly the operand types, it
allows to convert the constant value to an equivalent constant,
depending on the initial value of the induction variable and the trip
count, in order have an equivalent comparison between the induction
variable and the new constant.
llvm-svn: 186107
This fixes a bug (found by csmith) at -O0 where we attempt to create a RLWIMI
with an out-of-range operand. Most uses of the isRunOfOnes function are guarded
by a condition that the value is not zero. This was not true in two places, and
in both places a zero input would result in an out-of-rage MB value (= 32).
To fix this, isRunOfOnes returns false on a zero input (and I've remove one
now-redundant guard).
llvm-svn: 186101
We can vectorize them because in the case where we wrap in the address space the
unvectorized code would have had to access a pointer value of zero which is
undefined behavior in address space zero according to the LLVM IR semantics.
(Thank you Duncan, for pointing this out to me).
Fixes PR16592.
llvm-svn: 186088
- Coallocate entires for AttributeSetImpls and Nodes after the class itself.
- Remove mutable iterators from immutable classes.
- Remove unused context field from AttributeImpl.
- Derive Enum/Align/String attribute implementations from AttributeImpl instead
of having a whole new inheritance tree for them.
- Derive AlignAttributeImpl from EnumAttributeImpl.
llvm-svn: 186075
RISBG can handle some ANDs for which no AND IMMEDIATE exists.
It also acts as a three-operand AND for some cases where an
AND IMMEDIATE could be used instead.
It might be worth adding a pass to replace RISBG with AND IMMEDIATE
in cases where the register operands end up being the same and where
AND IMMEDIATE is smaller.
llvm-svn: 186072
RISBG has three 8-bit operands (I3, I4 and I5). I'd originally
restricted all three to 6 bits, since that's the only range we intended
to use at the time. However, the top bit of I4 acts as a "zero" flag for
RISBG, while the top bit of I3 acts as a "test" flag for RNSBG & co.
This patch therefore allows them to have the full 8-bit range.
I've left the fifth operand as a 6-bit value for now since the
upper 2 bits have no defined meaning.
llvm-svn: 186070
predecessors of the two blocks it is attempting to merge supply the
same incoming values to any phi in the successor block. This change
allows merging in the case where there is one or more incoming values
that are undef. The undef values are rewritten to match the non-undef
value that flows from the other edge. Patch by Mark Lacey.
llvm-svn: 186069
MF is normally initialized in AsmPrinter::SetupMachineFunction, but if the file
contains only globals (no functions), then we need this to be initialized
because, when encountering an error, lowerConstant() references it.
This should fix the non-deterministic failures of
test/CodeGen/X86/nonconst-static-iv.ll, etc.
llvm-svn: 186068
When computing currently-live registers, the register scavenger excludes undef
uses. As a result, undef uses are ignored when computing the restore points of
registers spilled into the emergency slots. While the register scavenger
normally excludes from consideration, when scavenging, registers used by the
current instruction, we need to not exclude undef uses. Otherwise, we might end
up requiring more emergency spill slots than we have (in the case where the
undef use *is* the currently-spilled register).
Another bug found by llvm-stress.
llvm-svn: 186067