This adds an SimplifyLibCalls case which converts the special __sinpi and
__cospi (float & double variants) into a __sincospi_stret where appropriate to
remove duplicated work.
Patch by Tim Northover
llvm-svn: 193943
Doing this with a hash map doesn't change behavior and avoids calling
isIdenticalTo O(n^2) times. This should probably eventually move into a utility
class shared with EarlyCSE and the limited CSE in the SLPVectorizer.
llvm-svn: 193926
I hit some problems with future work due to the member subprogram of
'a_b's type having a subprogram (an implicit default ctor, !52 in the
pre-commit source) with no name. Clang now generates a name for such a
function but in this case doesn't even emit debug info for it as it is
unused (Clang never emits the body of the ctor, instead just emitting
memset if needed).
llvm-svn: 193892
When the loop vectorizer was part of the SCC inliner pass manager gvn would
run after the loop vectorizer followed by instcombine. This way redundancy
(multiple uses) were removed and instcombine could perform scalarization on the
induction variables. Having moved the loop vectorizer to later we no longer run
any form of redundancy elimination before we perform instcombine. This caused
vectorized induction variables to survive that did not before.
On a recent iMac this helps linpack back from 6000Mflops to 7000Mflops.
This should also help lpbench and paq8p.
I ran a Release (without Asserts) build over the test-suite and did not see any
negative impact on compile time.
radar://15339680
llvm-svn: 193891
The point is to ensure that the attribute in question
(DW_AT_data_member_location) is associated with the prior tag, so ensure
that we don't see another tag starting between the intended tag and the
desired attribute.
llvm-svn: 193884
In a failed attempt to allow the gnu-public-names.ll test case to not
hardcode the size of the unit that the pubnames section referred to I've
at least managed to have unit headers and pubnames headers print out in
a similar style.
This failed to achieve the desired goal because the header in a unit
specifies the length of the unit without the length element of the
header whereas the length in the pubnames includes this element, so the
numbers are off by 4 bytes. I don't know of any arithmetic powers in
FileCheck so the test case can't simply say "CU_LENGTH + 4".
llvm-svn: 193872
linkonce_odr_auto_hide was in incomplete attempt to implement a way
for the linker to hide symbols that are known to be available in every
TU and whose addresses are not relevant for a particular DSO.
It was redundant in that it all its uses are equivalent to
linkonce_odr+unnamed_addr. Unlike those, it has never been connected
to clang or llvm's optimizers, so it was effectively dead.
Given that nothing produces it, this patch just nukes it
(other than the llvm-c enum value).
llvm-svn: 193865
Add a Virtualization ARM subtarget feature along with adding proper build
attribute emission for Tag_Virtualization_use (encodes Virtualization and
TrustZone) and Tag_MPextension_use.
Also rework test/CodeGen/ARM/2010-10-19-mc-elf-objheader.ll testcase to
something that is more maintainable. This changes the focus of this
testcase away from testing CPU defaults (which is tested elsewhere), onto
specifically testing that attributes are encoded correctly.
llvm-svn: 193859
Fix Tag_ABI_HardFP_use build attribute to handle single precision FP,
replace deprecated Tag_ABI_HardFP_use value of 3 with 0 and also add
some tests for Tag_ABI_VFP_args.
llvm-svn: 193856
This adds another heuristic to BPI, similar to the existing heuristic that
considers (x == 0) unlikely to be true. As suggested in the PACT'98 paper by
Deitrich, Cheng, and Hwu, -1 is often used to indicate an invalid index, and
equality comparisons with -1 are also unlikely to succeed. Local
experimentation supports this hypothesis: This yields a 1-2% speedup in the
test-suite sqlite benchmark on the PPC A2 core, with no significant
regressions.
llvm-svn: 193855
When a dependence check fails we can still try to vectorize loops with runtime
array bounds checks.
This helps linpack to vectorize a loop in dgefa. And we are back to 2x of the
scalar performance on a corei7-avx.
radar://15339680
llvm-svn: 193853
This commit only changes comments and documentation in OCaml bindings. The official name of the language is OCaml, and the usage is now consistent.
Patch by Peter Zotov
llvm-svn: 193836