1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

124370 Commits

Author SHA1 Message Date
Craig Topper
233dd30406 [X86] int_x86_avx2_permps and X86ISD::VPERMV should take an integer vector for its shuffle indices.
llvm-svn: 254269
2015-11-29 22:53:22 +00:00
Dan Gohman
d0486e9b93 [WebAssembly] Delete unused functions. NFC.
llvm-svn: 254268
2015-11-29 22:48:57 +00:00
Dan Gohman
d9b4e1da4b [WebAssembly] Minor clang-format and selected clang-tidy cleanups. NFC.
llvm-svn: 254267
2015-11-29 22:32:02 +00:00
Sanjay Patel
1fcfdccb6c fix typos in comments; NFC
llvm-svn: 254266
2015-11-29 22:09:34 +00:00
Davide Italiano
ae7cdf685f [SimplifyLibCalls] Don't crash if the function doesn't have a name.
llvm-svn: 254265
2015-11-29 21:58:56 +00:00
Davide Italiano
75c47db0da [SimplifyLibCalls] Cross out implemented transformations.
llvm-svn: 254264
2015-11-29 21:00:43 +00:00
Davide Italiano
85963c8ad6 [SimplifyLibCalls] Tranform log(pow(x, y)) -> y*log(x).
This one is enabled only under -ffast-math. There are cases where the
difference between the value computed and the correct value is huge
even for ffast-math, e.g. as Steven pointed out:

x = -1, y = -4
log(pow(-1), 4) = 0
4*log(-1) = NaN

I checked what GCC does and apparently they do the same optimization
(which result in the dramatic difference). Future work might try to
make this (slightly) less worse.

Differential Revision:	http://reviews.llvm.org/D14400

llvm-svn: 254263
2015-11-29 20:58:04 +00:00
Diego Novillo
d74fdde81f SamplePGO - Do not use std::to_string in diagnostics.
This fixes buildbots in systems that std::to_string is not present. It
also tidies the output of the diagnostic to render doubles a bit better
(thanks Ben Kramer for help with string streams and format).

llvm-svn: 254261
2015-11-29 18:23:26 +00:00
Craig Topper
268b85343f Use a lambda instead of std::bind and std::mem_fn I introduced in r254242. NFC
llvm-svn: 254260
2015-11-29 18:05:22 +00:00
Simon Pilgrim
a5289493dc [X86][SSE] Added support for lowering to ADDSUBPS/ADDSUBPD with commuted inputs
We could already recognise shuffle(FSUB, FADD) -> ADDSUB, this allow us to recognise shuffle(FADD, FSUB) -> ADDSUB by commuting the shuffle mask prior to matching.

llvm-svn: 254259
2015-11-29 16:41:04 +00:00
Rafael Espindola
94e602fadd Add a passing test.
When a comdat is discarded, any globals defined in it become undefined.

llvm-svn: 254258
2015-11-29 15:52:12 +00:00
Rafael Espindola
cb0eb08c74 Don't depend on the order the IR is copied.
llvm-svn: 254257
2015-11-29 15:22:49 +00:00
Rafael Espindola
731dd51848 Don't depend on the order the IR is copied.
llvm-svn: 254256
2015-11-29 15:08:39 +00:00
Rafael Espindola
da72729200 Make this test less strict.
We just want to test what is copied, no the order.

llvm-svn: 254255
2015-11-29 14:53:06 +00:00
Rafael Espindola
19b681ca64 Simplify. NFC.
llvm-svn: 254254
2015-11-29 14:33:06 +00:00
Igor Breger
31205fdf6a AVX512:Implemented encoding for the vmovq.s instruction.
Differential Revision: http://reviews.llvm.org/D14810

llvm-svn: 254248
2015-11-29 07:41:26 +00:00
Craig Topper
f93f4fa48a Remove an intermediate lambda. NFC
llvm-svn: 254246
2015-11-29 05:38:08 +00:00
Xinliang David Li
2f7575a9de Minor code cleanups
- Add const keyword
 - fix code comments
 - move forward decl to the common file

llvm-svn: 254244
2015-11-29 04:52:34 +00:00
Craig Topper
0aa3cc5a39 Remove unnecessary intermediate lambda. NFC
llvm-svn: 254243
2015-11-29 04:37:14 +00:00
Craig Topper
021094c82b [SelectionDAG] Use std::any_of instead of a manually coded loop. NFC
llvm-svn: 254242
2015-11-29 04:37:11 +00:00
Rafael Espindola
6e56ab6340 Correctly handle llvm.global_ctors merging.
We were not handling the case where an entry must be dropped and the
destination module has no llvm.global_ctors.

llvm-svn: 254241
2015-11-29 03:29:42 +00:00
Rafael Espindola
cff4b2f76a Fix a crash when writing merged bitcode.
Playing with mutateType in here was making getValueType and getType
incompatible.

llvm-svn: 254240
2015-11-29 03:21:30 +00:00
Davide Italiano
b0e9d52803 [SimplifyLibCalls] Use any_of(). Suggested by David Blaikie!
llvm-svn: 254239
2015-11-28 22:27:48 +00:00
Benjamin Kramer
cfb0838f67 [SimplifyLibCalls] Fix inverted condition that lead to an uninitialized memory read below.
Found by msan!

llvm-svn: 254238
2015-11-28 21:43:12 +00:00
Simon Pilgrim
a25f6b7f44 [X86][AVX] Regenerate ADDSUB tests
Tidied up triple and regenerate tests using update_llc_test_checks.py

llvm-svn: 254237
2015-11-28 19:20:49 +00:00
Xinliang David Li
1c888b8d73 [PGO] Move value profile format related structures and APIs to common file
This is the last step to enable profile runtime to share the same value prof
data format and reader/writer code with llvm host tools. The VP related 
data structures are moved to a section in InstrProfData.inc enabled with macro
INSTR_PROF_VALUE_PROF_DATA, and common API implementations are enabled with
INSTR_PROF_COMMON_API_IMPL. There should be no functional change.

llvm-svn: 254235
2015-11-28 19:07:09 +00:00
Renato Golin
c0ff495fd9 Revert "[ARM] Generate ABI_optimization_goals build attribute, as described in the ARM ARM."
This reverts commit r254201 and r254202, as it broke test-suite,
self-hosting and sanitizer tests on ARM buildbots.

llvm-svn: 254234
2015-11-28 17:23:46 +00:00
Simon Pilgrim
b81b924383 [X86][FMA] Added 512-bit tests to match 128/256-bit tests coverage
As discussed on D14909

llvm-svn: 254233
2015-11-28 16:04:24 +00:00
Simon Pilgrim
00dd5461a1 [X86][FMA] More thorough FMA tests
Added FMADD/FMSUB/FNMADD/FNMSUB tests for all types

Added load folding tests for 512-bit vectors

NOTE: Many of the AVX512 FMA instructions don't yet commute/fold correctly

As discussed on D14909

llvm-svn: 254232
2015-11-28 14:28:44 +00:00
Simon Pilgrim
5bec33202c [X86][AVX2] Tidied up PBROADCAST tests
Tidied up triple and regenerate tests using update_llc_test_checks.py

llvm-svn: 254231
2015-11-28 14:15:40 +00:00
NAKAMURA Takumi
29848f1d25 llvm/test/CodeGen/SystemZ/alloca-04.ll REQUIRES asserts due to -debug-pass.
llvm-svn: 254230
2015-11-28 13:05:49 +00:00
Jonas Paulsson
4e06f54193 [Stack realignment] Handling of aligned allocas.
This patch implements dynamic realignment of stack objects for targets
with a non-realigned stack pointer. Behaviour in FunctionLoweringInfo
is changed so that for a target that has StackRealignable set to
false, over-aligned static allocas are considered to be variable-sized
objects and are handled with DYNAMIC_STACKALLOC nodes.

It would be good to group aligned allocas into a single big alloca as
an optimization, but this is yet todo.

SystemZ benefits from this, due to its stack frame layout.

New tests SystemZ/alloca-03.ll for aligned allocas, and
SystemZ/alloca-04.ll for "no-realign-stack" attribute on functions.

Review and help from Ulrich Weigand and Hal Finkel.

llvm-svn: 254227
2015-11-28 11:02:32 +00:00
Craig Topper
ed0259dd22 Use range-based for loops. NFC
llvm-svn: 254222
2015-11-28 08:23:04 +00:00
Craig Topper
3b1fed5c86 [TableGen] Use SmallString instead of std::string to build up a string to avoid heap allocations. NFC
llvm-svn: 254221
2015-11-28 08:23:02 +00:00
Xinliang David Li
8167574a24 [PGO] Add return code for vp rt record init routine to indicate error condition
llvm-svn: 254220
2015-11-28 05:47:34 +00:00
Xinliang David Li
16f0d8f3a9 [PGO] Allow value profile writer interface to allocated target buffer
Raw profile writer needs to write all data of one kind in one continuous block,
so the buffer needs to be pre-allocated and passed to the writer method in
pieces for function profile data. The change adds the support for raw value data
writing.

llvm-svn: 254219
2015-11-28 05:37:01 +00:00
Xinliang David Li
c87d1d6b3e Function name cleanup (NFC)
llvm-svn: 254218
2015-11-28 05:06:00 +00:00
Xinliang David Li
7cdfa09575 [PGO] Extract VP data integrity check code into a helper function (NFC)
llvm-svn: 254217
2015-11-28 04:56:07 +00:00
Keno Fischer
39e3d1456f [autoconf] Fix MinGW build
This is the autoconf analog of r251201. I realize autoconf is
deprecated, but while it's in tree, it should at least be kept working.

Also add the deprecation message to configure.ac such that AutoRegen
actually picks ip up.

llvm-svn: 254215
2015-11-28 00:54:12 +00:00
Rafael Espindola
5718bba1b1 Pass .ll directly to llvm-link.
llvm-svn: 254214
2015-11-27 23:47:15 +00:00
Rafael Espindola
337e7e7a9d Pass .ll directly to llvm-link
llvm-svn: 254213
2015-11-27 23:21:45 +00:00
Diego Novillo
d08de97276 SamplePGO - Add initial support for inliner annotations.
This adds two thresholds to the sample profiler to affect inlining
decisions: the concept of global hotness and coldness.

Functions that have accumulated more than a certain fraction of samples at
runtime, are annotated with the InlineHint attribute. Conversely,
functions that accumulate less than a certain fraction of samples, are
annotated with the Cold attribute.

This is very similar to the hints emitted by Clang when using
instrumentation profiles.

Notice that this is a very blunt instrument. A function may have
globally collected a significant fraction of samples, but that does not
necessarily mean that every callsite for that function is hot.

Ideally, we would annotate each callsite with the samples collected at
that callsite. This way, the inliner can incorporate all these weights
into its cost model.

Once the inliner offers this functionality, we can change the hints
emitted here to a more precise per-callsite annotation. For now, this is
providing some measure of speedups with our internal benchmarks. I've
observed speedups of up to 23% (though the geo mean is about 3%). I expect
these numbers to improve as the inliner gets better annotations.

llvm-svn: 254212
2015-11-27 23:14:51 +00:00
Diego Novillo
c52a667205 SamplePGO - Fix default threshold for hot callsites.
Based on testing of internal benchmarks, I'm lowering this threshold to
a value of 0.1%.  This means that SamplePGO will respect 99.9% of the
original inline decisions when following a profile.

The performance difference is noticeable in some tests. With the
previous threshold, the speedups over baseline -O2 was about 0.63%. With
the new default, the speedups are around 3% on average.

The point of this threshold is not to do more aggressive inlining. When
an inlined callsite crosses this threshold, SamplePGO will redo the
inline decision so that it can better apply the input profile.

By respecting most original inline decisions, we can apply more of the
input profile because the shape of the code follows the profile more
closely.

In the next series, I'll be looking at adding some inline hints for the
cold callsites and for toplevel functions that are hot/cold as well.

llvm-svn: 254211
2015-11-27 23:14:49 +00:00
Rafael Espindola
edc5f030b0 Modernize the test a bit
Remove out of date comment.
Pass .ll files to llvm-link.

llvm-svn: 254210
2015-11-27 23:13:17 +00:00
Rafael Espindola
4a063d8813 Simplify the linking of recursive data.
Now the ValueMapper has two callbacks. The first one maps the
declaration. The ValueMapper records the mapping and then materializes
the body/initializer.

llvm-svn: 254209
2015-11-27 20:28:19 +00:00
Artyom Skrobov
5d9b865f7a Follow-up fix for r254201
llvm-svn: 254202
2015-11-27 16:20:34 +00:00
Artyom Skrobov
7b957b4af2 [ARM] Generate ABI_optimization_goals build attribute, as described in the ARM ARM.
Summary:
Since this build attribute corresponds to a whole module, and
different functions in a module may differ in the optimizations
enabled for them, this attribute is emitted after all functions,
and only in the case that the optimization goals for all
functions match.

Reviewers: logan, hans

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D14934

llvm-svn: 254201
2015-11-27 15:30:51 +00:00
Oliver Stannard
94a98bdf3b [AArch64] Add ARMv8.2-A FP16 scalar instructions
ARMv8.2-A adds 16-bit floating point versions of all existing VFP
floating-point instructions. This is an optional extension, so all of
these instructions require the FeatureFullFP16 subtarget feature.

Most of these instructions are the same as the 32- and 64-bit versions,
but with the type field (bits 23-22) set to 0b11. Previously the top bit
of the size field was always 0, so the instruction classes only provided
a 1-bit size field, which I have widened to 2 bits.

Differential Revision: http://reviews.llvm.org/D15014

llvm-svn: 254198
2015-11-27 13:04:48 +00:00
Adhemerval Zanella
fd321f0647 [sanitizer] [dfsan] Unify aarch64 mapping
This patch changes the DFSan instrumentation for aarch64 to instead
of using fixes application mask defined by SANITIZER_AARCH64_VMA
to read the application shadow mask value from compiler-rt. The value
is initialized based on runtime VAM detection.

Along with this patch a compiler-rt one will also be added to export
the shadow mask variable.

llvm-svn: 254196
2015-11-27 12:42:39 +00:00
Davide Italiano
916150a366 [SimplifyLibCalls] Use range-based loop. NFC.
llvm-svn: 254193
2015-11-27 08:05:40 +00:00