1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 12:33:33 +02:00
Commit Graph

128307 Commits

Author SHA1 Message Date
Teresa Johnson
904406f866 Fix new gold test to specify emulation mode.
The thinlto_linkonceresolution.ll gold linker test introduced in r262727
included a target triple, but didn't set the emulation mode, which is
necessary since the default linker target may be different.

Patch by H.J. Lu

llvm-svn: 262745
2016-03-04 21:19:08 +00:00
Dan Gohman
6d636cd56e [WebAssembly] Add another possible code-size optimization to README.txt
llvm-svn: 262740
2016-03-04 20:09:57 +00:00
Renato Golin
52bc44295a [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make
sure we only emit valid types or the ones that were explicitly marked as custom.
Now, passing check-all and test-suite on x86, ARM and AArch64.

This patch fixes PR17193 (and a long time FIXME in the tests).

llvm-svn: 262738
2016-03-04 19:19:36 +00:00
Tom Stellard
c656bfbad2 AMDGPU/SI: Add support for spiling SGPRs to scratch buffer
Summary:
This is necessary for when we run out of VGPRs and can no
longer use v_{read,write}_lane for spilling SGPRs.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17592

llvm-svn: 262732
2016-03-04 18:31:18 +00:00
Teresa Johnson
e5e9e689da Fix bot failure from r262721: unintented change in gold-plugin save-temps
The split code gen task ID should not be appended to save-temps output
file when the parallelism factor is 1 (not actually splitting).

llvm-svn: 262731
2016-03-04 18:16:00 +00:00
Sanjoy Das
e0159ebc33 [Statepoint docs] Delete trailing whitespace
llvm-svn: 262730
2016-03-04 18:14:09 +00:00
Tom Stellard
eaca07fc18 AMDGPU/SI: Enable frame index scavenging during PrologEpilogueInserter
Summary:
This allows us to use virtual registers when we need extra registers
for inserting spill instructions in SIRegisterInfo:eliminateFrameIndex().

Once all the frame indices have been eliminated, the
PrologEpilogueInserter does an extra pass over the program to replace
all virtual registers with physical ones.

This allows us to make more efficient use of our emergency spill slots,
so we only need to create one.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17591

llvm-svn: 262728
2016-03-04 18:02:01 +00:00
Teresa Johnson
215f45b06e [ThinLTO] Ensure prevailing linkonce emitted as weak in ThinLTO backends
Summary:
Since IR files are all compiled into separate independent object files
in ThinLTO mode, the prevailing linkonce symbols must be emitted in its
object file even if it is no longer referenced there, e.g. if no
references remain in the module after inlining, since it may be
referenced by another ThinLTO compiled object file. This is done by
changing LDPR_PREVAILING_DEF_IRONLY* symbols to LDPR_PREVAILING_DEF,
which converts the prevailing linkonce to weak. We also don't need the
other prevailing IRONLY handling for internalization, which is not
currently performed for ThinLTO.

Test case included.

Reviewers: davidxl, rafael

Subscribers: rafael, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D16173

llvm-svn: 262727
2016-03-04 17:48:35 +00:00
Krzysztof Parzyszek
0eb70ee94a [Hexagon] Fix lowering of calls with the return type of i1
This fixes an assertion in test/CodeGen/Hexagon/ifcvt-edge-weight.ll
when run with -debug-only=isel

llvm-svn: 262726
2016-03-04 17:38:05 +00:00
Zoran Jovanovic
ed3c27bd7d [mips][microMIPS] Prevent usage of OR16_MMR6 instruction when code for microMIPS is generated.
Author: milena.vujosevic.janicic
Reviewers: dsanders
Differential Revision: http://reviews.llvm.org/D17373

llvm-svn: 262725
2016-03-04 17:34:31 +00:00
Teresa Johnson
b00a3e6034 [ThinLTO] Launch importing backends in parallel threads from gold plugin
Summary:
Launch ThinLTO backends (LTO and codegen pipelines with importing) in
parallel using a ThreadPool, after creating the combined index.
The number of threads is controlled by the existing -jobs gold plugin
option, or the hardware concurrency if not specified.

The old behavior of exiting after creating the combined index can be
invoked via a new thinlto-index-only plugin option.

This commit involves just the ThinLTO-specific pieces of D15390, the NFC
and other restructuring pieces were committed independently:
  r262677: Add hardware_concurrency interface to llvm::thread (NFC)
  r262719: Change split code gen to use ThreadPool
  r262721: Refactor gold-plugin codegen to prepare for ThinLTO threads (NFC)

Reviewers: pcc, joker.eph, rafael

Subscribers: rafael, davidxl, llvm-commits, joker.eph

Differential Revision: http://reviews.llvm.org/D15390

llvm-svn: 262724
2016-03-04 17:06:02 +00:00
Teresa Johnson
e2461d628d Refactor gold-plugin codegen to prepare for ThinLTO threads (NFC)
This is the NFC part remaining from D15390, which refactors the
current codegen() into a CodeGen class with various modular methods and
other helper functions that will be used by the follow-on ThinLTO piece.

llvm-svn: 262721
2016-03-04 16:36:06 +00:00
Teresa Johnson
31ee1fa4e9 Change split code gen to use ThreadPool
Part of D15390.

llvm-svn: 262719
2016-03-04 15:39:13 +00:00
Simon Pilgrim
5516ba8111 [X86][AVX512] Added some basic X86ISD::VPERMV3 shuffle combining tests
None of these actually combine yet as we haven't enabled X86ISD::VPERMV3 for target shuffle combining

llvm-svn: 262718
2016-03-04 15:19:42 +00:00
Sam Kolton
5660f4b2b1 Test commit access
llvm-svn: 262714
2016-03-04 12:29:14 +00:00
Simon Pilgrim
910b95e254 [X86][SSSE3] Added combine test for unary shuffle (pshufb) only referencing elements from the second input of a binary shuffle (punpcklbw)
llvm-svn: 262710
2016-03-04 11:15:23 +00:00
Valery Pykhtin
c7ff55dc02 test commit
llvm-svn: 262709
2016-03-04 10:59:50 +00:00
Benjamin Kramer
585cc07d12 Make headers self-contained again.
llvm-svn: 262702
2016-03-04 10:49:30 +00:00
Nikolay Haustov
3529c0cbe0 AMDGPU/SI: add llvm.amdgcn.image.atomic.* intrinsics
These correspond to IMAGE_ATOMIC_* and are going to be used by Mesa for the
GL_ARB_shader_image_load_store extension.

Initial change by Nicolai H.hnle

Differential Revision: http://reviews.llvm.org/D17401

llvm-svn: 262701
2016-03-04 10:39:50 +00:00
Justin Bogner
6b5f8cc6e0 Annotate our undefined behaviour to sneak it past the sanitizers
We have known UB in some ilists where we static cast half nodes to
(larger) derived types and use the address. See llvm.org/PR26753.

This needs to be fixed, but in the meantime it'd be nice if running
ubsan didn't complain. This adds annotations in the two places where
ubsan complains while running check-all of a sanitized clang build.

llvm-svn: 262683
2016-03-04 01:52:47 +00:00
Easwaran Raman
a916b2e62f Fix a memory leak.
llvm-svn: 262682
2016-03-04 01:18:40 +00:00
Justin Bogner
7c056deabe CodeGen: Tune the SmallVector size in LiveRange
The vast majority of LiveRanges (ie, 4/5) have exactly 1 segment and 1
value number, and a good chunk of the rest have 2 of each, so
allocating space for 4 is wasteful. This is especially noticeable when
dealing with a very large number of vregs, and I have an internal case
where dropping this to 2 shaves over 5% off of peak memory when
compiling a particularly large function.

llvm-svn: 262681
2016-03-04 00:58:39 +00:00
Easwaran Raman
587391856c Fix a use-after-free bug introduced in r262636
llvm-svn: 262679
2016-03-04 00:44:01 +00:00
Teresa Johnson
be253c0570 Add hardware_concurrency interface to llvm::thread (NFC)
Part of D15390.

llvm-svn: 262677
2016-03-04 00:25:54 +00:00
Evgeniy Stepanov
f31dabc41a [gold] Handle modules that are not included in the link.
Gold has a newly added LDPT_GET_SYMBOLS_V3 callback that can
distinguish between a module that is not included in the link, and
one that is included but has its entire interface preempted by others.

Fixes PR26674.

llvm-svn: 262676
2016-03-04 00:23:29 +00:00
Easwaran Raman
da6baa5c34 Fix memory leak in tests.
llvm-svn: 262674
2016-03-03 23:55:41 +00:00
Mike Aizatsky
f9401b724f [libfuzzer] arbitrary function adapter.
The adapter automates converting sequence of bytes into arbitrary
arguments.

Differential Revision: http://reviews.llvm.org/D17829

llvm-svn: 262673
2016-03-03 23:45:29 +00:00
Philip Reames
347970f4ae [docs] Add a description of current problem areas to the statepoint docs
Triggered by a question on llvm-dev about status

llvm-svn: 262671
2016-03-03 23:24:44 +00:00
Guozhi Wei
a7fc8e012e [InstCombine] Combine A->B->A BitCast
This patch enhances InstCombine to handle following case:

        A  ->  B    bitcast
        PHI
        B  ->  A    bitcast

llvm-svn: 262670
2016-03-03 23:21:38 +00:00
NAKAMURA Takumi
57f709ace4 llvm/test/CodeGen/ARM/rem_crash.ll: Avoid unsupported targets to specify explicit triple.
We will see it for targeting win32;

  LLVM ERROR: CPU: 'generic' does not support ARM mode execution!

llvm-svn: 262668
2016-03-03 22:38:39 +00:00
Kostya Serebryany
fc674d4015 [libFuzzer] when interrupted, call _Exit() instead of exit()
llvm-svn: 262667
2016-03-03 22:36:37 +00:00
Simon Pilgrim
c831916cbc [X86][AVX512BW] Fixed 512-bit PSHUFB shuffle mask decode and added combine test.
PSHUFB decoder was assuming that input was 128 or 256-bit vector only.

llvm-svn: 262661
2016-03-03 21:55:01 +00:00
Lang Hames
c3c1858604 [RuntimeDyld] Fix '_' stripping in RTDyldMemoryManager::getSymbolAddressInProcess.
The RTDyldMemoryManager::getSymbolAddressInProcess method accepts a
linker-mangled symbol name, but it calls through to dlsym to do the lookup (via
DynamicLibrary::SearchForAddressOfSymbol), and dlsym expects an unmangled
symbol name.

Historically we've attempted to "demangle" by removing leading '_'s on all
platforms, and fallen back to an extra search if that failed. That's broken, as
it can cause symbols to resolve incorrectly on platforms that don't do mangling
if you query '_foo' and the process also happens to contain a 'foo'.

Fix this by demangling conditionally based on the host platform. That's safe
here because this function is specifically for symbols in the host process, so
the usual cross-process JIT looking concerns don't apply.

M    unittests/ExecutionEngine/ExecutionEngineTest.cpp
M    lib/ExecutionEngine/RuntimeDyld/RTDyldMemoryManager.cpp

llvm-svn: 262657
2016-03-03 21:23:15 +00:00
Philip Reames
b31e6a2515 [ValueTracking] "constant fold" an experimental hidden option
llvm-svn: 262648
2016-03-03 19:50:32 +00:00
Philip Reames
f86417771d [ValueTracking] Remove dead code from an old experiment
This experiment was originally about trying to use facts implied dominating conditions to infer more precise known bits.  While the compile time was found to be acceptable on several large code bases, we never found sufficiently profitable examples to justify turning on the code by default.  Given this, it's time to abandon the experiment.  

Several folks have commented that they've found this useful for experimentation, but nothing has come of those experiments.  Given how easy the patch is to apply, there's no reason to leave the code in tree.  

For anyone interested in further investigation in this area, I recommend finding the summary email I sent on one of the original review threads.  In particular, I now believe the use-list based approach is strictly worse than the dom-tree-walking approach.  

llvm-svn: 262646
2016-03-03 19:44:06 +00:00
Sanjay Patel
fe4b2c4bc6 [InstCombine] transform bitcasted bitwise logic ops with constants (PR26702)
Given that we're not actually reducing the instruction count in the included
regression tests, I think we would call this a canonicalization step.

The motivation comes from the example in PR26702:
https://llvm.org/bugs/show_bug.cgi?id=26702

If we hoist the bitwise logic ahead of the bitcast, the previously unoptimizable
example of:

define <4 x i32> @is_negative(<4 x i32> %x) {
  %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
  %not = xor <4 x i32> %lobit, <i32 -1, i32 -1, i32 -1, i32 -1>
  %bc = bitcast <4 x i32> %not to <2 x i64>
  %notnot = xor <2 x i64> %bc, <i64 -1, i64 -1>
  %bc2 = bitcast <2 x i64> %notnot to <4 x i32>
  ret <4 x i32> %bc2
}

Simplifies to the expected:

define <4 x i32> @is_negative(<4 x i32> %x) {
  %lobit = ashr <4 x i32> %x, <i32 31, i32 31, i32 31, i32 31>
  ret <4 x i32> %lobit
}

Differential Revision: http://reviews.llvm.org/D17583

llvm-svn: 262645
2016-03-03 19:19:04 +00:00
Easwaran Raman
c5bba219e7 Fix breakage caused by r262636.
Use LLVM_ATTRIBUTE_UNUSED instead of __attribute_((unused))

llvm-svn: 262643
2016-03-03 18:53:20 +00:00
Sanjoy Das
fb176dec48 [ConstantRange] Rename test; NFC
llvm-svn: 262640
2016-03-03 18:31:33 +00:00
Sanjoy Das
c51e182cd8 [SCEV] Prove no-overflow via constant ranges
Exploit ScalarEvolution::getRange's newly acquired smartness (since
r262438) by using that to infer nsw and nuw when possible.

llvm-svn: 262639
2016-03-03 18:31:29 +00:00
Sanjoy Das
7b29c5b2d5 [SCEV] Be less eager about demoting zexts to sexts
After r262438 we can have provably positive NSW SCEV expressions whose
zero extensions cannot be simplified (since r262438 makes SCEV better at
computing constant ranges).  This means demoting sexts of positive add
recurrences eagerly can result in an unsimplified zero extension where
we could have had a simplified sign extension.  This change fixes the
issue by teaching SCEV to demote sext of a positive SCEV expression to a
zext only if the sext could not be simplified.

llvm-svn: 262638
2016-03-03 18:31:23 +00:00
Sanjoy Das
53336f6738 [ConstantRange] Generalize makeGuaranteedNoWrapRegion to work on ranges
This will be used in a later patch to ScalarEvolution.  Right now only
the unit tests exercise the newly added code.

llvm-svn: 262637
2016-03-03 18:31:16 +00:00
Easwaran Raman
ff8cc9e544 Infrastructure for PGO enhancements in inliner
This patch provides the following infrastructure for PGO enhancements in inliner:

Enable the use of block level profile information in inliner
Incremental update of block frequency information during inlining
Update the function entry counts of callees when they get inlined into callers.

Differential Revision: http://reviews.llvm.org/D16381

llvm-svn: 262636
2016-03-03 18:26:33 +00:00
Simon Pilgrim
a9b6ea15aa [X86][AVX] Better support for the variable mask form of VPERMILPD/VPERMILPS
The variable mask form of VPERMILPD/VPERMILPS were only partially implemented, with much of it still performed as an intrinsic.

This patch properly defines the instructions in terms of X86ISD::VPERMILPV, permitting the opcode to be easily combined as a target shuffle.

Differential Revision: http://reviews.llvm.org/D17681

llvm-svn: 262635
2016-03-03 18:13:53 +00:00
Dehao Chen
3ccc40ea9e Use LineLocation instead of CallsiteLocation to index callsite profile.
Summary: With discriminator, LineLocation can uniquely identify a callsite without the need to specifying callee name. Remove Callee function name from the key, and put it in the value (FunctionSamples).

Reviewers: davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17827

llvm-svn: 262634
2016-03-03 18:09:32 +00:00
Simon Pilgrim
3f8d3f6e89 [X86] Tidied up 256-bit -> 2 x 128-bit vector shift extraction.
lowerShift was manually splitting BUILD_VECTOR cases when it could just call Extract128BitVector which does this anyway.

llvm-svn: 262633
2016-03-03 17:54:35 +00:00
Simon Pilgrim
dce6c6bcd0 [X86] Pulled out repeated code testing for constant vector shift amount. NFCI.
llvm-svn: 262631
2016-03-03 17:35:43 +00:00
Amjad Aboud
19124e966f MCU target has its own ABI, however X86 interrupt handler calling convention overrides this ABI.
Fixed the ordering to check first for X86 interrupt handler then for MCU target.

Differential Revision: http://reviews.llvm.org/D17801

llvm-svn: 262628
2016-03-03 17:17:54 +00:00
Ahmed Bougacha
24ce82aeec [X86] Don't assume that shuffle non-mask operands starts at #0.
That's not the case for VPERMV/VPERMV3, which cover all possible
combinations (the C intrinsics use a different order; the AVX vs
AVX512 intrinsics are different still).

Since:
  r246981 AVX-512: Lowering for 512-bit vector shuffles.
VPERMV is recognized in getTargetShuffleMask.

This breaks assumptions in most callers, as they expect
the non-mask operands to start at index 0.
VPERMV has the mask as operand #0; VPERMV3 has it in the middle.

Instead of the faulty assumption, have getTargetShuffleMask return
its operands as well.

One alternative we considered was to change the operand order of
VPERMV, but we agreed to stick to the instruction order, as there
are more AVX512 weirdness to cover (vpermt2/vpermi2 in particular).

Differential Revision: http://reviews.llvm.org/D17041

llvm-svn: 262627
2016-03-03 16:53:50 +00:00
Matthew Simpson
a066e2421a [LoopUtils, LV] Fix PR26734
The vectorization of first-order recurrences (r261346) caused PR26734. When
detecting these recurrences, we need to ensure that the previous value is
actually defined inside the loop. This patch includes the fix and test case.

llvm-svn: 262624
2016-03-03 16:12:01 +00:00
Sanjay Patel
0d79c5acf8 [AArch64] fold 'isPositive' vector integer operations (PR26819)
This is one of the cases shown in:
https://llvm.org/bugs/show_bug.cgi?id=26819

Shift and negate is what InstCombine prefers to produce (and I tried to make it do more of that
in http://reviews.llvm.org/rL262424 ), so we should recognize that pattern as something that might
come from autovectorization even if it's unlikely to be produced from C NEON intrinsics.

The patch is based on the x86 equivalent:
http://reviews.llvm.org/rL262036

Differential Revision: http://reviews.llvm.org/D17834

llvm-svn: 262623
2016-03-03 15:56:08 +00:00