1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-27 05:53:07 +01:00
Commit Graph

82304 Commits

Author SHA1 Message Date
Chen Li
cb288bc6fb [LoopUnswitch] Check OptimizeForSize before traversing over all basic blocks in current loop
Summary: This patch moves the check of OptimizeForSize before traversing over all basic blocks in current loop. If OptimizeForSize is set to true, no non-trivial unswitch is ever allowed. Therefore, the early exit will help reduce compilation time. This patch should be NFC. 

Reviewers: reames, weimingz, broune

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11997

llvm-svn: 244868
2015-08-13 05:24:29 +00:00
Ahmed Bougacha
d8d7836955 [CodeGen] Mark the promoted FCOPYSIGN result FP_ROUND as TRUNCating.
Now that we can properly promote mismatched FCOPYSIGNs (r244858), we
can mark the FP_ROUND on the result as truncating, to expose folding.

FCOPYSIGN doesn't change anything but the sign bit, so
  (fp_round (fcopysign (fpext a), b))
is equivalent to (modulo the sign bit):
  (fp_round (fpext a))
which is a no-op.

llvm-svn: 244862
2015-08-13 01:32:30 +00:00
Ahmed Bougacha
8e769a9269 [AArch64] Also custom-lowering mismatched vector/f16 FCOPYSIGN.
We can lower them using our cool tricks if we fpext/fptrunc the second
input, like we do for f32/f64.

Follow-up to r243924, r243926, and r244858.

llvm-svn: 244860
2015-08-13 01:13:56 +00:00
Ahmed Bougacha
9904f054b0 [CodeGen] Assert on getNode(FP_EXTEND) with a smaller dst type.
This would have caught the problem in r244858.

llvm-svn: 244859
2015-08-13 01:10:29 +00:00
Ahmed Bougacha
9d4c209fe9 [CodeGen] When Promoting, don't extend the 2nd FCOPYSIGN operand.
We don't care about its type, and there's even a combine that'll fold
away the FP_EXTEND if we let it run. However, until it does, we'll have
something broken like:
  (f32 (fp_extend (f64 v)))

Scalar f16 follow-up to r243924.

llvm-svn: 244858
2015-08-13 01:09:43 +00:00
Ahmed Bougacha
6f77d4d919 [CodeGen] Simplify getNode(*EXT/TRUNC) type size assert. NFC.
We already check that vectors have the same number of elements, we
don't need to use the scalar types explicitly: comparing the size of
the whole vector is enough.

llvm-svn: 244857
2015-08-13 01:08:48 +00:00
Rafael Espindola
de2ec4a63f There is only one saver of strings.
llvm-svn: 244854
2015-08-13 01:07:02 +00:00
Chandler Carruth
adb2df90fa [LIR] Make the LoopIdiomRecognize pass get analyses essentially the same
way as every other pass. This simplifies the code quite a bit and is
also more idiomatic! <ba-dum!>

llvm-svn: 244853
2015-08-13 01:03:26 +00:00
Chandler Carruth
3d8b7669f2 [LIR] Remove the dedicated class for popcount recognition and sink the
code into methods on LoopIdiomRecognize.

This simplifies the code somewhat and also makes it much easier to move
the analyses around. Ultimately, the separate class wasn't providing
significant value over methods -- it contained the precondition basic
block and the current loop. The current loop is already available and
the precondition block wasn't needed everywhere and is easy to pass
around.

In several cases I just moved things to be static functions because they
already accepted most of their inputs as arguments.

This doesn't fix the way we manage analyses yet, that will be the next
patch, but it already makes the code over 50 lines shorter.

No functionality changed.

llvm-svn: 244851
2015-08-13 00:44:29 +00:00
Rafael Espindola
7937d8a6c3 Return ErrorOr from FileOutputBuffer::create. NFC.
llvm-svn: 244848
2015-08-13 00:31:39 +00:00
Chandler Carruth
1c383505c5 [LIR] Move all the helpers to be private and re-order the methods in
a way that groups things logically. No functionality changed.

llvm-svn: 244845
2015-08-13 00:10:03 +00:00
Chandler Carruth
737d5e3ec5 [LIR] Remove the 'LIRUtils' abstraction which was unnecessary and adding
complexity.

There is only one function that was called from multiple locations, and
that was 'getBranch' which has a reasonable one-line spelling already:
dyn_cast<BranchInst>(BB->getTerminator). We could make this shorter, but
it doesn't seem to add much value. Instead, we should avoid calling it
so many times on the same basic blocks, but that will be in a subsequent
patch.

The other functions are only called in one location, so inline them
there, and take advantage of this to use direct early exit and reduce
indentation. This makes it much more clear what is being tested for, and
in fact makes it clear now to me that there are simpler ways to do this
work. However, this patch just does the mechanical inlining. I'll clean
up the functionality of the code to leverage loop simplified form more
effectively in a follow-up.

Despite lots of early line breaks due to early-exit, this is still
shorter than it was before.

llvm-svn: 244841
2015-08-12 23:55:56 +00:00
Chandler Carruth
a59073732a [LIR] Run clang-format over LoopIdiomRecognize in preparation for
a significant code cleanup here.

The handling of analyses in this pass is overly complex and can be
simplified significantly, but the right way to do that is to simplify
all of the code not just the analyses, and that'll require pretty
extensive edits that would be noisy with formatting changes mixed into
them.

llvm-svn: 244828
2015-08-12 23:06:37 +00:00
Chandler Carruth
97b830a9d8 [PM/AA] Remove the AliasDebugger pass.
This debugger was designed to catch places where the old update API was
failing to be used correctly. As I've removed the update API, it no
longer serves any purpose. We can introduce new debugging aid passes
around any future work w.r.t. updating AAs.

Note that I've updated the documentation here, but really I need to
rewrite the documentation to carefully spell out the ideas around
stateful AA and how things are changing in the AA world. However, I'm
hoping to do that as a follow-up to the refactoring of the AA
infrastructure to work in both old and new pass managers so that I can
write the documentation specific to that world.

Differential Revision: http://reviews.llvm.org/D11984

llvm-svn: 244825
2015-08-12 22:54:47 +00:00
Philip Reames
b755be339a [RewriteStatepointsForGC] Avoid using unrelocated pointers after safepoints
To be clear: this is an *optimization* not a correctness change.

CodeGenPrep likes to duplicate icmps feeding branch instructions to take advantage of x86's ability to fuze many comparison/branch patterns into a single micro-op and to reduce the need for materializing i1s into general registers. PlaceSafepoints likes to place safepoint polls right at the end of basic blocks (immediately before terminators) when inserting entry and backedge safepoints. These two heuristics interact in a somewhat unfortunate way where the branch terminating the original block will be controlled by a condition driven by unrelocated pointers. This forces the register allocator to keep both the relocated and unrelocated values of the pointers feeding the icmp alive over the safepoint poll.

One simple fix would have been to just adjust PlaceSafepoints to move one back in the basic block, but you can reach similar cases as a result of LICM or other hoisting passes. As a result, doing a post insertion fixup seems to be more robust.

I considered doing this in CodeGenPrep itself, but having to update the live sets of already rewritten safepoints gets complicated fast. In particular, you can't just use def/use information because by moving the icmp, we're extending the live range of it's inputs potentially.

Instead, this patch teaches RewriteStatepointsForGC to make the required adjustments before making the relocations explicit in the IR. This change really highlights the fact that RSForGC is a CodeGenPrep-like pass which is performing target specific lowering. In the long run, we may even want to combine the two though this would require a lot more smarts to be integrated into RSForGC first. We currently rely on being able to run a set of cleanup passes post rewriting because the IR RSForGC generates is pretty damn ugly.

Differential Revision: http://reviews.llvm.org/D11819

llvm-svn: 244821
2015-08-12 22:11:45 +00:00
Alex Lorenz
143b52b95d MIR Parser: Allow the MI IR references to reference global values.
This commit fixes a bug where MI parser couldn't resolve the named IR
references that referenced named global values.

llvm-svn: 244817
2015-08-12 21:27:16 +00:00
Alex Lorenz
25000271ea MIR Serialization: Serialize the fixed stack pseudo source values.
llvm-svn: 244816
2015-08-12 21:23:17 +00:00
Cong Hou
0ae6b03638 NFC. Convert comments in MachineBasicBlock.cpp into new style.
llvm-svn: 244815
2015-08-12 21:18:54 +00:00
Alex Lorenz
1a0f2b312b MIR Parser: Move the parsing of fixed stack object indices into new method. NFC
This commit moves the code that parses the frame indices for the fixed stack
objects from the method 'parseFixedStackObjectOperand' to a new method named
'parseFixedStackFrameIndex', so that it can be reused when parsing fixed stack
pseudo source values.

llvm-svn: 244814
2015-08-12 21:17:02 +00:00
Alex Lorenz
79aa831733 MIR Serialization: Serialize the jump table pseudo source values.
llvm-svn: 244813
2015-08-12 21:11:08 +00:00
Alex Lorenz
6378c53dfe MIR Serialization: Serialize the GOT pseudo source values.
llvm-svn: 244809
2015-08-12 21:00:22 +00:00
Philip Reames
15b1aa2963 [RewriteStatepointsForGC] Handle extractelement fully in the base pointer algorithm
When rewriting the IR such that base pointers are available for every live pointer, we potentially need to duplicate instructions to propagate the base. The original code had only handled PHI and Select under the belief those were the only instructions which would need duplicated. When I added support for vector instructions, I'd added a collection of hacks for ExtractElement which caught most of the common cases. Of course, I then found the one test case my hacks couldn't cover. :)

This change removes all of the early hacks for extract element. By defining extractelement as a BDV (rather than trying to look through it), we can extend the rewriting algorithm to duplicate the extract as needed.  Note that a couple of peephole optimizations were left in for the moment, because while we now handle extractelement as a first class citizen, we're not yet handling insertelement.  That change will follow in the near future.  

llvm-svn: 244808
2015-08-12 21:00:20 +00:00
Alex Lorenz
5c5fa37678 MIR Serialization: Serialize the stack pseudo source values.
llvm-svn: 244806
2015-08-12 20:44:16 +00:00
Sanjay Patel
4294d5f8bd fix typo; NFC
llvm-svn: 244805
2015-08-12 20:36:18 +00:00
Alex Lorenz
8f7638b24c MIR Serialization: Serialize the constant pool pseudo source values.
llvm-svn: 244803
2015-08-12 20:33:26 +00:00
Lenny Maiorani
1850ddfeb6 Fix missing space in libfuzzer's help text.
llvm-svn: 244800
2015-08-12 20:00:10 +00:00
Chandler Carruth
da8f360fe7 [PM/AA] Add missing static dependency edges from DSE and memdep to TLI.
I forgot to add these in r244780 and r244778. Sorry about that.

Also order the static dependencies in a lexicographical order.

llvm-svn: 244787
2015-08-12 18:10:45 +00:00
Chandler Carruth
2e28175329 [PM/AA] Explicitly depend on TLI rather than getting it out of the
AliasAnalysis.

Same as the other commits, the TLI access from an alias analysis is
going away and isn't very clean -- it is better to explicitly mark the
dependencies.

llvm-svn: 244785
2015-08-12 18:06:08 +00:00
Chandler Carruth
19b400baf8 [PM/AA] Stop getting the TargetLibraryInfo out of the AliasAnalysis and
just depend on it directly.

This was particularly frustrating because there was a really wide
mixture of using a member variable and re-extracting it from the AA that
happened to be around. I think the result is much more clear.

I've also deleted all of the pointless null checks and used references
across the APIs where I could to make it explicit that this cannot be
null in a useful fashion.

llvm-svn: 244780
2015-08-12 18:01:44 +00:00
JF Bastien
9205d59c82 WebAssembly: floating-point comparisons
Summary:
D11924 implemented part of the floating-point comparisons, this patch implements the rest:
 * Tell ISelLowering that all booleans are either 0 or 1.
 * Expand the eq/ne/lt/le/gt/ge floating-point comparisons to the canonical ones (similar to what Mips32r6InstrInfo.td does).
 * Add tests for ord/uno.
 * Add tests for ueq/one/ult/ule/ugt/uge.
 * Fix existing comparison tests to remove the (res & 1) code, which setBooleanContents stops from generating.

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11970

llvm-svn: 244779
2015-08-12 17:53:29 +00:00
Chandler Carruth
3fd9750c3e [PM/AA] Have memdep explicitly get and use TargetLibraryInfo rather than
relying on sneaking it out of its AliasAnalysis.

This abuse of AA (to shuffle TLI around rather than explicitly depending
on it) is going away with my refactor of AA.

llvm-svn: 244778
2015-08-12 17:47:44 +00:00
Adam Nemet
df626a149f [LoopVer] Optionally allow using memchecks from LAA
r243382 changed the behavior to always require a set of memchecks to be
passed to LoopVer.  This change restores the prior behavior as an
alternative to the new behavior.  This allows the checks to be
implicitly taken from the LAA object.

Patch by Ashutosh Nema!

llvm-svn: 244763
2015-08-12 16:51:19 +00:00
Sanjay Patel
a24f811c83 80-cols; NFC
llvm-svn: 244755
2015-08-12 15:12:25 +00:00
James Molloy
b13284a73a [ValueTracking] Tweak a comment slightly
Hal asked for this change in D11146, but I missed it when I committed originally.

llvm-svn: 244754
2015-08-12 15:11:43 +00:00
Sanjay Patel
463099751f fix typo; NFC
llvm-svn: 244753
2015-08-12 15:09:09 +00:00
John Brawn
dcf7a81cdd Redo "Make global aliases have symbol size equal to their type"
r242520 was reverted in r244313 as the expected behaviour of the alias
attribute in C is that the alias has the same size as the aliasee. However
we can re-introduce adding the size on the alias when the aliasee does not,
from a source code or object perspective, exist as a discrete entity. This
happens when the aliasee is not a symbol, or when that symbol is private.

Differential Revision: http://reviews.llvm.org/D11943

llvm-svn: 244752
2015-08-12 15:05:39 +00:00
John Brawn
8a59b4f141 [GlobalMerge] Only emit aliases for internal linkage variables for non-Mach-O
On Mach-O emitting aliases for the variables that make up a MergedGlobals
variable can cause problems when linking with dead stripping enabled so don't
do that, except for external variables where we must emit an alias.

llvm-svn: 244748
2015-08-12 13:36:48 +00:00
Zoran Jovanovic
3c2a065d19 [mips][microMIPS] Create microMIPS64r6 subtarget and implement DALIGN, DAUI, DAHI, DATI, DEXT, DEXTM and DEXTU instructions
Differential Revision: http://reviews.llvm.org/D10923

llvm-svn: 244744
2015-08-12 12:45:16 +00:00
Michael Kuperstein
43bbce4282 [X86] Disable mul -> shl + lea combine when compiling for minsize
Differential Revision: http://reviews.llvm.org/D11904

llvm-svn: 244740
2015-08-12 11:27:26 +00:00
Michael Kuperstein
8c8a758faa [X86] Allow x86 call frame optimization to fold more loads into pushes
This abstracts away the test for "when can we fold across a MachineInstruction"
into the the MI interface, and changes call-frame optimization use the same test
the peephole optimizer users.

Differential Revision: http://reviews.llvm.org/D11945

llvm-svn: 244729
2015-08-12 10:14:58 +00:00
Matt Arsenault
ac50a3d981 AMDGPU: Fix assert on dbg_value instructions
llvm-svn: 244728
2015-08-12 09:04:44 +00:00
Simon Pilgrim
9c9b4332ca unused variable warning fix.
llvm-svn: 244725
2015-08-12 08:23:36 +00:00
Simon Pilgrim
45d6ddee89 [InstCombine] Move SSE/AVX vector blend folding to instcombiner
As discussed in D11886, this patch moves the SSE/AVX vector blend folding to instcombiner from PerformINTRINSIC_WO_CHAINCombine (which allows us to remove this completely).

InstCombiner already had partial support for this, I just had to add support for zero (ConstantAggregateZero) masks and also the case where both selection inputs were the same (allowing us to ignore the mask).

I also moved all the relevant combine tests into InstCombine/blend_x86.ll

Differential Revision: http://reviews.llvm.org/D11934

llvm-svn: 244723
2015-08-12 08:08:56 +00:00
Saleem Abdulrasool
23546702ae X86: hoist a condition into a variable (NFC)
The same value is used multiple times through the function.  Hoist the condition
into a variable.  This should fix a silly static analysis warning where the
conditions flip around.  No functional change intended.

llvm-svn: 244713
2015-08-12 02:01:36 +00:00
Kostya Serebryany
a9d3e6b2dc [libFuzzer] add two flags, -tbm_depth and -tbm_width to control how the trace-based-mutations are applied
llvm-svn: 244712
2015-08-12 01:55:37 +00:00
Kostya Serebryany
2bdb9ad059 [libFuzzer] add colons to the stats output to avoid confusion
llvm-svn: 244708
2015-08-12 01:04:27 +00:00
Kostya Serebryany
5a4f36556e [libFuzzer] use raw C IO to reduce the risk of a deadlock in a signal handler.
llvm-svn: 244707
2015-08-12 00:55:09 +00:00
Sanjay Patel
7b4cd645e8 [x86] enable machine combiner reassociations for 256-bit vector FP mul/add
llvm-svn: 244705
2015-08-12 00:29:10 +00:00
Alex Lorenz
ce2812bb8e PseudoSourceValue: Transform the mips subclass to target independent subclasses
This commit transforms the mips-specific 'MipsCallEntry' subclass of the
'PseudoSourceValue' class into two, target-independent subclasses named
'GlobalValuePseudoSourceValue' and 'ExternalSymbolPseudoSourceValue'.

This change makes it easier to serialize the pseudo source values by removing
target-specific pseudo source values.

Reviewers: Akira Hatanaka
llvm-svn: 244698
2015-08-11 23:23:17 +00:00
Alex Lorenz
7b1d22a17d PseudoSourceValue: Replace global manager with a manager in a machine function.
This commit removes the global manager variable which is responsible for
storing and allocating pseudo source values and instead it introduces a new
manager class named 'PseudoSourceValueManager'. Machine functions now own an
instance of the pseudo source value manager class.

This commit also modifies the 'get...' methods in the 'MachinePointerInfo'
class to construct pseudo source values using the instance of the pseudo
source value manager object from the machine function.

This commit updates calls to the 'get...' methods from the 'MachinePointerInfo'
class in a lot of different files because those calls now need to pass in a
reference to a machine function to those methods.

This change will make it easier to serialize pseudo source values as it will
enable me to transform the mips specific MipsCallEntry PseudoSourceValue
subclass into two target independent subclasses.

Reviewers: Akira Hatanaka
llvm-svn: 244693
2015-08-11 23:09:45 +00:00
Alex Lorenz
4047ccf510 PseudoSourceValue: Introduce a 'PSVKind' enumerator.
This commit introduces a new enumerator named 'PSVKind' in the
'PseudoSourceValue' class. This enumerator is now used to distinguish between
the various kinds of pseudo source values.

This change is done in preparation for the changes to the pseudo source value
object management and to the PseudoSourceValue's class hierarchy - the next two
PseudoSourceValue commits will get rid of the global variable that manages the
pseudo source values and the mips specific MipsCallEntry subclass.

Reviewers: Akira Hatanaka
llvm-svn: 244687
2015-08-11 22:32:00 +00:00
Alex Lorenz
122f1442e4 PseudoSourceValue: Update comments and fix lowercase variable names. NFC.
This commit updates the documentation comments in PseudoSourceValue.cpp and
PseudoSourceValue.h based on the LLVM's documentation style. It also fixes
several instances of variable names that started with a lowercase letter.

This change is done in preparation for the changes to the pseudo source value
object management and to the PseudoSourceValue's class hierarchy.

llvm-svn: 244686
2015-08-11 22:23:19 +00:00
Alex Lorenz
09a0186673 Reformat PseudoSourceValue.cpp and PseudoSourceValue.h. NFC.
This commit reformats the files lib/CodeGen/PseudoSourceValue.cpp and
include/llvm/CodeGen/PseudoSourceValue.h using clang-format. This change is
done in preparation for the changes to the pseudo source value object
management and to the PseudoSourceValue's class hierarchy.

llvm-svn: 244685
2015-08-11 22:17:22 +00:00
Mark Heffernan
030525f5bf Use 32-bit divides instead of 64-bit divides where possible.
For NVPTX, try to use 32-bit division instead of 64-bit division when the dividend and divisor
fit in 32 bits. This speeds up some internal benchmarks significantly. The underlying reason
is that many index computations are carried out in 64-bits but never actually exceed the
capacity of a 32-bit word.

llvm-svn: 244684
2015-08-11 22:16:34 +00:00
Paul Robinson
d1a4dd1fd2 Make DW_AT_[MIPS_]linkage_name optional, and off by default for SCE.
Mangled "linkage" names can be huge, and if the debugger (or other
tools) have no use for them, the size savings can be very impressive
(on the order of 40%).

Add one test for controlling behavior, and modify a number of tests to
either stop using linkage names, or make llc emit them (so these tests
will still run when the default triple is for PS4).

Differential Revision: http://reviews.llvm.org/D11374

llvm-svn: 244678
2015-08-11 21:36:45 +00:00
Sanjoy Das
2078fb4d89 Fix PR24354.
`InstCombiner::OptimizeOverflowCheck` was asserting an
invariant (operands to binary operations are ordered by decreasing
complexity) that wasn't really an invariant.  Fix this by instead having
`InstCombiner::OptimizeOverflowCheck` establish the invariant if it does
not hold.

llvm-svn: 244676
2015-08-11 21:33:55 +00:00
Sanjay Patel
33647178af don't repeat function names in comments; NFC
llvm-svn: 244672
2015-08-11 21:24:04 +00:00
Sanjay Patel
0510654efe fix 80-cols; NFC
llvm-svn: 244668
2015-08-11 21:11:56 +00:00
JF Bastien
fcf9f2fb1b NFC SelectionDAGDumper: fix typo
Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11959

llvm-svn: 244667
2015-08-11 21:10:07 +00:00
JF Bastien
a0847105a1 WebAssembly: implement comparison.
Some of the FP comparisons (ueq, one, ult, ule, ugt, uge) are currently broken, I'll fix them in a follow-up.

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11924

llvm-svn: 244665
2015-08-11 21:02:46 +00:00
Sanjay Patel
39bab9e7a2 [x86] enable machine combiner reassociations for 128-bit vector single/double multiplies
llvm-svn: 244657
2015-08-11 20:19:23 +00:00
Chen Li
9199f20dfe [LowerSwitch] Skip dead blocks for processSwitchInst()
Summary: This patch adds check for dead blocks and skip them for processSwitchInst(). This will help reduce compilation time.

Reviewers: reames, hans

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11953

llvm-svn: 244656
2015-08-11 20:16:17 +00:00
JF Bastien
aace81abf2 WebAssembly: implement WebAssemblyTargetLowering::getTargetNodeName
Summary: Implementation is the same as in AArch64.

Subscribers: aemerson, jfb, llvm-commits, sunfish

Differential Revision: http://reviews.llvm.org/D11956

llvm-svn: 244655
2015-08-11 20:13:18 +00:00
Sanjay Patel
b7a56def6f fix minsize detection: minsize attribute implies optimizing for size
Also, add a test for optsize because this was not part of any existing regression test.

llvm-svn: 244651
2015-08-11 19:39:36 +00:00
Jingyue Wu
41c8aefa76 SelectionDAG: Prefer to combine multiplication with less uses for fma
Summary:
For example:

  s6 = s0*s5;
  s2 = s6*s6 + s6;
  ...
  s4 = s6*s3;

We notice that it is possible for s2 is folded to fma (s0, s5, fmul (s6 s6)).
This only happens when Aggressive is true, otherwise hasOneUse() check
already prevents from folding the multiplication with more uses.

Test Plan: test/CodeGen/NVPTX/fma-assoc.ll

Patch by Xuetian Weng

Reviewers: hfinkel, apazos, jingyue, ohsallen, arsenm

Subscribers: arsenm, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D11855

llvm-svn: 244649
2015-08-11 19:21:46 +00:00
Chen Li
13db1247c9 [LowerSwitch] Fix a bug when LowerSwitch deletes the default block
Summary: LowerSwitch crashed with the attached test case after deleting the default block. This happened because the current implementation of deleting dead blocks is wrong. After the default block being deleted, it contains no instruction or terminator, and it should no be traversed anymore. However, since the iterator is advanced before processSwitchInst() function is executed, the block advanced to could be deleted inside processSwitchInst(). The deleted block would then be visited next and crash dyn_cast<SwitchInst>(Cur->getTerminator()) because Cur->getTerminator() returns a nullptr. This patch fixes this problem by recording dead default blocks into a list, and delete them after all processSwitchInst() has been done. It still possible to visit dead default blocks and waste time process them. But it is a compile time issue, and I plan to have another patch to add support to skip dead blocks.

Reviewers: kariddi, resistor, hans, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11852

llvm-svn: 244642
2015-08-11 18:12:26 +00:00
Rafael Espindola
574b6734d9 Use llvm::make_unique to fix the MSVC build.
llvm-svn: 244641
2015-08-11 18:11:17 +00:00
Sanjay Patel
f473c5974a fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244631
2015-08-11 17:04:31 +00:00
Teresa Johnson
a8fc4e7498 Enable EliminateAvailableExternally pass in the LTO pipeline.
Summary:
For LTO we need to enable this pass in the LTO pipeline,
as it is skipped during the "-flto -c" compile step (when PrepareForLTO is
set).

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11919

llvm-svn: 244622
2015-08-11 16:26:41 +00:00
Sanjay Patel
39f0213518 Variable names should start with an upper case letter; NFC
llvm-svn: 244618
2015-08-11 16:05:43 +00:00
Sanjay Patel
feea9289bf fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244617
2015-08-11 15:56:31 +00:00
John Brawn
557ccbef5f [GlobalMerge] Use private linkage for MergedGlobals variables
Other objects can never reference the MergedGlobals symbol so external linkage
is never needed. Using private instead of internal linkage means the object is
more similar to what it looks like when global merging is not enabled, with
the only difference being that the merged variables are addressed indirectly
relative to the start of the section they are in.

Also add aliases for merged variables with internal linkage, as this also makes
the object be more like what it is when they are not merged.

Differential Revision: http://reviews.llvm.org/D11942

llvm-svn: 244615
2015-08-11 15:48:04 +00:00
Sanjay Patel
2bbc3b14af fix code that was accidentally commented out in previous commit
llvm-svn: 244610
2015-08-11 15:08:29 +00:00
Sanjay Patel
67e99fd161 fix typos in comments; NFC
llvm-svn: 244609
2015-08-11 15:04:51 +00:00
Sanjay Patel
7aac75b6df fix typo in comment; NFC
llvm-svn: 244607
2015-08-11 14:45:08 +00:00
Sanjay Patel
e13d1d30df fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244604
2015-08-11 14:31:14 +00:00
Michael Kuperstein
8ea9afb887 [X86] Allow merging of immediates within a basic block for code size savings
First step in preventing immediates that occur more than once within a single
basic block from being pulled into their users, in order to prevent unnecessary
large instruction encoding .Currently enabled only when optimizing for size.

Patch by: zia.ansari@intel.com
Differential Revision: http://reviews.llvm.org/D11363

llvm-svn: 244601
2015-08-11 14:10:58 +00:00
James Molloy
e0929cde28 [AArch64] Match fminnum/fmaxnum for vector fminnm/fmaxnm instead of an intrinsic.
Lower Intrinsic::aarch64_neon_fmin/fmax to fminnum/fmannum and match that instead. Minimal functional change:

  - Extra tests added because coverage of scalar fminnm/fmaxnm instructions was nonexistant.
  - f16 test updated because now we actually generate scalar fminnm/fmaxnm we no longer need to bail out to a libcall!

llvm-svn: 244595
2015-08-11 12:06:37 +00:00
James Molloy
655902e549 [AArch64] Replace the custom AArch64ISD::FMIN/MAX nodes with ISD::FMINNAN/MAXNAN
NFCI. This just removes custom ISDNodes that are no longer needed.

llvm-svn: 244594
2015-08-11 12:06:33 +00:00
James Molloy
d56d688228 [ARM] Match fminnan/fmaxnan for vector vmin/vmax instead of an intrinsic
Lower Intrinsic::arm_neon_vmins/vmaxs to fminnan/fmaxnan and match that instead. This is important because SDAG will soon be able to select FMINNAN itself, so we need a unified lowering path for intrinsics and SDAG.

NFCI.

llvm-svn: 244593
2015-08-11 12:06:28 +00:00
James Molloy
c131e948e8 [ARM] Match fminnum/fmaxnum for vector vminnm/vmaxnm instead of an intrinsic
Lower the intrinsic to a FMINNUM/FMAXNUM node and select that instead. This is important because soon SDAG will be able to select FMINNUM/FMAXNUM itself, so we need an integrated lowering path between SDAG and intrinsics.

NFCI.

llvm-svn: 244592
2015-08-11 12:06:25 +00:00
James Molloy
83cfd780e5 [ARM] Replace ARMISD::VMINNM/VMAXNM with ISD::FMINNUM/FMAXNUM
NFCI. This replaces another custom ISDNode with a generic equivalent.

llvm-svn: 244591
2015-08-11 12:06:22 +00:00
James Molloy
9564f7ade6 [ARM] Replace ARMISD::FMIN/FMAX with the shiny new ISD::FMINNAN/FMAXNAN.
NFCI. This removes a custom ISDNode.

llvm-svn: 244590
2015-08-11 12:06:15 +00:00
Marina Yatsina
a28fbe6a96 [X86] Add SAL mnemonics for Intel syntax
SAL and SHL instructions perform the same operation

Differential Revision: http://reviews.llvm.org/D11882

llvm-svn: 244588
2015-08-11 12:05:06 +00:00
Marina Yatsina
fc986c89c0 [X86] Fix REPE, REPZ, REPNZ for intel syntax
REPE, REPZ, REPNZ, REPNE should have mnemonics for Intel syntax as well.
Currently using these instructions causes compilation errors for Intel syntax.

Differential Revision: http://reviews.llvm.org/D11794

llvm-svn: 244584
2015-08-11 11:28:10 +00:00
Marina Yatsina
d8e14460d5 [X86] Fix imul alias for intel syntax
The "imul reg, imm" alias is not defined for intel syntax. 
In intel syntax there is no w/l/q suffix for the imul instruction.

Differential Revision: http://reviews.llvm.org/D11887

llvm-svn: 244582
2015-08-11 10:43:04 +00:00
James Molloy
cf490f8ffc Add new ISD nodes: ISD::FMINNAN and ISD::FMAXNAN
The intention of these is to be a corollary to ISD::FMINNUM/FMAXNUM,
differing only on how NaNs are treated. FMINNUM returns the non-NaN
input (when given one NaN and one non-NaN), FMINNAN returns the NaN
input instead.

This patch includes support for scalarizing, widening and splitting
vectors, but not expansion or softening. The reason is that these
should never be needed - FMINNAN nodes are only going to be created
in one place (SDAGBuilder::visitSelect) and there we'll check if the
node is legal or custom. I could preemptively add expand and soften
code, but I'm fairly opposed to adding code I can't test. It's bad
enough I can't create tests with this patch, but at least this code
will be exercised by the ARM and AArch64 backends fairly shortly.

llvm-svn: 244581
2015-08-11 09:13:05 +00:00
James Molloy
ecd6525b24 Add support for floating-point minnum and maxnum
The select pattern recognition in ValueTracking (as used by InstCombine
and SelectionDAGBuilder) only knew about integer patterns. This teaches
it about minimum and maximum operations.

matchSelectPattern() has been extended to return a struct containing the
existing Flavor and a new enum defining the pattern's behavior when
given one NaN operand.

C minnum() is defined to return the non-NaN operand in this case, but
the idiomatic C "a < b ? a : b" would return the NaN operand.

ARM and AArch64 at least have different instructions for these different cases.

llvm-svn: 244580
2015-08-11 09:12:57 +00:00
Vasileios Kalintiris
761ce121c9 [mips] Remap move as or.
Summary:
This patch remaps the assembly idiom 'move' to 'or' instead of 'daddu' or
'addu'. The use of addu/daddu instead of or as move was highlighted as a
performance issue during the analysis of a recent 64bit design. Originally
move was encoded as 'or' by binutils but was changed for the r10k cpu family
due to their pipeline which had 2 arithmetic units and a single logical unit,
and so could issue multiple (d)addu based moves at the same time but only 1
logical move.

This patch preserves the disassembly behaviour so that disassembling a old style
(d)addu move still appears as move, but assembling move always gives an or

Patch by Simon Dardis.

Reviewers: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11796

llvm-svn: 244579
2015-08-11 08:56:25 +00:00
Michael Kuperstein
ebd10e5c0e [X86] When optimizing for minsize, use POP for small post-call stack clean-up
When optimizing for size, replace "addl $4, %esp" and "addl $8, %esp"
following a call by one or two pops, respectively. We don't try to do it in
general, but only when the stack adjustment immediately follows a call - which
is the most common case.

That allows taking a short-cut when trying to find a free register to pop into,
instead of a full-blown liveness check. If the adjustment immediately follows a
call, then every register the call clobbers but doesn't define should be dead at
that point, and can be used.

Differential Revision: http://reviews.llvm.org/D11749

llvm-svn: 244578
2015-08-11 08:48:48 +00:00
Michael Kuperstein
e5fcd53d38 Allow PeepholeOptimizer to fold a few more cases
The condition for clearing the folding candidate list was clamped together
with the "uninteresting instruction" condition. This is too conservative,
e.g. we don't need to clear the list when encountering an IMPLICIT_DEF.

Differential Revision: http://reviews.llvm.org/D11591

llvm-svn: 244577
2015-08-11 08:19:43 +00:00
Michael Kuperstein
6d16ba7233 [GMR] Be a bit smarter about which globals don't alias when doing recursive lookups
Should hopefully fix the remainder of PR24288.

Differential Revision: http://reviews.llvm.org/D11900

llvm-svn: 244575
2015-08-11 08:06:44 +00:00
Lang Hames
3f10ab0d27 [RuntimeDyld][AArch64] Add explicit addends before calling relocationValueRef.
relocationValueRef uses the addend, so it has to be set before the call.

llvm-svn: 244574
2015-08-11 06:27:53 +00:00
Nick Lewycky
1bff8578d4 Fix unused variable 'X' in release builds.
llvm-svn: 244571
2015-08-11 05:57:10 +00:00
JF Bastien
7a8b1de402 WebAssembly: NFC fix release build break, unused variable.
Summary: Caused by D11914, pointed out by blaikie.

Subscribers: llvm-commits, jfb, dblaikie

Differential Revision: http://reviews.llvm.org/D11929

llvm-svn: 244570
2015-08-11 04:52:24 +00:00
David Majnemer
8646dfdbfe [IR] Verify EH pad predecessors
Make sure that an EH pad's predecessors are using their unwind edge to
transfer control to the EH pad.

llvm-svn: 244563
2015-08-11 02:48:30 +00:00
JF Bastien
b4d2511cd9 WebAssembly: add basic floating-point tests
Summary: I somehow forgot to add these when I added the basic floating-point opcodes. Also remove ceil/floor/trunc/nearestint for now, and add them only when properly tested.

Subscribers: llvm-commits, sunfish, jfb

Differential Revision: http://reviews.llvm.org/D11927

llvm-svn: 244562
2015-08-11 02:45:15 +00:00
Kostya Serebryany
1c2b96fda9 [libFuzzer] add -only_ascii flag
llvm-svn: 244559
2015-08-11 01:44:42 +00:00
David Majnemer
fd52148995 [WinEHPrepare] Add rudimentary support for the new EH instructions
This adds somewhat basic preparation functionality including:
- Formation of funclets via coloring basic blocks.
- Cloning of polychromatic blocks to ensure that funclets have unique
  program counters.
- Demotion of values used between different funclets.
- Some amount of cleanup once we have removed predecessors from basic
  blocks.
- Verification that we are left with a CFG that makes some amount of
  sense.

N.B. Arguments and numbering still need to be done.

Differential Revision: http://reviews.llvm.org/D11750

llvm-svn: 244558
2015-08-11 01:15:26 +00:00
Cameron Esfahani
0c35a7deea Explicitly clear the MI operand list when getInstruction() is called. Call MI.clear() within MCD::OPC_Decode case and inside of translateInstruction() for the X86 target. Remove now unnecessary MI.clear() from ARMDisassembler.
Summary: Explicitly clear the MI operand list when getInstruction() is called.

Reviewers: hfinkel, t.p.northover, hvarga, kparzysz, jyknight, qcolombet, uweigand

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11665

llvm-svn: 244557
2015-08-11 01:15:07 +00:00
Tyler Nowicki
12178daefd Print vectorization analysis when loop hint is specified.
This patch and a relatec clang patch solve the problem of having to explicitly enable analysis when specifying a loop hint pragma to get the diagnostics. Passing AlwasyPrint as the pass name (see below) causes the front-end to print the diagnostic if the user has specified '-Rpass-analysis' without an '=<target-pass>’. Users of loop hints can pass that compiler option without having to specify the pass and they will get diagnostics for only those loops with loop hints.

llvm-svn: 244555
2015-08-11 01:09:15 +00:00
Tyler Nowicki
1bb9d68857 Moved LoopVectorizeHints and related functions before LoopVectorizationLegality and LoopVectorizationCostModel.
llvm-svn: 244552
2015-08-11 00:52:54 +00:00
JF Bastien
6198dac24e WebAssembly: simply assert on SNaN and NaNs with payloads
Summary: convertToHexString doesn't represent them correctly at this point in time. This is a follow-up to sunfish's suggestion in D11914.

Subscribers: llvm-commits, sunfish, jfb

Differential Revision: http://reviews.llvm.org/D11925

llvm-svn: 244551
2015-08-11 00:49:20 +00:00
Tyler Nowicki
b772c3e968 Simplify processLoop() by moving loop hint verification into Hints::allowVectorization().
llvm-svn: 244550
2015-08-11 00:35:44 +00:00
Alex Lorenz
49c54efc4a MIR Serialization: Serialize UsedPhysRegMask from the machine register info.
This commit serializes the UsedPhysRegMask register mask from the machine
register information class. The mask is serialized as an inverted
'calleeSavedRegisters' mask to keep the output minimal.

This commit also allows the MIR parser to infer this mask from the register
mask operands if the machine function doesn't specify it.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 244548
2015-08-11 00:32:49 +00:00
Sanjay Patel
7c1a954f13 use range-based for loops; NFCI
llvm-svn: 244545
2015-08-11 00:26:05 +00:00
Kostya Serebryany
c88d3123b8 [libFuzzer] don't crash if the condition in a switch has unusual type (e.g. i72)
llvm-svn: 244544
2015-08-11 00:24:39 +00:00
Adam Nemet
3f19b8eced [LAA] Change name from addRuntimeCheck to addRuntimeChecks, NFC
This was requested by Hal in D11205.

llvm-svn: 244540
2015-08-11 00:09:37 +00:00
Alex Lorenz
b0cf6f0102 MIR Parser: Report an error when a stack object is redefined.
llvm-svn: 244536
2015-08-10 23:50:41 +00:00
Joerg Sonnenberger
547554aeba Add lduw and lwua aliases for SPARCv9.
llvm-svn: 244535
2015-08-10 23:47:22 +00:00
Alex Lorenz
4b65d19694 MIR Parser: Report an error when a fixed stack object is redefined.
llvm-svn: 244534
2015-08-10 23:45:02 +00:00
Joerg Sonnenberger
ed1bfffcbb Load/store for float registers from/to alternate space.
llvm-svn: 244532
2015-08-10 23:33:17 +00:00
Sanjay Patel
cd2fc88ce7 use range-based for loop; NFCI
llvm-svn: 244531
2015-08-10 23:29:41 +00:00
Alex Lorenz
3b5305d636 MIR Serialization: Serialize the liveout register mask machine operands.
llvm-svn: 244529
2015-08-10 23:24:42 +00:00
Sanjay Patel
2ea34a4b40 fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244528
2015-08-10 23:07:26 +00:00
Adam Nemet
97b2779195 [LoopVer] Remove unused pointer partition argument, NFC.
llvm-svn: 244527
2015-08-10 23:05:31 +00:00
Tyler Nowicki
3f1d874bb9 Extend late diagnostics to include late test for runtime pointer checks.
This patch moves checking the threshold of runtime pointer checks to the vectorization requirements (late diagnostics) and emits a diagnostic that infroms the user the loop would be vectorized if not for exceeding the pointer-check threshold. Clang will also append the options that can be used to allow vectorization.

llvm-svn: 244523
2015-08-10 23:01:55 +00:00
JF Bastien
5df3b56df8 WebAssembly: print immediates
Summary:
For now output using C99's hexadecimal floating-point representation.

This patch also cleans up how machine operands are printed: instead of special-casing per type of machine instruction, the code now handles operands generically.

Reviewers: sunfish

Subscribers: llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D11914

llvm-svn: 244520
2015-08-10 22:36:48 +00:00
Joerg Sonnenberger
d2510224dd Add support for the signx instrution alias of SPARCv9.
llvm-svn: 244519
2015-08-10 22:32:25 +00:00
Cong Hou
f64a63c16f NFC. Fix some format issues in lib/CodeGen/MachineBasicBlock.cpp.
llvm-svn: 244518
2015-08-10 22:27:10 +00:00
Alex Lorenz
8ed7a00db9 MachineVerifier: Handle the optional def operand in a PATCHPOINT instruction.
The PATCHPOINT instructions have a single optional defined register operand,
but the machine verifier can't verify the optional defined register operands.
This commit makes sure that the machine verifier won't report an error when a
PATCHPOINT instruction doesn't have its optional defined register operand.
This change will allow us to enable the machine verifier for the code
generation tests for the patchpoint intrinsics.

Reviewers: Juergen Ributzka
llvm-svn: 244513
2015-08-10 21:47:36 +00:00
Sanjay Patel
8a6e14cb21 remove function names from comments; NFC
llvm-svn: 244509
2015-08-10 21:28:16 +00:00
Alex Lorenz
77ac257c14 StackMap: FastISel: Add an appropriate number of immediate operands to the
frame setup instruction.

This commit ensures that the stack map lowering code in FastISel adds an
appropriate number of immediate operands to the frame setup instruction.

The previous code added just one immediate operand, which was fine for a target
like AArch64, but on X86 the ADJCALLSTACKDOWN64 instruction needs two explicit
operands. This caused the machine verifier to report an error when the old code
added just one.

Reviewers: Juergen Ributzka

Differential Revision: http://reviews.llvm.org/D11853

llvm-svn: 244508
2015-08-10 21:27:03 +00:00
JF Bastien
f82b3e73a6 x86: Emit LAHF/SAHF instead of PUSHF/POPF
NaCl's sandbox doesn't allow PUSHF/POPF out of security concerns (priviledged emulators have forgotten to mask system bits in the past, and EFLAGS's DF bit is a constant source of hilarity). Commit r220529 fixed PR20376 by saving cmpxchg's flags result using EFLAGS, this commit now generated LAHF/SAHF instead, for all of x86 (not just NaCl) because it leads to an overall performance gain over PUSHF/POPF.

As with the previous patch this code generation is pretty bad because it occurs very later, after register allocation, and in many cases it rematerializes flags which were already available (e.g. already in a register through SETE). Fortunately it's somewhat rare that this code needs to fire.

I did [[ https://github.com/jfbastien/benchmark-x86-flags | a bit of benchmarking ]], the results on an Intel Haswell E5-2690 CPU at 2.9GHz are:

| Time per call (ms)  | Runtime (ms) | Benchmark                      |
| 0.000012514         |      6257    | sete.i386                      |
| 0.000012810         |      6405    | sete.i386-fast                 |
| 0.000010456         |      5228    | sete.x86-64                    |
| 0.000010496         |      5248    | sete.x86-64-fast               |
| 0.000012906         |      6453    | lahf-sahf.i386                 |
| 0.000013236         |      6618    | lahf-sahf.i386-fast            |
| 0.000010580         |      5290    | lahf-sahf.x86-64               |
| 0.000010304         |      5152    | lahf-sahf.x86-64-fast          |
| 0.000028056         |     14028    | pushf-popf.i386                |
| 0.000027160         |     13580    | pushf-popf.i386-fast           |
| 0.000023810         |     11905    | pushf-popf.x86-64              |
| 0.000026468         |     13234    | pushf-popf.x86-64-fast         |

Clearly `PUSHF`/`POPF` are suboptimal. It doesn't really seems to be worth teaching LLVM about individual flags, at least not for this purpose.

Reviewers: rnk, jvoung, t.p.northover

Subscribers: llvm-commits

Differential revision: http://reviews.llvm.org/D6629

llvm-svn: 244503
2015-08-10 20:59:36 +00:00
Sanjay Patel
bea667f5ae fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244499
2015-08-10 20:45:44 +00:00
Simon Pilgrim
65266a8e22 [InstCombine] Move SSE2/AVX2 arithmetic vector shift folding to instcombiner
As discussed in D11760, this patch moves the (V)PSRA(WD) arithmetic shift-by-constant folding to InstCombine to match the logical shift implementations.

Differential Revision: http://reviews.llvm.org/D11886

llvm-svn: 244495
2015-08-10 20:21:15 +00:00
Tyler Nowicki
6edbef9016 Late evaluation of the fast-math vectorization requirement.
This patch moves the verification of fast-math to just before vectorization is done. This way we can tell clang to append the command line options would that allow floating-point commutativity. Specifically those are enableing fast-math or specifying a loop hint. 

llvm-svn: 244489
2015-08-10 19:51:46 +00:00
Tyler Nowicki
455075e570 Modify diagnostic messages to clearly indicate the why interleaving wasn't done.
Sometimes interleaving is not beneficial, as determined by the cost-model and sometimes it is disabled by a loop hint (by the user). This patch modifies the diagnostic messages to make it clear why interleaving wasn't done.

llvm-svn: 244485
2015-08-10 19:14:16 +00:00
James Y Knight
2a6af41342 [Sparc] Implement i64 load/store support for 32-bit sparc.
The LDD/STD instructions can load/store a 64bit quantity from/to
memory to/from a consecutive even/odd pair of (32-bit) registers. They
are part of SparcV8, and also present in SparcV9. (Although deprecated
there, as you can store 64bits in one register).

As recommended on llvmdev in the thread "How to enable use of 64bit
load/store for 32bit architecture" from Apr 2015, I've modeled the
64-bit load/store operations as working on a v2i32 type, rather than
making i64 a legal type, but with few legal operations. The latter
does not (currently) work, as there is much code in llvm which assumes
that if i64 is legal, operations like "add" will actually work on it.

The same assumption does not hold for v2i32 -- for vector types, it is
workable to support only load/store, and expand everything else.

This patch:
- Adds a new register class, IntPair, for even/odd pairs of registers.

- Modifies the list of reserved registers, the stack spilling code,
  and register copying code to support the IntPair register class.

- Adds support in AsmParser. (note that in asm text, you write the
  name of the first register of the pair only. So the parser has to
  morph the single register into the equivalent paired register).

- Adds the new instructions themselves (LDD/STD/LDDA/STDA).

- Hooks up the instructions and registers as a vector type v2i32. Adds
  custom legalizer to transform i64 load/stores into v2i32 load/stores
  and bitcasts, so that the new instructions can actually be
  generated, and marks all operations other than load/store on v2i32
  as needing to be expanded.

- Copies the unfortunate SelectInlineAsm hack from ARMISelDAGToDAG.
  This hack undoes the transformation of i64 operands into two
  arbitrarily-allocated separate i32 registers in
  SelectionDAGBuilder. and instead passes them in a single
  IntPair. (Arbitrarily allocated registers are not useful, asm code
  expects to be receiving a pair, which can be passed to ldd/std.)

Also adds a bunch of test cases covering all the bugs I've added along
the way.

Differential Revision: http://reviews.llvm.org/D8713

llvm-svn: 244484
2015-08-10 19:11:39 +00:00
Chad Rosier
e027a2d1cc [AArch64] Convert a conditional check that will always be true to an assert. NFC.
llvm-svn: 244479
2015-08-10 18:42:45 +00:00
Igor Laevsky
dc6b4a78a4 [IndVarSimplify] Make cost estimation in RewriteLoopExitValues smarter
Differential Revision: http://reviews.llvm.org/D11687

llvm-svn: 244474
2015-08-10 18:23:58 +00:00
Mark Heffernan
ba9e336c90 Add new llvm.loop.unroll.enable metadata.
This change adds the unroll metadata "llvm.loop.unroll.enable" which directs
the optimizer to unroll a loop fully if the trip count is known at compile time, and
unroll partially if the trip count is not known at compile time. This differs from
"llvm.loop.unroll.full" which explicitly does not unroll a loop if the trip count is not
known at compile time.

The "llvm.loop.unroll.enable" is intended to be added for loops annotated with
"#pragma unroll".

llvm-svn: 244466
2015-08-10 17:28:08 +00:00
Chad Rosier
216eb1ed4b Typo. Move comment closer to relevant code. NFC.
llvm-svn: 244465
2015-08-10 17:17:19 +00:00
Sanjay Patel
d654d315bb fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244464
2015-08-10 17:15:17 +00:00
Sanjay Patel
5fcdfefe10 fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244463
2015-08-10 17:00:44 +00:00
Sanjay Patel
c0b2c539ba fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244460
2015-08-10 16:47:47 +00:00
Sanjay Patel
c41be0722a fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244458
2015-08-10 16:43:20 +00:00
Yaron Keren
b598ba7c7c Add missing include guard to FuzzerInternal.h, NFC.
llvm-svn: 244457
2015-08-10 16:37:40 +00:00
Aaron Ballman
04b689be68 Silence a sign mismatch warning; NFC.
llvm-svn: 244452
2015-08-10 15:22:39 +00:00
Silviu Baranga
fccd898d4b [TTI] Add a hook for specifying per-target defaults for Interleaved Accesses
Summary:
This adds a hook to TTI which enables us to selectively turn on by default
interleaved access vectorization for targets on which we have have performed
the required benchmarking.

Reviewers: rengolin

Subscribers: rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D11901

llvm-svn: 244449
2015-08-10 14:50:54 +00:00
Fraser Cormack
c174b92fa2 Prevent the scalarizer from caching incorrect entries
The scalarizer can cache incorrect entries when walking up a chain of
insertelement instructions. This occurs when it encounters more than one
instruction that it is not actively searching for, as it unconditionally caches
every element it finds. The fix is to only cache the first element that it
isn't searching for so we don't overwrite correct entries.

Reviewers: hfinkel

Differential Revision: http://reviews.llvm.org/D11559

llvm-svn: 244448
2015-08-10 14:48:47 +00:00
Michael Kruse
8153cc8a10 [RegionInfo] Add debug-time region viewer functions
Summary:
Analogously to Function::viewCFG(), RegionInfo::view() and RegionInfo::viewOnly() are meant to be called in debugging sessions. They open a viewer to show how RegionInfo currently understands the region hierarchy.

The functions viewRegion(Function*) and viewRegionOnly(Function*) invoke a fresh region analysis of the function in contrast to viewRegion(RegionInfo*) and viewRegionOnly(RegionInfo*) which show the current analysis result.

Reviewers: grosser

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11875

llvm-svn: 244444
2015-08-10 13:21:59 +00:00
Michael Kruse
3373ce735a [RegionInfo] Use RegionInfo* instead of RegionInfoPass* as graph type
This allows printing region graphs when only the RegionInfo (e.g. Region::getRegionInfo()), but no RegionInfoPass object is available.

Specifically, we will use this to print RegionInfo graphs in the debugger.

Differential version: http://reviews.llvm.org/D11874

Reviewed-by: grosser
llvm-svn: 244442
2015-08-10 12:57:23 +00:00
Robert Lougher
ac5d349432 Trace copies when checking for rematerializability in spill weight calculation
PR24139 contains an analysis of poor register allocation. One of the findings
was that when calculating the spill weight, a rematerializable interval once
split is no longer rematerializable. This is because the isRematerializable
check in CalcSpillWeights.cpp does not follow the copies introduced by live
range splitting (after splitting, the live interval register definition is a
copy which is not rematerializable).

Reviewers: qcolombet

Differential Revision: http://reviews.llvm.org/D11686

llvm-svn: 244439
2015-08-10 11:59:44 +00:00
Marina Yatsina
b999526f3d Test commit to verify commit access
llvm-svn: 244438
2015-08-10 11:33:10 +00:00
Yaron Keren
8d83d7181d Rangify for loop, NFC.
llvm-svn: 244434
2015-08-10 07:04:29 +00:00
Saleem Abdulrasool
e810d83543 X86: remove a dead store (NFC)
The SP was always unconditionally assigned to later, but initialised early.
This delays the initialisation, and avoids the dead store.  Identified by
clang static analysis.  No functional change intended.

llvm-svn: 244423
2015-08-09 20:39:09 +00:00
Adam Nemet
4451b15c71 [LAA] Remove unused pointer partition argument from needsChecking(), NFC
This is no longer used in any of the callers.  Also remove the logic of
handling this argument.

llvm-svn: 244421
2015-08-09 20:06:08 +00:00
Adam Nemet
03ba5040ec [LAA] Remove unused pointer partition argument from generateChecks, NFC
LoopDistribution does its own filtering now.

llvm-svn: 244420
2015-08-09 20:06:06 +00:00
David Majnemer
3ac7e03604 [PHITransAddr] Don't assume that instruction operands are translatable
We can only PHI translate instructions.  In our attempt to PHI translate
a bitcast, we attempt to translate its operand; however, the operand
might be an argument or a global instead of an instruction.  Benignly
bail out when this happens.

This fixes PR24397.

Differential Revision: http://reviews.llvm.org/D11879

llvm-svn: 244418
2015-08-09 15:43:02 +00:00
Sanjay Patel
17b3c58695 [x86] enable machine combiner reassociations for 128-bit vector single/double adds
llvm-svn: 244403
2015-08-08 19:08:20 +00:00
Benjamin Kramer
69a3fdb314 Fix some comment typos.
llvm-svn: 244402
2015-08-08 18:27:36 +00:00
Craig Topper
09426f8318 [X86] Add ADX and RDSEED to Skylake processor.
llvm-svn: 244396
2015-08-08 07:31:15 +00:00
Craig Topper
f9b8d39b42 Add SlowBTMem to Sandy Bridge and newer Intel CPUs. Reading through Agner Fog's table suggests there have been no improvements to these processors relative to Westmere for bit test instructions.
llvm-svn: 244395
2015-08-08 07:20:04 +00:00
David Majnemer
eabf812a59 [InstCombine] Don't try to sink EH pad instructions
Found by inspection, this change should not effect the existing
landingpad behavior.

llvm-svn: 244391
2015-08-08 03:51:49 +00:00
Craig Topper
5033eede06 Add model numbers for Skylake CPUs and an additional Broadwell model.
llvm-svn: 244385
2015-08-08 01:29:15 +00:00
Craig Topper
b81b70117f Add Intel family 6 model 93 as Silvermont.
llvm-svn: 244384
2015-08-08 01:16:05 +00:00
Tom Stellard
15de4ba0ef AMDGPU/SI: Another attempt to fix Windows bots broken by r244372
llvm-svn: 244383
2015-08-08 01:11:07 +00:00
Matt Arsenault
a7a9e929b5 Remove unnecessary includes
llvm-svn: 244382
2015-08-08 00:41:53 +00:00
Matt Arsenault
04b53db7bf AMDGPU: Implement AMDGPUOperand::print()
llvm-svn: 244381
2015-08-08 00:41:51 +00:00
Matt Arsenault
b0ba266970 AMDGPU/SI: Remove VCCReg
llvm-svn: 244380
2015-08-08 00:41:48 +00:00
Matt Arsenault
ec3023a130 AMDGPU/SI: Remove source uses of VCCReg
llvm-svn: 244379
2015-08-08 00:41:45 +00:00
Tom Stellard
cd6ab5c821 AMDGPU/SI: Attempt to fix Windows bots broken by r244372
llvm-svn: 244376
2015-08-08 00:17:59 +00:00
Rafael Espindola
088669ce42 Convert getSymbolSection to return an ErrorOr.
This function can actually fail since the symbol contains an index to the
section and that can be invalid.

llvm-svn: 244375
2015-08-07 23:27:14 +00:00
Tom Stellard
6f30f702e1 AMDGPU: Add pass to lower OpenCL image and sampler arguments.
The pass adds new kernel arguments for image attributes, and
resolves calls to dummy attribute and resource id getter functions.

Patch by: Zoltan Gilian

llvm-svn: 244372
2015-08-07 23:19:30 +00:00
Adam Nemet
c67a08de7f [LAA] Remove unused pointer partition argument from getNumberOfChecks, NFC
This is unused after filtering checks was moved to the clients.

As a result, we can just return the number of the checks in the
precomputed set.

llvm-svn: 244369
2015-08-07 22:44:21 +00:00
Adam Nemet
36760f098d [LAA] Make the set of runtime checks part of the state of LAA, NFC
This is the full set of checks that clients can further filter. IOW,
it's client-agnostic.  This makes LAA complete in the sense that it now
provides the two main results of its analysis precomputed:

1. memory dependences via getDepChecker().getInsterestingDependences()
2. run-time checks via getRuntimePointerCheck().getChecks()

However, as a consequence we now compute this information pro-actively.
Thus if the client decides to skip the loop based on the dependences
we've computed the checks unnecessarily.  In order to see whether this
was a significant overhead I checked compile time on SPEC2k6 LTO bitcode
files.  The change was in the noise.

The checks are generated in canCheckPtrAtRT, at the same place where we
used to call groupChecks to merge checks.

llvm-svn: 244368
2015-08-07 22:44:15 +00:00
Quentin Colombet
d05bd6348b [AArch64][LoadStoreOptimizer] Turn a test into an assert. NFC.
At this point the given Opc must be valid, otherwise we should
not look for a matching pair to form paired load or store.

Thanks to Chad to point out this piece of code!

llvm-svn: 244366
2015-08-07 22:40:51 +00:00
Tom Stellard
dfc5991339 AMDGPU/SI: Use InstAlias instead of MnemonicAlias for VOPC instructions
Summary:
With InstAlias, we don't need to print the _e32 portion of the mnemonic
when we print the $dst operand.  This change makes it possible to
include vcc in the asm string when we switch VOPC over to having
implicit vcc defs.

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11813

llvm-svn: 244362
2015-08-07 22:00:56 +00:00
Alex Lorenz
7b9ddd3d95 MIR Serialization: Serialize the base alignment for the machine memory operands.
llvm-svn: 244357
2015-08-07 20:48:30 +00:00
Alex Lorenz
39e2231276 MIR Serialization: Serialize the offsets for the machine memory operands.
llvm-svn: 244356
2015-08-07 20:26:52 +00:00
Alex Lorenz
7a90618f38 MIR Parser: Extract the parsing of the operand's offset into a new method. NFC.
This commit extract the code that parses the 64-bit offset from the method
'parseOperandsOffset' to a new method 'parseOffset' so that we can reuse it
when parsing the offset for the machine memory operands.

llvm-svn: 244355
2015-08-07 20:21:00 +00:00
Matt Arsenault
b7690a5199 AMDGPU: Assume SMRD access for constant address space
Since r243294 these are selected to SMRD and
moved later if required.

llvm-svn: 244354
2015-08-07 20:18:34 +00:00
Craig Topper
ff53ab3a44 Add Intel family 6 model 90 as Silvermont. Fixes PR24392.
llvm-svn: 244352
2015-08-07 20:09:42 +00:00
Adam Nemet
59fd1cc3c9 [LAA] Remove unused pointer partition argument from print(), NFC
This is now handled in the client.  No need for LAA to provide this
variant.

llvm-svn: 244349
2015-08-07 19:44:48 +00:00
Chen Li
ef70e358a3 [ConstantFoldTerminator] Preserve make.implicit metadata when converting SwitchInst to BranchInst
Summary: llvm::ConstantFoldTerminator function can convert SwitchInst with single case (and default) to a conditional BranchInst. This patch adds support to preserve make.implicit metadata on this conversion.

Reviewers: sanjoy, weimingz, chenli

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D11841

llvm-svn: 244348
2015-08-07 19:30:12 +00:00
Simon Pilgrim
c31bcce147 [InstCombine] Fix SSE2/AVX2 vector logical shift by constant
This patch fixes the sse2/avx2 vector shift by constant instcombine call to correctly deal with the fact that the shift amount is formed from the entire lower 64-bit and not just the lowest element as it currently assumes.

e.g.

%1 = tail call <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32> %v, <4 x i32> <i32 15, i32 15, i32 15, i32 15>)

In this case, (V)PSRLD doesn't perform a lshr by 15 but in fact attempts to shift by 64424509455 ((15 << 32) | 15) - giving a zero result.

In addition, this review also recognizes shift-by-zero from a ConstantAggregateZero type (PR23821).

Differential Revision: http://reviews.llvm.org/D11760

llvm-svn: 244341
2015-08-07 18:22:50 +00:00
Nico Weber
b3a3522ac1 Add functions to save and restore the PrettyStackTrace state.
PrettyStackTraceHead is a LLVM_THREAD_LOCAL, which means it's just a global
in LLVM_ENABLE_THREADS=NO builds.  If a CrashRecoveryContext is used with
code that uses PrettyStackEntries, and a crash happens, PrettyStackTraceHead is
currently not reset to its pre-crash value.  These functions make it possible
to add a cleanup to such code that does this.

(Not reseting the value then causes the assert in ~PrettyStackTraceEntry() to
fire if the code outside of the CrashRecoveryContext also uses
PrettyStackEntries -- for example, clang when building a module.)

Part of PR11974.

llvm-svn: 244338
2015-08-07 17:47:03 +00:00
Nico Weber
198bc9060d Add a comment.
llvm-svn: 244337
2015-08-07 17:32:06 +00:00
Chad Rosier
defd3944ce [ARM] Remove an unused reference to MachineRegisterInfo. NFC.
llvm-svn: 244334
2015-08-07 17:02:29 +00:00
Tom Stellard
162c8c165f AMDGPU/SI: Use correct encoding of vopc for VI in the assembler
Summary: We were using the SI encoding for VI.

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11812

llvm-svn: 244332
2015-08-07 16:45:33 +00:00
Tom Stellard
ae73825453 AMDGPU/SI: v_mac_legacy_f32 does not exist on VI
Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11810

llvm-svn: 244322
2015-08-07 15:34:30 +00:00
Tom Stellard
9bf2e6849b AMDGPU/SI: Remove unused outs parameter from VOPC TableGen classes
Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11809

llvm-svn: 244321
2015-08-07 15:34:27 +00:00
Rafael Espindola
0d29f912e2 Add dynamic_table iterators back to ELF.h.
In tree they are only used by llvm-readobj, but it is also used by
https://github.com/mono/CppSharp.

While at it, add some missing error checking.

llvm-svn: 244320
2015-08-07 15:25:20 +00:00
Frederic Riss
d75746ed42 [MC/Dwarf] Allow to specify custom parameters for linetable emission.
NFC patch for current users, but llvm-dsymutil will use the new
functionality to adapt to the input linetable.

Based on a patch by Adrian Prantl.

llvm-svn: 244318
2015-08-07 15:14:08 +00:00
Silviu Baranga
0957d36dd5 Fix unused variable warning introduced in r244314
llvm-svn: 244315
2015-08-07 12:05:46 +00:00
Silviu Baranga
f1f3f2018c [ARM] Update ReconstructShuffle to handle mismatched types
Summary:
Port the ReconstructShuffle function from AArch64 to ARM
to handle mismatched incoming types in the BUILD_VECTOR
node.

This fixes an outstanding FIXME in the ReconstructShuffle
code.

Reviewers: t.p.northover, rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D11720

llvm-svn: 244314
2015-08-07 11:40:46 +00:00
John Brawn
093484721d Revert "Make global aliases have symbol size equal to their type"
This reverts r242520, as it caused pr24379. Also removes part of the test added
by r243874 that checks the size of alias symbols.

llvm-svn: 244313
2015-08-07 10:56:21 +00:00
NAKAMURA Takumi
dc1fe4b07d ShrinkWrap.cpp: Tweak r244235 for a non-functional member, PredicateFtor. [-Wdocumentation]
llvm-svn: 244309
2015-08-07 07:40:23 +00:00
JF Bastien
c6f87ed268 WebAssembly: textual emission uses expected opcode names
Summary: WebAssembly's tablegen instructions have the names WebAssembly expects, but by LLVM convention they're uppercase and suffixed with their type after an underscore. Leave the C++ code that way, but print outt he names WebAssembly expects (lowercase, no type). We could teach tablegen to do this later, maybe by using `!cast<string>(node)` in the .td files.

Reviewers: sunfish

Subscribers: jfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D11776

llvm-svn: 244305
2015-08-07 01:57:03 +00:00
Duncan P. N. Exon Smith
5607ee4237 ValueMapper: Resolve uniquing cycles more aggressively
As a follow-up to r244181, resolve uniquing cycles underneath distinct
nodes on the fly.  This prevents uniquing cycles in early operands from
affecting later operands.  It also removes an iteration through distinct
nodes' operands.

No real functional change here, just more prompt resolution of temporary
nodes.

llvm-svn: 244302
2015-08-07 00:44:55 +00:00
Duncan P. N. Exon Smith
c55e88eaa6 ValueMapper: Pull out helper to resolve cycles, NFC
Pull out a helper for resolving uniquing cycles of `Metadata` to remove
the boiler-plate of downcasting to `MDNode`.

llvm-svn: 244301
2015-08-07 00:39:26 +00:00
Alex Lorenz
28c3634e6a MIR Serialization: Fix serialization of unnamed IR block references.
The block address machine operands can reference IR blocks in other functions.
This commit fixes a bug where the references to unnamed IR blocks in other
functions weren't serialized correctly.

llvm-svn: 244299
2015-08-06 23:57:04 +00:00
Alex Lorenz
cc4cd5aa06 MIR Parser: Simplify the token's string value handling.
This commit removes the 'StringOffset' and 'HasStringValue' fields from the
MIToken struct and simplifies the 'stringValue' method which now returns
the new 'StringValue' field.

This commit also adopts a different way of initializing the lexed tokens -
instead of constructing a new MIToken instance, the lexer resets the old token
using the new 'reset' method and sets its attributes using the new
'setStringValue', 'setOwnedStringValue', and 'setIntegerValue' methods.

Reviewers: Sean Silva

Differential Revision: http://reviews.llvm.org/D11792

llvm-svn: 244295
2015-08-06 23:17:42 +00:00
Juergen Ributzka
fb208d24f5 [AArch64][FastISel] Always use AND before checking the branch flag.
When we are not emitting the condition for the branch, because the condition is
in another BB or SDAG did the selection for us, then we have to mask the flag in
the register with AND.

This is required when the condition comes from a truncate, because SDAG only
truncates down to a legal size of i32.

This fixes rdar://problem/22161062.

llvm-svn: 244291
2015-08-06 22:44:15 +00:00
Juergen Ributzka
0453fd2de1 Revert "[AArch64][FastISel] Add more truncation tests." and "[AArch64][FastISel] Always use an AND instruction when truncating to non-legal types."
This reverts commit r243198 and 243304.

Turns out this wasn't the correct fix for this problem. It works only within
FastISel, but fails when the truncate is selected by SDAG.

llvm-svn: 244287
2015-08-06 22:13:48 +00:00
David Majnemer
16f420a9d2 Revert accidentally committed WinEHPrepare changes
This reverts commit r244272, r244273, r244274, and r244275.

llvm-svn: 244278
2015-08-06 21:13:51 +00:00
David Majnemer
115b42e41a [IR] Remove TerminateInst's "NameStr" argument
TerminateInst can't have a name because it doesn't produce a result.  No
functionality change is intended, this is just a cleanup.

llvm-svn: 244276
2015-08-06 21:08:36 +00:00
David Majnemer
e856f2df71 PHIs don't need to be postprocessed
llvm-svn: 244275
2015-08-06 21:08:34 +00:00
David Majnemer
16b27ceef8 Handle PHI nodes prefacing EH pads too
llvm-svn: 244274
2015-08-06 21:08:32 +00:00
David Majnemer
123d7d227a handle phi nodes
llvm-svn: 244273
2015-08-06 21:08:30 +00:00
David Majnemer
e7cdbd1fa7 [WinEHPrepare] Add rudimentary support for the new EH instructions
Summary:
This adds somewhat basic preparation functionality including:
- Formation of funclets via coloring basic blocks.
- Cloning of polychromatic blocks to ensure that funclets have unique
  program counters.
- Demotion of values used between different funclets.
- Some amount of cleanup once we have removed predecessors from basic
  blocks.
- Verification that we are left with a CFG that makes some amount of
  sense.

N.B. Arguments and numbering still need to be done.

Reviewers: rnk, JosephTremoulet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11750

llvm-svn: 244272
2015-08-06 21:07:55 +00:00
Frederic Riss
6027545a69 Thread premissions through sys::fs::create_director{y|ies}
llvm-svn: 244268
2015-08-06 21:04:55 +00:00
Sanjoy Das
85ee23b440 [IndVars] Fix PR24356.
Unsigned predicates increase or decrease agnostic of the signs of their
increments.

llvm-svn: 244265
2015-08-06 20:43:41 +00:00
Sanjoy Das
46a633bcfd [IndVars] Improved logging under DEBUG(); NFC.
Before this, we'd print the modified comparision in the "Simplified
comparison" case.  That looked misleading.

llvm-svn: 244264
2015-08-06 20:43:28 +00:00
Pete Cooper
598f1f2fd1 Convert a bunch of loops to foreach. NFC.
After r244074, we now have a successors() method to iterate over
all the successors of a TerminatorInst.  This commit changes a bunch
of eligible loops to use it.

llvm-svn: 244260
2015-08-06 20:22:46 +00:00
Tom Stellard
17adc6509b AMDGPU/SI: Add Fiji support
Patch by: Alex Deucher

llvm-svn: 244255
2015-08-06 19:43:02 +00:00
Tom Stellard
544c5ba738 AMDGPU/SI: Add support for 32-bit immediate SMRD offsets on CI
Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11604

llvm-svn: 244254
2015-08-06 19:28:38 +00:00
Tom Stellard
5d9e608825 AMDGPU/SI: Use ComplexPatterns for SMRD addressing modes
Summary: This allows us to consolidate several of the TableGen patterns.

Reviewers: arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11602

llvm-svn: 244253
2015-08-06 19:28:30 +00:00
Nico Weber
4a603af6f2 Fix nested CrashRecoveryContexts with LLVM_ENABLE_THREADS=OFF, allow them.
libclang uses a CrashRecoveryContext, and building a module does too. If a
module gets built through libclang, nested CrashRecoveryContexts are used.  They
work fine with threads as things are stored in ThreadLocal variables, but in
LLVM_ENABLE_THREADS=OFF builds the two recovery contexts would write to the
same globals.

To fix, keep active CrashRecoveryContextImpls in a list and have the global
point to the innermost one, and do something similar for
tlIsRecoveringFromCrash.

Necessary (but not sufficient) for PR11974 and PR20325

http://reviews.llvm.org/D11770

llvm-svn: 244251
2015-08-06 19:21:25 +00:00
Kostya Serebryany
90b784ccc2 [libFuzzer] move the mutators to public interface so that custom mutators may reuse these functions directly
llvm-svn: 244250
2015-08-06 19:19:55 +00:00
Nico Rieck
4acab3dcc9 Rename inst_range() to instructions() for consistency. NFC
llvm-svn: 244248
2015-08-06 19:10:45 +00:00
Kit Barton
5eb6e29fcf Fix possible infinite loop in shrink wrapping when searching for save/restore
points.

There is an infinite loop that can occur in Shrink Wrapping while searching 
for the Save/Restore points. 

Part of this search checks whether the save/restore points are located in
different loop nests and if so, uses the (post) dominator trees to find the
immediate (post) dominator blocks. However, if the current block does not have
any immediate (post) dominators then this search will result in an infinite
loop. This can occur in code containing an infinite loop.

The modification checks whether the immediate (post) dominator is different from
the current save/restore block. If it is not, then the search terminates and the
current location is not considered as a valid save/restore point for shrink wrapping.

Phabricator: http://reviews.llvm.org/D11607
llvm-svn: 244247
2015-08-06 19:01:57 +00:00
Peter Collingbourne
93aa8db8d4 LibDriver: Replace references to lld-link2 with lld-link.
llvm-svn: 244246
2015-08-06 19:00:42 +00:00
Quentin Colombet
4323bdacbd [Reassociation] Fix miscompile for va_arg arguments.
iisUnmovableInstruction() had a list of instructions hardcoded which are
considered unmovable. The list lacked (at least) an entry for the va_arg
and cmpxchg instructions.
Fix this by introducing a new Instruction::mayBeMemoryDependent()
instead of maintaining another instruction list.

Patch by Matthias Braun <matze@braunis.de>.

Differential Revision: http://reviews.llvm.org/D11577

rdar://problem/22118647

llvm-svn: 244244
2015-08-06 18:44:34 +00:00
Alex Lorenz
8ecabffd63 MIR Parser: Report an error when parsing duplicate memory operand flags.
llvm-svn: 244240
2015-08-06 18:26:36 +00:00
Cong Hou
f7490fad1b Revert r244154 which causes some build failure. See https://llvm.org/bugs/show_bug.cgi?id=24377.
llvm-svn: 244239
2015-08-06 18:17:29 +00:00
Kit Barton
05694fa00a This patch changes the interface to enable the shrink wrapping optimization.
It adds a new constructor, which takes a std::function predicate function that
is run at the beginning of shrink wrapping to determine whether the optimization
should run on the given machine function. The std::function can be overridden by
each target, allowing target-specific decisions to be made on each machine
function.

This is necessary for PowerPC, as the decision to run shrink wrapping is
partially based on the ABI. Futhermore, this operates nicely with the GCC iFunc
capability, which allows option overrides on a per-function basis.

Phabricator: http://reviews.llvm.org/D11421
llvm-svn: 244235
2015-08-06 18:02:53 +00:00
Chad Rosier
9a57801f77 [AArch64] Use a static function and other minor cleanup for readability. NFC.
llvm-svn: 244233
2015-08-06 17:37:18 +00:00
Alex Lorenz
0d1da50e29 MIR Serialization: Serialize the 'invariant' machine memory operand flag.
llvm-svn: 244230
2015-08-06 16:55:53 +00:00
Richard Diamond
eb6992a477 Fix an alignment error in llvm::expandAtomicRMWToCmpXchg without breaking the build where X86 isn't enabled.
Summary: Divide the primitive size in bits by eight so the initial load's alignment is in bytes as expected. Tested with the included unit test.

Reviewers: rengolin, jfb

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11804

llvm-svn: 244229
2015-08-06 16:55:03 +00:00
Alex Lorenz
b105a060cc MIR Serialization: Serialize the 'non-temporal' machine memory operand flag.
llvm-svn: 244228
2015-08-06 16:49:30 +00:00
Chad Rosier
e41a0f9fda [AArch64] Improve the readability of the ld/st optimization pass. NFC.
llvm-svn: 244222
2015-08-06 15:50:12 +00:00
Douglas Katzman
3a696c107c [SPARC] Don't compare arch name as a string, use the enum instead.
Fixes PR22695

llvm-svn: 244221
2015-08-06 15:44:12 +00:00
Renato Golin
d87c97f34a Revert "Divide the primitive size in bits by eight so the initial load's alignment is in bytes as expected. Tested with the included unit test."
This reverts commit r244155, as it was breaking the buildbots for too long.
Should be reapplied with proper fix.

llvm-svn: 244205
2015-08-06 10:37:59 +00:00
NAKAMURA Takumi
db476d2a29 llvm/lib/IR/AttributeImpl.h: Move comment block not to cover typedef, introduced in r244164. [-Wdocumentation]
llvm-svn: 244204
2015-08-06 09:49:17 +00:00
Michael Liao
5abd44bd1a Removing tailing whitespaces
llvm-svn: 244203
2015-08-06 09:06:20 +00:00
Michael Kuperstein
a32d396d60 [X86] Improve EmitLoweredSelect for contiguous CMOV pseudo instructions.
This change improves EmitLoweredSelect() so that multiple contiguous CMOV pseudo
instructions with the same (or exactly opposite) conditions get lowered using a single
new basic-block. This eliminates unnecessary extra basic-blocks (and CFG merge points)
when contiguous CMOVs are being lowered.

Patch by: kevin.b.smith@intel.com
Differential Revision: http://reviews.llvm.org/D11428

llvm-svn: 244202
2015-08-06 08:45:34 +00:00
Chandler Carruth
6cbdb95607 [PM/AA] Clean up and homogenize comments throughout basic-aa.
llvm-svn: 244200
2015-08-06 08:17:06 +00:00
Chandler Carruth
5ffeb1ea3f [PM/AA] Run clang-format over all of basic-aa before making more
substantive edits.

llvm-svn: 244198
2015-08-06 07:57:58 +00:00
Chandler Carruth
a0655c50ee [PM/AA] Hoist the interface for BasicAA into a header file.
This is the first mechanical step in preparation for making this and all
the other alias analysis passes available to the new pass manager. I'm
factoring out all the totally boring changes I can so I'm moving code
around here with no other changes. I've even minimized the formatting
churn.

I'll reformat and freshen comments on the interface now that its located
in the right place so that the substantive changes don't triger this.

llvm-svn: 244197
2015-08-06 07:33:15 +00:00
Peter Collingbourne
523abeacd5 COFF: Assign the correct symbol type to internal functions.
The COFFSymbolRef::isFunctionDefinition() function tests for several conditions
that are not related to whether a symbol is a function, but rather whether
the symbol meets the requirements for a function definition auxiliary record,
which excludes certain symbols such as internal functions and undefined
references. The test we need to determine the symbol type is much simpler:
we only need to compare the complex type against IMAGE_SYM_DTYPE_FUNCTION.

llvm-svn: 244195
2015-08-06 05:26:35 +00:00
Chandler Carruth
595690977b [PM/AA] Simplify the AliasAnalysis interface by removing a wrapper
around a DataLayout interface in favor of directly querying DataLayout.

This wrapper specifically helped handle the case where this no
DataLayout, but LLVM now requires it simplifynig all of this. I've
updated callers to directly query DataLayout. This in turn exposed
a bunch of places where we should have DataLayout readily available but
don't which I've fixed. This then in turn exposed that we were passing
DataLayout around in a bunch of arguments rather than making it readily
available so I've also fixed that.

No functionality changed.

llvm-svn: 244189
2015-08-06 02:05:46 +00:00
Kostya Serebryany
acf2228ee8 [libFuzzer] add one more mutation strategy: byte shuffling
llvm-svn: 244188
2015-08-06 01:29:13 +00:00
Alex Lorenz
b06d114835 MIR Serialization: Initial serialization of the machine operand target flags.
This commit implements the initial serialization of the machine operand target
flags. It extends the 'TargetInstrInfo' class to add two new methods that help
to provide text based serialization for the target flags.

This commit can serialize only the X86 target flags, and the target flags for
the other targets will be serialized in the follow-up commits.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 244185
2015-08-06 00:44:07 +00:00
Duncan P. N. Exon Smith
ee028e8d75 ValueMapper: Rotate distinct node remapping algorithm
Rotate the algorithm for remapping distinct nodes in order to simplify
how uniquing cycles get resolved.  This removes some of the recursion,
and, most importantly, exposes all uniquing cycles at the top-level.
Besides being a little more efficient -- temporary MDNodes won't live as
long -- the clearer logic should help protect against bugs like those
fixed in r243961 and r243976.

What are uniquing cycles?  Why do they present challenges when remapping
metadata?

    !0 = !{!1}
    !1 = !{!0}

!0 and !1 form a simple uniquing cycle.  When remapping from one
metadata graph to another, every uniquing cycle gets "duplicated"
through a dance:

    !0-temp = !{!1?}     ; map(!0): clone !0, VM[!0] = !0-temp
    !1-temp = !{!0?}     ; ..map(!1): clone !1, VM[!1] = !1-temp
    !1-temp = !{!0-temp} ; ..map(!1): remap !1's operands
    !2      = !{!0-temp} ; ..map(!1): uniquify: !1-temp => !2
    !0-temp = !{!2}      ; map(!0): remap !0's operands
    !3      = !{!2}      ; map(!0): uniquify: !0-temp => !3

    ; Result
    !2 = !{!3}
    !3 = !{!2}

(In the two "uniquify" steps above, the operands of !X-temp are compared
to the operands of !X.  If they're the same, then !X-temp gets RAUW'ed
to !X; if they're different, then !X-temp is promoted to a new unique
node.  The latter case always hits in for uniquing cycles, so we
duplicate all the nodes involved.)

Why is this a problem?  Uniquable Metadata nodes that have temporary
node as transitive operands keep RAUW support until the temporary nodes
get finalized.  With non-cycles, this happens automatically: when a
uniquable node's count of unresolved operands drops to zero, it
immediately sheds its own RAUW support (possibly triggering the same in
any node that references it).  However, uniquing cycles create a
reference cycle, and uniqued nodes that transitively reference a
uniquing cycle are "stuck" in an unresolved state until someone calls
`MDNode::resolveCycles()` on a node in the unresolved subgraph.

Distinct nodes should help here (and mostly do): since they aren't
uniqued anywhere, they are guaranteed not to be RAUW'ed.  They
effectively form a barrier between uniqued nodes, breaking some uniquing
cycles, and shielding uniqued nodes from uniquing cycles.

Unfortunately, with this barrier in place, the unresolved subgraph(s)
can be disjoint from the top-level node.  The mapping algorithm needs to
find at least one representative from each disjoint subgraph.  But which
nodes are *stuck*, and which will get resolved automatically?  And which
nodes are in the unresolved subgraph?  The old logic was conservative.

This commit rotates the logic for distinct nodes, so that we have access
to unresolved nodes at the top-level call to `llvm::MapMetadata()`.
Each time we return to the top-level, we know that all temporaries have
been RAUW'ed away.  Here, it's safe (and necessary) to call
`resolveCycles()` immediately on unresolved operands.

This should also perform better than the old algorithm.  The recursion
stack is shorter, temporary nodes don't live as long, and there are
fewer tracking references to unresolved nodes.  As the debug info graph
introduces more 'distinct' nodes, remapping should incrementally get
cheaper and cheaper.

Aside from possible performance improvements (and reduced cruft in the
`LLVMContext`), there should be no functionality change here.

llvm-svn: 244181
2015-08-05 23:52:42 +00:00
Kostya Serebryany
c721977710 [libFuzzer] avoid build warnings in non-assert build (useful warning in this case)
llvm-svn: 244177
2015-08-05 23:44:42 +00:00
Wei Mi
3d7c898206 Add a stat to show how often the limit to decompose GEPs in BasicAA is reached.
Differential Revision: http://reviews.llvm.org/D9689

llvm-svn: 244174
2015-08-05 23:40:30 +00:00
Duncan P. N. Exon Smith
41ec46939f ValueMapper: Simplify remap() helper function, NFC
Rename `remap()` to `remapOperands()`, and restrict its contract to
remapping operands.  Previously, it also called `mapToMetadata()`, but
this logic is hard to reason about externally.  In particular, this
refactors `mapUniquedNode()` to avoid redundant mapping calls, taking
advantage of the RAUWs that are already in place.

llvm-svn: 244168
2015-08-05 23:22:34 +00:00
JF Bastien
3c8571deeb x86: NFC remove needless InstrCompiler cast
Summary: The casts from String to PatFrag weren't needed if we instead provided an SDNode. This fix was suggested by @pete in D11382.

Subscribers: pete, llvm-commits

Differential Revision: http://reviews.llvm.org/D11788

llvm-svn: 244167
2015-08-05 23:15:37 +00:00
Bjarke Hammersholt Roune
f7c0d778e2 [NVPTX] Use LDG for pointer induction variables.
More specifically, make NVPTXISelDAGToDAG able to emit cached loads (LDG) for pointer induction variables.

Also fix latent bug where LDG was not restricted to kernel functions. I believe that this could not be triggered so far since we do not currently infer that a pointer is global outside a kernel function, and only loads of global pointers are considered for cached loads.

llvm-svn: 244166
2015-08-05 23:11:57 +00:00
Kostya Serebryany
4338e69a99 [libFuzzer] in dfsan mode, set labels every time we start recording traces as opposed to doing it at process startup. This ensures that the labels are fresh.
llvm-svn: 244165
2015-08-05 23:02:57 +00:00
James Y Knight
45f6b5bc69 Add a TrailingObjects template class.
This is intended to help support the idiom of a class that has some
other objects (or multiple arrays of different types of objects)
appended on the end, which is used quite heavily in clang.

Differential Revision: http://reviews.llvm.org/D11272

llvm-svn: 244164
2015-08-05 22:57:34 +00:00
Reid Kleckner
f884663717 If the "CodeView" module flag is set, emit codeview instead of DWARF
Summary:
Emit both DWARF and CodeView if "CodeView" and "Dwarf Version" module
flags are set.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11756

llvm-svn: 244158
2015-08-05 22:26:20 +00:00
Alex Lorenz
3768a6786c MIR Serialization: Serialize the machine operand's offset.
This commit serializes the offset for the following operands: target index,
global address, external symbol, constant pool index, and block address.

llvm-svn: 244157
2015-08-05 22:26:15 +00:00
Richard Diamond
688e7b03a7 Divide the primitive size in bits by eight so the initial load's alignment is in
bytes as expected. Tested with the included unit test.

llvm-svn: 244155
2015-08-05 22:10:57 +00:00
Cong Hou
26ec441a7b Record whether the weights on out-edges from a MBB are normalized.
1. Create a utility function normalizeEdgeWeights() in MachineBranchProbabilityInfo that normalizes a list of edge weights so that the sum of then can fit in uint32_t.
2. Provide an interface in MachineBasicBlock to normalize its successors' weights.
3. Add a flag in MachineBasicBlock that tracks whether its successors' weights are normalized.
4. Provide an overload of getSumForBlock that accepts a non-const pointer to a MBB so that it can force normalizing this MBB's successors' weights.
5. Update several uses of getSumForBlock() by eliminating the once needed weight scale.

Differential Revision: http://reviews.llvm.org/D11442

llvm-svn: 244154
2015-08-05 22:01:20 +00:00
Kostya Serebryany
80051e17c0 [libFuzzer] add option -report_slow_units=Nsec to control when slow units are printed
llvm-svn: 244152
2015-08-05 21:43:48 +00:00
Kostya Serebryany
5be4cb583e [libFuzzer] add a missing test file
llvm-svn: 244151
2015-08-05 21:32:13 +00:00
David Blaikie
9b9c39d803 -Wdeprecated: Remove some dead code that was relying on a questionable (rule-of-3-violating) copy ctor in MCInstPrinter
llvm-svn: 244133
2015-08-05 21:15:48 +00:00