llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00

Author	SHA1	Message	Date
Alexander Potapenko	44a90df4d9	Revert r223239, which broke some bots. llvm-svn: 223240	2014-12-03 16:03:08 +00:00
Alexander Potapenko	80e9b87af0	Fix the metadata number used by llvm.gcov to match the number of the inserted metadata node. llvm-svn: 223239	2014-12-03 15:15:58 +00:00
Erik Eckstein	b61d06dbaf	InstCombine: simplify signed range checks Try to convert two compares of a signed range check into a single unsigned compare. Examples: (icmp sge x, 0) & (icmp slt x, n) --> icmp ult x, n (icmp slt x, 0) \| (icmp sgt x, n) --> icmp ugt x, n llvm-svn: 223224	2014-12-03 10:39:15 +00:00
Hal Finkel	e95528845d	[PowerPC] Print all inline-asm consts as signed numbers Almost all immediates in PowerPC assembly (both 32-bit and 64-bit) are signed numbers, and it is important that we print them as such. To make sure that happens, we change PPCTargetLowering::LowerAsmOperandForConstraint so that it does all intermediate checks on a signed-extended int64_t value, and then creates the resulting target constant using MVT::i64. This will ensure that all negative values are printed as negative values (mirroring what is done in other backends to achieve the same sign-extension effect). This came up in the context of inline assembly like this: "add%I2 %0,%0,%2", ..., "Ir"(-1ll) where we used to print: addi 3,3,4294967295 and gcc would print: addi 3,3,-1 and gas accepts both forms, but our builtin assembler (correctly) does not. Now we print -1 like gcc does. While here, I replaced a bunch of custom integer checks with isInt<16> and friends from MathExtras.h. Thanks to Paul Hargrove for the bug report. llvm-svn: 223220	2014-12-03 09:37:50 +00:00
Charlie Turner	ab73ef8264	Emit ABI_FP_rounding attribute. LLVM understands a -enable-sign-dependent-rounding-fp-math codegen option. When the user has specified this option, the Tag_ABI_FP_rounding attribute should be emitted with value 1. This option currently does not appear to disable transformations and optimizations that assume default floating point rounding behavior, AFAICT, but the intention should be recorded in the build attributes, regardless of what the compiler actually does with the intention. Change-Id: If838578df3dc652b6f2796b8d152545674bcb30e llvm-svn: 223218	2014-12-03 08:12:26 +00:00
Charlie Turner	7f4e539ba5	Add tests for default value of Tag_ABI_FP_rounding. Change-Id: I051866d073fc6ce87ce3e693a3762da6d81f4393 llvm-svn: 223217	2014-12-03 07:59:50 +00:00
Rafael Espindola	02dc2705ae	Ask the module for its the identified types. When lazy reading a module, the types used in a function will not be visible to a TypeFinder until the body is read. This patch fixes that by asking the module for its identified struct types. If a materializer is present, the module asks it. If not, it uses a TypeFinder. This fixes pr21374. I will be the first to say that this is ugly, but it was the best I could find. Some of the options I looked at: * Asking the LLVMContext. This could be made to work for gold, but not currently for ld64. ld64 will load multiple modules into a single context before merging them. This causes us to see types from future merges. Unfortunately, MappedTypes is not just a cache when it comes to opaque types. Once the mapping has been made, we have to remember it for as long as the key may be used. This would mean moving MappedTypes to the Linker class and having to drop the Linker::LinkModules static methods, which are visible from C. * Adding an option to ignore function bodies in the TypeFinder. This would fix the PR by picking the worst result. It would work, but unfortunately we are currently quite dependent on the upfront type merging. I will try to reduce our dependency, but it is not clear that we will be able to get rid of it for now. The only clean solution I could think of is making the Module own the types. This would have other advantages, but it is a much bigger change. I will propose it, but it is nice to have this fixed while that is discussed. With the gold plugin, this patch takes the number of types in the LTO clang binary from 52817 to 49669. llvm-svn: 223215	2014-12-03 07:18:23 +00:00
Matt Arsenault	0533e2261e	R600/SI: Remove i1 pseudo VALU ops Select i1 logical ops directly to 64-bit SALU instructions. Vector i1 values are always really in SGPRs, with each bit for each item in the wave. This saves about 4 instructions when and/or/xoring any condition, and also helps write conditions that need to be passed in vcc. This should work correctly now that the SGPR live range fixing pass works. More work is needed to eliminate the VReg_1 pseudo regclass and possibly the entire SILowerI1Copies pass. llvm-svn: 223206	2014-12-03 05:22:35 +00:00
Tom Stellard	213a8062a7	StructurizeCFG: Use LoopInfo analysis for better loop detection We were assuming that each back-edge in a region represented a unique loop, which is not always the case. We need to use LoopInfo to correctly determine which back-edges are loops. llvm-svn: 223199	2014-12-03 04:28:32 +00:00
Tom Stellard	a8af03e062	R600/SI: Enable inline assembly We just needed to remove the assertion in AMDGPURegisterInfo::getFrameRegister(), which is called when initializing the parser for inline assembly. llvm-svn: 223197	2014-12-03 04:08:00 +00:00
Matt Arsenault	a51749b87e	R600/SI: Change mubuf offsets to print as decimal This matches SC's behavior. llvm-svn: 223194	2014-12-03 03:12:13 +00:00
Nick Lewycky	be2ab4ddd3	Emit the entry block first and the exit block second, then all the blocks in between afterwards. This is what gcc always does, and some out of tree tools depend on that. llvm-svn: 223193	2014-12-03 02:45:01 +00:00
Peter Collingbourne	837799f13b	Prologue support Patch by Ben Gamari! This redefines the `prefix` attribute introduced previously and introduces a `prologue` attribute. There are a two primary usecases that these attributes aim to serve, 1. Function prologue sigils 2. Function hot-patching: Enable the user to insert `nop` operations at the beginning of the function which can later be safely replaced with a call to some instrumentation facility 3. Runtime metadata: Allow a compiler to insert data for use by the runtime during execution. GHC is one example of a compiler that needs this functionality for its tables-next-to-code functionality. Previously `prefix` served cases (1) and (2) quite well by allowing the user to introduce arbitrary data at the entrypoint but before the function body. Case (3), however, was poorly handled by this approach as it required that prefix data was valid executable code. Here we redefine the notion of prefix data to instead be data which occurs immediately before the function entrypoint (i.e. the symbol address). Since prefix data now occurs before the function entrypoint, there is no need for the data to be valid code. The previous notion of prefix data now goes under the name "prologue data" to emphasize its duality with the function epilogue. The intention here is to handle cases (1) and (2) with prologue data and case (3) with prefix data. References ---------- This idea arose out of discussions[1] with Reid Kleckner in response to a proposal to introduce the notion of symbol offsets to enable handling of case (3). [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-May/073235.html Test Plan: testsuite Differential Revision: http://reviews.llvm.org/D6454 llvm-svn: 223189	2014-12-03 02:08:38 +00:00
Ahmed Bougacha	ac111e2226	[X86][MC] Intel syntax: accept implicit memory operand sizes larger than 80. The X86AsmParser intel handling was refactored in r216481, making it try each different memory operand size to see which one matches. Operand sizes larger than 80 ("[xyz]mmword ptr") were forgotten, which led to an "invalid operand" error for code such as: movdqa [rax], xmm0 llvm-svn: 223187	2014-12-03 02:03:26 +00:00
Hal Finkel	2b306926ed	[PowerPC] Fix readcyclecounter to be custom expanded for all 32-bit targets We need to use the custom expansion of readcyclecounter on all 32-bit targets (even those with 64-bit registers). This should fix the ppc64 buildbot. llvm-svn: 223182	2014-12-03 00:19:17 +00:00
Tim Northover	0c91cccef8	AArch64: strengthen Darwin ABI alignment assumptions A global variable without an explicit alignment specified should be assumed to be ABI-aligned according to its type, like on other platforms. This allows us to use better memory operations when accessing it. rdar://18533701 llvm-svn: 223180	2014-12-02 23:53:43 +00:00
Tim Northover	85149ea9e5	AArch64: don't be too greedy when folding :lo12: accesses into mem ops. This frequently leads to cases like: ldr xD, [xN, :lo12:var] add xA, xN, :lo12:var ldr xD, [xA, #8] where the ADD would have been needed anyway, and the two distinct addressing modes can prevent the formation of an ldp. Because of how we handle ADRP (aggressively forming an ADRP/ADD pseudo-inst at ISel time), this pattern also results in duplicated ADRP instructions (one on its own to cover the ldr, and one combined with the add). llvm-svn: 223172	2014-12-02 23:13:39 +00:00
Michael Zolotukhin	37717c3cf1	PR21302. Vectorize only bottom-tested loops. rdar://problem/18886083 llvm-svn: 223171	2014-12-02 22:59:06 +00:00
Michael Zolotukhin	ce3e203aab	Apply loop-rotate to several vectorizer tests. Such loops shouldn't be vectorized due to the loops form. After applying loop-rotate (+simplifycfg) the tests again start to check what they are intended to check. llvm-svn: 223170	2014-12-02 22:59:02 +00:00
Simon Pilgrim	12f78b2c48	[X86][SSE] Keep 4i32 vector insertions in integer domain on SSE4.1 targets 4i32 shuffles for single insertions into zero vectors lowers to X86vzmovl which was using (v)blendps - causing domain switch stalls. This patch fixes this by using (v)pblendw instead. The updated tests on test/CodeGen/X86/sse41.ll still contain a domain stall due to the use of insertps - I'm looking at fixing this in a future patch. Differential Revision: http://reviews.llvm.org/D6458 llvm-svn: 223165	2014-12-02 22:31:23 +00:00
Hal Finkel	337f550328	[PowerPC] Implement readcyclecounter for PPC32 We've long supported readcyclecounter on PPC64, but it is easier there (the read of the 64-bit time-base register can be accomplished via a single instruction). This now provides an implementation for PPC32 as well. On PPC32, the time-base register is still 64 bits, but can only be read 32 bits at a time via two separate SPRs. The ISA manual explains how to do this properly (it involves re-reading the upper bits and looping if the counter has wrapped while being read). This requires PPC to implement a custom integer splitting legalization for the READCYCLECOUNTER node, turning it into a target-specific SDAG node, which then gets turned into a pseudo-instruction, which is then expanded to the necessary sequence (which has three SPR reads, the comparison and the branch). Thanks to Paul Hargrove for pointing out to me that this was still unimplemented. llvm-svn: 223161	2014-12-02 22:01:00 +00:00
Lang Hames	8e0b335b01	[AArch64][Stackmaps] Optimize stackmap shadows on AArch64. Reduce the number of nops emitted for stackmap shadows on AArch64 by counting non-stackmap instructions up to the next branch target towards the requested shadow. <rdar://problem/14959522> llvm-svn: 223156	2014-12-02 21:36:24 +00:00
Tom Stellard	07557193b4	R600/SI: Move more information into SIProgramInfo struct llvm-svn: 223154	2014-12-02 21:28:53 +00:00
Matt Arsenault	0ddaaf1c85	R600: Cleanup some tests and add missing testcases llvm-svn: 223151	2014-12-02 21:02:20 +00:00
Daniel Sanders	8a04298b22	[mips] Fix passing of small structures for big-endian O32. Summary: Like N32/N64, they must be passed in the upper bits of the register. The new code could be merged with the existing if-statements but I've refrained from doing this since it will make porting the O32 implementation to tablegen harder later. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6463 llvm-svn: 223148	2014-12-02 20:40:27 +00:00
Roman Divacky	3adb65e8f7	Introduce CPUStringIsValid() into MCSubtargetInfo and use it for ARM .cpu parsing. Previously .cpu directive in ARM assembler didnt switch to the new CPU and therefore acted as a nop. This implemented real action for .cpu and eg. allows to assembler FreeBSD kernel with -integrated-as. llvm-svn: 223147	2014-12-02 20:03:22 +00:00
Philip Reames	02104421ff	[Statepoints 3/4] Statepoint infrastructure for garbage collection: SelectionDAGBuilder This is the third patch in a small series. It contains the CodeGen support for lowering the gc.statepoint intrinsic sequences (223078) to the STATEPOINT pseudo machine instruction (223085). The change also includes the set of helper routines and classes for working with gc.statepoints, gc.relocates, and gc.results since the lowering code uses them. With this change, gc.statepoints should be functionally complete. The documentation will follow in the fourth change, and there will likely be some cleanup changes, but interested parties can start experimenting now. I'm not particularly happy with the amount of code or complexity involved with the lowering step, but at least it's fairly well isolated. The statepoint lowering code is split into it's own files and anyone not working on the statepoint support itself should be able to ignore it. During the lowering process, we currently spill aggressively to stack. This is not entirely ideal (and we have plans to do better), but it's functional, relatively straight forward, and matches closely the implementations of the patchpoint intrinsics. Most of the complexity comes from trying to keep relocated copies of values in the same stack slots across statepoints. Doing so avoids the insertion of pointless load and store instructions to reshuffle the stack. The current implementation isn't as effective as I'd like, but it is functional and 'good enough' for many common use cases. In the long term, I'd like to figure out how to integrate the statepoint lowering with the register allocator. In principal, we shouldn't need to eagerly spill at all. The register allocator should do any spilling required and the statepoint should simply record that fact. Depending on how challenging that turns out to be, we may invest in a smarter global stack slot assignment mechanism as a stop gap measure. Reviewed by: atrick, ributzka llvm-svn: 223137	2014-12-02 18:50:36 +00:00
Bruno Cardoso Lopes	fda1c647dd	[SwitchLowering] Handle destinations on multiple phi instructions Follow up from r222926. Also handle multiple destinations from merged cases on multiple and subsequent phi instructions. rdar://problem/19106978 llvm-svn: 223135	2014-12-02 18:31:53 +00:00
Ahmed Bougacha	3b0fbe2e87	[MachineCSE] Clear kill-flag on registers imp-def'd by the CSE'd instruction. Go through implicit defs of CSMI and MI, and clear the kill flags on their uses in all the instructions between CSMI and MI. We might have made some of the kill flags redundant, consider: subs ... %NZCV<imp-def> <- CSMI csinc ... %NZCV<imp-use,kill> <- this kill flag isn't valid anymore subs ... %NZCV<imp-def> <- MI, to be eliminated csinc ... %NZCV<imp-use,kill> Since we eliminated MI, and reused a register imp-def'd by CSMI (here %NZCV), that register, if it was killed before MI, should have that kill flag removed, because it's lifetime was extended. Also, add an exhaustive testcase for the motivating example. Reviewed by: Juergen Ributzka <juergen@apple.com> llvm-svn: 223133	2014-12-02 18:09:51 +00:00
Tim Northover	53d419429f	AArch64: make register block rules apply to vector types too. The blocking code originated in ARM, which is more aggressive about casting types to a canonical representative before doing anything else, so I missed out most vector HFAs and broke the ABI. This should fix it. llvm-svn: 223126	2014-12-02 17:15:22 +00:00
Tom Stellard	221c239920	R600/SI: Set the ATC bit on all resource descriptors for the HSA runtime llvm-svn: 223125	2014-12-02 17:05:41 +00:00
Bruno Cardoso Lopes	498cbaf867	[LICM] Avoind store sinking if no preheader is available Load instructions are inserted into loop preheaders when sinking stores and later removed if not used by the SSA updater. Avoid sinking if the loop has no preheader and avoid crashes. This fixes one more side effect of not handling indirectbr instructions properly on LoopSimplify. llvm-svn: 223119	2014-12-02 14:22:34 +00:00
Asiri Rathnayake	9fa2089501	Add support for ARM modified-immediate assembly syntax. Certain ARM instructions accept 32-bit immediate operands encoded as a 8-bit integer value (0-255) and a 4-bit rotation (0-30, even). Current ARM assembly syntax support in LLVM allows the decoded (32-bit) immediate to be specified as a single immediate operand for such instructions: mov r0, #4278190080 The ARMARM defines an extended assembly syntax allowing the encoding to be made more explicit, as in: mov r0, #255, #8 ; (same 32-bit value as above) The behaviour of the two instructions can be different w.r.t flags, which is documented under "Modified immediate constants" in ARMARM. This patch enables support for this extended syntax at the MC layer. llvm-svn: 223113	2014-12-02 10:53:20 +00:00
Charlie Turner	e286c5ea3a	Emit Tag_ABI_FP_denormal correctly in fast-math mode. The default ARM floating-point mode does not support IEEE 754 mode exactly. Of relevance to this patch is that input denormals are flushed to zero. The way in which they're flushed to zero depends on the architecture, * For VFPv2, it is implementation defined as to whether the sign of zero is preserved. * For VFPv3 and above, the sign of zero is always preserved when a denormal is flushed to zero. When FP support has been disabled, the strategy taken by this patch is to assume the software support will mirror the behaviour of the hardware support for the target if it existed. That is, for architectures which can only have VFPv2, it is assumed the software will flush to positive zero. For later architectures it is assumed the software will flush to zero preserving sign. Change-Id: Icc5928633ba222a4ba3ca8c0df44a440445865fd llvm-svn: 223110	2014-12-02 08:22:29 +00:00
Sonam Kumari	c72e2ad442	[signext.ll] Removal Of Duplicate Test Cases Removed the duplicate test case existing in signext.ll file. llvm-svn: 223109	2014-12-02 05:29:47 +00:00
Hal Finkel	f84be383f3	Simplify pointer comparisons involving memory allocation functions System memory allocation functions, which are identified at the IR level by the noalias attribute on the return value, must return a pointer into a memory region disjoint from any other memory accessible to the caller. We can use this property to simplify pointer comparisons between allocated memory and local stack addresses and the addresses of global variables. Neither the stack nor global variables can overlap with the region used by the memory allocator. Fixes PR21556. llvm-svn: 223093	2014-12-01 23:38:06 +00:00
Philip Reames	cbcf6ea7d6	[Statepoints 1/4] Statepoint infrastructure for garbage collection: IR Intrinsics The statepoint intrinsics are intended to enable precise root tracking through the compiler as to support garbage collectors of all types. The addition of the statepoint intrinsics to LLVM should have no impact on the compilation of any program which does not contain them. There are no side tables created, no extra metadata, and no inhibited optimizations. A statepoint works by transforming a call site (or safepoint poll site) into an explicit relocation operation. It is the frontend's responsibility (or eventually the safepoint insertion pass we've developed, but that's not part of this patch series) to ensure that any live pointer to a GC object is correctly added to the statepoint and explicitly relocated. The relocated value is just a normal SSA value (as seen by the optimizer), so merges of relocated and unrelocated values are just normal phis. The explicit relocation operation, the fact the statepoint is assumed to clobber all memory, and the optimizers standard semantics ensure that the relocations flow through IR optimizations correctly. This is the first patch in a small series. This patch contains only the IR parts; the documentation and backend support will be following separately. The entire series can be seen as one combined whole in http://reviews.llvm.org/D5683. Reviewed by: atrick, ributzka llvm-svn: 223078	2014-12-01 21:18:12 +00:00
Jingyue Wu	66d4df8027	[NVPTX] Do not emit .weak symbols for NVPTX Summary: ".weak" symbols cannot be consumed by ptxas (PR21685). This patch makes the weak directive in MCAsmPrinter customizable, and disables emitting ".weak" symbols for NVPTX. Test Plan: weak-linkage.ll Reviewers: jholewinski Reviewed By: jholewinski Subscribers: majnemer, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D6455 llvm-svn: 223077	2014-12-01 21:16:17 +00:00
Reid Kleckner	1591491217	Parse 'ghccc' in .ll files as the GHC convention (cc 10) Previously we just used "cc 10" in the .ll files, but that isn't very human readable. llvm-svn: 223076	2014-12-01 21:04:44 +00:00
Ahmed Bougacha	17e4b2bbd7	[AArch64] Don't combine "select (setcc i1 LHS, RHS), vL, vR". r208210 introduced an optimization that improves the vector select codegen by doing the setcc on vectors directly. This is a problem they the setcc operands are i1s, because the optimization would create vectors of i1, which aren't legal. Part of PR21549. Differential Revision: http://reviews.llvm.org/D6308 llvm-svn: 223075	2014-12-01 20:59:00 +00:00
Ahmed Bougacha	e6c3a5f724	[AArch64] Fix v2i8->i16 bitcast legalization. r213378 improved f16 bitcasts, so that they go directly through subregs, instead of through the stack. That code now causes an assertion failure for bitcasts from other 16-bits types (most importantly v2i8). Correct that by doing the custom lowering for i16 bitcasts only when the input is an f16. Part of PR21549. Differential Revision: http://reviews.llvm.org/D6307 llvm-svn: 223074	2014-12-01 20:52:32 +00:00
Peter Zotov	a7ec66eea2	[OCaml] Move Llvm.clone_module to its own Llvm_transform_utils module. This way most code won't link this (substantially large) library, if compiled statically with LLVM. llvm-svn: 223072	2014-12-01 19:50:39 +00:00
Peter Zotov	a05f11b29b	[OCaml] [cmake] Add CMake buildsystem for OCaml. Closes PR15325. llvm-svn: 223071	2014-12-01 19:50:23 +00:00
Ahmed Bougacha	d23753b378	[MachineVerifier] Accept a MBB with a single landing pad successor. The MachineVerifier used to check that there was always exactly one unconditional branch to a non-landingpad (normal) successor. If that normal successor to an invoke BB is unreachable, it seems reasonable to only have one successor, the landing pad. On targets other than AArch64 (and on AArch64 with a different testcase), the branch folder turns the branch to the landing pad into a fallthrough. The MachineVerifier, which relies on AnalyzeBranch, is unable to check the condition, and doesn't complain. However, it does in this specific testcase, where the branch to the landing pad remained. Make the MachineVerifier accept it. llvm-svn: 223059	2014-12-01 18:43:53 +00:00
Tim Northover	6096fde320	ARM: lower tail calls correctly when using GHC calling convention. Patch by Ben Gamari. llvm-svn: 223055	2014-12-01 17:46:39 +00:00
Hans Wennborg	f8109e3ddf	Revert r223049, r223050 and r223051 while investigating test failures. I didn't foresee affecting the Clang test suite :/ llvm-svn: 223054	2014-12-01 17:36:43 +00:00
Hans Wennborg	75717ff698	SimplifyCFG: Omit range checks for switch lookup tables when default is unreachable They would get optimized away later, but we might as well not emit them. llvm-svn: 223051	2014-12-01 17:08:38 +00:00
Hans Wennborg	15017c6fe4	SimplifyCFG: don't remove unreachable default switch destinations An unreachable default destination can be exploited by other optimizations, and SDag lowering is now prepared to handle them efficiently. For example, branches to the unreachable destination will be optimized away, such as in the case of range checks for switch lookup tables. On 64-bit Linux, this reduces the size of a clang bootstrap by 80 kB (and Chromium by 30 kB). llvm-svn: 223050	2014-12-01 17:08:35 +00:00
Hans Wennborg	09865c30b6	SelectionDAG switch lowering: Replace unreachable default with most popular case. This can significantly reduce the size of the switch, allowing for more efficient lowering. I also worked with the idea of exploiting unreachable defaults by omitting the range check for jump tables, but always ended up with a non-neglible binary size increase. It might be worth looking into some more. llvm-svn: 223049	2014-12-01 17:08:32 +00:00
Rafael Espindola	0bd0e50fbc	Partial revert of r222986. The explicit set of destination types is not fully redundant when lazy loading since the TypeFinder will not find types used only in function bodies. This keeps the logic to drop the name of mapped types since it still helps with avoiding further renaming. llvm-svn: 223043	2014-12-01 16:32:20 +00:00
Vladimir Medic	7915fd41c4	The andi16, addiusp and jraddiusp micromips instructions were missing dedicated decoder methods in MipsDisassembler.cpp to properly decode immediate operands. These methods are added together with corresponding tests. llvm-svn: 223006	2014-12-01 11:12:04 +00:00
Jay Foad	1e2a95bd74	[PowerPC] Fix unwind info with dynamic stack realignment Summary: PowerPC DWARF unwind info defined CFA as SP + offset even in a function where the stack had been dynamically realigned. This clearly doesn't work because the offset from SP to CFA is not a constant. Fix it by defining CFA as BP instead. This was causing the AddressSanitizer null_deref test to fail 50% of the time, depending on whether SP happened to be 32-byte aligned on entry to a particular function or not. Reviewers: willschm, uweigand, hfinkel Reviewed By: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6410 llvm-svn: 222996	2014-12-01 09:42:32 +00:00
Sonam Kumari	b1f24c8a5b	Removed extra whitespace. (Testing commit access). NFC. llvm-svn: 222994	2014-12-01 09:27:46 +00:00
Charlie Turner	8386827fe8	Add post-decode checking of HVC instruction. Add checkDecodedInstruction for post-decode checking of instructions, to catch the corner cases like HVC that don't fit into the general pattern. Needed to check for an invalid condition field in instruction encoding despite HVC not taking a predicate. Patch by Matthew Wahab. Change-Id: I48e28de981d7a9e43569594da3c45fb478b4f795 llvm-svn: 222992	2014-12-01 08:50:27 +00:00
Yury Gribov	cfbc9e88f4	[asan] Change dynamic alloca instrumentation to only consider allocas that are dominating all exits from function. Reviewed in http://reviews.llvm.org/D6412 llvm-svn: 222991	2014-12-01 08:47:58 +00:00
Charlie Turner	5186769a0c	Add Thumb HVC and ERET virtualisation extension instructions. Patch by Matthew Wahab. Change-Id: I131f71c1150d5fa797066a18e09d526c19bf9016 llvm-svn: 222990	2014-12-01 08:39:19 +00:00
Charlie Turner	b03bbf42d6	Add ARM ERET and HVC virtualisation extension instructions. Patch by Matthew Wahab. Change-Id: Iad75f078fbaa4ecc7d7a4820ad9b3930679cbbbb llvm-svn: 222989	2014-12-01 08:33:28 +00:00
Akira Hatanaka	cad434d590	[stack protector] Set edge weights for newly created basic blocks. This commit fixes a bug in stack protector pass where edge weights were not set when new basic blocks were added to lists of successor basic blocks. Differential Revision: http://reviews.llvm.org/D5766 llvm-svn: 222987	2014-12-01 04:27:03 +00:00
Rafael Espindola	0376bacc1a	Change how we keep track of which types are in the dest module. Instead of keeping an explicit set, just drop the names of types we choose to map to some other type. This has the advantage that the name of the unused will not cause the context to rename types on module read. llvm-svn: 222986	2014-12-01 04:15:59 +00:00
Rafael Espindola	85fbf2f916	Add a test showing what the linker IdentifiedStructTypes is for. Without this it could just be deleted and all tests would pass. llvm-svn: 222985	2014-12-01 03:20:57 +00:00
Rafael Espindola	c79482b0b3	Relax an assert a bit to avoid a crash on unreachable code. Patch by Duncan Exon Smith with a small tweak by me. llvm-svn: 222984	2014-12-01 02:55:24 +00:00
Hal Finkel	efd08ad91f	[PowerPC] Add asm support for cache-inhibited ld/st instructions Add assembler support for the fixed-point cache-inhibited load/store instructions. These are hypervisor-level only, so don't get too excited ;) Fixes PR21650. llvm-svn: 222976	2014-11-30 10:15:56 +00:00
Hans Wennborg	22fda335d6	Switch lowering: Fix broken 'Figure out which block is next' code This doesn't seem to have worked in a long time, but other optimizations would clean it up. llvm-svn: 222961	2014-11-29 21:17:05 +00:00
Jozef Kolek	e3a600e129	[mips][microMIPS] Implement NOP aliases This patch implements microMIPS 16-bit (MOVE16 $0, $0) and 32-bit (SLL $0, $0, 0) NOP aliases. http://reviews.llvm.org/D6440 llvm-svn: 222953	2014-11-29 13:29:24 +00:00
Duncan P. N. Exon Smith	57cead164b	DebugIR: Delete -debug-ir llvm-svn: 222945	2014-11-29 03:15:47 +00:00
Matt Arsenault	5c2fb0e261	R600/SI: Fix assertion on sign extend of 3 vectors This was trying to create an MVT with 3x vectors which created an invalid EVT llvm-svn: 222942	2014-11-28 22:51:38 +00:00
Duncan P. N. Exon Smith	73ce6dbb2b	Revert "Masked Vector Load and Store Intrinsics." This reverts commit r222632 (and follow-up r222636), which caused a host of LNT failures on an internal bot. I'll respond to the commit on the list with a reproduction of one of the failures. Conflicts: lib/Target/X86/X86TargetTransformInfo.cpp llvm-svn: 222936	2014-11-28 21:29:14 +00:00
David Majnemer	87a17a0975	InstCombine: FoldOrOfICmps harder We may be in a situation where the icmps might not be near each other in a tree of or instructions. Try to dig out related compare instructions and see if they combine. N.B. This won't fire on deep trees of compares because rewritting the tree might end up creating a net increase of IR. We may have to resort to something more sophisticated if this is a real problem. llvm-svn: 222928	2014-11-28 19:58:29 +00:00
Bruno Cardoso Lopes	a072cee758	[LICM] Store sink and indirectbr instructions Loop simplify skips exit-block insertion when exits contain indirectbr instructions. This leads to an assertion in LICM when trying to sink stores out of non-dedicated loop exits containing indirectbr instructions. This patch fix this issue by re-checking for dedicated exits in LICM prior to store sink attempts. Differential Revision: http://reviews.llvm.org/D6414 rdar://problem/18943047 llvm-svn: 222927	2014-11-28 19:47:46 +00:00
Bruno Cardoso Lopes	02319ec9b1	[SwitchLowering] Handle multiple destinations on condensed case stmts Switch cases statements with sequential values that branch to the same destination BB may often be handled together in a single new source BB. In this scenario we need to remove remaining incoming values from PHI instructions in the destination BB, as to match the number of source branches. Differential Revision: http://reviews.llvm.org/D6415 rdar://problem/19040894 llvm-svn: 222926	2014-11-28 19:47:33 +00:00
Sanjay Patel	019be40c8c	Enable FeatureFastUAMem for btver2 Allow unaligned 16-byte memop codegen for btver2. No functional changes for any other subtargets. Replace the existing supposed small memcpy test with an actual test of a small memcpy. The previous test wasn't using FileCheck either. This patch should allow us to close PR21541 ( http://llvm.org/bugs/show_bug.cgi?id=21541 ). Differential Revision: http://reviews.llvm.org/D6360 llvm-svn: 222925	2014-11-28 18:40:18 +00:00
Rafael Espindola	880240ae91	Add back r222727 with a fix. The original patch would fail when: * A dst opaque type (%A) is matched with a src type (%A). * A src opaque (%E) type is then speculatively matched with %A and the speculation fails afterward. * When rolling back the speculation we would cancel the source %A to dest %A mapping. The fix is to keep an explicit list of which resolutions are speculative. Original message: Fix overly aggressive type merging. If we find out that two types are not isomorphic, we learn nothing about opaque sub types in both the source and destination. llvm-svn: 222923	2014-11-28 16:41:24 +00:00
Rafael Espindola	3962e8e2eb	Add a testcase reduced from clang lto bootstrap on OS X. llvm-svn: 222921	2014-11-28 15:45:31 +00:00
Charlie Turner	dc3f8f7176	Fix wrong encoding of MRSBanked. Patch by Matthew Wahab. Change-Id: Ia2a001ca2760028ea360fe77b56f203a219eefbc llvm-svn: 222920	2014-11-28 15:01:06 +00:00
Evgeniy Stepanov	118b7804cb	[msan] Fix origin propagation for select of floats. MSan does not assign origin for instrumentation temps (i.e. the ones that do not come from the application code), but "select" instrumentation erroneously tried to use one of those. https://code.google.com/p/memory-sanitizer/issues/detail?id=78 llvm-svn: 222918	2014-11-28 11:17:58 +00:00
Charlie Turner	0795093430	Test all <build attribute, value> pairs. Add more tests to make sure the encoding/decoding of build attributes works correctly for all permissible values of build attributes. For cases where there are an infinite number of such values, a representative subset has been settled for. Change-Id: I2643c9624c211b2d56405306e16eec2d487bc5d6 llvm-svn: 222917	2014-11-28 11:14:47 +00:00
Tim Northover	e8b34aaff0	AArch64: treat [N x Ty] as a block during procedure calls. The AAPCS treats small structs and homogeneous floating (or vector) aggregates specially, and guarantees they either get passed as a contiguous block of registers, or prevent any future use of those registers and get passed on the stack. This concept can fit quite neatly into LLVM's own type system, mapping an HFA to [N x float] and so on, and small structs to [N x i64]. Doing so allows front-ends to emit AAPCS compliant code without having to duplicate the register counting logic. llvm-svn: 222903	2014-11-27 21:02:42 +00:00
Zoran Jovanovic	15712f82b0	[mips][microMIPS] Implement SWM16 and LWM16 instructions Differential Revision: http://reviews.llvm.org/D5579 llvm-svn: 222901	2014-11-27 18:28:59 +00:00
Jozef Kolek	c85a4a5656	[mips][microMIPS] Implement BREAK16 and SDBBP16 instructions Patch by Radovan Obradovic. Differential Revision: http://reviews.llvm.org/D5048 llvm-svn: 222900	2014-11-27 18:18:42 +00:00
Daniel Sanders	d7d5d4cdb1	[mips] Add synci instruction. Patch by Amaury Pouly Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6421 llvm-svn: 222899	2014-11-27 17:28:10 +00:00
Will Newton	a200eb1b54	Widen ELFYAML relocation type to 32 bits The current 8 bits is sufficient for ELF32 targets but ELF64 requires 32 bits. Add a test for AArch64 that exposes the issue. llvm-svn: 222898	2014-11-27 17:20:48 +00:00
Rafael Espindola	fe0669d419	Commit back the correct bits of r222760 (was r222538). I also added a test. Original message: Allow FDE references outside the +/-2GB range supported by PC relative offsets for code models other than small/medium. For JIT application, memory layout is less controlled and can result in truncations otherwise. Patch from Akos Kiss. Differential Revision: http://reviews.llvm.org/D6079 llvm-svn: 222897	2014-11-27 17:13:56 +00:00
Rafael Espindola	008818ce29	Revert "Reapply 222538 and update tests to explicitly request small code model and PIC:" This reverts commit r222760. It changed our behaviour on PIC so we don't match gas anymore. It also included lots of unnecessary changes to tests. If those changes are desirable, there should be an independent discussion as they are out of scope for that patch. I will recommit the other bits. llvm-svn: 222896	2014-11-27 17:13:51 +00:00
Duncan P. N. Exon Smith	bcb91dbf3f	Revert "Fix overly aggressive type merging." This reverts commit r222727, which causes LTO bootstrap failures. Last passing @ r222698: http://lab.llvm.org:8080/green/job/clang-Rlto_master_build/532/ First failing @ r222843: http://lab.llvm.org:8080/green/job/clang-Rlto_master_build/533/ Internal bootstraps pointed at a much narrower range: r222725 is passing, and r222731 is failing. LTO crashes while handling libclang.dylib: http://lab.llvm.org:8080/green/job/clang-Rlto_master_build/533/consoleFull#-158682280549ba4694-19c4-4d7e-bec5-911270d8a58c GEP is not of right type for indices! %InfoObj.i.i = getelementptr inbounds %"class.llvm::OnDiskIterableChainedHashTable"* %.lcssa, i64 0, i32 0, i32 4, !dbg !123627 %"class.clang::serialization::reader::ASTIdentifierLookupTrait" = type { %"class.clang::ASTReader.31859", %"class.clang::serialization::ModuleFile.31870", %"class.clang::IdentifierInfo"* }LLVM ERROR: Broken function found, compilation aborted! clang: error: linker command failed with exit code 1 (use -v to see invocation) Looks like the new algorithm doesn't merge types aggressively enough. llvm-svn: 222895	2014-11-27 17:01:10 +00:00
Erik Eckstein	b81721adb5	reinstate r222872: Peephole optimization in switch table lookup: reuse the guarding table comparison if possible. Fixed missing dominance check. Original commit message: This optimization tries to reuse the generated compare instruction, if there is a comparison against the default value after the switch. Example: if (idx < tablesize) r = table[idx]; // table does not contain default_value else r = default_value; if (r != default_value) ... Is optimized to: cond = idx < tablesize; if (cond) r = table[idx]; else r = default_value; if (cond) ... Jump threading will then eliminate the second if(cond). llvm-svn: 222891	2014-11-27 15:13:14 +00:00
Evgeniy Stepanov	a93bac024f	[msan] Remove indirect call wrapping code. This functionality was only used in MSanDR, which is deprecated. llvm-svn: 222889	2014-11-27 14:54:02 +00:00
Jozef Kolek	90243462d4	[mips][microMIPS] Implement disassembler support for 16-bit instructions LI16, ADDIUR1SP, ADDIUR2 and ADDIUS5 Differential Revision: http://reviews.llvm.org/D6419 llvm-svn: 222887	2014-11-27 14:41:44 +00:00
Charlie Turner	e4a6c7fd89	Stop uppercasing build attribute data. The string data for string-valued build attributes were being unconditionally uppercased. There is no mention in the ARM ABI addenda about case conventions, so it's technically implementation defined as to whether the data are capitialised in some way or not. However, there are good reasons not to captialise the data. * It's less work. * Some vendors may legitimately have case-sensitive checks for these attributes which would fail on LLVM generated object files. * There could be locale issues with uppercasing. The original reasons for uppercasing appear to have stemmed from an old codesourcery toolchain behaviour, see http://comments.gmane.org/gmane.comp.compilers.llvm.cvs/87133 This patch makes the object file emitted no longer captialise string data, it encodes as seen in the assembly source. Change-Id: Ibe20dd6e60d2773d57ff72a78470839033aa5538 llvm-svn: 222882	2014-11-27 12:13:56 +00:00
Suyog Sarda	375e0af627	Use FileCheck instead of grep. Change by Ankur Garg. Differential Revision: http://reviews.llvm.org/D6430 llvm-svn: 222879	2014-11-27 11:22:49 +00:00
Erik Eckstein	6a5635f3ac	Revert "Peephole optimization in switch table lookup: reuse the guarding table comparison if possible." It is breaking the clang bootstrag. llvm-svn: 222877	2014-11-27 10:59:08 +00:00
Suyog Sarda	a17e6ed740	Use FileCheck instead of grep. Change by Sonam. Differential Revision: http://reviews.llvm.org/D6432 llvm-svn: 222876	2014-11-27 10:57:24 +00:00
Erik Eckstein	15d20044f2	Peephole optimization in switch table lookup: reuse the guarding table comparison if possible. This optimization tries to reuse the generated compare instruction, if there is a comparison against the default value after the switch. Example: if (idx < tablesize) r = table[idx]; // table does not contain default_value else r = default_value; if (r != default_value) ... Is optimized to: cond = idx < tablesize; if (cond) r = table[idx]; else r = default_value; if (cond) ... \endcode Jump threading will then eliminate the second if(cond). llvm-svn: 222872	2014-11-27 08:33:51 +00:00
David Majnemer	9698d3487d	InstCombine: Restore optimizations lost in r210006 This restores our ability to optimize: (X & C) == 0 ? X ^ C : X into X \| C (X & C) != 0 ? X ^ C : X into X & ~C llvm-svn: 222871	2014-11-27 07:25:21 +00:00
David Majnemer	d9ae958b9b	InstSimplify: Restore optimizations lost in r210006 This restores our ability to optimize: (X & C) ? X & ~C : X into X & ~C (X & C) ? X : X & ~C into X (X & C) ? X \| C : X into X (X & C) ? X : X \| C into X \| C llvm-svn: 222868	2014-11-27 06:32:46 +00:00
David Majnemer	7e9b94486e	Revert "Added inst combine transforms for single bit tests from Chris's note" This reverts commit r210006, it miscompiled libapr which is used in who knows how many projects. A test has been added to ensure that we don't regress again. I'll work on a rewrite of what the optimization was trying to do later. llvm-svn: 222856	2014-11-26 23:00:38 +00:00
Colin LeMahieu	0872710917	[Hexagon] Adding cmp* immediate form instructions. llvm-svn: 222849	2014-11-26 19:43:12 +00:00
Jozef Kolek	ecfa20e7f7	[mips][microMIPS] Implement disassembler support for 16-bit instructions LBU16, LHU16, LW16, SB16, SH16 and SW16 Differential Revision: http://reviews.llvm.org/D6405 llvm-svn: 222847	2014-11-26 18:56:38 +00:00
Colin LeMahieu	1f8cdbb85d	[Hexagon] Adding and64, or64, and xor64 instructions. llvm-svn: 222846	2014-11-26 18:55:59 +00:00
Will Newton	da953f13b9	Update AArch64 ELF relocations to ABI 1.0 This mostly entails adding relocations, however there are a couple of changes to existing relocations: 1. R_AARCH64_NONE is defined to be zero rather than 256 R_AARCH64_NONE has been defined to be zero for a long time elsewhere e.g. binutils and glibc since the submission of the AArch64 port in 2012 so this is required for compatibility. 2. R_AARCH64_TLSDESC_ADR_PAGE renamed to R_AARCH64_TLSDESC_ADR_PAGE21 I don't think there is any way for relocation names to leak out of LLVM so this should not break anything. Tested with check-all with no regressions. llvm-svn: 222821	2014-11-26 10:49:18 +00:00
Elena Demikhovsky	868b76ae69	AVX-512: Scalar ERI intrinsics including SAE mode and memory operand. Added AVX512_maskable_scalar template, that should cover all scalar instructions in the future. The main difference between AVX512_maskable_scalar<> and AVX512_maskable<> is using X86select instead of vselect. I need it, because I can't create vselect node for MVT::i1 mask for scalar instruction. http://reviews.llvm.org/D6378 llvm-svn: 222820	2014-11-26 10:46:49 +00:00
Will Newton	7576aeb371	Update ARM ELF relocations to ABI 2.09 Add R_ARM_IRELATIVE. llvm-svn: 222817	2014-11-26 10:36:03 +00:00
Simon Pilgrim	0e1c44b939	[X86][SSE] Improvements to byte shift shuffle matching Since (v)pslldq / (v)psrldq instructions resolve to a single input argument it is useful to match it much earlier than we currently do - this prevents more complicated shuffles (notably insertion into a zero vector) matching before it. Differential Revision: http://reviews.llvm.org/D6409 llvm-svn: 222796	2014-11-25 22:34:59 +00:00
Colin LeMahieu	535f692186	[Hexagon] Adding add64 and sub64 instructions. llvm-svn: 222795	2014-11-25 22:15:44 +00:00
Colin LeMahieu	732a0febbc	Reverting 222792 llvm-svn: 222793	2014-11-25 21:39:57 +00:00
Colin LeMahieu	bf565f821c	[Hexagon] Adding compare with immediate instructions. llvm-svn: 222792	2014-11-25 21:30:28 +00:00
Rafael Espindola	3e2785424e	This test requires asserts because of -stats. Sorry about that. llvm-svn: 222788	2014-11-25 20:56:56 +00:00
Rafael Espindola	4252588d37	gold plugin: call llvm_shutdown so that -stats works. llvm-svn: 222787	2014-11-25 20:52:49 +00:00
Cameron McInally	c32dadfa69	[AVX512] Add 512b integer shift by variable intrinsics and patterns. llvm-svn: 222786	2014-11-25 20:41:51 +00:00
Colin LeMahieu	4ed8312072	[Hexagon] [NFC] Adding trailing whitespace to test files. llvm-svn: 222785	2014-11-25 20:22:24 +00:00
Colin LeMahieu	9b0719975d	[Hexagon] Adding C2_mux instruction. llvm-svn: 222784	2014-11-25 20:20:09 +00:00
Hans Wennborg	24f12c7a0b	Remove useless rdar:// comment from switch_to_lookup_table.ll test. llvm-svn: 222772	2014-11-25 18:45:23 +00:00
Colin LeMahieu	4a42d4abd9	[Hexagon] Replacing cmp* instructions with ones that contain encoding bits. llvm-svn: 222771	2014-11-25 18:20:52 +00:00
Hans Wennborg	333795bf71	LazyValueInfo: Actually re-visit partially solved block-values in solveBlockValue() If solveBlockValue() needs results from predecessors that are not already computed, it returns false with the intention of resuming when the dependencies have been resolved. However, the computation would never be resumed since an 'overdefined' result had been placed in the cache, preventing any further computation. The point of placing the 'overdefined' result in the cache seems to have been to break cycles, but we can check for that when inserting work items in the BlockValue stack instead. This makes the "stop and resume" mechanism of solveBlockValue() work as intended, unlocking more analysis. Using this patch shaves 120 KB off a 64-bit Chromium build on Linux. I benchmarked compiling bzip2.c at -O2 but couldn't measure any difference in compile time. Tests by Jiangning Liu from r215343 / PR21238, Pete Cooper, and me. Differential Revision: http://reviews.llvm.org/D6397 llvm-svn: 222768	2014-11-25 17:23:05 +00:00
Joerg Sonnenberger	0740338c61	Small model and JIT generally don't go well with each other. On LP64 platforms, it will work or not depending on the choosen memory layout, so neither PASS nor XFAIL is appropiate. As UNSUPPORTED as per-test target doesn't exist (yet), remove the test instead to unbreak the builds. llvm-svn: 222767	2014-11-25 17:14:22 +00:00
Rafael Espindola	6791ceb3c6	Set the body of a new struct as soon as it is created. This changes the order in which different types are passed to get, but one order is not inherently better than the other. The main motivation is that this simplifies linkDefinedTypeBodies now that it is only linking "real" opaque types. It is also means that we only have to call it once and that we don't need getImpl. A small change in behavior is that we don't copy type names when resolving opaque types. This is an improvement IMHO, but it can be added back if desired. A test is included with the new behavior. llvm-svn: 222764	2014-11-25 15:33:40 +00:00
Joerg Sonnenberger	829c958371	Reapply 222538 and update tests to explicitly request small code model and PIC: Allow FDE references outside the +/-2GB range supported by PC relative offsets for code models other than small/medium. For JIT application, memory layout is less controlled and can result in truncations otherwise. Patch from Akos Kiss. Differential Revision: http://reviews.llvm.org/D6079 llvm-svn: 222760	2014-11-25 13:37:55 +00:00
Joerg Sonnenberger	6dd10513e3	Mark as explicit failing on x86-64 -- small memory model doesn't agree with default address selections. llvm-svn: 222759	2014-11-25 13:28:56 +00:00
Zoran Jovanovic	c3664f6f8a	[mips][micromips] Use call instructions with short delay slots Differential Revision: http://reviews.llvm.org/D6338 llvm-svn: 222752	2014-11-25 10:50:00 +00:00
Chandler Carruth	6264bc6537	[InstCombine] Change LLVM To canonicalize toward the value type being stored rather than the pointer type. This change is analogous to r220138 which changed the canonicalization for loads. The rationale is the same: memory does not have a type, operations (and thus the values they produce) have a type. We should match that type as closely as possible rather than reading some form of semantics into the pointer type. With this change, loads and stores should no longer be made with nonsensical types for the values that tehy load and store. This is particularly important when trying to match specific loaded and stored types in the process of doing other instcombines, which is what led me down this twisty maze of miscanonicalization. I've put quite some effort into looking through IR to find places where LLVM's optimizer was being unreasonably conservative in the face of mismatched load and store types, however it is possible (let's say, likely!) I have missed some. If you see regressions here, or from r220138, the likely cause is some part of LLVM failing to cope with load and store types differing. Test cases appreciated, it is important that we root all of these out of LLVM. llvm-svn: 222748	2014-11-25 10:09:51 +00:00
Suyog Sarda	cbd1299080	Change the test case file to use FileCheck instead of grep. NFC. Change by Ankur Garg. Differential Revision: http://reviews.llvm.org/D6382 llvm-svn: 222740	2014-11-25 08:44:56 +00:00
Chandler Carruth	7feb19d89c	Revert r220349 to re-instate r220277 with a fix for PR21330 -- quite clearly only exactly equal width ptrtoint and inttoptr casts are no-op casts, it says so right there in the langref. Make the code agree. Original log from r220277: Teach the load analysis to allow finding available values which require inttoptr or ptrtoint cast provided there is datalayout available. Eventually, the datalayout can just be required but in practice it will always be there today. To go with the ability to expose available values requiring a ptrtoint or inttoptr cast, helpers are added to perform one of these three casts. These smarts are necessary to finish canonicalizing loads and stores to the operational type requirements without regressing fundamental combines. I've added some test cases. These should actually improve as the load combining and store combining improves, but they may fundamentally be highlighting some missing combines for select in addition to exercising the specific added logic to load analysis. llvm-svn: 222739	2014-11-25 08:20:27 +00:00
David Majnemer	a4a80e242b	Forgot to add a file for r222734 llvm-svn: 222736	2014-11-25 07:45:56 +00:00
David Majnemer	705e1231b2	COFF: Add another test for r222124 llvm-svn: 222734	2014-11-25 07:42:36 +00:00
Rafael Espindola	34bccf8e6c	Fix overly aggressive type merging. If we find out that two types are not isomorphic, we learn nothing about opaque sub types in both the source and destination. llvm-svn: 222727	2014-11-25 05:59:24 +00:00
Simon Atanasyan	b1231365ea	[Object][Mips] Return address of MIPS symbol with cleared microMIPS indicator bit llvm-svn: 222726	2014-11-25 05:57:55 +00:00
Rafael Espindola	906b92c9d0	Link the type of aliases. They are not more or less "well typed" than GlobalVariables. llvm-svn: 222725	2014-11-25 04:43:59 +00:00
Juergen Ributzka	c90ddb75a2	[FastISel][AArch64] Fix and extend the tbz/tbnz pattern matching. The pattern matching failed to recognize all instances of "-1", because when comparing against "-1" we didn't use an APInt of the same bitwidth. This commit fixes this and also adds inverse versions of the conditon to catch more cases. llvm-svn: 222722	2014-11-25 04:16:15 +00:00
Rafael Espindola	690ff4c264	Add an interesting test that we already get right. NFC. llvm-svn: 222720	2014-11-25 03:47:57 +00:00
David Majnemer	4e5c0f46f5	InstSimplify: Handle some simple tautological comparisons This handles cases where we are comparing a masked value against itself. The analysis could be further improved by making it recursive but such expense is not currently justified. llvm-svn: 222716	2014-11-25 02:55:48 +00:00
Hal Finkel	d19aa4cdf8	[PowerPC] Add the 'attn' instruction The attn instruction is not part of the Power ISA, but is documented in the A2 user manual, and is accepted by the GNU assembler for the A2 and the POWER4+. Reported as part of PR21650. llvm-svn: 222712	2014-11-25 00:30:11 +00:00
Hal Finkel	515f6e50f5	[PowerPC] Implement combineRepeatedFPDivisors This does not matter on newer cores (where we can use reciprocal estimates in fast-math mode anyway), but for older cores this allows us to generate better fast-math code where we have multiple FDIVs with a common divisor. llvm-svn: 222710	2014-11-24 23:45:21 +00:00
Matt Arsenault	454e837bd2	Bug 21610: Canonicalize min/max fcmp selects to use ordered comparisons llvm-svn: 222705	2014-11-24 23:15:18 +00:00
Matt Arsenault	ae1a2b7c22	Convert test to FileCheck and use CHECK-LABEL llvm-svn: 222704	2014-11-24 23:03:17 +00:00
Rafael Espindola	9584ebd5dc	Add a disable-output option to the gold plugin. This corresponds to the opt option and is handy for profiling. llvm-svn: 222687	2014-11-24 21:18:14 +00:00
Rafael Espindola	c8934bcfb9	Pass the .ll files to llvm-link directly. NFC. llvm-svn: 222681	2014-11-24 20:35:59 +00:00
Kostya Serebryany	cb8f8175c9	[asan/coverage] change the way asan coverage instrumentation is done: instead of setting the guard to 1 in the generated code, pass the pointer to guard to __sanitizer_cov and set it there. No user-visible functionality change expected llvm-svn: 222675	2014-11-24 18:49:53 +00:00
Ulrich Weigand	24b899d017	[PowerPC] Fix PR 21652 - copy st_other bits on symbol assignment When processing an assignment in the integrated assembler that sets a symbol to the value of another symbol, we need to copy the st_other bits that encode the local entry point offset. Modeled after MipsTargetELFStreamer::emitAssignment handling of the ELF::STO_MIPS_MICROMIPS flag. llvm-svn: 222672	2014-11-24 18:09:47 +00:00
Colin LeMahieu	1dc5a7ca7c	[Hexagon] Adding asrh instruction, removing unused multiclasses. llvm-svn: 222670	2014-11-24 18:04:42 +00:00
Colin LeMahieu	b3868ebb81	[Hexagon] Adding aslh instruction. llvm-svn: 222668	2014-11-24 17:44:19 +00:00
Colin LeMahieu	e1cd9ff6b5	[Hexagon] Adding zxth instruction. llvm-svn: 222662	2014-11-24 17:11:34 +00:00
Colin LeMahieu	80e59674e9	[Hexagon] Adding zxtb instruction. llvm-svn: 222660	2014-11-24 16:48:43 +00:00
David Majnemer	291966cd3b	InstCombine: Don't create an unused instruction We would create an instruction but not inserting it. Not inserting the unused instruction would lead us to verification failure. This fixes PR21653. llvm-svn: 222659	2014-11-24 16:41:13 +00:00
Jozef Kolek	a4e87d7a74	[mips][microMIPS] Fix JRADDIUSP instruction Fix JRADDIUSP instruction, remove delay slot flag because this instruction doesn't have delay slot. Differential Revision: http://reviews.llvm.org/D6365 llvm-svn: 222658	2014-11-24 16:14:10 +00:00
Jozef Kolek	dd0dbf282b	[mips][microMIPS] Implement LBU16, LHU16, LW16, SB16, SH16 and SW16 instructions Differential Revision: http://reviews.llvm.org/D5122 llvm-svn: 222653	2014-11-24 14:39:13 +00:00
Jozef Kolek	0d926f8fc9	[mips][microMIPS] Implement disassembler support for 16-bit instructions With the help of new method readInstruction16() two bytes are read and decodeInstruction() is called with DecoderTableMicroMips16, if this fails four bytes are read and decodeInstruction() is called with DecoderTableMicroMips32. Differential Revision: http://reviews.llvm.org/D6149 llvm-svn: 222648	2014-11-24 13:29:59 +00:00
Andrea Di Biagio	3646b17160	[X86] Improved target specific combine on VSELECT dag nodes. This patch teaches function 'transformVSELECTtoBlendVECTOR_SHUFFLE' how to convert VSELECT dag nodes to shuffles on targets that do not have SSE4.1. On pre-SSE4.1 targets, we can still perform blend operations using movss/movsd. Also, removed a target specific combine that performed a premature lowering of VSELECT nodes to target specific MOVSS/MOVSD nodes. llvm-svn: 222647	2014-11-24 12:23:15 +00:00
David Majnemer	2445c6caf5	InstCombine: Don't assume DataLayout is always available We tried to get the result of DataLayout::getLargestLegalIntTypeSize but we didn't have a DataLayout. This resulted in opt crashing. This fixes PR21651. llvm-svn: 222645	2014-11-24 07:26:20 +00:00
Michael Kuperstein	b12b19a24a	[X86] Fixes bug in build_vector v4x32 lowering r222375 made some improvements to build_vector lowering of v4x32 and v4xf32 into an insertps, but it missed a case where: 1. A single extracted element is used twice. 2. The lower of the two non-zero indexes should be preserved, and the higher should be used for the dest mask. This caused a crash, since the source value for the insertps ends-up uninitialized. Differential Revision: http://reviews.llvm.org/D6377 llvm-svn: 222635	2014-11-23 13:09:06 +00:00
Elena Demikhovsky	36a2243ab7	Masked Vector Load and Store Intrinsics. Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align /, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8 %addr, <8 x double> %value, i32 4, <8 x i1> %mask) Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch. http://reviews.llvm.org/D6191 llvm-svn: 222632	2014-11-23 08:07:43 +00:00
Matt Arsenault	1b03538afe	R600: Fix extloads of i1 on R600/Evergreen llvm-svn: 222631	2014-11-23 02:57:54 +00:00
Matt Arsenault	5edd328299	R600/SI: Add additional tests for i1 loads llvm-svn: 222629	2014-11-23 02:57:50 +00:00
Matt Arsenault	1d9ec67967	R600/SI: Fix broken check lines and modernize prefixes Use -LABEL and remove -CHECK llvm-svn: 222628	2014-11-23 02:57:49 +00:00
Matt Arsenault	a30225b261	R600/SI: Fix missing -verify-machineinstrs on a test llvm-svn: 222627	2014-11-23 02:57:47 +00:00
David Majnemer	0b413925f3	InstCombine: Propagate exact for (sdiv X, Pow2) -> (udiv X, Pow2) llvm-svn: 222625	2014-11-22 20:00:41 +00:00
David Majnemer	ba33e07fad	InstCombine: Propagate exact for (sdiv X, Y) -> (udiv X, Y) llvm-svn: 222624	2014-11-22 20:00:38 +00:00
David Majnemer	e3d9e29780	InstCombine: Propagate exact for (sdiv -X, C) -> (sdiv X, -C) llvm-svn: 222623	2014-11-22 20:00:34 +00:00
David Majnemer	23e1540ef9	InstCombine: Propagate exact in (udiv (lshr X,C1),C2) -> (udiv x,C1<<C2) llvm-svn: 222620	2014-11-22 18:16:54 +00:00
David Majnemer	1847177b9b	InstCombine: Propagate NSW/NUW for X*(1<<Y) -> X<<Y llvm-svn: 222613	2014-11-22 08:57:02 +00:00
David Majnemer	3c7153d5d6	InstCombine: Propagate NSW for -X * -Y -> X * Y llvm-svn: 222612	2014-11-22 07:25:19 +00:00
David Majnemer	26583aff1f	InstSimplify: Simplify (sub 0, X) -> X if it's NUW This is a generalization of the X - (0 - Y) -> X transform. llvm-svn: 222611	2014-11-22 07:15:16 +00:00
Chandler Carruth	4c0a2a8001	[x86] Add some tests for a common unpack pattern of vector shuffle that has a remarkably unique and efficient lowering. While we get this some of the time already, we miss a few cases and there wasn't a principled reason we got it. We should at least test this. v8 already has tests for this pattern. llvm-svn: 222607	2014-11-22 05:44:43 +00:00
David Majnemer	c405b87f53	InstCombine: Preserve nsw when folding X*(2^C) -> X << C llvm-svn: 222606	2014-11-22 04:52:55 +00:00
David Majnemer	96d9c67b69	InstCombine: Preserve nsw/nuw for ((X << C2)C1) -> (X (C1 << C2)) llvm-svn: 222605	2014-11-22 04:52:52 +00:00
David Majnemer	6191590b23	InstCombine: Preserve nsw for (mul %V, -1) -> (sub 0, %V) llvm-svn: 222604	2014-11-22 04:52:38 +00:00
Gerolf Hoflehner	cb87bd4853	[InstCombine] Re-commit of r218721 (Optimize icmp-select-icmp sequence) Fixes the self-host fail. Note that this commit activates dominator analysis in the combiner by default (like the original commit did). llvm-svn: 222590	2014-11-21 23:36:44 +00:00
Joerg Sonnenberger	00a4fe60d0	Fix transformation of add with pc argument to adr for non-immediate arguments. llvm-svn: 222587	2014-11-21 22:39:34 +00:00
Kostya Serebryany	ec6bd28ded	[asan] remove old experimental code llvm-svn: 222586	2014-11-21 22:34:29 +00:00
Tom Stellard	3929e69978	R600/SI: Add a failing test case for offset order in ds_read2 instructions llvm-svn: 222585	2014-11-21 22:31:47 +00:00
Tom Stellard	a112fe4e40	R600/SI: Emit s_mov_b32 m0, -1 before every DS instruction This s_mov_b32 will write to a virtual register from the M0Reg class and all the ds instructions now take an extra M0Reg explicit argument. This change is necessary to prevent issues with the scheduler mixing together instructions that expect different values in the m0 registers. llvm-svn: 222583	2014-11-21 22:31:44 +00:00
Tom Stellard	484f10138e	R600/SI: Add SIFoldOperands pass This pass attempts to fold the source operands of mov and copy instructions into their uses. llvm-svn: 222581	2014-11-21 22:06:37 +00:00
Jozef Kolek	52fa965cf8	[mips][microMIPS] This patch implements functionality in MIPS delay slot filler such as if delay slot filler have to put NOP instruction into the delay slot of microMIPS BEQ or BNE instruction which uses the register $0, then instead of emitting NOP this instruction is replaced by the corresponding microMIPS compact branch instruction, i.e. BEQZC or BNEZC. Differential Revision: http://reviews.llvm.org/D3566 llvm-svn: 222580	2014-11-21 22:04:35 +00:00
Tom Stellard	267a21d6d7	R600/SI: Use hex notation for constant in test llvm-svn: 222578	2014-11-21 22:00:13 +00:00
Colin LeMahieu	4986bc53c5	[Hexagon] Adding sxth instruction. llvm-svn: 222577	2014-11-21 21:54:59 +00:00
Colin LeMahieu	9a7b747bf6	[Hexagon] Adding sxtb instruction. Renaming some identically named classes that will be removed after converting referencing defs. llvm-svn: 222575	2014-11-21 21:35:52 +00:00
Manman Ren	8be1069f3f	Debug Info: revert r222195, r222210 and r222239. This is no longer needed after David's fix at r222377 + r222485. rdar://18958417 llvm-svn: 222563	2014-11-21 19:55:23 +00:00
Sanjay Patel	776e5485fb	Add a feature flag for slow 32-byte unaligned memory accesses [x86]. This patch adds a feature flag to avoid unaligned 32-byte load/store AVX codegen for Sandy Bridge and Ivy Bridge. There is no functionality change intended for those chips. Previously, the absence of AVX2 was being used as a proxy to detect this feature. But that hindered codegen for AVX-enabled AMD chips such as btver2 that do not have the 32-byte unaligned access slowdown. Performance measurements are included in PR21541 ( http://llvm.org/bugs/show_bug.cgi?id=21541 ). Differential Revision: http://reviews.llvm.org/D6355 llvm-svn: 222544	2014-11-21 17:40:04 +00:00
Chandler Carruth	5e598c0342	[x86] Restructure the checking patterns for v16 and v32 avx2 vector shuffle lowering to allow much better blend matching. Specifically, with the new structure the code seems clearer to me and we correctly can hit the cases where merging two 128-bit lanes is a clear win and can be shuffled cheaply afterward. llvm-svn: 222539	2014-11-21 14:53:03 +00:00
Chandler Carruth	7491f1f32f	[x86] Make the previous logic significantly less conservative and get a bunch more improvements. Non-lane-crossing is fine, the key is that lane merging only makes sense for single-input shuffles. Not sure why I got so turned around here. The code all works, I was just using the wrong model for it. This only updates v4 and v8 lowering. The v16 and v32 lowering requires restructuring the entire check sequence. llvm-svn: 222537	2014-11-21 14:33:24 +00:00
Andrea Di Biagio	0a8cf1ad5a	[DAG] Teach how to turn a build_vector into a shuffle if some of the operands are zero. Before this patch, the DAGCombiner only tried to convert build_vector dag nodes into shuffles if all operands were either extract_vector_elt or undef. This patch improves that logic and teaches the DAGCombiner how to deal with build_vector dag nodes where one or more operands are zero. A build_vector dag node with some zero operands is turned into a shuffle only if the resulting shuffle mask is legal for the target. llvm-svn: 222536	2014-11-21 14:32:06 +00:00
Chandler Carruth	8387bec088	[x86] Teach the x86 vector shuffle lowering to detect mergable 128-bit lanes. By special casing these we can often either reduce the total number of shuffles significantly or reduce the number of (high latency on Haswell) AVX2 shuffles that potentially cross 128-bit lanes. Even when these don't actually cross lanes, they have much higher latency to support that. Doing two of them and a blend is worse than doing a single insert across the 128-bit lanes to blend and then doing a single interleaved shuffle. While this seems like a narrow case, it kept cropping up on me and the difference is huge as you can see in many of the test cases. I first hit this trying to perfectly fix the interleaving shuffle patterns used by Halide for AVX2. llvm-svn: 222533	2014-11-21 13:56:05 +00:00
Chandler Carruth	5646862a2e	[x86] Remove more windows line endings that slipped into this file... llvm-svn: 222528	2014-11-21 12:33:46 +00:00
Chandler Carruth	2db7c4cf32	[x86] Add a bunch of test cases to 256-bit shuffles that exercise merging 128-bit subvectors and also shuffling all the elements of those subvectors. Currently we generate pretty bad code for many of these, but I'm testing a patch that should dramatically improve this in addition to making the shuffle lowering robust to other changes. llvm-svn: 222525	2014-11-21 12:17:50 +00:00
Alexey Volkov	235268b4ed	[X86] For Silvermont CPU use 16-bit division instead of 64-bit for small positive numbers Differential Revision: http://reviews.llvm.org/D5938 llvm-svn: 222521	2014-11-21 11:19:34 +00:00
Yury Gribov	cb671c0b2c	[asan] Add new hidden compile-time flag asan-instrument-allocas to sanitize variable-sized dynamic allocas. Patch by Max Ostapenko. Reviewed at http://reviews.llvm.org/D6055 llvm-svn: 222519	2014-11-21 10:29:50 +00:00
Hao Liu	9cb82be410	DAGCombiner: Allow the DAGCombiner to combine multiple FDIVs with the same divisor info FMULs by the reciprocal. E.g., ( a / D; b / D ) -> ( recip = 1.0 / D; a * recip; b * recip) A hook is added to allow the target to control whether it needs to do such combine. Reviewed in http://reviews.llvm.org/D6334 llvm-svn: 222510	2014-11-21 06:39:58 +00:00
Hal Finkel	ac26448a5c	[PPC] Use SeparateConstOffsetFromGEP This mirrors r222331, which enabled SeparateConstOffsetFromGEP on AArch64, in the PowerPC backend. Yields, on a POWER7 machine, a 30% speedup on SingleSource/Benchmarks/Shootout/nestedloop (this might just be from LICM, there is a store moved out of the inner loop) and a potential speedup on MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode. Regardless, it makes some code look cleaner, and synchronizing the backends in this regard seems like a generally good thing. llvm-svn: 222504	2014-11-21 04:35:51 +00:00
David Majnemer	8a561be3da	SROA: The alloca type isn't a candidate promotion type for vectors The alloca's type is irrelevant, only those types which are used in a load or store of the exact size of the slice should be considered. This manifested as an assertion failure when we compared the various types: we had a size mismatch. This fixes PR21480. llvm-svn: 222499	2014-11-21 02:34:55 +00:00
Quentin Colombet	8fea50c066	[X86] Do not custom lower UINT_TO_FP when the target type does not match the custom lowering. <rdar://problem/19026326> llvm-svn: 222489	2014-11-21 00:47:19 +00:00
Michael Zolotukhin	7e8ae7cad7	Fix a trip-count overflow issue in LoopUnroll. Currently LoopUnroll generates a prologue loop before the main loop body to execute first N%UnrollFactor iterations. Also, this loop is used if trip-count can overflow - it's determined by a runtime check. However, we've been mistakenly optimizing this loop to a linear code for UnrollFactor = 2, not taking into account that it also serves as a safe version of the loop if its trip-count overflows. llvm-svn: 222451	2014-11-20 20:19:55 +00:00
Saleem Abdulrasool	0de13e90eb	X86: use the correct alloca symbol for Windows Itanium Windows itanium targets the MSVCRT, and the stack probe symbol is provided by MSVCRT. This corrects the emission of stack probes on i686-windows-itanium. llvm-svn: 222439	2014-11-20 18:01:26 +00:00
Renato Golin	daa4ba29b8	MCJIT tests passing on ARM after r222414 fixed the relocation llvm-svn: 222430	2014-11-20 13:32:16 +00:00
Jyoti Allur	0aaf89456e	[ELF] Prevent ARM ELF object writer from generating deprecated relocation code R_ARM_PLT32 llvm-svn: 222414	2014-11-20 05:58:11 +00:00
David Majnemer	44cf22795f	Add a test for r221870 bad-relocs.obj.coff-i386 has a relocation whose symbol index is outside the symbol table. llvm-svn: 222413	2014-11-20 05:32:10 +00:00
Colin LeMahieu	384b462d47	[Hexagon] Adding A2_xor instruction with IR selection pattern and test. llvm-svn: 222399	2014-11-19 23:22:23 +00:00
Chad Rosier	eafdf66096	Revert "[Reassociate] As the expression tree is rewritten make sure the operands are" This reverts commit r222142. This is causing/exposing an execution-time regression in spec2006/gcc and coremark on AArch64/A57/Ofast. Conflicts: test/Transforms/Reassociate/optional-flags.ll llvm-svn: 222398	2014-11-19 23:21:20 +00:00
Colin LeMahieu	35e8a8aa73	[Hexagon] Adding A2_or instruction with IR selection pattern and test. llvm-svn: 222396	2014-11-19 22:58:04 +00:00
Andrea Di Biagio	b770dd344e	[X86] Improved lowering of v4x32 build_vector dag nodes. This patch improves the lowering of v4f32 and v4i32 build_vector dag nodes that are known to have at least two non-zero elements. With this patch, a build_vector that performs a blend with zero is converted into a shuffle. This is done to let the shuffle legalizer expand the dag node in a optimal way. For example, if we know that a build_vector performs a blend with zero, we can try to lower it as a movq/blend instead of always selecting an insertps. This patch also improves the logic that lowers a build_vector into a insertps with zero masking. See for example the extra test cases added to test sse41.ll. Differential Revision: http://reviews.llvm.org/D6311 llvm-svn: 222375	2014-11-19 19:34:29 +00:00
Tom Stellard	271c0a936e	R600/SI: Make SIInstrInfo::isOperandLegal() more strict A register operand that has a common sub-class with its instruction's defined register class is not always legal. For example, SReg_32 and M0Reg both have a common sub-class, but we can't use an SReg_32 in instructions that expect a M0Reg. This prevents the llvm.SI.sendmsg.ll test from failing when the fold operand pass is added. llvm-svn: 222368	2014-11-19 16:58:49 +00:00
Zoran Jovanovic	ebf19d975c	[mips][micromips] Implement SWM32 and LWM32 instructions Differential Revision: http://reviews.llvm.org/D5519 llvm-svn: 222367	2014-11-19 16:44:02 +00:00
Suyog Sarda	e6a1f30c00	Vectorize a reduction chain feeding into a 'return' statement. e.x return (a[0]+b[0]) + (a[1]+b[1]) Differential Revision: http://reviews.llvm.org/D6227 llvm-svn: 222364	2014-11-19 16:07:38 +00:00
Jozef Kolek	2b6a42be6d	[mips][microMIPS] Fix opcodes of MFHC1 and MTHC1 instructions. Differential Revision: http://reviews.llvm.org/D6169 llvm-svn: 222355	2014-11-19 13:37:51 +00:00
Arnaud A. de Grandmaison	fdfed29d10	Fix tail recursion elimination When the BasicBlock containing the return instrution has a PHI with 2 incoming values, FoldReturnIntoUncondBranch will remove the no longer used incoming value and remove the no longer needed phi as well. This leaves us with a BB that no longer has a PHI, but the subsequent call to FoldReturnIntoUncondBranch from FoldReturnAndProcessPred will not remove the return instruction (which still uses the result of the call instruction). This prevents EliminateRecursiveTailCall to remove the value, as it is still being used in a basicblock which has no predecessors. The basicblock can not be erased on the spot, because its iterator is still being used in runTRE. This issue was exposed when removing the threshold on size for lifetime marker insertion for named temporaries in clang. The testcase is a much reduced version of peelOffOuterExpr(const Expr, const ExplodedNode ) from clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp. llvm-svn: 222354	2014-11-19 13:32:51 +00:00
Jozef Kolek	d19675f448	[mips][microMIPS] Implement CodeGen support for 16-bit instruction ADDIUR2. Differential Revision: http://reviews.llvm.org/D5800 llvm-svn: 222352	2014-11-19 13:23:58 +00:00
Jozef Kolek	9fbf00198c	[mips][microMIPS] Implement CodeGen support for ADDIUS5 instruction. Differential Revision: http://reviews.llvm.org/D5799 llvm-svn: 222351	2014-11-19 13:11:09 +00:00
Jozef Kolek	407eb9ebdd	[mips][microMIPS] Add disassembler tests for new microMIPS 32-bit instructions: LWXS, BGEZALS, BLTZALS, BEQZC, BNEZC, JALS and JALRS. http://reviews.llvm.org/D5413 llvm-svn: 222349	2014-11-19 11:49:57 +00:00
Jozef Kolek	0de52b5b97	[mips][microMIPS] Implement LWXS instruction. Differential Revision: http://reviews.llvm.org/D5407 llvm-svn: 222348	2014-11-19 11:39:12 +00:00
Jozef Kolek	e466cd5b54	[mips][microMIPS] Implement SDBBP and RDHWR instructions. Differential Revision: http://reviews.llvm.org/D5240 llvm-svn: 222347	2014-11-19 11:25:50 +00:00
Simon Pilgrim	e5f972f1c1	[X86][SSE] pslldq/psrldq byte shifts/rotation for SSE2 This patch builds on http://reviews.llvm.org/D5598 to perform byte rotation shuffles (lowerVectorShuffleAsByteRotate) on pre-SSSE3 (palignr) targets - pre-SSSE3 is only enabled on i8 and i16 vector targets where it is a more definite performance gain. I've also added a separate byte shift shuffle (lowerVectorShuffleAsByteShift) that makes use of the ability of the SLLDQ/SRLDQ instructions to implicitly shift in zero bytes to avoid the need to create a zero register if we had used palignr. Differential Revision: http://reviews.llvm.org/D5699 llvm-svn: 222340	2014-11-19 10:06:49 +00:00
David Majnemer	074041b4ec	AliasSetTracker: UnknownInsts should contribute to the refcount AliasSetTracker::addUnknown may create an AliasSet devoid of pointers just to contain an instruction if no suitable AliasSet already exists. It will then AliasSet::addUnknownInst and we will be done. However, it's possible for addUnknown to choose an existing AliasSet to addUnknownInst. If this were to occur, we are in a bit of a pickle: removing pointers from the AliasSet can cause the entire AliasSet to become destroyed, taking our unknown instructions out with them. Instead, keep track whether or not our AliasSet has any unknown instructions. This fixes PR21582. llvm-svn: 222338	2014-11-19 09:41:05 +00:00
Hao Liu	00d285aca3	[AArch64] Enable SeparateConstOffsetFromGEP, EarlyCSE and LICM passes on AArch64 backend. SeparateConstOffsetFromGEP can gives more optimizaiton opportunities related to GEPs, which benefits EarlyCSE and LICM. By enabling these passes we can have better address calculations and generate a better addressing mode. Some SPEC 2006 benchmarks (astar, gobmk, namd) have obvious improvements on Cortex-A57. Reviewed in http://reviews.llvm.org/D5864. llvm-svn: 222331	2014-11-19 06:39:53 +00:00
Rui Ueyama	520b1e8263	llvm-readobj: fix off-by-one error in COFFDumper It printed out base relocation table header as table entry. This patch also makes llvm-readobj to not skip ABSOLUTE entries becuase it was confusing. llvm-svn: 222299	2014-11-19 02:07:10 +00:00
Weiming Zhao	c7ce2ee93f	[Aarch64] Customer lowering of CTPOP to SIMD should check for NEON availability llvm-svn: 222292	2014-11-19 00:29:14 +00:00
Kostya Serebryany	52d047bc0f	[asan] add experimental basic-block tracing to asan-coverage; also fix -fsanitize-coverage=3 which was broken by r221718 llvm-svn: 222290	2014-11-19 00:22:58 +00:00
Rui Ueyama	2b5655092a	llvm-readobj: teach it how to dump COFF base relocation table llvm-svn: 222289	2014-11-19 00:18:07 +00:00
Manman Ren	df5625ea3a	Revert r222039 because of bot failure. http://lab.llvm.org:8080/green/job/clang-Rlto_master/298/ Hopefully, bot will be green. If not, we will re-submit the commit. llvm-svn: 222287	2014-11-19 00:13:26 +00:00
Matt Arsenault	73f4bd8758	R600/SI: Implement areMemAccessesTriviallyDisjoint This partially makes up for not having address spaces used for alias analysis in some simple cases. This is not yet enabled by default so shouldn't change anything yet. llvm-svn: 222286	2014-11-19 00:01:31 +00:00
Simon Pilgrim	daabed160f	[X86][AVX] 256-bit vector stack unaligned load/stores identification Under many circumstances the stack is not 32-byte aligned, resulting in the use of the vmovups/vmovupd/vmovdqu instructions when inserting ymm reloads/spills. This minor patch adds these instructions to the isFrameLoadOpcode/isFrameStoreOpcode helpers so that they can be correctly identified and not be treated as folded reloads/spills. This has also been noticed by http://llvm.org/bugs/show_bug.cgi?id=18846 where it was causing redundant spills - I've added a reduced test case at test/CodeGen/X86/pr18846.ll Differential Revision: http://reviews.llvm.org/D6252 llvm-svn: 222281	2014-11-18 23:38:19 +00:00
Colin LeMahieu	f815ca1b0b	[Hexagon] Adding A2_and instruction. llvm-svn: 222274	2014-11-18 22:45:47 +00:00
Chad Rosier	ccf41a5c21	[FastISel][AArch64] Also allow folding of sign-/zero-extend and arithmetic shift-right for booleans (i1). Arithmetic shift-right immediate with sign-/zero-extensions also works for boolean values. Update the assert and the test cases to reflect that fact. llvm-svn: 222272	2014-11-18 22:41:49 +00:00
Chad Rosier	7153154a79	[FastISel][AArch64] Also allow folding of sign-/zero-extend and logical shift-right for booleans (i1). Logical shift-right immediate with sign-/zero-extensions also works for boolean values. Update the assert and the test cases to reflect that fact. llvm-svn: 222270	2014-11-18 22:38:42 +00:00
David Majnemer	e6cc1061cc	InstCombine: Fix another infinite loop caused by visitFPTrunc We would attempt to replace an frem's operand with the same operand. This would cause InstCombine to think real work was done, causing InstCombine to enter an infinite loop. This fixes the second part of PR21576. llvm-svn: 222265	2014-11-18 22:06:45 +00:00
Colin LeMahieu	1d74e8f01b	[Hexagon] Adding A2_sub instruction Renaming test files. llvm-svn: 222263	2014-11-18 21:51:51 +00:00
David Majnemer	0c67e78132	Revert "Revert r222040 because of bot failure." This reverts commit r222203, reverting r222040 didn't end up turning the bot green. llvm-svn: 222261	2014-11-18 21:30:02 +00:00
Juergen Ributzka	b3791ee3a7	[FastISel][AArch64] Follow-up fix for "Fix shift-immediate emission for "zero" shifts." Shifts also perform sign-/zero-extends to larger types, which requires us to emit an integer extend instead of a simple COPY. Related to PR21594. llvm-svn: 222257	2014-11-18 21:20:17 +00:00
Matt Arsenault	84d2214a94	R600/SI: Move SIFixSGPRCopies to inst selector passes This should expose more of the actually used VALU instructions to the machine optimization passes. This also should help getting i1 handling into a better state. For not entirly understood reasons, this fixes the split-scalar-i64-add.ll test where a 64-bit add would only partially be moved to the VALU resulting in use of undefined VCC. llvm-svn: 222256	2014-11-18 21:06:58 +00:00
Tom Stellard	962ccd7f85	R600/SI: Make sure resource descriptors are always stored in SGPRs llvm-svn: 222253	2014-11-18 20:39:39 +00:00
Chad Rosier	714cdb214a	[Reassociate] Use test cases that can actually be optimized to verify optional flags are cleared. The reassociation pass was just reordering the leaf nodes in the previous test cases. llvm-svn: 222250	2014-11-18 20:34:01 +00:00
Colin LeMahieu	f7ca7f6c70	[Hexagon] Converting from ADD_rr to A2_add which has encoding bits. Adding test to show correct instruction selection and encoding. llvm-svn: 222249	2014-11-18 20:28:11 +00:00
Juergen Ributzka	a2005be2b4	[FastISel][AArch64] Fix shift-immediate emission for "zero" shifts. This change emits a COPY for a shift-immediate with a "zero" shift value. This fixes PR21594 where we emitted a shift instruction with an incorrect immediate operand. llvm-svn: 222247	2014-11-18 19:58:59 +00:00
Philip Reames	33e827b222	Tweak EarlyCSE to recognize series of dead stores EarlyCSE is giving up on the current instruction immediately when it recognizes that the current instruction makes a previous store trivially dead. There's no reason to do this. Once the previous store has been deleted, it's perfectly legal to remember the value of the current store (for value forwarding) and the fact the store occurred (it could be dead too!). Reviewed by: Hal Differential Revision: http://reviews.llvm.org/D6301 llvm-svn: 222241	2014-11-18 17:46:32 +00:00
Manman Ren	58632b809c	Remove triple in testing case to recover an arm bot. llvm-svn: 222239	2014-11-18 16:45:34 +00:00
David Majnemer	a43009a5dc	InstCombine: Fold away tautological masked compares It is impossible for (x & INT_MAX) == 0 && x == INT_MAX to ever be true. While this sort of reasoning should normally live in InstSimplify, the machinery that derives this result is not trivial to split out. llvm-svn: 222230	2014-11-18 09:31:41 +00:00
Frederic Riss	f1bfe6e383	Allow DwarfCompileUnit::constructImportedEntityDIE to instanciate a GlobalVariable DIE. Usually global variables are in a retain list and instanciated before any call to constructImportedEntityDIE is made. This isn't true for forward declarations though. The testcase for this change is generated by a clang patched to emit such forward declarations (patch at http://reviews.llvm.org/D6173 which will land soon). The updated testcase tests more than just global variables, it now tests every type of 'using' clause we support. llvm-svn: 222217	2014-11-18 02:46:11 +00:00
David Majnemer	40a65fda59	llvm-readobj: Don't print the Characteristics field as the Subsystem We claimed that we were printing the Subystem field when we were actually printing the Characteristics field. llvm-svn: 222216	2014-11-18 02:45:28 +00:00
David Majnemer	35be2aba4c	IndVarSimplify: Allow LFTR to fire more often I added a pessimization in r217102 to prevent miscompiles when the incremented induction variable was used in a comparison; it would be poison. Try to use the incremented induction variable more often when we can be sure that the increment won't end in poison. Differential Revision: http://reviews.llvm.org/D6222 llvm-svn: 222213	2014-11-18 02:20:58 +00:00
Manman Ren	d45a3b7135	Update testing case that was accidently duplicated. llvm-svn: 222210	2014-11-18 01:49:06 +00:00
Manman Ren	3d4f707d60	Revert r222040 because of bot failure. http://lab.llvm.org:8080/green/job/clang-Rlto_master/298/ Hopefully, bot will be green. llvm-svn: 222203	2014-11-18 00:33:22 +00:00
Manman Ren	9b65b2864d	Debug Info: In DIBuilder, the context field of a global variable is updated to use DIScopeRef. A paired commit at clang will follow to show cases where we will use an identifer for the context of a global variable. rdar://18958417 llvm-svn: 222195	2014-11-18 00:29:08 +00:00
Duncan P. N. Exon Smith	41818ec794	IR: Simplify uniquing for MDNode Change uniquing from a `FoldingSet` to a `DenseSet` with custom `DenseMapInfo`. Unfortunately, this doesn't save any memory, since `DenseSet<T>` is a simple wrapper for `DenseMap<T, char>`, but I'll come back to fix that later. I used the name `GenericDenseMapInfo` to the custom `DenseMapInfo` since I'll be splitting `MDNode` into two classes soon: `MDNodeFwdDecl` for temporaries, and `GenericMDNode` for everything else. I also added a non-debug-info reduced version of a type-uniquing test that started failing on an earlier draft of this patch. Part of PR21532. llvm-svn: 222191	2014-11-17 23:28:21 +00:00
Juergen Ributzka	05cff0a244	[SimplifyCFG] Make the value type of the hole check bitmask a power-of-2. When converting a switch to a lookup table we might have to generate a bitmaks to encode and check for holes in the original switch statement. The type of this mask depends on the number of switch statements, which can result in illegal types for pretty much all architectures. To avoid unnecessary type legalization and help FastISel this commit increases the size of the bitmask to next power-of-2 value when necessary. This fixes rdar://problem/18984639. llvm-svn: 222168	2014-11-17 19:39:56 +00:00
Chad Rosier	2db8cbf601	[Reassociate] As the expression tree is rewritten make sure the operands are emitted in canonical form. llvm-svn: 222142	2014-11-17 16:33:50 +00:00
Alexey Volkov	3cc8e8e28b	[X86] Use ADD/SUB instead of INC/DEC for Haswell and Broadwell CPUs Differential Revision: http://reviews.llvm.org/D5934 llvm-svn: 222141	2014-11-17 16:17:51 +00:00
Chad Rosier	fc00bbc305	[Reassociate] Canonicalize constants to RHS operand. Fix a thinko where the RHS was already a constant. llvm-svn: 222139	2014-11-17 15:52:51 +00:00
Renato Golin	b92ad16856	Fix ARM triple parsing The triple parser should only accept existing architecture names when the triple starts with armv, armebv, thumbv or thumbebv. Patch by Gabor Ballabas. llvm-svn: 222129	2014-11-17 14:08:57 +00:00
Oliver Stannard	93a823bc7a	[Thumb1] Re-write emitThumbRegPlusImmediate This was motivated by a bug which caused code like this to be miscompiled: declare void @take_ptr(i8) define void @test() { %addr1.32 = alloca i8 %addr2.32 = alloca i32, i32 1028 call void @take_ptr(i8 %addr1) ret void } This was emitting the following assembly to get the value of %addr1: add r0, sp, #1020 add r0, r0, #8 However, "add r0, r0, #8" is not a valid Thumb1 instruction, and this could not be assembled. The generated object file contained this, resulting in r0 holding SP+8 rather tha SP+1028: add r0, sp, #1020 add r0, sp, #8 This function looked like it could have caused miscompilations for other combinations of registers and offsets (though I don't think it is currently called with these), and the heuristic it used did not match the emitted code in all cases. llvm-svn: 222125	2014-11-17 11:18:10 +00:00
David Majnemer	903d9c0ab0	Object, COFF: Tighten the object file parser We were a little lax in a few areas: - We pretended that import libraries were like any old COFF file, they are not. In fact, they aren't really COFF files at all, we should probably grow some specialized functionality to handle them smarter. - Our symbol iterators were more than happy to attempt to go past the end of the symbol table if you had a symbol with a bad list of auxiliary symbols. llvm-svn: 222124	2014-11-17 11:17:17 +00:00
Oliver Stannard	2efad103c3	Fix optimisations of SELECT_CC which assumed result is boolean Some optimisations in DAGCombiner cause miscompilations for targets that use TargetLowering::UndefinedBooleanContent, because they assume that the results of a SELECT_CC node are boolean values, and can be safely ANDed, ORed and XORed. These optimisations are only valid for targets that use ZeroOrOneBooleanContent or ZeroOrNegativeOneBooleanContent. This is a follow-up to D6210/r221693. llvm-svn: 222123	2014-11-17 10:49:31 +00:00
Erik Eckstein	49292b906e	Optimize switch lookup tables with linear mapping. This is a simple optimization for switch table lookup: It computes the output value directly with an (optional) mul and add if there is a linear mapping between index and output. Example: int f1(int x) { switch (x) { case 0: return 10; case 1: return 11; case 2: return 12; case 3: return 13; } return 0; } generates: define i32 @f1(i32 %x) #0 { entry: %0 = icmp ult i32 %x, 4 br i1 %0, label %switch.lookup, label %return switch.lookup: %switch.offset = add i32 %x, 10 ret i32 %switch.offset return: ret i32 0 } llvm-svn: 222121	2014-11-17 09:13:57 +00:00
Bob Wilson	0774bc6c56	Fix CR/LF line endings in test case. llvm-svn: 222120	2014-11-17 08:00:45 +00:00
Rafael Espindola	7382f6d0d4	Add back r222061 with a fix. This adds back r222061, but now calls initializePAEvalPass from the correct library to avoid link problems. Original message: Don't make assumptions about the name of private global variables. Private variables are can be renamed, so it is not reliable to make decisions on the name. The name is also dropped by the assembler before getting to the linker, so using the name causes a disconnect between how llvm makes a decision (var name) and how the linker makes a decision (section it is in). This patch changes one case where we were looking at the variable name to use the section instead. Test tuning by Michael Gottesman. llvm-svn: 222117	2014-11-17 02:28:27 +00:00
Frederic Riss	21adb20a7e	Implement MachODumper::printFileHeaders Patch by Chilledheart. Differential Revision: http://reviews.llvm.org/D6163 llvm-svn: 222115	2014-11-17 01:34:15 +00:00
Jingyue Wu	230717d66d	[DependenceAnalysis] Allow subscripts of different types Summary: Several places in DependenceAnalysis assumes both SCEVs in a subscript pair share the same integer type. For instance, isKnownPredicate calls SE->getMinusSCEV(X, Y) which asserts X and Y share the same type. However, DependenceAnalysis fails to ensure this assumption when producing a subscript pair, causing tests such as NonCanonicalizedSubscript to crash. With this patch, DependenceAnalysis runs unifySubscriptType before producing any subscript pair, ensuring the assumption. Test Plan: Added NonCanonicalizedSubscript.ll on which DependenceAnalysis before the fix crashed because subscripts have different types. Reviewers: spop, sebpop, jingyue Reviewed By: jingyue Subscribers: eliben, meheff, llvm-commits Differential Revision: http://reviews.llvm.org/D6289 llvm-svn: 222100	2014-11-16 16:52:44 +00:00
David Majnemer	7e29d637c6	ScalarEvolution: HowFarToZero was wrongly using signed division HowFarToZero was supposed to use unsigned division in order to calculate the backedge taken count. However, SCEVDivision::divide performs signed division. Unless I am mistaken, no users of SCEVDivision actually want signed arithmetic: switch to udiv and urem. This fixes PR21578. llvm-svn: 222093	2014-11-16 07:30:35 +00:00
Andrea Di Biagio	5475bc1d1b	[DAG] Improved target independent vector shuffle folding logic. This patch teaches the DAGCombiner how to combine shuffles according to rules: shuffle(shuffle(A, Undef, M0), B, M1) -> shuffle(B, A, M2) shuffle(shuffle(A, B, M0), B, M1) -> shuffle(B, A, M2) shuffle(shuffle(A, B, M0), A, M1) -> shuffle(B, A, M2) llvm-svn: 222090	2014-11-15 22:56:25 +00:00
Simon Pilgrim	5e170d7652	[X86][SSE] Improve legal SHUFP and PSHUFD shuffle matching Updated X86TargetLowering::isShuffleMaskLegal to match SHUFP masks with commuted inputs and PSHUFD masks that reference the second input. As part of this I've refactored isPSHUFDMask to work in a more general manner and allow it to match against either the first or second input vector. Differential Revision: http://reviews.llvm.org/D6287 llvm-svn: 222087	2014-11-15 21:13:05 +00:00
Matt Arsenault	1298bf6e9c	R600: Permute operands when selecting legacy min/max This gets the correct NaN behavior based on the compare type the hardware uses. This now passes the new piglit test I have for this on SI. Add stricter tests for the operand order. llvm-svn: 222079	2014-11-15 05:02:57 +00:00
Reid Kleckner	30a587b9ae	Revert "Don't make assumptions about the name of private global variables." This reverts commit r222061. It's causing linker errors. llvm-svn: 222077	2014-11-15 02:03:53 +00:00
Rafael Espindola	c01b31682e	Don't make assumptions about the name of private global variables. Private variables are can be renamed, so it is not reliable to make decisions on the name. The name is also dropped by the assembler before getting to the linker, so using the name causes a disconnect between how llvm makes a decision (var name) and how the linker makes a decision (section it is in). This patch changes one case where we were looking at the variable name to use the section instead. Test tuning by Michael Gottesman. llvm-svn: 222061	2014-11-14 23:17:47 +00:00
Tim Northover	f511e3a633	ARM: refactor .cfi_def_cfa_offset emission. We use to track quite a few "adjusted" offsets through the FrameLowering code to account for changes in the prologue instructions as we went and allow the emission of correct CFA annotations. However, we were missing a couple of cases and the code was almost impenetrable. It's easier to just add any stack-adjusting instruction to a list and emit them together. llvm-svn: 222057	2014-11-14 22:45:33 +00:00
Tim Northover	5eff3fb39c	ARM: correctly calculate the offset of FP in its push. When we folded the DPR alignment gap into a push, we weren't noting the extra distance from the beginning of the push to the FP, and so FP ended up pointing at an incorrect offset. The .cfi_def_cfa_offset directives are still wrong in this case, but I think that can be improved by refactoring. llvm-svn: 222056	2014-11-14 22:45:31 +00:00
Tim Northover	3327a92a74	ARM: simplify test. The test's DWARF stubs were there just to trigger the emission of .cfi directives. Fortunately, the NetBSD ABI already demands proper DWARF unwind info, so it's easier to just use that triple. llvm-svn: 222055	2014-11-14 22:45:23 +00:00
Kevin Enderby	a64b36ecab	Add the code and test cases for 64-bit ARM to llvm-objdump’s Mach-O symbolizer. FYI, removed the unused MCInstrAnalysis as it does not exist for 64-bit ARM and was causing a “couldn't initialize disassembler for target” error. llvm-svn: 222045	2014-11-14 21:52:18 +00:00
Frederic Riss	1cf11e9eff	Add a test for r222029 that doesn't rely on the default target being a COFF platform. llvm-svn: 222041	2014-11-14 21:23:26 +00:00
David Majnemer	4c69b4d32f	InstCombine: Fix infinite loop caused by visitFPTrunc We would attempt to replace a fptrunc of an frem with an identical fptrunc. This would cause the new fptrunc to be added to the worklist. Of course, this results in an infinite loop because we will keep visiting the newly created fptruncs. This fixes PR21576. llvm-svn: 222040	2014-11-14 21:21:15 +00:00
Chad Rosier	b2f1df48f3	Reapply r221924: "[GVN] Perform Scalar PRE on gep indices that feed loads before doing Load PRE" This commit updates the failing test in Analysis/TypeBasedAliasAnalysis/gvn-nonlocal-type-mismatch.ll The failing test is sensitive to the order in which we process loads. This version turns on the RPO traversal instead of the while DT traversal in GVN. The new test code is functionally same just the order of loads that are eliminated is swapped. This new version also fixes an issue where GVN splits a critical edge and potentially invalidate the RPO/DT iterator. llvm-svn: 222039	2014-11-14 21:09:13 +00:00
Tom Stellard	c18e457d39	R600/SI: Fix spilling of m0 register If we have spilled the value of the m0 register, then we need to restore it with v_readlane_b32 to a regular sgpr, because v_readlane_b32 can't write to m0. v_readlane_b32 can't write to m0, so llvm-svn: 222036	2014-11-14 20:43:26 +00:00
Matt Arsenault	633ce0ecd7	R600/SI: Combine min3/max3 instructions llvm-svn: 222032	2014-11-14 20:08:52 +00:00
Frederic Riss	582964b38a	[dwarfdump] Handle relocations in Dwarf accelerator tables ELF targets (and maybe COFF) use relocations when referring to strings in the .debug_str section. Handle that in the accelerator table dumper. This commit restores the test/DebugInfo/cross-cu-inlining.ll test to its expected platform independant form, validating that the fix works (this test failed on linux boxes). llvm-svn: 222029	2014-11-14 19:30:08 +00:00
Matt Arsenault	d4196f855a	R600/SI: Fix verifier error from a branch on IMPLICIT_DEF SIILowerI1Copies wasn't correctly handling this case. llvm-svn: 222020	2014-11-14 18:43:41 +00:00
Matt Arsenault	346bdd92b1	R600/SI: Match integer min / max instructions llvm-svn: 222015	2014-11-14 18:30:06 +00:00
Matt Arsenault	8ddb68217c	R600/SI: Use S_BFE_I64 for 64-bit sext_inreg llvm-svn: 222012	2014-11-14 18:18:16 +00:00
Chad Rosier	91839c8c44	[Reassociate] Canonicalize the operands of all binary operators. llvm-svn: 222008	2014-11-14 17:09:19 +00:00
Frederic Riss	bdc4bb6ad1	Tentatively appease the bots. If this workaround gets the bots green, then we have to find out why the -dwarf-accel-tables=Enable option doesn't work as expected on non-darwin platforms. llvm-svn: 222007	2014-11-14 17:08:18 +00:00
Chad Rosier	534da46914	[Reassociate] Canonicalize operands of vector binary operators. Prior to this commit fmul and fadd binary operators were being canonicalized for both scalar and vector versions. We now canonicalize add, mul, and, or, and xor vector instructions. llvm-svn: 222006	2014-11-14 17:08:15 +00:00
Chad Rosier	40ae58f215	[Reassociate] Canonicalize constants to RHS operand. llvm-svn: 222005	2014-11-14 17:05:59 +00:00
Frederic Riss	3ebefbe21f	Reapply "[dwarfdump] Add support for dumping accelerator tables." This reverts commit r221842 which was a revert of r221836 and of the test parts of r221837. This new version fixes an UB bug pointed out by David (along with addressing some other review comments), makes some dumping more resilient to broken input data and forces the accelerator tables to be dumped in the tests where we use them (this decision is platform specific otherwise). llvm-svn: 222003	2014-11-14 16:15:53 +00:00
Cameron McInally	8882e35937	[AVX512] Add 512b masked integer shift by immediate patterns. llvm-svn: 222002	2014-11-14 15:43:00 +00:00
Tom Stellard	a4aabd3acb	R600/SI: Start implementing an assembler This was done using the Sparc and PowerPC AsmParsers as guides. So far it is very simple and only supports sopp instructions. llvm-svn: 221994	2014-11-14 14:08:00 +00:00
Bill Schmidt	80cd5e5dbb	[PowerPC] Add VSX builtins for vec_div This patch adds builtin support for xvdivdp and xvdivsp, along with a test case. Straightforward stuff. There's a companion patch for Clang. llvm-svn: 221983	2014-11-14 12:10:40 +00:00
Tim Northover	3a30fb90f5	X86: use getConstant rather than getTargetConstant behind BUILD_VECTOR. getTargetConstant should only be used when you can guarantee the instruction selected will be able to cope with the raw value. BUILD_VECTOR is rather too generic for this so we should use getConstant instead. In that case, an instruction can still consume the constant, but if it doesn't it'll be materialised through its own round of ISel. Should fix PR21352. llvm-svn: 221961	2014-11-14 01:30:14 +00:00
Reid Kleckner	c90bcabd2d	Allow the use of functions as typeinfo in landingpad clauses This is one step towards supporting SEH filter functions in LLVM. llvm-svn: 221954	2014-11-14 00:35:50 +00:00
Reed Kotler	664c8f453d	First stage of call lowering for Mips fast-isel Summary: This has most of what is needed for mips fast-isel call lowering for O32. What is missing I will add on the next patch because this patch is already too large. It should not be doing anything wrong but it will punt on some cases that it is basically capable of doing. The mechanism is there for parameters to be passed on the stack but I have not enabled it because it serves as a way for now to prevent some of the strange cases of O32 register passing that I have not fully checked yet and have some issues. The Mips O32 abi rules are very complicated as far how data is passed in floating and integer registers. However there is a way to think about this all very simply and this implementation reflects that. Basically, the ABI rules are written as if everything is passed on the stack and aligned as such. Once that is conceptually done, it is nearly trivial to reassign those locations to registers and then all the complexity disappears. So I have told tablegen that all the data is passed on the stack and during the lowering I fix this by assigning to registers as per the ABI doc. This has been my approach and you can line up what I did with the ABI document and see 1 to 1 what is going on. Test Plan: callabi.ll Reviewers: dsanders Reviewed By: dsanders Subscribers: jholewinski, echristo, ahatanak, llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D5714 llvm-svn: 221948	2014-11-13 23:37:45 +00:00
Reid Kleckner	b9aef21886	Fix symbol resolution of floating point libc builtins in MCJIT Fix for LLI failure on Windows\X86: http://llvm.org/PR5053 LLI.exe crashes on Windows\X86 when single precession floating point intrinsics like the following are used: acos, asin, atan, atan2, ceil, copysign, cos, cosh, exp, floor, fmin, fmax, fmod, log, pow, sin, sinh, sqrt, tan, tanh The above intrinsics are defined as inline-expansions in math.h, and are not exported by msvcr120.dll (Win32 API GetProcAddress returns null). For an FREM instruction, the JIT compiler generates a call to a stub for the fmodf() intrinsic, and adds a relocation to fixup at load time. The loader searches the libraries for the function, but fails because the symbol is not exported. So, the call target remains NULL and the execution crashes. Since the math functions are loaded at JIT/runtime, the JIT can patch CALL instruction directly instead of the searching the libraries' exported symbols. However, this fix caused build failures due to unresolved symbols like _fmodf at link time. Therefore, the current fix defines helper functions in the Runtime link/load library to perform the above operations. The address of these helper functions are used to patch up the CALL instruction at load time. Reviewers: lhames, rnk Reviewed By: rnk Differential Revision: http://reviews.llvm.org/D5387 Patch by Swaroop Sridhar! llvm-svn: 221947	2014-11-13 23:32:52 +00:00
Reid Kleckner	97786a6d08	Relax the gcov version.ll test to check '.' instead of '\' The escaping of the '\' doesn't work with my combination of testing tools. llvm-svn: 221944	2014-11-13 23:07:55 +00:00
Matt Arsenault	08233590d4	R600/SI: Fix fmin_legacy / fmax_legacy matching for SI select_cc is expanded on SI, so this was never matched. llvm-svn: 221941	2014-11-13 23:03:09 +00:00
Chad Rosier	b78b0f87d3	Revert "[GVN] Perform Scalar PRE on gep indices that feed loads before doing Load PRE." This reverts commit r221924. It appears the commit was a bit premature and is causing bot failures that need further investigation. llvm-svn: 221939	2014-11-13 22:54:59 +00:00
Chandler Carruth	a6a8ccfa74	[x86] Add some tests for specific patterns of lane-flips combined with in-lane shuffles that aren't always handled well by the current vector shuffle lowering. No functionality change yet, that will follow in a subsequent commit. llvm-svn: 221938	2014-11-13 22:49:44 +00:00
Chad Rosier	9c48d1f2b8	[GVN] Perform Scalar PRE on gep indices that feed loads before doing Load PRE. Phabricator Revision: http://reviews.llvm.org/D6103 Patch by "Balaram Makam" <bmakam@codeaurora.org>! llvm-svn: 221924	2014-11-13 21:17:58 +00:00
Juergen Ributzka	fb67fff9cb	[FastISel][AArch64] Don't bail during simple GEP instruction selection. The generic FastISel code would bail, because it can't emit a sign-extend for AArch64. This copies the code over and uses AArch64 specific emit functions. This is not ideal and 'computeAddress' should handles this, so it can fold the address computation into the memory operation. I plan to clean up 'computeAddress' anyways, so I will add that in a future commit. Related to rdar://problem/18962471. llvm-svn: 221923	2014-11-13 20:50:44 +00:00
Matt Arsenault	d222be0dd7	R600/SI: Use s_movk_i32 llvm-svn: 221922	2014-11-13 20:44:23 +00:00
Matt Arsenault	2ef28e2234	R600: Fix assert on empty function If a function is just an unreachable, this would hit a "this is not a MachO target" assertion because of setting HasSubsectionViaSymbols. llvm-svn: 221920	2014-11-13 20:07:40 +00:00
Matt Arsenault	8f277a520f	R600: Error on initializer for LDS. Also give a proper error for other address spaces. llvm-svn: 221917	2014-11-13 19:56:13 +00:00
Matt Arsenault	5487619f0a	R600/SI: Get rid of FCLAMP_SI pseudo It's not necessary. Also use complex patterns to allow src modifier usage. llvm-svn: 221916	2014-11-13 19:49:04 +00:00
Matt Arsenault	41886925dc	R600/SI: Allow commuting with src2_modifiers llvm-svn: 221911	2014-11-13 19:26:50 +00:00
Matt Arsenault	a919d1ac8d	R600/SI: Allow commuting some 3 op instructions e.g. v_mad_f32 a, b, c -> v_mad_f32 b, a, c This simplifies matching v_madmk_f32. This looks somewhat surprising, but it appears to be OK to do this. We can commute src0 and src1 in all of these instructions, and that's all that appears to matter. llvm-svn: 221910	2014-11-13 19:26:47 +00:00
Tim Northover	8c9fca07bd	ARM: allow constpool entry to be moved to the user's block in all cases. Normally entries can only move to a lower address, but when that wasn't viable, the user's block was considered anyway. Unfortunately, it went via createNewWater which wasn't designed to handle the case where there's already an island after the block. Unfortunately, the test we have is slow and fragile, and I couldn't reduce it to anything sane even with the @llvm.arm.space intrinsic. The test change here is recreating the previous one after the change. rdar://problem/18545506 llvm-svn: 221905	2014-11-13 17:58:53 +00:00
Tim Northover	a52b6a893f	ARM: avoid duplicating branches during constant islands. We were using a naive heuristic to determine whether a basic block already had an unconditional branch at the end. This mostly corresponded to reality (assuming branches got optimised) because there's not much point in a branch to the next block, but could go wrong. llvm-svn: 221904	2014-11-13 17:58:51 +00:00
Tim Northover	5807daa475	ARM: add @llvm.arm.space intrinsic for testing ConstantIslands. Creating tests for the ConstantIslands pass is very difficult, since it depends on precise layout details. Having the ability to precisely inject a number of bytes into the stream helps greatly. llvm-svn: 221903	2014-11-13 17:58:48 +00:00
Elena Demikhovsky	9da1df2e58	AVX-512: SINT_TO_FP cost model and some bugfixes Checked some corner cases, for example translation of <8 x i1> to <8 x double> llvm-svn: 221883	2014-11-13 11:46:16 +00:00
Hal Finkel	d9aee5d51a	OCAMLFLAGS can contain =, don't use = with sed Like HOST_LDFLAGS, etc. OCAMLFLAGS can contain =, so use ! as the substitution separator instead of = (otherwise, sed might error). llvm-svn: 221879	2014-11-13 09:29:30 +00:00

... 4 5 6 7 8 ...

27549 Commits