llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 06:22:56 +02:00

Author	SHA1	Message	Date
Jack Carter	8fc37da48b	Mips direct object xgot support This patch provides support for the MIPS relocations: ) R_MIPS_GOT_HI16 ) R_MIPS_GOT_LO16 ) R_MIPS_CALL_HI16 ) R_MIPS_CALL_LO16 These are used for large GOT instruction sequences. Contributer: Jack Carter llvm-svn: 168471	2012-11-21 23:38:59 +00:00
Akira Hatanaka	0f8303f1e5	[mips] Generate big GOT code. llvm-svn: 168460	2012-11-21 20:40:38 +00:00
Andrew Kaylor	4c7a9eebb0	Adding tests for the Intel JIT event listener's MCJIT support. llvm-svn: 168459	2012-11-21 20:38:26 +00:00
Anton Korobeynikov	a96a1c8e42	Add support for varargs functions for msp430. Patch by Job Noorman! llvm-svn: 168440	2012-11-21 17:28:27 +00:00
Anton Korobeynikov	1a8ff7b99a	Add support for byval args. Patch by Job Noorman! llvm-svn: 168439	2012-11-21 17:23:03 +00:00
NAKAMURA Takumi	8a294363da	llvm/test/Transforms/InstCombine/sdiv-1.ll: FileCheck-ize. "not grep '-715827882'" performed as below...bad... Usage: grep [OPTION]... PATTERN [FILE]... Try `grep --help' for more information. llvm-svn: 168430	2012-11-21 14:46:18 +00:00
Rafael Espindola	03adaab74d	Using "not grep" is brittle as the test passes if llvm-as fails. Fix the testcase to be valid IL and uses FileCheck. Thanks to NAKAMURA Takumi for noticing it. llvm-svn: 168427	2012-11-21 14:17:23 +00:00
Chandler Carruth	17e363c242	PR14055: Implement support for sub-vector operations in SROA. Now if we can transform an alloca into a single vector value, but it has subvector, non-element accesses, we form the appropriate shufflevectors to allow SROA to proceed. This fixes PR14055 which pointed out a very common pattern that SROA couldn't handle -- mixed vec3 and vec4 operations on a single alloca. llvm-svn: 168418	2012-11-21 08:16:30 +00:00
Eli Bendersky	386b394a4c	Add a tests for the new -no-show-raw-insn option of llvm-objdump. This also initiates a test/tools directory where tools-specific tests can be placed. llvm-svn: 168397	2012-11-20 23:44:22 +00:00
Kostya Serebryany	278702663c	[asan] don't instrument linker-initialized globals even with external linkage in -asan-initialization-order mode llvm-svn: 168367	2012-11-20 13:11:32 +00:00
Kostya Serebryany	ae2ee8e3f1	[asan] make sure that linker-initialized globals (non-extern) are not instrumented even in -asan-initialization-order mode. This time with a test llvm-svn: 168366	2012-11-20 13:00:01 +00:00
NAKAMURA Takumi	af8d10bdd1	llvm/test/ExecutionEngine/MCJIT/lit.local.cfg: ppc32-elf is not ready. llvm-svn: 168364	2012-11-20 10:49:01 +00:00
Chandler Carruth	42df021931	Fix PR14132 and handle OOB loads speculated throuh PHI nodes. The issue is that we may end up with newly OOB loads when speculating a load into the predecessors of a PHI node, and this confuses the new integer splitting logic in some cases, triggering an assertion failure. In fact, the branch in question must be dead code as it loads from a too-narrow alloca. Add code to handle this gracefully and leave the requisite FIXMEs for both optimizing more aggressively and doing more to aid sanitizing invalid code which triggers these patterns. llvm-svn: 168361	2012-11-20 10:02:19 +00:00
Tim Northover	3556e52a02	Fix physical register liveness calculations: + Take account of clobbers + Give outputs priority over inputs since they happen later. llvm-svn: 168360	2012-11-20 09:56:11 +00:00
Elena Demikhovsky	9f52a3ef84	Intel OCL built-ins calling conventions now support MacOS 32-bit. llvm-svn: 168359	2012-11-20 09:37:57 +00:00
Simon Atanasyan	946ee234f3	Marking remote mcjit tests as XFAIL for MIPS. llvm-svn: 168357	2012-11-20 07:25:17 +00:00
Chandler Carruth	47187cb94a	Rework the rewriting of loads and stores for vector and integer allocas to properly handle the combinations of these with split integer loads and stores. This essentially replaces Evan's r168227 by refactoring the code in a different way, and trynig to mirror that refactoring in both the load and store sides of the rewriting. Generally speaking there was some really problematic duplicated code here that led to poorly founded assumptions and then subtle bugs. Now much of the code actually flows through and follows a more consistent style and logical path. There is still a tiny bit of duplication on the store side of things, but it is much less bad. This also changes the logic to never re-use a load or store instruction as that was simply too error prone in practice. I've added a few tests (one a reduction of the one in Evan's original patch, which happened to be the same as the report in PR14349). I'm going to look at adding a few more tests for things I found and fixed in passing (such as the volatile tests in the vectorizable predicate). This patch has survived bootstrap, and modulo one bugfix survived Duncan's test suite, but let me know if anything else explodes. llvm-svn: 168346	2012-11-20 01:12:50 +00:00
Anton Korobeynikov	7a285e97e2	Factor out type info emission into separate routine. It turned out that ARM wants different layout of type infos. This is yet another patch in attempt to fix PR7187 llvm-svn: 168325	2012-11-19 21:06:26 +00:00
Jakob Stoklund Olesen	42930d7c54	Handle mixed normal and early-clobber defs on inline asm. PR14376. llvm-svn: 168320	2012-11-19 19:31:10 +00:00
Ulrich Weigand	c0920bb63a	Enable MCJIT tests on PowerPC. Disable old JIT tests on PowerPC. llvm-svn: 168316	2012-11-19 17:57:07 +00:00
Duncan Sands	2d43cbfea0	Fix PR14060, an infinite loop in reassociate. The problem was that one of the operands of the expression being written was wrongly thought to be reusable as an inner node of the expression resulting in it turning up as both an inner node and a leaf, creating a cycle in the def-use graph. This would have caused the verifier to blow up if things had gotten that far, however it managed to provoke an infinite loop first. llvm-svn: 168291	2012-11-18 19:27:01 +00:00
Andrew Trick	ab75b8798c	Use a full triple for a PPC test case for asm syntax. llvm-svn: 168283	2012-11-18 06:21:03 +00:00
NAKAMURA Takumi	5948e67968	MCJIT: [cygming] Give noop to __main also in RecordingMemoryManger. It is emitted in @main(). XFAIL(s) can be removed. llvm-svn: 168282	2012-11-18 06:16:32 +00:00
NAKAMURA Takumi	0a04171e80	test/ExecutionEngine/MCJIT/stubs-remote.ll: Prune DOSish CRLF. llvm-svn: 168281	2012-11-18 06:16:21 +00:00
Nick Lewycky	570e765264	Don't try to calculate the alignment of an unsigned type. Fixes PR14371! llvm-svn: 168280	2012-11-18 05:39:39 +00:00
Andrew Trick	d4358df73b	Silence the buildbots for this test while I figure out the triple llvm-svn: 168249	2012-11-17 03:39:26 +00:00
Andrew Trick	52f84ce773	Broaden isSchedulingBoundary to check aliases of SP. On PPC the stack pointer is X1, but ADJCALLSTACK writes R1. Fixes PR14315: Register regmask dependency problem with misched. llvm-svn: 168248	2012-11-17 03:35:11 +00:00
Hal Finkel	9dc292f3c5	Phi speculation improvement for BasicAA This is a partial solution to PR14351. It removes some of the special significance of the first incoming phi value in the phi aliasing checking logic in BasicAA. In the context of a loop, the old logic assumes that the first incoming value is the interesting one (meaning that it is the one that comes from outside the loop), but this is often not the case. With this change, we now test first the incoming value that comes from a block other than the parent of the phi being tested. llvm-svn: 168245	2012-11-17 02:33:15 +00:00
Eli Friedman	d7496f6688	Mark FP_EXTEND form v2f32 to v2f64 as "expand" for ARM NEON. Patch by Pete Couperus. llvm-svn: 168240	2012-11-17 01:52:46 +00:00
Chad Rosier	7aa7c0d952	[fast-isel] Add the -verify-machineinstrs to these test cases. The remaining test cases require fixes to fast-isel before the verifier can be enabled. Part of rdar://12594152 llvm-svn: 168233	2012-11-17 00:42:06 +00:00
Nadav Rotem	6ff38dc8d2	LoopVectorizer: Add initial support for pointer induction variables (for example: dst++ = src++). At the moment we still require to have an integer induction variable (for example: i++). llvm-svn: 168231	2012-11-17 00:27:03 +00:00
Akira Hatanaka	869eb1acb9	Initial implementation of MipsTargetLowering::isLegalAddressingMode. llvm-svn: 168230	2012-11-17 00:25:41 +00:00
Evan Cheng	3fb5893b5d	Teach SROA rewriteVectorizedStoreInst to handle cases when the loaded value is narrower than the stored value. rdar://12713675 llvm-svn: 168227	2012-11-17 00:05:06 +00:00
Andrew Kaylor	734b62b6ac	Marking remote mcjit tests as XFAIL for cygwin (hopefully only temporarily). llvm-svn: 168226	2012-11-17 00:02:50 +00:00
Andrew Kaylor	f132e42fe6	Marking remote mcjit tests as XFAIL for mingw32 (hopefully only temporarily). llvm-svn: 168221	2012-11-16 23:38:16 +00:00
Andrew Kaylor	47462d3fe5	Marking remote mcjit tests as XFAIL for ARM (hopefully only temporarily). llvm-svn: 168210	2012-11-16 22:21:04 +00:00
Weiming Zhao	85dce59506	Remove hard coded registers in ARM ldrexd and strexd instructions This patch replaces the hard coded GPR pair [R0, R1] of Intrinsic:arm_ldrexd and [R2, R3] of Intrinsic:arm_strexd with even/odd GPRPair reg class. Similar to the lowering of atomic_64 operation. llvm-svn: 168207	2012-11-16 21:55:34 +00:00
Anton Korobeynikov	3cd85d754d	Make sure FABS on v2f32 and v4f32 is legal on ARM NEON This fixes PR14359 llvm-svn: 168200	2012-11-16 21:15:20 +00:00
Richard Osborne	c8f73df738	Fix handling of aliases to functions. An alias to a function should use pc relative addressing. llvm-svn: 168199	2012-11-16 21:12:38 +00:00
Justin Holewinski	a794462d5b	[NVPTX] Order global variables in def-use order before emiting them in the final assembly llvm-svn: 168198	2012-11-16 21:03:51 +00:00
Justin Holewinski	cef6246d31	Preserve address space of forward-referenced global variables in the LL parser Before, the parser would assert on the following code: @a2 = global i8 addrspace(1)* @a @a = addrspace(1) global i8 0 because the type of @a was "i8" instead of "i8 addrspace(1)" when parsing the initializer for @a2. llvm-svn: 168197	2012-11-16 21:03:47 +00:00
Hemant Kulkarni	4fbbfcc6a6	Added program header emission llvm-svn: 168195	2012-11-16 20:51:32 +00:00
Duncan Sands	33a1bb1041	InstructionSimplify should be able to simplify A+B==B+A to 'true' but wasn't due to the same logic bug that caused PR14361. llvm-svn: 168186	2012-11-16 19:41:26 +00:00
Duncan Sands	86ca3afe2e	Fix PR14361: wrong simplification of A+B==B+A. You may think that the old logic replaced by this patch is equivalent to the new logic, but you'd be wrong, and that's exactly where the bug was. There's a similar bug in instsimplify which manifests itself as instsimplify failing to simplify this, rather than doing it wrong, see next commit. llvm-svn: 168181	2012-11-16 18:55:49 +00:00
Andrew Kaylor	3b722976b8	Adding new tests to test lli's pseudo-remote feature (-remote-mcjit). llvm-svn: 168180	2012-11-16 18:51:59 +00:00
NAKAMURA Takumi	6c26c8f4b6	llvm/test/CodeGen/X86/hipe-cc*.ll: Add explicit -mcpu, or they don't expect to pass on Atom. llvm-svn: 168171	2012-11-16 16:07:37 +00:00
Duncan Sands	98b6a4f4b5	Add the Erlang/HiPE calling convention, patch by Yiannis Tsiouris. llvm-svn: 168166	2012-11-16 12:36:39 +00:00
Amara Emerson	6ad9fb1a92	Add MCJIT test case for running global constructors. llvm-svn: 168149	2012-11-16 11:17:00 +00:00
Hans Wennborg	220b5e3636	Constant::IsThreadDependent(): Use dyn_cast<Constant> instead of cast It turns out that the operands of a Constant are not always themselves Constant. For example, one of the operands of BlockAddress is BasicBlock, which is not a Constant. This should fix the dragonegg-x86_64-linux-gcc-4.6-test build which broke in r168037. llvm-svn: 168147	2012-11-16 10:33:25 +00:00
Craig Topper	ad33f996a6	Use roundps/pd for llvm.ceil, llvm.trunc, llvm.rint, and llvm.nearbyint of vector types. llvm-svn: 168141	2012-11-16 06:37:56 +00:00
Akira Hatanaka	ba0e266eb2	[mips] Fix delay slot filler so that instructions with register operand $1 are allowed in branch delay slot. llvm-svn: 168131	2012-11-16 02:39:34 +00:00
Eli Bendersky	b6e1193a2e	Add some tests for the FileCheck utility. http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121112/156007.html llvm-svn: 168113	2012-11-15 23:42:51 +00:00
Eli Friedman	79932a2f77	Mark FP_ROUND for converting NEON v2f64 to v2f32 as expand. Add a missing case to vector legalization so this actually works. Patch by Pete Couperus. Fixes PR12540. llvm-svn: 168107	2012-11-15 22:44:27 +00:00
Adhemerval Zanella	c159b16933	PowerPC: Lowering floor intrinsic for Altivec This patch lowers the llvm.floor, llvm.ceil, llvm.trunc, and llvm.nearbyint to Altivec instruction when using 4 single-precision float vectors. llvm-svn: 168086	2012-11-15 20:56:03 +00:00
Hans Wennborg	73c4bb7fcb	Make GlobalOpt be conservative with TLS variables (PR14309) For global variables that get the same value stored into them everywhere, GlobalOpt will replace them with a constant. The problem is that a thread-local GlobalVariable looks like one value (the address of the TLS var), but is different between threads. This patch introduces Constant::isThreadDependent() which returns true for thread-local variables and constants which depend on them (e.g. a GEP into a thread-local array), and teaches GlobalOpt not to track such values. llvm-svn: 168037	2012-11-15 11:40:00 +00:00
Duncan Sands	f05d8752a2	Fix a crash observed by Shuxin Yang. The issue here is that LinearizeExprTree, the utility for extracting a chain of operations from the IR, thought that it might as well combine any constants it came across (rather than just returning them along with everything else). On the other hand, the factorization code would like to see the individual constants (this is quite reasonable: it is much easier to pull a factor of 3 out of 2*3 than it is to pull it out of 6; you may think 6/3 isn't so hard, but due to overflow it's not as easy to undo multiplications of constants as it may at first appear). This patch therefore makes LinearizeExprTree stupider: it now leaves optimizing to the optimization part of reassociate, and sticks to just analysing the IR. llvm-svn: 168035	2012-11-15 09:58:38 +00:00
Bill Schmidt	f294eb980a	This patch is in preparation for adding medium code model support to the PPC64 target. The five tests modified herein test code generation that is sensitive to the code model selected. So I've added -code-model=small to the RUN commands for each. Since small code model is the default, this has no effect for now; but this prepares us for eventually changing the default to medium code model for PPC64. Test changes verified with small and medium code model as default on powerpc64-unknown-linux-gnu. All tests continue to pass. llvm-svn: 167999	2012-11-14 23:23:27 +00:00
Jakub Staszak	8c20275ebf	Make sure to not get AVX code on an AVX-capable host. Revealed in r167967. llvm-svn: 167989	2012-11-14 22:24:01 +00:00
NAKAMURA Takumi	8adf86a12e	test/CodeGen/Hexagon/postinc-load.ll: Suppress it for now. It triggered the failure on i686 hosts. llvm-svn: 167988	2012-11-14 22:22:37 +00:00
Eric Christopher	caf5a23d81	Remove the CellSPU port. Approved by Chris Lattner. llvm-svn: 167984	2012-11-14 22:09:20 +00:00
NAKAMURA Takumi	e40f4623cd	llvm/test/CodeGen/X86/memset.ll: FileCheck-ize, and add another case on +avx. llvm-svn: 167975	2012-11-14 21:01:40 +00:00
Jyotsna Verma	a472ef54f3	Added multiclass for post-increment load instructions. llvm-svn: 167974	2012-11-14 20:38:48 +00:00
Benjamin Kramer	27983167e3	Force CPU in test so we don't accidentally get AVX code on an AVX-capable host. llvm-svn: 167973	2012-11-14 20:31:42 +00:00
Jakub Staszak	14a889a054	Remove DOS line endings. llvm-svn: 167968	2012-11-14 20:18:34 +00:00
Benjamin Kramer	0006b33581	X86: Enable SSE memory intrinsics even when stack alignment is less than 16 bytes. The stack realignment code was fixed to work when there is stack realignment and a dynamic alloca is present so this shouldn't cause correctness issues anymore. Note that this also enables generation of AVX instructions for memset under the assumptions: - Unaligned loads/stores are always fast on CPUs supporting AVX - AVX is not slower than SSE We may need some tweaked heuristics if one of those assumptions turns out not to be true. Effectively reverts r58317. Part of PR2962. llvm-svn: 167967	2012-11-14 20:08:40 +00:00
Nadav Rotem	b339c55cd3	The code pattern "imm0_255_neg" is used for checking if an immediate value is a small negative number. This patch changes the definition of negative from -0..-255 to -1..-255. I am changing this because of a bug that we had in some of the patterns that assumed that "subs" of zero does not set the carry flag. rdar://12028498 llvm-svn: 167963	2012-11-14 19:39:15 +00:00
Justin Holewinski	3f79944ac9	[NVPTX] Implement custom lowering of loads/stores for i1 Loads from i1 become loads from i8 followed by trunc Stores to i1 become zext to i8 followed by store to i8 Fixes PR13291 llvm-svn: 167948	2012-11-14 19:19:16 +00:00
Anton Korobeynikov	c8df249529	Fix really stupid ARM EHABI info generation bug: we should not emit eh table and handler data if there are no landing pads in the function. Patch by Logan Chien with some cleanups from me. llvm-svn: 167945	2012-11-14 19:13:30 +00:00
Jim Grosbach	2742e92ea2	X86: Better diagnostics for 32-bit vs. 64-bit mode mismatches. When an instruction as written requires 32-bit mode and we're assembling in 64-bit mode, or vice-versa, issue a more specific diagnostic about what's wrong. rdar://12700702 llvm-svn: 167937	2012-11-14 18:04:47 +00:00
Alexey Samsonov	92705fb808	Emit relocations from .debug_aranges to .debug_info for asm files llvm-svn: 167926	2012-11-14 09:55:38 +00:00
Rafael Espindola	1fb628bc96	Handle DAG CSE adding new uses during ReplaceAllUsesWith. Fixes PR14333. llvm-svn: 167912	2012-11-14 05:08:56 +00:00
Anton Korobeynikov	3edf77ac04	Use TARGET2 relocation for TType references on ARM. Do some cleanup of the code while here. Inspired by patch by Logan Chien! llvm-svn: 167904	2012-11-14 01:47:00 +00:00
Eric Christopher	b3e4c78741	Revert "Use the 'count' attribute instead of the 'upper_bound' attribute." temporarily as it is breaking the gdb bots. This reverts commit r167806/e7ff4c14b157746b3e0228d2dce9f70712d1c126. llvm-svn: 167886	2012-11-13 23:30:43 +00:00
Michael J. Spencer	b158d9cd2a	[MC][COFF] Emit weak symbols to the correct section. Patch by Dmitry Puzirev! llvm-svn: 167877	2012-11-13 22:04:09 +00:00
NAKAMURA Takumi	3250911d1e	Revert r167836, "llvm/test/Other/close-stderr.ll: Mark it as XFAIL:mingw32 for now.", corresponding to r167849. llvm-svn: 167876	2012-11-13 21:57:42 +00:00
Ulrich Weigand	7e5e3a1ed0	Add test case to verify correct relocs being generated for TLS symbols on PowerPC using the integrated assembler. llvm-svn: 167875	2012-11-13 21:53:43 +00:00
Shankar Easwaran	eb6f136f28	numerically sort the symbols, so that the testcase result is uniform llvm-svn: 167872	2012-11-13 21:01:11 +00:00
Daniel Dunbar	6579221af6	llvm-nm: Make sort more stable when symbol names are equal. llvm-svn: 167866	2012-11-13 19:39:55 +00:00
Manman Ren	e98ec5dd77	X86: when constructing VZEXT_LOAD from other loads, makes sure its output chain is correctly setup. As an example, if the original load must happen before later stores, we need to make sure the constructed VZEXT_LOAD is constrained to be before the stores. rdar://12684358 llvm-svn: 167859	2012-11-13 19:13:05 +00:00
Ulrich Weigand	9c5e333c90	Do not consider a machine instruction that uses and defines the same physical register as candidate for common subexpression elimination in MachineCSE. This fixes a bug on PowerPC in MultiSource/Applications/oggenc/oggenc caused by MachineCSE invalidly merging two separate DYNALLOC insns. llvm-svn: 167855	2012-11-13 18:40:58 +00:00
Shankar Easwaran	f934185b04	Adding changes to support GNU style archive library reading llvm-svn: 167853	2012-11-13 18:38:42 +00:00
Chad Rosier	abe41c8f95	Revert 167755/167760. We don't want to emit crash diagnostics on command-line syntax errors. llvm-svn: 167849	2012-11-13 16:42:19 +00:00
NAKAMURA Takumi	cb05f68b6c	llvm/test/Other/close-stderr.ll: Mark it as XFAIL:mingw32 for now. On MSYS, 70 is not seen, but 1. r127726 should be reworked. Candidate options are; 1) Use not exit(70), but _exit(70), in report_fatal_error(). 2) Return with _exit(70) in ~raw_ostream(). llvm-svn: 167836	2012-11-13 15:03:33 +00:00
Duncan Sands	7c55936d5f	Codegen support for arbitrary vector getelementptrs. llvm-svn: 167830	2012-11-13 13:01:58 +00:00
Duncan Sands	834534bbe1	Fix the instcombine GEP index widening transform to work correctly for vector getelementptrs. llvm-svn: 167829	2012-11-13 13:01:00 +00:00
Duncan Sands	8c43343240	Relax the restrictions on vector of pointer types, and vector getelementptr. Previously in a vector of pointers, the pointer couldn't be any pointer type, it had to be a pointer to an integer or floating point type. This is a hassle for dragonegg because the GCC vectorizer happily produces vectors of pointers where the pointer is a pointer to a struct or whatever. Vector getelementptr was restricted to just one index, but now that vectors of pointers can have any pointer type it is more natural to allow arbitrary vector getelementptrs. There is however the issue of struct GEPs, where if each lane chose different struct fields then from that point on each lane will be working down into unrelated types. This seems like too much pain for too little gain, so when you have a vector struct index all the elements are required to be the same. llvm-svn: 167828	2012-11-13 12:59:33 +00:00
Benjamin Kramer	60b650f8c4	DependenceAnalysis: Print all dependency pairs when dumping. Update all testcases. Part of a patch by Preston Briggs. llvm-svn: 167827	2012-11-13 12:12:02 +00:00
Alexey Samsonov	c7d4a8bbd3	Figure out <size> argument of llvm.lifetime intrinsics at the moment they are created (during function inlining) llvm-svn: 167821	2012-11-13 07:15:32 +00:00
Meador Inge	a191db7d99	instcombine: Migrate math library call simplifications This patch migrates the math library call simplifications from the simplify-libcalls pass into the instcombine library call simplifier. I have typically migrated just one simplifier at a time, but the math simplifiers are interdependent because: 1. CosOpt, PowOpt, and Exp2Opt all depend on UnaryDoubleFPOpt. 2. CosOpt, PowOpt, Exp2Opt, and UnaryDoubleFPOpt all depend on the option -enable-double-float-shrink. These two factors made migrating each of these simplifiers individually more of a pain than it would be worth. So, I migrated them all together. llvm-svn: 167815	2012-11-13 04:16:17 +00:00
Hal Finkel	f33a9ea70d	BBVectorize: Don't vectorize vector-manipulation chains Don't choose a vectorization plan containing only shuffles and vector inserts/extracts. Due to inperfections in the cost model, these can lead to infinite recusion. llvm-svn: 167811	2012-11-13 03:12:40 +00:00
Bill Wendling	ab44d906b6	Use the 'count' attribute instead of the 'upper_bound' attribute. If we have a type 'int a[1]' and a type 'int b[0]', the generated DWARF is the same for both of them because we use the 'upper_bound' attribute. Instead use the 'count' attrbute, which gives the correct number of elements in the array. <rdar://problem/12566646> llvm-svn: 167806	2012-11-13 02:31:47 +00:00
Andrew Trick	d8c621a864	Cleanup the main RegisterCoalescer loop. Block priorities still apply outside loops. llvm-svn: 167793	2012-11-13 00:34:44 +00:00
Shuxin Yang	9597b0a305	revert r167740 llvm-svn: 167787	2012-11-13 00:08:49 +00:00
Hal Finkel	47f58fe181	BBVectorize: Only some insert element operand pairs are free. This fixes another infinite recursion case when using target costs. We can only replace insert element input chains that are pure (end with inserting into an undef). llvm-svn: 167784	2012-11-12 23:55:36 +00:00
Michael Liao	c260a62eb2	Fix test case added in patch fixing PR14314 llvm-svn: 167769	2012-11-12 22:33:18 +00:00
Chad Rosier	fe86a11392	Update test case for r167754/r167755. llvm-svn: 167760	2012-11-12 21:51:08 +00:00
Hal Finkel	1c4de5a823	BBVectorize: Use a more sophisticated check for input cost The old checking code, which assumed that input shuffles and insert-elements could always be folded (and thus were free) is too simple. This can only happen in special circumstances. Using the simple check caused infinite recursion. llvm-svn: 167750	2012-11-12 21:21:02 +00:00
Hal Finkel	ff11a22f1a	BBVectorize: Check the types of compare instructions The pass would previously assert when trying to compute the cost of compare instructions with illegal vector types (like struct pointers). llvm-svn: 167743	2012-11-12 19:41:38 +00:00
Shuxin Yang	a699462f9d	This change is to fix rdar://12571717 which is about assertion in Reassociate pass. The assertion is trigged when the Reassociater tries to transform expression ... + 2 * n * 3 + 2 * m + ... into: ... + 2 * (n3 + m). In the process of the transformation, a helper routine folds the constant 23 into 6, confusing optimizer which is trying the to eliminate the common factor 2, and cannot find 2 any more. Review is pending. But I'd like commit first in order to help those who are waiting for this fix. llvm-svn: 167740	2012-11-12 19:34:11 +00:00
Andrew Trick	ad4b55b3d8	misched: Infrastructure for weak DAG edges. This adds support for weak DAG edges to the general scheduling infrastructure in preparation for MachineScheduler support for heuristics based on weak edges. llvm-svn: 167738	2012-11-12 19:28:57 +00:00
Hal Finkel	7cca290894	BBVectorize: Check the input types of shuffles for legality This fixes a bug where shuffles were being fused such that the resulting input types were not legal on the target. This would occur only when both inputs and dependencies were also foldable operations (such as other shuffles) and there were other connected pairs in the same block. llvm-svn: 167731	2012-11-12 14:50:59 +00:00
Meador Inge	0cf613c15a	Normalize memcmp constant folding results. The library call simplifier folds memcmp calls with all constant arguments to a constant. For example: memcmp("foo", "foo", 3) -> 0 memcmp("hel", "foo", 3) -> 1 memcmp("foo", "hel", 3) -> -1 The folding is implemented in terms of the system memcmp that LLVM gets linked with. It currently just blindly uses the value returned from the system memcmp as the folded constant. This patch normalizes the values returned from the system memcmp to (-1, 0, 1) so that we get consistent results across multiple platforms. The test cases were adjusted accordingly. llvm-svn: 167726	2012-11-12 14:00:45 +00:00
Michael Liao	5bf5c77881	Fix PR14314 - Fix operand order for atomic sub, where the minuend is the value loaded from memory and the subtrahend is the parameter specified. llvm-svn: 167718	2012-11-12 06:49:17 +00:00
Justin Holewinski	da9a98c364	[NVPTX] Add more precise PTX/SM target attributes Each SM and PTX version is modeled as a subtarget feature/CPU. Additionally, PTX 3.1 is added as the default PTX version to be out-of-the-box compatible with CUDA 5.0. Available CPUs for this target: sm_10 - Select the sm_10 processor. sm_11 - Select the sm_11 processor. sm_12 - Select the sm_12 processor. sm_13 - Select the sm_13 processor. sm_20 - Select the sm_20 processor. sm_21 - Select the sm_21 processor. sm_30 - Select the sm_30 processor. sm_35 - Select the sm_35 processor. Available features for this target: ptx30 - Use PTX version 3.0. ptx31 - Use PTX version 3.1. sm_10 - Target SM 1.0. sm_11 - Target SM 1.1. sm_12 - Target SM 1.2. sm_13 - Target SM 1.3. sm_20 - Target SM 2.0. sm_21 - Target SM 2.1. sm_30 - Target SM 3.0. sm_35 - Target SM 3.5. llvm-svn: 167699	2012-11-12 03:16:43 +00:00
Meador Inge	69e38a3d15	Remove hard-coded constant in Transforms/InstCombine/memcmp-1.ll Transforms/InstCombine/memcmp-1.ll has a test case that looks like: @foo = constant [4 x i8] c"foo\00" @hel = constant [4 x i8] c"hel\00" ... %mem1 = getelementptr [4 x i8]* @hel, i32 0, i32 0 %mem2 = getelementptr [4 x i8]* @foo, i32 0, i32 0 %ret = call i32 @memcmp(i8* %mem1, i8* %mem2, i32 3) ret i32 %ret ; CHECK: ret i32 2 The folded return value (2 above) is computed using the system memcmp that the compiler is linked with. This can return different values on different systems. The test was originally written on an OS X 10.7.5 x86-64 box and passed. However, it failed on one of the x86-64 FreeBSD buildbots because the system memcpy on that machine returned a different value (1 instead of 2). I fixed the test by checking the folding constants with regexes. llvm-svn: 167691	2012-11-11 07:10:25 +00:00
Meador Inge	ba025d5d90	instcombine: Migrate memset optimizations This patch migrates the memset optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167689	2012-11-11 06:49:03 +00:00
Meador Inge	e093f6c41e	instcombine: Migrate memmove optimizations This patch migrates the memmove optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167687	2012-11-11 06:22:40 +00:00
Meador Inge	bf03751391	instcombine: Migrate memcpy optimizations This patch migrates the memcpy optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167686	2012-11-11 05:54:34 +00:00
Meador Inge	13e6be2fd6	instcombine: Migrate memcmp optimizations This patch migrates the memcmp optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167683	2012-11-11 05:11:20 +00:00
Meador Inge	a062b17960	instcombine: Migrate strstr optimizations This patch migrates the strstr optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167682	2012-11-11 03:51:48 +00:00
Meador Inge	a202e0c179	instcombine: Migrate strcspn optimizations This patch migrates the strcspn optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167675	2012-11-10 15:16:48 +00:00
Evan Cheng	2599006e46	Convert an improper CodeGen test to a MC test. llvm-svn: 167663	2012-11-10 04:30:40 +00:00
Meador Inge	9b62fb8d77	instcombine: Query target library information to gate libcall simplifications Several of the simplifiers migrated from the simplify-libcalls pass to the instcombine pass were not correctly checking the target library information to gate the simplifications. This patch ensures that the check is made. llvm-svn: 167660	2012-11-10 03:11:10 +00:00
Evan Cheng	6ed26ba70c	xfail a bad test. This is a MC test but it's dependent on a codegen optimization which is now disabled. llvm-svn: 167658	2012-11-10 02:34:36 +00:00
Evan Cheng	ebe241fb9d	Disable the Thumb no-return call optimization: mov lr, pc b.w _foo The "mov" instruction doesn't set bit zero to one, it's putting incorrect value in lr. It messes up backtraces. rdar://12663632 llvm-svn: 167657	2012-11-10 02:09:05 +00:00
Craig Topper	f424da6ff9	Cleanup pcmp(e/i)str(m/i) instruction definitions and load folding support. llvm-svn: 167652	2012-11-10 01:23:36 +00:00
Justin Holewinski	be8faeed70	[NVPTX] Use ABI alignment for parameters when alignment is not specified. Affects SM 2.0+. Fixes bug 13324. llvm-svn: 167646	2012-11-09 23:50:24 +00:00
Jakob Stoklund Olesen	887571e652	Fix assertions in updateRegMaskSlots(). The RegMaskSlots contains 'r' slots while NewIdx and OldIdx are 'B' slots. This broke the checks in the assertions. This fixes PR14302. llvm-svn: 167625	2012-11-09 19:18:49 +00:00
Dmitry Vyukov	fab21a5c47	tsan: switch to new memory_order constants (ABI compatible) llvm-svn: 167615	2012-11-09 14:12:16 +00:00
Dmitry Vyukov	62df6da6a6	tsan: instrument all atomics (including fetch_add, exchange, cas, etc) llvm-svn: 167612	2012-11-09 12:55:36 +00:00
Nadav Rotem	ee232d62d1	Add support for memory runtime check. When we can, we calculate array bounds. If the arrays are found to be disjoint then we run the vectorized version of the loop. If they are not, we run the scalar code. llvm-svn: 167608	2012-11-09 07:09:44 +00:00
NAKAMURA Takumi	cda12da6b9	llvm/ConstantFolding.cpp: Make ReadDataFromGlobal() and FoldReinterpretLoadFromConstPtr() Big-endian-aware. llvm-svn: 167595	2012-11-08 20:34:25 +00:00
Amara Emerson	f7a46cedbc	Recommit modified r167540. Improve ARM build attribute emission for architectures types. This also changes the default architecture emitted for a generic CPU to "v7". llvm-svn: 167574	2012-11-08 09:51:45 +00:00
Michael Liao	59114df23b	Add support of RTM from TSX extension - Add RTM code generation support throught 3 X86 intrinsics: xbegin()/xend() to start/end a transaction region, and xabort() to abort a tranaction region llvm-svn: 167573	2012-11-08 07:28:54 +00:00
Meador Inge	28cefe8802	instcombine: Migrate strspn optimizations This patch migrates the strspn optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167568	2012-11-08 01:33:50 +00:00
Eric Christopher	b34bece6a8	Add a relocation visitor to lib object. This works via caching relocated values in a map that can be passed to consumers. Add a testcase that ensures this works for llvm-dwarfdump. llvm-svn: 167558	2012-11-07 23:22:07 +00:00
Hans Wennborg	d166484584	Only do switch-to-lookup table transformation when TargetTransformInfo is available. llvm-svn: 167552	2012-11-07 21:35:12 +00:00
Akira Hatanaka	b8f5a8ab0b	[mips] Custom-lower ISD::FRAME_TO_ARGS_OFFSET node. Patch by Sasa Stankovic. llvm-svn: 167548	2012-11-07 19:10:58 +00:00
Hans Wennborg	7dd7657cec	Fix bad test IR in switch_to_lookup_table.ll llvm-svn: 167543	2012-11-07 18:38:24 +00:00
Andrew Trick	8b72906a53	misched: Heuristics based on the machine model. misched is disabled by default. With -enable-misched, these heuristics balance the schedule to simultaneously avoid saturating processor resources, expose ILP, and minimize register pressure. I've been analyzing the performance of these heuristics on everything in the llvm test suite in addition to a few other benchmarks. I would like each heuristic check to be verified by a unit test, but I'm still trying to figure out the best way to do that. The heuristics are still in considerable flux, but as they are refined we should be rigorous about unit testing the improvements. llvm-svn: 167527	2012-11-07 07:05:09 +00:00
Nadav Rotem	dce9a7a599	CostModel: add another known vector trunc optimization. llvm-svn: 167488	2012-11-06 21:17:17 +00:00
Nadav Rotem	2fb5dc3a15	Cost Model: add tables for some avx type-conversion hacks. llvm-svn: 167480	2012-11-06 19:33:53 +00:00
Nadav Rotem	890d7c7f8e	CostModel: Add tables for the common x86 compares. llvm-svn: 167421	2012-11-05 23:48:20 +00:00
Nadav Rotem	8ddfd47801	Code Model: Improve the accuracy of the zext/sext/trunc vector cost estimation. llvm-svn: 167412	2012-11-05 22:20:53 +00:00
Kevin Enderby	d988c6daf6	Fix for PR14264 cause by commit r167237 which did not take into account a possible buffer change with a .macro directive. rdar://12637628 llvm-svn: 167408	2012-11-05 21:55:41 +00:00
Nadav Rotem	04d64771f6	Cost Model: Normalize the insert/extract index when splitting types llvm-svn: 167402	2012-11-05 21:12:13 +00:00
Nadav Rotem	a504aa057e	Cost Model: teach the cost model about expanding integers. llvm-svn: 167401	2012-11-05 21:11:10 +00:00
Ulrich Weigand	5e496676d0	On PowerPC64, integer return values (as well as arguments) are supposed to be extended to a full register. This is modeled in the IR by marking the return value (or argument) with a signext or zeroext attribute. However, while these attributes are respected for function arguments, they are currently ignored for function return values by the PowerPC back-end. This patch updates PPCCallingConv.td to ask for the promotion to i64, and fixes LowerReturn and LowerCallResult to implement it. The new test case verifies that both arguments and return values are properly extended when passing them; and also that the optimizers understand incoming argument and return values are in fact guaranteed by the ABI to be extended. The patch caused a spurious breakage in CodeGen/PowerPC/coalesce-ext.ll, since the test case used a "ret" instruction to create a use of an i32 value at the end of the function (to set up data flow as required for what the test is intended to test). Since there's now an implicit promotion to i64, that data flow no longer works as expected. To fix this, this patch now adds an extra "add" to ensure we have an appropriate use of the i32 value. llvm-svn: 167396	2012-11-05 19:39:45 +00:00
Nadav Rotem	4def3aace5	Implement the cost of abnormal x86 instruction lowering as a table. llvm-svn: 167395	2012-11-05 19:32:46 +00:00
Hal Finkel	a82b79fc22	Add support for the PowerPC-specific inline asm Z constraint and y modifier. The Z constraint specifies an r+r memory address, and the y modifier expands to the "r, r" in the asm string. For this initial implementation, the base register is forced to r0 (which has the special meaning of 0 for r+r addressing on PowerPC) and the full address is taken in the second register. In the future, this should be improved. llvm-svn: 167388	2012-11-05 18:18:42 +00:00
Adhemerval Zanella	382ede5fd4	[PATCH] PowerPC: Expand load extend vector operations This patch expands the SEXTLOAD, ZEXTLOAD, and EXTLOAD operations for vector types when altivec is enabled. llvm-svn: 167386	2012-11-05 17:15:56 +00:00
Richard Osborne	258e3e70bb	Don't infer whether a value is captured in the current function from the 'nocapture' attribute. The nocapture attribute only specifies that no copies are made that outlive the function. This isn't the same as there being no copies at all. This fixes PR14045. llvm-svn: 167381	2012-11-05 10:48:24 +00:00
Duncan Sands	626552af21	Generalize the transform that boosts GEP indices to the size of a pointer to also do it for vectors of pointers. llvm-svn: 167354	2012-11-03 11:44:17 +00:00
Akira Hatanaka	1bfa522bfe	[mips] Set flag neverHasSideEffects flag on floating point conversion instructions. llvm-svn: 167348	2012-11-03 00:53:12 +00:00
Nadav Rotem	c9bbabd5e9	X86 CostModel: Add support for a some of the common arithmetic instructions for SSE4, AVX and AVX2. llvm-svn: 167347	2012-11-03 00:39:56 +00:00
Akira Hatanaka	61434a3632	[mips] Set flag isAsCheapAsAMove flag on instruction LUi. llvm-svn: 167345	2012-11-03 00:26:02 +00:00
Akira Hatanaka	06b2c52edc	[mips] Stop reserving register AT and use register scavenger when a scratch register is needed. llvm-svn: 167341	2012-11-03 00:05:43 +00:00
Nadav Rotem	6f0c234b7f	Add a stub for the x86 cost model impl. Implement a basic cost rule for inserting/extracting from XMM registers. llvm-svn: 167333	2012-11-02 23:27:16 +00:00
Nadav Rotem	6edee82efa	CostModel: add support for Vector Insert and Extract. llvm-svn: 167329	2012-11-02 22:31:56 +00:00
Akira Hatanaka	109078895c	[mips] Fix disassembler test cases. llvm-svn: 167326	2012-11-02 22:20:10 +00:00
Nadav Rotem	ce21a69b9d	Add a cost model analysis that allows us to estimate the cost of IR-level instructions. llvm-svn: 167324	2012-11-02 21:48:17 +00:00
Akira Hatanaka	5a2c763cb0	[mips] Fix bug in test case. Disable machine LICM to prevent instruction from being moved out of a basic block. llvm-svn: 167322	2012-11-02 21:46:42 +00:00
Quentin Colombet	522698f693	Vext Lowering was missing opportunities llvm-svn: 167318	2012-11-02 21:32:17 +00:00
Akira Hatanaka	25aba90c6b	[mips] Use register number instead of name to print register $AT. llvm-svn: 167315	2012-11-02 21:26:03 +00:00
Akira Hatanaka	ba57d98a50	[mips] Delete MipsFunctionInfo::EmitNOAT. Unconditionally print directive "set .noat" so that the assembler doesn't issue warnings when register $AT is used. llvm-svn: 167310	2012-11-02 20:56:25 +00:00
Chandler Carruth	ca6ce473f5	Add a testcase to loop-idiom to cover PR14241 when we start handling strided loops again. llvm-svn: 167287	2012-11-02 08:40:24 +00:00
Chandler Carruth	1fc186f3bd	Revert the switch of loop-idiom to use the new dependence analysis. The new analysis is not yet ready for prime time. It has a critical flawed assumption, and some troubling shortages of testing. Until it's been hammered into better shape, let's stick with the working code. This should be easy to revert itself when the analysis is ready. Fixes PR14241, a miscompile of any memcpy-able loop which uses a pointer as the induction mechanism. If you have been seeing miscompiles in this revision range, you really want to test with this backed out. The results of this miscompile are a bit subtle as they can lead to downstream passes concluding things are impossible which are in fact possible. Thanks to David Blaikie for the majority of the reduction of this miscompile. I'll be checking in the test case in a non-revert commit. Revesions reverted here: r167045: LoopIdiom: Fix a serious missed optimization: we only turned top-level loops into memmove. r166877: LoopIdiom: Add checks to avoid turning memmove into an infinite loop. r166875: LoopIdiom: Recognize memmove loops. r166874: LoopIdiom: Replace custom dependence analysis with DependenceAnalysis. llvm-svn: 167286	2012-11-02 08:33:25 +00:00
Hal Finkel	1037cb0026	BBVectorize: Commit the rest of the test-case change. llvm-svn: 167257	2012-11-01 21:57:27 +00:00
Hal Finkel	ebd97dd2ef	BBVectorize: Use target costs for incoming and outgoing values instead of the depth heuristic. When target cost information is available, compute explicit costs of inserting and extracting values from vectors. At this point, all costs are estimated using the target information, and the chain-depth heuristic is not needed. As a result, it is now, by default, disabled when using target costs. llvm-svn: 167256	2012-11-01 21:50:12 +00:00
Kevin Enderby	efec4e2817	Add support for generating dwarf debugging info with assembly files run through the 'C' preprocessor. That is pick up the file name and line numbers from the cpp hash file line comments for the dwarf file and line numbers tables. rdar://9275556 llvm-svn: 167237	2012-11-01 17:31:35 +00:00
NAKAMURA Takumi	380d526a6c	llvm/test/lit.cfg: Don't use mcjit to ppc32 yet, not ready. Unsupported CPU type! UNREACHABLE executed at llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:553! llvm-svn: 167231	2012-11-01 14:28:51 +00:00
Kostya Serebryany	2bae7f204a	[asan] don't instrument globals that we've created ourselves (reduces the binary size a bit) llvm-svn: 167230	2012-11-01 13:42:40 +00:00
Chandler Carruth	6410ac7263	Add a test case for PR14233. llvm-svn: 167224	2012-11-01 10:26:36 +00:00
Chandler Carruth	76f7f4a33e	Revert the series of commits starting with r166578 which introduced the getIntPtrType support for multiple address spaces via a pointer type, and also introduced a crasher bug in the constant folder reported in PR14233. These commits also contained several problems that should really be addressed before they are re-committed. I have avoided reverting various cleanups to the DataLayout APIs that are reasonable to have moving forward in order to reduce the amount of churn, and minimize the number of commits that were reverted. I've also manually updated merge conflicts and manually arranged for the getIntPtrType function to stay in DataLayout and to be defined in a plausible way after this revert. Thanks to Duncan for working through this exact strategy with me, and Nick Lewycky for tracking down the really annoying crasher this triggered. (Test case to follow in its own commit.) After discussing with Duncan extensively, and based on a note from Micah, I'm going to continue to back out some more of the more problematic patches in this series in order to ensure we go into the LLVM 3.2 branch with a reasonable story here. I'll send a note to llvmdev explaining what's going on and why. Summary of reverted revisions: r166634: Fix a compiler warning with an unused variable. r166607: Add some cleanup to the DataLayout changes requested by Chandler. r166596: Revert "Back out r166591, not sure why this made it through since I cancelled the command. Bleh, sorry about this! r166591: Delete a directory that wasn't supposed to be checked in yet. r166578: Add in support for getIntPtrType to get the pointer type based on the address space. llvm-svn: 167221	2012-11-01 08:07:29 +00:00
NAKAMURA Takumi	f12f8c13ef	[CMake] Add llvm-mcmarkup to check-llvm. llvm-svn: 167208	2012-11-01 02:13:50 +00:00
NAKAMURA Takumi	e2c67c353f	test/CodeGen/X86/fp-fast.ll: Add +avx. llvm-svn: 167207	2012-11-01 02:13:45 +00:00
Owen Anderson	8f66d7107c	Add a few more simple fast-math constant propagations and cancellations. llvm-svn: 167200	2012-11-01 02:00:53 +00:00
Jim Grosbach	ca24351f26	MC: Simple example parser for MC assembly markup. Nothing fancy, just a simple demonstration parser. llvm-svn: 167181	2012-10-31 23:24:13 +00:00
Shuxin Yang	4762eb1825	(For X86) Enhancement to add-carray/sub-borrow (adc/sbb) optimization. The adc/sbb optimization is to able to convert following expression into a single adc/sbb instruction: (ult) ... = x + 1 // where the ult is unsigned-less-than comparison (ult) ... = x - 1 This change is to flip the "x >u y" (i.e. ugt comparison) in order to expose the adc/sbb opportunity. llvm-svn: 167180	2012-10-31 23:11:48 +00:00
Nadav Rotem	0a30b41020	LoopVectorize: Preserve NSW, NUW and IsExact flags. llvm-svn: 167174	2012-10-31 21:40:39 +00:00
Nadav Rotem	e3083d1688	Fix a bug in the cost calculation of vector casts. Detect situations where bitcasts cost zero. llvm-svn: 167170	2012-10-31 20:52:26 +00:00
Akira Hatanaka	245eaafd42	[mips] Set isAsCheapAsAMove flag on ADDiu and DADDiu, which enables re-materialization of immediate loads. llvm-svn: 167153	2012-10-31 18:37:55 +00:00
Akira Hatanaka	6632837f2e	Test case for r167039. Check that tail-call optimization is disabled for mips16. llvm-svn: 167139	2012-10-31 17:25:23 +00:00
Hans Wennborg	4c6d01059c	Remove fixme about unreachable cases from SwitchToLookupTable SimplifyCFG will have removed those cases for us. llvm-svn: 167132	2012-10-31 16:15:25 +00:00
Hal Finkel	d1fc849359	BBVectorize: Choose pair ordering to minimize shuffles BBVectorize would, except for loads and stores, always fuse instructions so that the first instruction (in the current source order) would always represent the low part of the input vectors and the second instruction would always represent the high part. This lead to too many shuffles being produced because sometimes the opposite order produces fewer of them. With this change, BBVectorize tracks the kind of pair connections that form the DAG of candidate pairs, and uses that information to reorder the pairs to avoid excess shuffles. Using this information, a future commit will be able to add VTTI-based shuffle costs to the pair selection procedure. Importantly, the number of remaining shuffles can now be estimated during pair selection. There are some trivial instruction reorderings in the test cases, and one simple additional test where we certainly want to do a reordering to avoid an unnecessary shuffle. llvm-svn: 167122	2012-10-31 15:17:07 +00:00
Meador Inge	ccbf761437	instcombine: Migrate strto* optimizations This patch migrates the strto* optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167119	2012-10-31 14:58:26 +00:00
Hans Wennborg	d162380e59	Do simple constant propagation in lookup table formation for switches By propagating the value for the switch condition, LLVM can now build lookup tables for code such as: switch (x) { case 1: return 5; case 2: return 42; case 3: case 4: case 5: return x - 123; default: return 123; } Given that x is known for each case, "x - 123" becomes a constant for cases 3, 4, and 5. llvm-svn: 167115	2012-10-31 13:42:45 +00:00
Benjamin Kramer	6cd4d55f5b	LCSSA: Add a workaround for another nasty SCEV cache invalidation issue. I'm not entirely happy with this solution, but I don't see a smarter way currently. Fixes PR14214. llvm-svn: 167112	2012-10-31 10:01:29 +00:00
Benjamin Kramer	f6b2bc4c9a	DependenceAnalysis: Don't crash if there is no constant operand. This makes the code match the comments. Resolves a crash in loop idiom (PR14219). llvm-svn: 167110	2012-10-31 09:20:38 +00:00
Reed Kotler	82a7f09a30	Implement ADJCALLSTACKUP and ADJCALLSTACKDOWN llvm-svn: 167107	2012-10-31 05:21:10 +00:00
Meador Inge	b6984384bf	instcombine: Migrate strpbrk optimizations This patch migrates the strpbrk optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167105	2012-10-31 04:29:58 +00:00
Meador Inge	5f906a50d3	instcombine: Migrate strlen optimizations This patch migrates the strlen optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167103	2012-10-31 03:33:06 +00:00
Meador Inge	4d309f330c	instcombine: Migrate strncpy optimizations This patch migrates the strncpy optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 167102	2012-10-31 03:33:00 +00:00
Nadav Rotem	9ab0e93cc1	LoopVectorize: Do not vectorize loops with tiny constant trip counts. llvm-svn: 167101	2012-10-31 03:31:07 +00:00
Bill Schmidt	f4c899f8e7	This patch addresses an ABI compatibility issue with empty aggregate parameters. Examples of these are: struct { } a; union { } b[256]; int a[0]; An empty aggregate has an address, although dereferencing that address is pointless. When passed as a parameter, an empty aggregate does not consume a protocol register, nor does it consume a doubleword in the parameter save area. Passing an empty aggregate by reference passes an address just as for any other aggregate. Returning an empty aggregate uses GPR3 as a hidden address of the return value location, just as for any other aggregate. The patch modifies PPCTargetLowering::LowerFormalArguments_64SVR4 and PPCTargetLowering::LowerCall_64SVR4 to properly skip empty aggregate parameters passed by value. The handling of return values and by-reference parameters was already correct. Built on powerpc64-unknown-linux-gnu and tested with no new regressions. A test case is included to test proper handling of empty aggregate parameters on both sides of the function call protocol. llvm-svn: 167090	2012-10-31 01:15:05 +00:00
Nadav Rotem	240ead98fd	Add support for loops that don't start with Zero. This is important for loops in the LAPACK test-suite. These loops start at 1 because they are auto-converted from fortran. llvm-svn: 167084	2012-10-31 00:45:26 +00:00
Meador Inge	261da7dfde	instcombine: Migrate stpcpy optimizations This patch migrates the stpcpy optimizations from the simplify-libcalls pass into the instcombine library call simplifier. Note that the __stpcpy_chk simplifications were migrated in a previous commit. llvm-svn: 167083	2012-10-31 00:20:56 +00:00
Meador Inge	1a88082441	instcombine: Split out the __stpcpy_chk simplifications from StrCpyChkOpt r166198 migrated the strcpy optimization to instcombine. The strcpy simplifier that was migrated from Transforms/Scalar/SimplifyLibCalls.cpp was also doing some __strcpy_chk simplifications. Those fortified simplifications were migrated as well, but introduced a bug in the __stpcpy_chk simplifier in the process. This happened because the __strcpy_chk and __stpcpy_chk simplifiers were both mapped to StrCpyChkOpt which was updated with simplifications that worked for __strcpy_chk, but not __stpcpy_chk. This patch fixes the problem by adding proper test coverage and creating a new simplifier for __stpcpy_chk (instead of sharing one with __strcpy_chk). llvm-svn: 167082	2012-10-31 00:20:51 +00:00
Manman Ren	f26bd7d8f9	X86 SSE: update rsqrtss and rcpss to use two source operands and the first source operand is tied to the destination operand. This is to accurately model the corresponding instructions where the upper bits are unmodified. rdar://12558838 PR14221 llvm-svn: 167064	2012-10-30 23:53:59 +00:00
Manman Ren	584c3daf8d	X86 MMX: optimize transfer from mmx to i32 We used to generate a store (movq) + a load. Now we use movd. rdar://9946746 llvm-svn: 167056	2012-10-30 22:15:38 +00:00
Chandler Carruth	d3b4a83c9f	Fix PR14212: For some strange reason I treated vectors differently from integers in that the code to handle split alloca-wide integer loads or stores doesn't come first. It should, for the same reasons as with integers, and the PR attests to that. Also had to fix a busted assert in that this test case also covers. llvm-svn: 167051	2012-10-30 20:52:40 +00:00
Akira Hatanaka	dbea525cfc	[mips] Allow tail-call optimization for vararg functions and functions which use the caller's stack. llvm-svn: 167048	2012-10-30 20:16:31 +00:00
Benjamin Kramer	78cdbf2f16	LoopIdiom: Fix a serious missed optimization: we only turned top-level loops into memmove. Thanks to Preston Briggs for catching this! llvm-svn: 167045	2012-10-30 19:49:39 +00:00
Hal Finkel	1c116e9ec0	BBVectorize: Fix a small bug introduced in r167042. We need to make sure that we take the correct load/store alignment when the inputs are flipped. llvm-svn: 167044	2012-10-30 19:47:37 +00:00
Nadav Rotem	69e6bca813	LoopVectorize: Add support for write-only loops when the write destination is a single pointer. Speedup SciMark by 1% llvm-svn: 167035	2012-10-30 18:36:45 +00:00
Adhemerval Zanella	74fd05ff3f	PowerPC: Expand FSRQT for vector types This patch expands FSQRT for floating point vector types when altivec is used. llvm-svn: 167034	2012-10-30 18:29:42 +00:00
Nadav Rotem	4fc2912062	LoopVectorize: Fix a bug in the initialization of reduction variables. AND needs to start at all-one while XOR, and OR need to start at zero. llvm-svn: 167032	2012-10-30 18:12:36 +00:00
Ulrich Weigand	418fafa0b8	Set %defaultjit to use MCJIT for PowerPC targets. Update Transforms/LICM/2003-12-11-SinkingToPHI.ll test to use %defaultjit as well. llvm-svn: 167031	2012-10-30 18:07:58 +00:00
Quentin Colombet	dde058d386	Change ForceSizeOpt attribute into MinSize attribute llvm-svn: 167020	2012-10-30 16:32:52 +00:00
Hans Wennborg	885eff267a	switch_to_lookup_table.ll: Remove some unnecessary lines, comments, function attributes, etc. llvm-svn: 167016	2012-10-30 15:11:52 +00:00
Adhemerval Zanella	ac3ba40bc2	PowerPC: More support for Altivec compare operations This patch adds more support for vector type comparisons using altivec. It adds correct support for v16i8, v8i16, v4i32, and v4f32 vector types for comparison operators ==, !=, >, >=, <, and <=. llvm-svn: 167015	2012-10-30 13:50:19 +00:00
Ulrich Weigand	2df331332d	Enable some additional constant folding for PPCDoubleDouble. This fixes Clang :: CodeGen/complex-builtints.c on PowerPC. llvm-svn: 167013	2012-10-30 12:33:18 +00:00
Hans Wennborg	40eb1b4055	Use TargetTransformInfo to control switch-to-lookup table transformation When the switch-to-lookup tables transform landed in SimplifyCFG, it was pointed out that this could be inappropriate for some targets. Since there was no way at the time for the pass to know anything about the target, an awkward reverse-transform was added in CodeGenPrepare that turned lookup tables back into switches for some targets. This patch uses the new TargetTransformInfo to determine if a switch should be transformed, and removes CodeGenPrepare::ConvertLoadToSwitch. llvm-svn: 167011	2012-10-30 11:23:25 +00:00
Hal Finkel	1e4b354323	Remove an invalid assert in TargetTransformImpl getCastInstrCost had an assert prohibiting scalar to vector casts. Such casts, however, are allowed. This should make the vectorizer buildbot happier. llvm-svn: 166998	2012-10-30 02:41:57 +00:00
Jim Grosbach	6585037b8c	ARM: Better disassembly for pc-relative LDR. When the operand is a plain immediate rather than a label, print it as [pc, #imm] like we do for the Thumb2 wide encoding variant. rdar://12154503 llvm-svn: 166991	2012-10-30 01:04:51 +00:00
Reed Kotler	de0ea1027e	Change mips16 delay slot jumps to non delay slot forms by default. We will make them delay slot forms if there is something that can be placed in the delay slot during a separate pass. Mips16 extended instructions cannot be placed in delay slots. llvm-svn: 166990	2012-10-30 00:54:49 +00:00
Jakub Staszak	f1cddf738b	Re-commit r166971. I reverted it to quickly, when buildbots didn't have a chance to test it with chapni's fix (-mattr=+avx). llvm-svn: 166985	2012-10-30 00:01:57 +00:00
Kevin Enderby	ecb9e2620c	Fix ARM's b.w instruction for thumb 2 and the encoding T4. The branch target is 24 bits not 20 and the decoding needed to correctly handle converting the J1 and J2 bits to their I1 and I2 values to reconstruct the displacement. llvm-svn: 166982	2012-10-29 23:27:20 +00:00
Jakub Staszak	ce95e4429f	Revert r166971. It causes buildbot failure. To be investigated. llvm-svn: 166979	2012-10-29 23:13:50 +00:00
NAKAMURA Takumi	d544c8f4ca	llvm/test/CodeGen/X86/vec_shuffle-30.ll: Try to unbreak builds - assuming +avx. llvm-svn: 166974	2012-10-29 22:45:18 +00:00
Jakub Staszak	ded6f21890	Allow to fold vector load if there is more than one bitcast, so in the case: %0 = load <8 x i16>* %dest %1 = shufflevector <8 x i16> %0, <8 x i16> %in, <8 x i32> < i32 0, i32 1, i32 2, i32 3, i32 13, i32 undef, i32 14, i32 14> store <8 x i16> %1, <8 x i16>* %dest We get: vmovlpd (%eax), %xmm0, %xmm0 instead of: vmovaps (%eax), %xmm1 vmovsd %xmm1, %xmm0, %xmm0 No extra test-case is added. I just fixed the existing one (also it uses FileCheck now). llvm-svn: 166971	2012-10-29 21:56:35 +00:00
Bill Schmidt	77a8fd274b	This patch solves a problem with passing varargs parameters under the PPC64 ELF ABI. A varargs parameter consisting of a single-precision floating-point value, or of a single-element aggregate containing a single-precision floating-point value, must be passed in the low-order (rightmost) four bytes of the doubleword stack slot reserved for that parameter. If there are GPR protocol registers remaining, the parameter must also be mirrored in the low-order four bytes of the reserved GPR. Prior to this patch, such parameters were being passed in the high-order four bytes of the stack slot and the mirrored GPR. The patch adds a new test case to verify the correct code generation. llvm-svn: 166968	2012-10-29 21:18:16 +00:00
Reed Kotler	3859f5469e	Implement patterns for extloadi8 and extloadi16 llvm-svn: 166960	2012-10-29 19:39:04 +00:00
Ulrich Weigand	445bd73056	In various places throughout the code generator, there were special checks to avoid performing compile-time arithmetic on PPCDoubleDouble. Now that APFloat supports arithmetic on PPCDoubleDouble, those checks are no longer needed, and we can treat the type like any other. llvm-svn: 166958	2012-10-29 18:35:49 +00:00
Chad Rosier	3b32dec25d	Remove redundant test case from r166949, per Eli's suggestion. llvm-svn: 166953	2012-10-29 18:18:26 +00:00
Chad Rosier	651ecf255c	[ms-inline asm] Add support for the [] operator. Essentially, [expr1][expr2] is equivalent to [expr1 + expr2]. See test cases for more examples. rdar://12470392 llvm-svn: 166949	2012-10-29 18:01:54 +00:00
Michael Liao	d26c27ad35	Fix PR14204 - Add missing pattern on X86ISD::VZEXT from VR256 to VR256 when AVX2 is enabled. llvm-svn: 166947	2012-10-29 17:57:12 +00:00
Jakob Stoklund Olesen	05cec5db28	Completely disallow partial copies in adjustCopiesBackFrom(). Partial copies can show up even when CoalescerPair.isPartial() returns false. For example: %vreg24:dsub_0<def> = COPY %vreg31:dsub_0; QPR:%vreg24,%vreg31 Such a partial-partial copy is not good enough for the transformation adjustCopiesBackFrom() needs to do. llvm-svn: 166944	2012-10-29 17:51:52 +00:00
Ulrich Weigand	2daab9e4b4	Allow i32/i64 for 'f' constraint on PowerPC. This fixes PR12757. llvm-svn: 166943	2012-10-29 17:49:34 +00:00
Reed Kotler	56b8c348f2	Expand all atomic ops for mips16. llvm-svn: 166935	2012-10-29 16:16:54 +00:00
Preston Gurd	1683865eea	This patch addresses a problem with the Post RA scheduler generating an incorrect instruction sequence due to it not being aware that an inline assembly instruction may reference memory. This patch fixes the problem by causing the scheduler to always assume that any inline assembly code instruction could access memory. This is necessary because the internal representation of the inline instruction does not include any information about memory accesses. This should fix PR13504. llvm-svn: 166929	2012-10-29 15:01:23 +00:00
Bill Schmidt	2682b73f6d	This patch adds alignment information for long double to the 64-bit PowerPC ELF subtarget. The existing logic is used as a fallback to avoid any changes to the Darwin ABI. PPC64 ELF now has two possible data layout strings: one for FreeBSD, which requires 8-byte alignment, and a default string that requires 16-byte alignment. I've added a test for PPC64 Linux to verify the 16-byte alignment. If somebody wants to add a separate test for FreeBSD, that would be great. Note that there is a companion patch to update the alignment information in Clang, which I am committing now as well. llvm-svn: 166928	2012-10-29 14:59:36 +00:00
Tim Northover	a1e6e26f6e	Align the data section correctly when loading an ELF file. Patch by Amara Emerson. llvm-svn: 166920	2012-10-29 10:47:07 +00:00
Tim Northover	fdb1d1af73	Make use of common-symbol alignment info in ELF loader. Patch by Amara Emerson. llvm-svn: 166919	2012-10-29 10:47:04 +00:00
Rafael Espindola	f030658478	Add -alias and -ralias options to match what we have for functions and globals. llvm-svn: 166909	2012-10-29 02:23:07 +00:00
Rafael Espindola	c460831626	llvm-extract changes linkages so that functions on both sides of the split module can see each other. If it is keeping a symbol that already has a non local linkage, it doesn't need to change it. llvm-svn: 166908	2012-10-29 01:59:03 +00:00
Rafael Espindola	218ef13219	llvm-extract was unable to handle aliases. It would leave a copy on the output of both llvm-extract foo.ll -func=bar and llvm-extract foo.ll -func=bar -delete so the two new files could not be linked together anymore. With this change alias are handled almost like functions and global variables. Almost because with alias we cannot just clear the initializer/body, we have to create a new declaration and replace the alias with it. The net result is that now the output of the above commands can be linked even if foo.ll has aliases. llvm-svn: 166907	2012-10-29 00:27:55 +00:00
Reed Kotler	1b4c985fc8	Implement brind operator for mips16. llvm-svn: 166903	2012-10-28 23:08:07 +00:00
Reed Kotler	16eaf8644a	This patch is for the implementation of mips16 complex pattern addr16. Previously mips16 was sharing the pattern addr which is used for mips32 and mips64. This had a number of problems: 1) Storing and loading byte and halfword quantities for mips16 has particular problems due to the primarily non mips16 nature of SP. When we must load/store byte/halfword stack objects in a function, we must create a mips16 alias register for SP. This functionality is tested in stchar.ll. 2) We need to have an FP register under certain conditions (such as dynamically sized alloca). We use mips16 register S0 for this purpose. In this case, we also use this register when accessing frame objects so this issue also affects the complex pattern addr16. This functionality is tested in alloca16.ll. The Mips16InstrInfo.td has been updated to use addr16 instead of addr. The complex pattern C++ function for addr has been copied to addr16 and updated to reflect the above issues. llvm-svn: 166897	2012-10-28 06:02:37 +00:00
Jakob Stoklund Olesen	2efe8df169	Never attempt to join an early-clobber def with a regular kill. This fixes PR14194. llvm-svn: 166880	2012-10-27 17:41:27 +00:00
Benjamin Kramer	00df4c1b61	LoopIdiom: Add checks to avoid turning memmove into an infinite loop. I don't think this is possible with the current implementation but that may change eventually. llvm-svn: 166877	2012-10-27 15:18:28 +00:00
Benjamin Kramer	8ba71ab2ab	LoopIdiom: Recognize memmove loops. This turns loops like for (unsigned i = 0; i != n; ++i) p[i] = p[i+1]; into memmove, which has a highly optimized implementation in most libcs. This was really easy with the new DependenceAnalysis :) llvm-svn: 166875	2012-10-27 14:25:51 +00:00
Benjamin Kramer	ba967fc2f3	LoopIdiom: Replace custom dependence analysis with DependenceAnalysis. Requires a lot less code and complexity on loop-idiom's side and the more precise analysis can catch more cases, like the one I included as a test case. This also fixes the edge-case miscompilation from PR9481. Compile time performance seems to be slightly worse, but this is mostly due to an extra LCSSA run scheduled by the PassManager and should be fixed there. llvm-svn: 166874	2012-10-27 14:25:44 +00:00
Nadav Rotem	04f3086065	1. Fix a bug in getTypeConversion. When a simple type is split, we need to return the type of the split result. 2. Change the maximum vectorization width from 4 to 8. 3. A test for both. llvm-svn: 166864	2012-10-27 04:11:32 +00:00
Quentin Colombet	bcd2bc3437	[code size][ARM] Emit regular call instructions instead of the move, branch sequence llvm-svn: 166854	2012-10-27 01:10:17 +00:00
Reed Kotler	cc41464454	Implement MipsHi for mips16 llvm-svn: 166852	2012-10-27 00:57:14 +00:00
Akira Hatanaka	a2ae72bd2f	[mips] Do not tail-call optimize vararg functions or functions with byval arguments. This is rather conservative and should be fixed later to be more aggressive. llvm-svn: 166851	2012-10-27 00:56:56 +00:00
Akira Hatanaka	4d26f60ca0	[mips] Make sure FuncArg doesn't advance when OrigArgIndex is the same as in the previous iteration. llvm-svn: 166850	2012-10-27 00:44:39 +00:00
Nadav Rotem	133e437c48	Refactor the VectorTargetTransformInfo interface. Add getCostXXX calls for different families of opcodes, such as casts, arithmetic, cmp, etc. Port the LoopVectorizer to the new API. The LoopVectorizer now finds instructions which will remain uniform after vectorization. It uses this information when calculating the cost of these instructions. llvm-svn: 166836	2012-10-26 23:49:28 +00:00
Jakob Stoklund Olesen	4b4db880a3	Revert r163298 "Optimize codegen for VSETLNi{8,16,32} operating on Q registers." Keep the integer_insertelement test case, the new coalescer can handle this kind of lane insertion without help from pseudo-instructions. llvm-svn: 166835	2012-10-26 23:39:46 +00:00
Reed Kotler	8721a311b3	implement mips16 tls global addr llvm-svn: 166827	2012-10-26 22:57:32 +00:00
Jakob Stoklund Olesen	1a8c00078d	Add GPRPair Register class to ARM. Some instructions in ARM require 2 even-odd paired GPRs. This patch adds support for such register class. Patch by Weiming Zhao! llvm-svn: 166816	2012-10-26 21:29:15 +00:00
Benjamin Kramer	0f18b5e49c	Remove LoopDependenceAnalysis. It was unmaintained and not much more than a stub. The new DependenceAnalysis pass is both more general and complete. llvm-svn: 166810	2012-10-26 20:25:01 +00:00
Hal Finkel	9cb3d9d08c	Move target-specific BBVectorize tests into a separate directory. llvm-svn: 166802	2012-10-26 19:38:09 +00:00
Nadav Rotem	137991e110	Move the target-specific tests, which require specific backends, to dirs that only run if the target is present. llvm-svn: 166796	2012-10-26 18:52:01 +00:00
Rafael Espindola	4b51029c9e	Change the internalize pass to internalize all symbols when given an empty list of externals. This makes sense since a shared library with no symbols can still be useful if it has static constructors. llvm-svn: 166795	2012-10-26 18:47:48 +00:00
Benjamin Kramer	f1e6d84f01	Fix SCEV cache invalidation in LCSSA and LoopSimplify. The LoopSimplify bug is pretty harmless because the loop goes from unanalyzable to analyzable but the LCSSA bug is very nasty. It only comes into play with a specific order of the LoopPassManager worklist and can cause actual miscompilations, when a SCEV refers to a value that has been replaced with PHI node. SCEVExpander may then insert code into the wrong place, either violating domination or randomly miscompiling stuff. Comes with an extensive test case reduced from the test-suite with bugpoint+SCEVValidator. llvm-svn: 166787	2012-10-26 17:31:43 +00:00
Nadav Rotem	a2f4e0d8a5	Fix a crash in SimpliftDemandedBits of vectors of pointers. PR14183. llvm-svn: 166785	2012-10-26 17:17:05 +00:00
Reed Kotler	6b0e65fce7	Implement carry for subtract/add for mips16 llvm-svn: 166755	2012-10-26 04:46:26 +00:00
Reed Kotler	fd22c8bfc1	implement large (>16 bit) constant loading. llvm-svn: 166749	2012-10-26 03:09:34 +00:00
Rafael Espindola	32ebb868a5	Fix unexpected passes. These test do work with LTO on linux. I tested both a cmake and an autoconf build. llvm-svn: 166748	2012-10-26 02:19:02 +00:00
Reed Kotler	72bddf2c9a	fix test setgek.ll so that it will not give false "make check" failure in some cases llvm-svn: 166747	2012-10-26 01:29:42 +00:00
Rafael Espindola	8259775e00	Port testcase to FileCheck. llvm-svn: 166742	2012-10-26 00:14:11 +00:00
Hal Finkel	a47e6ef6e6	Disable generation of pointer vectors by BBVectorize. Once vector-of-pointer support works, then this can be reverted. llvm-svn: 166741	2012-10-26 00:05:26 +00:00
Nadav Rotem	dec4761379	Revert 166726 because it may have broken a number of SPEC tests. PR14183. llvm-svn: 166739	2012-10-25 23:51:48 +00:00
Nadav Rotem	b33b47c635	Fix a crash in ValueTracking. Add support for vectors of pointers. llvm-svn: 166726	2012-10-25 21:52:52 +00:00
Nadav Rotem	0ccb9515e1	Fix the cost-model test. llvm-svn: 166722	2012-10-25 21:42:50 +00:00
Reed Kotler	6ac4e4d880	implement mips16 patterns for select nodes llvm-svn: 166721	2012-10-25 21:33:30 +00:00
Hal Finkel	5a11a9ff23	Add CPU model to BBVectorize cost-model tests. llvm-svn: 166720	2012-10-25 21:31:51 +00:00
Nadav Rotem	58110e0478	Add the cpu model to the test. llvm-svn: 166718	2012-10-25 21:18:42 +00:00
Hal Finkel	e2184ac235	Begin incorporating target information into BBVectorize. This is the first of several steps to incorporate information from the new TargetTransformInfo infrastructure into BBVectorize. Two things are done here: 1. Target information is used to determine if it is profitable to fuse two instructions. This means that the cost of the vector operation must not be more expensive than the cost of the two original operations. Pairs that are not profitable are no longer considered (because current cost information is incomplete, for intrinsics for example, equal-cost pairs are still considered). 2. The 'cost savings' computed for the profitability check are also used to rank the DAGs that represent the potential vectorization plans. Specifically, for nodes of non-trivial depth, the cost savings is used as the node weight. The next step will be to incorporate the shuffle costs into the DAG weighting; this will give the edges of the DAG weights as well. Once that is done, when target information is available, we should be able to dispense with the depth heuristic. llvm-svn: 166716	2012-10-25 21:12:23 +00:00
Jakob Stoklund Olesen	d69b3afa22	Also optimize large switch statements. The isValueEqualityComparison() guard at the top of SimplifySwitch() only applies to some of the possible transformations. The newer transformations work just fine on large switches, and the check on predecessor count is nonsensical. llvm-svn: 166710	2012-10-25 18:51:15 +00:00
Michael Liao	e5329935bd	Add test for ATOM ISA SSSE3 - Remove SSE4.1 feature in other ATOM-based test cases llvm-svn: 166699	2012-10-25 17:50:05 +00:00
Bill Schmidt	71c462aff2	This patch addresses a PPC64 ELF issue with passing parameters consisting of structs having size 3, 5, 6, or 7. Such a struct must be passed and received as right-justified within its register or memory slot. The problem is only present for structs that are passed in registers. Previously, as part of a patch handling all structs of size less than 8, I added logic to rotate the incoming register so that the struct was left- justified prior to storing the whole register. This was incorrect because the address of the parameter had already been adjusted earlier to point to the right-adjusted value in the storage slot. Essentially I had accidentally accounted for the right-adjustment twice. In this patch, I removed the incorrect logic and reorganized the code to make the flow clearer. The removal of the rotates changes the expected code generation, so test case structsinregs.ll has been modified to reflect this. I also added a new test case, jaggedstructs.ll, to demonstrate that structs of these sizes can now be properly received and passed. I've built and tested the code on powerpc64-unknown-linux-gnu with no new regressions. I also ran the GCC compatibility test suite and verified that earlier problems with these structs are now resolved, with no new regressions. llvm-svn: 166680	2012-10-25 13:38:09 +00:00
Adhemerval Zanella	b08709a59f	Initial TOC support for PowerPC64 object creation This patch adds initial PPC64 TOC MC object creation using the small mcmodel (a single 64K TOC) adding the some TOC relocations (R_PPC64_TOC, R_PPC64_TOC16, and R_PPC64_TOC16DS). The addition of 'undefinedExplicitRelSym' hook on 'MCELFObjectTargetWriter' is meant to avoid the creation of an unreferenced ".TOC." symbol (used in the .odp creation) as well to set the R_PPC64_TOC relocation target as the temporary ".TOC." symbol. On PPC64 ABI, the R_PPC64_TOC relocation should not point to any symbol. llvm-svn: 166677	2012-10-25 12:27:42 +00:00
Elena Demikhovsky	d93b6e7cd4	The test avx-intel-ocl.ll failed. I can't reproduce on any of my machines. I added -mcpu flag, may be it will fix the problem llvm-svn: 166669	2012-10-25 08:38:42 +00:00
Chandler Carruth	452bb4578f	Teach SROA how to split whole-alloca integer loads and stores into smaller integer loads and stores. The high-level motivation is that the frontend sometimes generates a single whole-alloca integer load or store during ABI lowering of splittable allocas. We need to be able to break this apart in order to see the underlying elements and properly promote them to SSA values. The hope is that this fixes some performance regressions on x86-32 with the new SROA pass. Unfortunately, this causes quite a bit of churn in the test cases, and bloats some IR that comes out. When we see an alloca that consists soley of bits and bytes being extracted and re-inserted, we now do some splitting first, before building widened integer "bucket of bits" representations. These are always well folded by instcombine however, so this shouldn't actually result in missed opportunities. If this splitting of all-integer allocas does cause problems (perhaps due to smaller SSA values going into the RA), we could potentially go to some extreme measures to only do this integer splitting trick when there are non-integer component accesses of an alloca, but discovering this is quite expensive: it adds yet another complete walk of the recursive use tree of the alloca. Either way, I will be watching build bots and LNT bots to see what fallout there is here. If anyone gets x86-32 numbers before & after this change, I would be very interested. llvm-svn: 166662	2012-10-25 04:37:07 +00:00
Nadav Rotem	5635a9350f	Add support for additional reduction variables: AND, OR, XOR. Patch by Paul Redmond <paul.redmond@intel.com>. llvm-svn: 166649	2012-10-25 00:08:41 +00:00
Nadav Rotem	9d7ba0ef55	Implement a basic cost model for vector and scalar instructions. llvm-svn: 166642	2012-10-24 23:47:38 +00:00
Chad Rosier	3935e5ec30	Tell llvm-mc we're using intel syntax, so we don't have to use directives. llvm-svn: 166640	2012-10-24 23:34:38 +00:00
Chad Rosier	492e58a0f7	[ms-inline asm] Add back-end test case for r166632. Make sure we emit the correct .s output as well as get the correct encoding by the integrated assembler. llvm-svn: 166638	2012-10-24 23:10:28 +00:00
Hal Finkel	39d442863c	Update GVN to support vectors of pointers. GVN will now generate ptrtoint instructions for vectors of pointers. Fixes PR14166. llvm-svn: 166624	2012-10-24 21:22:30 +00:00
Nadav Rotem	05d9e80245	LoopVectorizer: Add a basic cost model which uses the VTTI interface. llvm-svn: 166620	2012-10-24 20:36:32 +00:00
Evan Cheng	f97472cdf6	Fix a miscompilation caused by a typo. When turning a adde with negative value into a sbc with a positive number, the immediate should be complemented, not negated. Also added a missing pattern for ARM codegen. rdar://12559385 llvm-svn: 166613	2012-10-24 19:53:01 +00:00
Hal Finkel	2392da9d83	getSmallConstantTripMultiple should never return zero. When the trip count is -1, getSmallConstantTripMultiple could return zero, and this would cause runtime loop unrolling to assert. Instead of returning zero, one is now returned (consistent with the existing overflow cases). Fixes PR14167. llvm-svn: 166612	2012-10-24 19:46:44 +00:00
Micah Villmow	521311700f	Add in support for getIntPtrType to get the pointer type based on the address space. This checkin also adds in some tests that utilize these paths and updates some of the clients. llvm-svn: 166578	2012-10-24 15:52:52 +00:00
Elena Demikhovsky	b711ef1960	Special calling conventions for Intel OpenCL built-in library. llvm-svn: 166566	2012-10-24 14:46:16 +00:00
Duncan Sands	1f4e159b01	Add a testcase that would have noticed the typo fixed in commit 166475. llvm-svn: 166547	2012-10-24 07:17:20 +00:00
Michael Liao	18e40965aa	Teach DAG combine to fold (buildvec (Xint2fp x)) to (Xint2fp (buildvec x)) - If more than 1 elemennts are defined and target supports the vectorized conversion, use the vectorized one instead to reduce the strength on conversion operation. llvm-svn: 166546	2012-10-24 04:14:18 +00:00
Michael Liao	70bb8004bd	Add custom conversion from v2u32 to v2f32 in 32-bit mode - As there's no 64-bit GPRs in 32-bit mode, a custom conversion from v2u32 to v2f32 is added to improve the efficiency of the code generated. llvm-svn: 166545	2012-10-24 04:09:32 +00:00
Akira Hatanaka	8c77adc9ce	[mips] Make sure sret argument is returned in register V0. llvm-svn: 166539	2012-10-24 02:10:54 +00:00
Rafael Espindola	8ebde25931	Change x86_fastcallcc to require inreg markers. This allows it to known the difference from "int x" (which should go in registers and "struct y {int x;}" (which should not). Clang will be updated in the next patches. llvm-svn: 166536	2012-10-24 01:58:48 +00:00
Michael Liao	7ca099f727	Fix PR14161 - Check index being extracted to be constant 0 before simplfiying. Otherwise, retain the original sequence. llvm-svn: 166504	2012-10-23 21:40:15 +00:00
Nadav Rotem	3deae09579	Use the AliasAnalysis isIdentifiedObj because it also understands mallocs and c++ news. PR14158. llvm-svn: 166491	2012-10-23 18:44:18 +00:00
Bill Wendling	cc498d64b5	Ignore unreachable blocks when doing memory dependence analysis on non-local loads. It's not really profitable and may result in GVN going into an infinite loop when it hits constructs like this: %x = gep %some.type %x, ... Found via an LTO build of LLVM. llvm-svn: 166490	2012-10-23 18:37:11 +00:00
Michael Liao	23c890e0a8	Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1 llvm-svn: 166486	2012-10-23 17:34:00 +00:00
Duncan Sands	6ce2ce7ed1	Transform code like this %V = mul i64 %N, 4 %t = getelementptr i8* bitcast (i32* %arr to i8), i32 %V into %t1 = getelementptr i32 %arr, i32 %N %t = bitcast i32* %t1 to i8* incorporating the multiplication into the getelementptr. This happens all the time in dragonegg, for example for int foo(int A, int N) { return A[N]; } because gcc turns this into byte pointer arithmetic before it hits the plugin: D.1590_2 = (long unsigned int) N_1(D); D.1591_3 = D.1590_2 4; D.1592_5 = A_4(D) + D.1591_3; D.1589_6 = D.1592_5; return D.1589_6; The D.1592_5 line is a POINTER_PLUS_EXPR, which is turned into a getelementptr on a bitcast of A_4 to i8, so this becomes exactly the kind of IR that the transform fires on. An analogous transform (with no testcases!) already existed for bitcasts of arrays, so I rewrote it to share code with this one. llvm-svn: 166474	2012-10-23 08:28:26 +00:00
Reed Kotler	730040c219	implement setXX patterns llvm-svn: 166459	2012-10-23 01:35:48 +00:00
Bill Wendling	e97df2d337	When a block ends in an indirect branch, add its successors to the machine basic block. The CFG of the machine function needs to know that the targets of the indirect branch are successors to the indirect branch. <rdar://problem/12529625> llvm-svn: 166448	2012-10-22 23:30:04 +00:00
Kevin Enderby	0f6b703b72	Add support for annotated disassembly output for X86 and arm. Per the October 12, 2012 Proposal for annotated disassembly output sent out by Jim Grosbach this set of changes implements this for X86 and arm. The llvm-mc tool now has a -mdis option to produced the marked up disassembly and a couple of small example test cases have been added. rdar://11764962 llvm-svn: 166445	2012-10-22 22:31:46 +00:00
Nadav Rotem	302d4b678a	Don't crash if the load/store pointer is not a GEP. Fix by Shivarama Rao <Shivarama.Rao@amd.com> llvm-svn: 166427	2012-10-22 18:27:56 +00:00
Nadav Rotem	58a0c52168	Add a testcase for the previous commit. llvm-svn: 166425	2012-10-22 18:16:55 +00:00
Argyrios Kyrtzidis	61e2024bf0	Revert r166407 because it caused analyzer tests to crash and broke self-host bots. llvm-svn: 166424	2012-10-22 18:16:14 +00:00
Hal Finkel	7a55058abc	BBVectorize should ignore unreachable blocks. Unreachable blocks can have invalid instructions. For example, jump threading can produce self-referential instructions in unreachable blocks. Also, we should not be spending time optimizing unreachable code. Fixes PR14133. llvm-svn: 166423	2012-10-22 18:00:55 +00:00
Nadav Rotem	6b56385c1a	Vectorizer: optimize the generation of selects. If the condition is uniform, generate a scalar-cond select (i1 as selector). llvm-svn: 166409	2012-10-22 04:38:00 +00:00
Nick Lewycky	44e5136371	Reapply r166405, teaching tailcallelim to be smarter about nocapture, with a very small but very important bugfix: bool shouldExplore(Use U) { Value V = U->get(); if (isa<CallInst>(V) \|\| isa<InvokeInst>(V)) [...] should have read: bool shouldExplore(Use U) { Value V = U->getUser(); if (isa<CallInst>(V) \|\| isa<InvokeInst>(V)) Fixes PR14143! llvm-svn: 166407	2012-10-22 03:03:52 +00:00
NAKAMURA Takumi	32507dd070	Revert r166405, "Teach TailRecursionElimination to consider 'nocapture' when deciding whether" It broke selfhosting stage2 in several builders. llvm-svn: 166406	2012-10-22 00:48:51 +00:00
Nick Lewycky	1a50a0e414	Teach TailRecursionElimination to consider 'nocapture' when deciding whether calls can be marked tail. llvm-svn: 166405	2012-10-21 23:51:22 +00:00
Hal Finkel	502fe3cc4a	DataLayout should use itself when calculating the size of a vector. This is important for vectors of pointers because only DataLayout, not the underlying vector type, knows how to calculate the size of the pointers in the vector. Fixes PR14138. llvm-svn: 166401	2012-10-21 20:38:03 +00:00
Benjamin Kramer	d97f445bdf	Revert r166390 "LoopIdiom: Replace custom dependence analysis with LoopDependenceAnalysis." It passes all tests, produces better results than the old code but uses the wrong pass, LoopDependenceAnalysis, which is old and unmaintained. "Why is it still in tree?", you might ask. The answer is obviously: "To confuse developers." Just swapping in the new dependency pass sends the pass manager into an infinte loop, I'll try to figure out why tomorrow. llvm-svn: 166399	2012-10-21 19:31:16 +00:00

... 4 5 6 7 8 ...

17859 Commits