llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 14:02:52 +02:00

Author	SHA1	Message	Date
Wesley Peck	b168ddedaa	Fix a 16-bit immediate value detection bug in the MBlaze delay slot filler. Address more hazards in the MBlaze delay slot filler. patch contributed by Jack Whitham! llvm-svn: 121037	2010-12-06 21:11:01 +00:00
Rafael Espindola	fd0cc5d13f	Another use of getSymbolOffset. llvm-svn: 121034	2010-12-06 19:55:05 +00:00
Rafael Espindola	65c25aef87	Remove the instruction fragment to data fragment lowering since it was causing freed data to be read. I will open a bug to track it being reenabled. llvm-svn: 121028	2010-12-06 19:08:48 +00:00
Owen Anderson	8e9cb84ea2	Revert r121021, which broke the buildbots. llvm-svn: 121026	2010-12-06 18:57:40 +00:00
Jim Grosbach	f2e0e808ba	Trailing whitespace. llvm-svn: 121024	2010-12-06 18:47:44 +00:00
Owen Anderson	0c51a02230	Improve handling of Thumb2 PC-relative loads by converting LDRpci (and friends) to Pseudos. llvm-svn: 121021	2010-12-06 18:35:51 +00:00
Jim Grosbach	6c27b4f3cf	Encode the register operand of ARM CondCode operands correctly. ARM::CPSR if the instruction is predicated, reg0 otherwise. llvm-svn: 121020	2010-12-06 18:30:57 +00:00
Jim Grosbach	c79c6290ee	The ARM AsmMatcher needs to know that the CCOut operand is a register value, not an immediate. It stores either ARM::CPSR or reg0. llvm-svn: 121018	2010-12-06 18:21:12 +00:00
Devang Patel	a4d6774cf8	Do not try luck by using given name to create temporary file. In parallel builds it may not work. This time for .s file. llvm-svn: 121016	2010-12-06 18:04:39 +00:00
Rafael Espindola	3e954d16f4	Second try at making direct object emission produce the same results as llc + llvm-mc. This time ELF is not changed and I tested that llvm-gcc bootstrap on darwin10 using darwin9's assembler and linker. llvm-svn: 121006	2010-12-06 17:27:56 +00:00
Rafael Espindola	4ec917db9b	Revert previous two patches while I try to find out how to make both linux and darwin assemblers happy :-( llvm-svn: 121004	2010-12-06 15:35:15 +00:00
Rafael Espindola	ad6219b193	Update test for the extra =. llvm-svn: 121001	2010-12-06 15:05:36 +00:00
Rafael Espindola	3dc2b4cba7	Add an EmitAbsValue helper method and use it in cases where we want to be sure that no relocations are used (on MochO). Fixes llc producing different output from llc + llvm-mc. llvm-svn: 121000	2010-12-06 14:53:14 +00:00
Frits van Bommel	f7368778bc	Fix clang warning: "extra ';' inside a class [-pedantic]". llvm-svn: 120998	2010-12-06 10:48:11 +00:00
Chris Lattner	48a7310e08	Fix PR8735, a really terrible problem in the inliner's "alloca merging" optimization. Consider: static void foo() { A = alloca ... } static void bar() { B = alloca ... call foo(); } void main() { bar() } The inliner proceeds bottom up, but lets pretend it decides not to inline foo into bar. When it gets to main, it inlines bar into main(), and says "hey, I just inlined an alloca "B" into main, lets remember that. Then it keeps going and finds that it now contains a call to foo. It decides to inline foo into main, and says "hey, foo has an alloca A, and I have an alloca B from another inlined call site, lets reuse it". The problem with this of course, is that the lifetime of A and B are nested, not disjoint. Unfortunately I can't create a reasonable testcase for this: the one in the PR is both huge and extremely sensitive, because you minor tweaks end up causing foo to get inlined into bar too early. We already have tests for the basic alloca merging optimization and this does not break them. llvm-svn: 120995	2010-12-06 07:52:42 +00:00
Chris Lattner	21587c9f65	improve comment llvm-svn: 120994	2010-12-06 07:43:04 +00:00
Chris Lattner	71a4c43942	improve -debug output and comments a little. llvm-svn: 120993	2010-12-06 07:38:40 +00:00
Michael J. Spencer	b31b7d5b4e	Support/Windows: Make MinGW happy. llvm-svn: 120991	2010-12-06 06:02:07 +00:00
Michael J. Spencer	244b426701	Support/FileSystem: Add directory_iterator implementation. llvm-svn: 120989	2010-12-06 04:28:42 +00:00
Michael J. Spencer	36a2df800d	Support/PathV2: Fix append to not add a slash to empty or root paths. llvm-svn: 120988	2010-12-06 04:28:23 +00:00
Michael J. Spencer	61043e9f3a	Support/Windows: Add ScopedHandle and move some clients over to it. llvm-svn: 120987	2010-12-06 04:28:13 +00:00
Michael J. Spencer	68b70b5024	KillTheDoctor: Cleanup error_code usage. llvm-svn: 120986	2010-12-06 04:28:01 +00:00
Michael J. Spencer	51890f109f	KillTheDoctor: Fix spelling. llvm-svn: 120985	2010-12-06 04:27:52 +00:00
Michael J. Spencer	e5298b0f07	Support/ADT: Move c_str() from SmallString to SmallVectorImpl. The Windows PathV2 implementation needs it for wchar_t and SmallVectorImpl in general. llvm-svn: 120984	2010-12-06 04:27:42 +00:00
Che-Liang Chiou	cd2878d421	ptx: add shift instructions llvm-svn: 120982	2010-12-06 04:00:03 +00:00
Rafael Espindola	0ba01a5b5c	Remove the getAddress getter, initialize Ordinal in the constructor and use that on the ELF writer to detect a section we created. llvm-svn: 120981	2010-12-06 03:48:09 +00:00
Rafael Espindola	bf001eed4c	Simplify a bit. llvm-svn: 120980	2010-12-06 03:36:43 +00:00
Rafael Espindola	d361a448af	Use getSymbolOffset on the COFF writer. llvm-svn: 120979	2010-12-06 03:24:04 +00:00
Rafael Espindola	f56c11276e	Don't use PadSectionToAlignment on windows. llvm-svn: 120978	2010-12-06 03:03:44 +00:00
Rafael Espindola	1b2090ef24	Add a getSymbolOffset method and use it in the ELF writer. llvm-svn: 120977	2010-12-06 02:57:26 +00:00
Chris Lattner	db6c348f31	Fix PR8728, a miscompilation I recently introduced. When optimizing memcpy's like: memcpy(A, B) memcpy(A, C) we cannot delete the first memcpy as dead if A and C might be aliases. If so, we actually get: memcpy(A, B) memcpy(A, A) which is not correct to transform into: memcpy(A, A) This patch was heavily influenced by Jakub Staszak's patch in PR8728, thanks Jakub! llvm-svn: 120974	2010-12-06 01:48:06 +00:00
Chris Lattner	83445a6415	add a helper method. llvm-svn: 120973	2010-12-06 01:01:28 +00:00
Evan Cheng	4d9d54e44e	Eliminate unneeded #include's. llvm-svn: 120971	2010-12-05 23:41:43 +00:00
NAKAMURA Takumi	594d4094ca	ARM/CMakeLists.txt: Add missing MLxExpansionPass.cpp since r120960. llvm-svn: 120966	2010-12-05 23:08:57 +00:00
Evan Cheng	12561e250d	Code clean up. llvm-svn: 120965	2010-12-05 23:03:45 +00:00
Evan Cheng	854ec53564	Remove an unused variable. llvm-svn: 120964	2010-12-05 23:03:35 +00:00
Cameron Zwarich	f56ba80bb2	Some cleanup before I start committing some incremental progress on StrongPHIElimination. llvm-svn: 120961	2010-12-05 22:34:08 +00:00
Evan Cheng	fc78767730	Making use of VFP / NEON floating point multiply-accumulate / subtraction is difficult on current ARM implementations for a few reasons. 1. Even though a single vmla has latency that is one cycle shorter than a pair of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause additional pipeline stall. So it's frequently better to single codegen vmul + vadd. 2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to stall for 4 cycles. We need to schedule them apart. 3. A vmla followed vmla is a special case. Obvious issuing back to back RAW vmla + vmla is very bad. But this isn't ideal either: vmul vadd vmla Instead, we want to expand the second vmla: vmla vmul vadd Even with the 4 cycle vmul stall, the second sequence is still 2 cycles faster. Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough but it isn't the optimial solution. This patch attempts to make it possible to use vmla / vmls in cases where it is profitable. A. Add missing isel predicates which cause vmla to be codegen'ed. B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to compute a fmul and a fmla. C. Add additional isel checks for vmla, avoid cases where vmla is feeding into fp instructions (except for the #3 exceptional case). D. Add ARM hazard recognizer to model the vmla / vmls hazards. E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the vmla / vmls will trigger one of the special hazards. Work in progress, only A+B are enabled. llvm-svn: 120960	2010-12-05 22:04:16 +00:00
Cameron Zwarich	f64c26bb9e	Remove the PHIElimination.h header, as it is no longer needed. llvm-svn: 120959	2010-12-05 21:39:42 +00:00
Frits van Bommel	96efe38470	Clarify some of the differences between indexing with getelementptr and indexing with insertvalue/extractvalue. llvm-svn: 120957	2010-12-05 20:54:38 +00:00
Frits van Bommel	e390b379ae	Fix PR 4170 by having ExtractValueInst::getIndexedType() reject out-of-bounds indexing. Also add asserts that the indices are valid in InsertValueInst::init(). ExtractValueInst already asserts when constructed with invalid indices. llvm-svn: 120956	2010-12-05 20:50:26 +00:00
Cameron Zwarich	fbe9e91d97	I forgot to actually remove the FindCopyInsertPoint() declaration from PHIElimination.h. llvm-svn: 120953	2010-12-05 19:58:57 +00:00
Cameron Zwarich	cb613dcf69	Remove the SplitCriticalEdge() method declaration from PHIElimination.h. At one time, this method existed, but now PHIElimination uses the method of the same name on MachineBasicBlock. llvm-svn: 120952	2010-12-05 19:54:23 +00:00
Cameron Zwarich	c680f44c1b	Move the FindCopyInsertPoint method of PHIElimination to a new standalone function so that it can be shared with StrongPHIElimination. llvm-svn: 120951	2010-12-05 19:51:05 +00:00
Frits van Bommel	b95594885e	Refactor jump threading. Should have no functional change other than the order of two transformations that are mutually-exclusive and the exact formatting of debug output. Internally, it now stores the ConstantInts as Constants, and actual undef values instead of nulls. llvm-svn: 120946	2010-12-05 19:06:41 +00:00
Frits van Bommel	4f39797ac2	Remove trailing whitespace. llvm-svn: 120945	2010-12-05 19:02:47 +00:00
Frits van Bommel	31cf7b99f9	Teach SimplifyCFG to turn (indirectbr (select cond, blockaddress(@fn, BlockA), blockaddress(@fn, BlockB))) into (br cond, BlockA, BlockB). llvm-svn: 120943	2010-12-05 18:29:03 +00:00
Chris Lattner	e30adfb732	Teach X86ISelLowering that the second result of X86ISD::UMUL is a flags result. This allows us to compile: void *test12(long count) { return new int[count]; } into: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx movq $-1, %rdi cmovnoq %rax, %rdi jmp __Znam ## TAILCALL instead of: test12: movl $4, %ecx movq %rdi, %rax mulq %rcx seto %cl testb %cl, %cl movq $-1, %rdi cmoveq %rax, %rdi jmp __Znam Of course it would be even better if the regalloc inverted the cmov to 'cmovoq', which would eliminate the need for the 'movq %rdi, %rax'. llvm-svn: 120936	2010-12-05 07:49:54 +00:00
Chris Lattner	76601e7a99	it turns out that when ".with.overflow" intrinsics were added to the X86 backend that they were all implemented except umul. This one fell back to the default implementation that did a hi/lo multiply and compared the top. Fix this to check the overflow flag that the 'mul' instruction sets, so we can avoid an explicit test. Now we compile: void *func(long count) { return new int[count]; } into: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] seto %cl ## encoding: [0x0f,0x90,0xc1] testb %cl, %cl ## encoding: [0x84,0xc9] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL instead of: __Z4funcl: ## @_Z4funcl movl $4, %ecx ## encoding: [0xb9,0x04,0x00,0x00,0x00] movq %rdi, %rax ## encoding: [0x48,0x89,0xf8] mulq %rcx ## encoding: [0x48,0xf7,0xe1] testq %rdx, %rdx ## encoding: [0x48,0x85,0xd2] movq $-1, %rdi ## encoding: [0x48,0xc7,0xc7,0xff,0xff,0xff,0xff] cmoveq %rax, %rdi ## encoding: [0x48,0x0f,0x44,0xf8] jmp __Znam ## TAILCALL Other than the silly seto+test, this is using the o bit directly, so it's going in the right direction. llvm-svn: 120935	2010-12-05 07:30:36 +00:00
Chris Lattner	9b4b9e751a	fix the rest of the linux miscompares :) llvm-svn: 120933	2010-12-05 02:08:07 +00:00

1 2 3 4 5 ...

67831 Commits