llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 06:22:56 +02:00

Author	SHA1	Message	Date
Andrew Trick	c6aec1a4f7	misched: Fix LiveInterval update to better handle DebugVal. Assertion failed: (itr != mi2iMap.end() && "Instruction not found in maps.") rdar://12777252. llvm-svn: 169070	2012-12-01 01:22:41 +00:00
Andrew Trick	cc6d57195d	misched: fix RegionBegin when DebugValues get shuffled to the top. assert (RemainingInstrs == 0 && "Instruction count mismatch!") rdar://12776937. llvm-svn: 169069	2012-12-01 01:22:38 +00:00
Jakob Stoklund Olesen	4aa22e2c8d	Simplify REG_SEQUENCE lowering. The TwoAddressInstructionPass takes the machine code out of SSA form by expanding REG_SEQUENCE instructions into copies. It is no longer necessary to rewrite the registers used by a REG_SEQUENCE instruction because the new coalescer algorithm can do it now. REG_SEQUENCE is just converted to a sequence of sub-register copies now. llvm-svn: 169067	2012-12-01 01:06:44 +00:00
Eric Christopher	17e71f21a8	Add some first skeleton work for the DWARF5 Fission proposal. Emit part of the compile unit CU and start separating out information into the various sections that will be pulled out later. WIP. llvm-svn: 169061	2012-11-30 23:59:06 +00:00
Chad Rosier	05b569a7a5	test/CodeGen/PowerPC/vec_mul.ll: Add a triple. Thanks, Hal. llvm-svn: 169026	2012-11-30 19:15:10 +00:00
Sebastian Pop	f372d2334f	Codegen failure for vmull with small vectors Codegen was failing with an assertion because of unexpected vector operands when legalizing the selection DAG for a MUL instruction. The asserting code was legalizing multiplies for vectors of size 128 bits. It uses a custom lowering to try and detect cases where it can use a VMULL instruction instead of a VMOVL + VMUL. The code was looking for input operands to the MUL that had been sign or zero extended. If it found the extended operands it would drop the sign/zero extension and use the original vector size as input to a VMULL instruction. The code assumed that the original input vector was 64 bits so that after dropping the extension it would fit directly into a D register and could be used as an operand of a VMULL instruction. The input code that trigger the failure used a vector of <4 x i8> that was sign extended to <4 x i32>. It was not safe to drop the sign extension in this case because the original vector is only 32 bits wide. The fix is to insert a sign extension for the vector to reach the required 64 bit size. In this particular example, the vector would need to be sign extented to a <4 x i16>. llvm-svn: 169024	2012-11-30 19:08:04 +00:00
Chad Rosier	bee049d9ed	test/CodeGen/PowerPC/vec_mul.ll: Fix register operands. llvm-svn: 169020	2012-11-30 18:29:01 +00:00
NAKAMURA Takumi	a95fd58fdb	test/CodeGen/PowerPC: Add explicit -march=ppc32. FIXME: Please add another RUN line if you would like to check also on ppc64. llvm-svn: 168999	2012-11-30 13:28:31 +00:00
Adhemerval Zanella	72208bbf33	This patch fixes the Altivec addend construction for the fused multiply-add instruction (vmaddfp) to conform with IEEE to ensure the sign of a zero result when resulting product is -0.0. The -0.0 vector addend to vmaddfp is generated by a creating a vector with full bits sets and then shifting each elements by 31-bits to the left, resulting in a vector of 0x80000000 (or -0.0 as float). The 'buildvec_canonicalize.ll' was adjusted to reflect this change and the 'vec_mul.ll' was complemented with the float vector multiplication test. llvm-svn: 168998	2012-11-30 13:05:44 +00:00
Evgeniy Stepanov	d27ab822c9	[msan] Tests for vector manipulation instructions. llvm-svn: 168997	2012-11-30 12:12:20 +00:00
Evan Cheng	af9b73ef6f	Fix logic to determine whether to turn a switch into a lookup table. When the tables cannot fit in registers (i.e. bitmap), do not emit the table if it's using an illegal type. rdar://12779436 llvm-svn: 168970	2012-11-30 02:02:42 +00:00
Preston Briggs	52d4891df2	Modified dump() to provide a little more information for dependences between instructions that don't share a common loop. Updated the test results appropriately. llvm-svn: 168965	2012-11-30 00:44:47 +00:00
Kevin Enderby	d3ba5ff018	Fixed the arm disassembly of invalid BFI instructions to not build a bad MCInst which would then cause an assert when printed. rdar://11437956 llvm-svn: 168960	2012-11-29 23:47:11 +00:00
Eli Bendersky	73483dbfc6	Add a FileCheck test that makes sure two different CHECKs won't match the same string llvm-svn: 168942	2012-11-29 21:24:44 +00:00
Shuxin Yang	a7c032d8b5	rdar://12100355 (part 1) This revision attempts to recognize following population-count pattern: while(a) { c++; ... ; a &= a - 1; ... }, where <c> and <a>could be used multiple times in the loop body. TODO: On X8664 and ARM, __buildin_ctpop() are not expanded to a efficent instruction sequence, which need to be improved in the following commits. Reviewed by Nadav, really appreciate! llvm-svn: 168931	2012-11-29 19:38:54 +00:00
Bill Wendling	18531926d1	Handle the situation where CodeGenPrepare removes a reference to a BB that has the last invoke instruction in the function. This also removes the last landing pad in an function. This is fine, but with SjLj EH code, we've already placed a bunch of code in the 'entry' block, which expects the landing pad to stick around. When we get to the situation where CGP has removed the last landing pad, go ahead and nuke the SjLj instructions from the 'entry' block. <rdar://problem/12721258> llvm-svn: 168930	2012-11-29 19:38:06 +00:00
Meador Inge	3524aece42	instcombine: Migrate puts optimizations This patch migrates the puts optimizations from the simplify-libcalls pass into the instcombine library call simplifier. All the simplifiers from simplify-libcalls have now been migrated to instcombine. Yay! Just a few other bits to migrate (prototype attribute inference and a few statistics) and simplify-libcalls can finally be put to rest. llvm-svn: 168925	2012-11-29 19:15:17 +00:00
Benjamin Kramer	0bcd999459	Follow up to 168711: It's safe to base this analysis on the found compare, just return the value for the right predicate. Thanks to Andy for catching this. llvm-svn: 168921	2012-11-29 19:07:57 +00:00
Shuxin Yang	fd7c5c30c7	fix a typo llvm-svn: 168909	2012-11-29 18:09:37 +00:00
Meador Inge	95a0f6df53	instcombine: Migrate fputs optimizations This patch migrates the fputs optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168893	2012-11-29 15:45:43 +00:00
Meador Inge	787f51971a	instcombine: Migrate fwrite optimizations This patch migrates the fwrite optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168892	2012-11-29 15:45:39 +00:00
Meador Inge	5553b265a0	instcombine: Migrate fprintf optimizations This patch migrates the fprintf optimizations from the simplify-libcalls pass into the instcombine library call simplifier. llvm-svn: 168891	2012-11-29 15:45:33 +00:00
Silviu Baranga	d93d64a5fd	Added atomic 64 min/max/umin/umax instrinsics support in the ARM backend. llvm-svn: 168886	2012-11-29 14:41:25 +00:00
Justin Holewinski	9c8d5cc197	Teach the legalizer how to handle operands for VSELECT nodes If we need to split the operand of a VSELECT, it must be the mask operand. We split the entire VSELECT operand with EXTRACT_SUBVECTOR. llvm-svn: 168883	2012-11-29 14:26:28 +00:00
Justin Holewinski	c9fa05b437	Allow targets to prefer TypeSplitVector over TypePromoteInteger when computing the legalization method for vectors For some targets, it is desirable to prefer scalarizing <N x i1> instead of promoting to a larger legal type, such as <N x i32>. llvm-svn: 168882	2012-11-29 14:26:24 +00:00
Evgeniy Stepanov	fafa2ae4b6	[msan] Propagate shadow through (x<0) and (x>=0) comparisons. This is a special case of signed relational comparison where result only depends on the sign of x. llvm-svn: 168881	2012-11-29 14:25:47 +00:00
Evgeniy Stepanov	fc6164c985	[msan] Fix shadow & origin store & load alignment. This change ensures that shadow memory accesses have the same alignment as corresponding app memory accesses. llvm-svn: 168880	2012-11-29 14:05:53 +00:00
Evgeniy Stepanov	a193a72b44	[msan] Add a test for r168873. llvm-svn: 168877	2012-11-29 13:11:09 +00:00
Evgeniy Stepanov	ad930ee08f	[msan] Update tests (broken in r168873). llvm-svn: 168874	2012-11-29 12:43:56 +00:00
Evgeniy Stepanov	6d7e99f2ac	Initial commit of MemorySanitizer. Compiler pass only. llvm-svn: 168866	2012-11-29 09:57:20 +00:00
Kostya Serebryany	5858a1aa4c	[asan] when checking the noreturn attribute on the call, also check it on the callee llvm-svn: 168861	2012-11-29 08:57:20 +00:00
Shuxin Yang	106133b571	Instruction::isAssociative() returns true for fmul/fadd if they are tagged "unsafe" mode. Approved by: Eli and Michael. llvm-svn: 168848	2012-11-29 01:47:31 +00:00
Jakob Stoklund Olesen	9dbb7d3582	Avoid rewriting instructions twice. This could cause miscompilations in targets where sub-register composition is not always idempotent (ARM). <rdar://problem/12758887> llvm-svn: 168837	2012-11-29 00:26:11 +00:00
Nadav Rotem	3f31f2a3aa	When combining consecutive stores allow loads in between the stores, if the loads do not alias. llvm-svn: 168832	2012-11-29 00:00:08 +00:00
Benjamin Kramer	bd65c85dc1	ARM: Implement CanLowerReturn so large vectors get expanded into sret. Fixes 14337. llvm-svn: 168809	2012-11-28 20:55:10 +00:00
Ulrich Weigand	3ab1bb1fd8	Fix initial frame state on powerpc64. The createPPCMCAsmInfo routine used PPC::R1 as the initial frame pointer register, but on PPC64 the 32-bit R1 register does not have a corresponding DWARF number, causing invalid CIE initial frame state to be emitted. Fix by using PPC::X1 instead. llvm-svn: 168799	2012-11-28 18:21:03 +00:00
Patrik Hägglund	9c1279a58f	Add error handling in getInt. Accordingly, update a testcase with a broken datalayout string. Also, we never parse negative numbers, because '-' is used as a separator. Therefore, use unsigned as result type. llvm-svn: 168785	2012-11-28 12:13:12 +00:00
Kostya Serebryany	133cb3c737	[asan] Split AddressSanitizer into two passes (FunctionPass, ModulePass), LLVM part. This requires a clang part which will follow. llvm-svn: 168781	2012-11-28 10:31:36 +00:00
Andrew Trick	ceeb35bbb8	misched: Analysis that partitions the DAG into subtrees. This is a simple, cheap infrastructure for analyzing the shape of a DAG. It recognizes uniform DAGs that take the shape of bottom-up subtrees, such as the included matrix multiplication example. This is useful for heuristics that balance register pressure with ILP. Two canonical expressions of the heuristic are implemented in scheduling modes: -misched-ilpmin and -misched-ilpmax. llvm-svn: 168773	2012-11-28 05:13:28 +00:00
Andrew Trick	7ba8fe7bcd	misched: better alias analysis. This fixes a hole in the "cheap" alias analysis logic implemented within the DAG builder itself, regardless of whether proper alias analysis is enabled. It now handles this pattern produced by LSR+CodeGenPrepare. %sunkaddr1 = ptrtoint * %obj to i64 %sunkaddr2 = add i64 %sunkaddr1, %lsr.iv %sunkaddr3 = inttoptr i64 %sunkaddr2 to i32* store i32 %v, i32* %sunkaddr3 llvm-svn: 168768	2012-11-28 03:42:49 +00:00
Hal Finkel	e25b9ebee4	BBVectorize: Correctly merge SubclassOptionalData When two instructions are combined into a vector instruction, the resulting instruction must have the most-conservative flags. llvm-svn: 168765	2012-11-28 03:04:10 +00:00
Bill Schmidt	9f4da44752	This patch makes medium code model the default for 64-bit PowerPC ELF. When the CodeGenInfo is to be created for the PPC64 target machine, a default code-model selection is converted to CodeModel::Medium provided we are not targeting the Darwin OS. Defaults for Darwin are unaffected. llvm-svn: 168747	2012-11-27 23:36:26 +00:00
Chad Rosier	9a90d62b0b	Add -verify-machineinstrs to these fast-isel test cases. llvm-svn: 168723	2012-11-27 20:49:56 +00:00
Preston Briggs	f15c406c47	Modified depends() to recognize that when all levels are "=" and there's no possible loo-independent dependence, then there's no dependence. Updated all test result appropriately. llvm-svn: 168719	2012-11-27 19:12:26 +00:00
Manman Ren	c45c0a304b	CSE: allow PerformTrivialCoalescing to check copies across basic block boundaries. Given the following case: BB0 %vreg1<def> = SUBrr %vreg0, %vreg7 %vreg2<def> = COPY %vreg7 BB1 %vreg10<def> = SUBrr %vreg0, %vreg2 We should be able to CSE between SUBrr in BB0 and SUBrr in BB1. rdar://12462006 llvm-svn: 168717	2012-11-27 18:58:41 +00:00
Meador Inge	4275530cf4	instcombine: Don't replace all uses for instructions with no uses My commit to migrate the printf simplifiers from the simplify-libcalls in r168604 introduced a regression reported by Duncan [1]. The problem is that in some cases the library call simplifier can return a new value that has no uses and the new value's type is different than the old value's type (which is fine because there are no uses). The specific case that triggered the bug looked something like: declare void @printf(i8, ...) ... call void (i8, ...)* @printf(i8* %fmt) Which we want to optimized into: call i32 @putchar(i32 104) However, the code was attempting to replace all uses of the printf with the putchar and the types differ, hence a crash. This is fixed by just deleting the original instruction when there are no uses. The old simplify-libcalls pass is already doing something similar. [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-November/056338.html llvm-svn: 168716	2012-11-27 18:52:49 +00:00
Benjamin Kramer	dd7fb68c76	SCEV: Even if the latch terminator is foldable we can't deduce the result of an unrelated condition with it. Fixes PR14432. llvm-svn: 168711	2012-11-27 18:16:32 +00:00
Manman Ren	cbcf2bcc8a	X86: do not fold load instructions such as [V]MOVS[S\|D] to other instructions when the destination register is wider than the memory load. These load instructions load from m32 or m64 and set the upper bits to zero, while the folded instructions may accept m128. rdar://12721174 llvm-svn: 168710	2012-11-27 18:09:26 +00:00
Bill Schmidt	0975882ed4	This patch implements medium code model support for 64-bit PowerPC. The default for 64-bit PowerPC is small code model, in which TOC entries must be addressable using a 16-bit offset from the TOC pointer. Additionally, only TOC entries are addressed via the TOC pointer. With medium code model, TOC entries and data sections can all be addressed via the TOC pointer using a 32-bit offset. Cooperation with the linker allows 16-bit offsets to be used when these are sufficient, reducing the number of extra instructions that need to be executed. Medium code model also does not generate explicit TOC entries in ".section toc" for variables that are wholly internal to the compilation unit. Consider a load of an external 4-byte integer. With small code model, the compiler generates: ld 3, .LC1@toc(2) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc ei[TC],ei With medium model, it instead generates: addis 3, 2, .LC1@toc@ha ld 3, .LC1@toc@l(3) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc ei[TC],ei Here .LC1@toc@ha is a relocation requesting the upper 16 bits of the 32-bit offset of ei's TOC entry from the TOC base pointer. Similarly, .LC1@toc@l is a relocation requesting the lower 16 bits. Note that if the linker determines that ei's TOC entry is within a 16-bit offset of the TOC base pointer, it will replace the "addis" with a "nop", and replace the "ld" with the identical "ld" instruction from the small code model example. Consider next a load of a function-scope static integer. For small code model, the compiler generates: ld 3, .LC1@toc(2) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc test_fn_static.si[TC],test_fn_static.si .type test_fn_static.si,@object .local test_fn_static.si .comm test_fn_static.si,4,4 For medium code model, the compiler generates: addis 3, 2, test_fn_static.si@toc@ha addi 3, 3, test_fn_static.si@toc@l lwz 4, 0(3) .type test_fn_static.si,@object .local test_fn_static.si .comm test_fn_static.si,4,4 Again, the linker may replace the "addis" with a "nop", calculating only a 16-bit offset when this is sufficient. Note that it would be more efficient for the compiler to generate: addis 3, 2, test_fn_static.si@toc@ha lwz 4, test_fn_static.si@toc@l(3) The current patch does not perform this optimization yet. This will be addressed as a peephole optimization in a later patch. For the moment, the default code model for 64-bit PowerPC will remain the small code model. We plan to eventually change the default to medium code model, which matches current upstream GCC behavior. Note that the different code models are ABI-compatible, so code compiled with different models will be linked and execute correctly. I've tested the regression suite and the application/benchmark test suite in two ways: Once with the patch as submitted here, and once with additional logic to force medium code model as the default. The tests all compile cleanly, with one exception. The mandel-2 application test fails due to an unrelated ABI compatibility with passing complex numbers. It just so happens that small code model was incredibly lucky, in that temporary values in floating-point registers held the expected values needed by the external library routine that was called incorrectly. My current thought is to correct the ABI problems with _Complex before making medium code model the default, to avoid introducing this "regression." Here are a few comments on how the patch works, since the selection code can be difficult to follow: The existing logic for small code model defines three pseudo-instructions: LDtoc for most uses, LDtocJTI for jump table addresses, and LDtocCPT for constant pool addresses. These are expanded by SelectCodeCommon(). The pseudo-instruction approach doesn't work for medium code model, because we need to generate two instructions when we match the same pattern. Instead, new logic in PPCDAGToDAGISel::Select() intercepts the TOC_ENTRY node for medium code model, and generates an ADDIStocHA followed by either a LDtocL or an ADDItocL. These new node types correspond naturally to the sequences described above. The addis/ld sequence is generated for the following cases: * Jump table addresses * Function addresses * External global variables * Tentative definitions of global variables (common linkage) The addis/addi sequence is generated for the following cases: * Constant pool entries * File-scope static global variables * Function-scope static variables Expanding to the two-instruction sequences at select time exposes the instructions to subsequent optimization, particularly scheduling. The rest of the processing occurs at assembly time, in PPCAsmPrinter::EmitInstruction. Each of the instructions is converted to a "real" PowerPC instruction. When a TOC entry needs to be created, this is done here in the same manner as for the existing LDtoc, LDtocJTI, and LDtocCPT pseudo-instructions (I factored out a new routine to handle this). I had originally thought that if a TOC entry was needed for LDtocL or ADDItocL, it would already have been generated for the previous ADDIStocHA. However, at higher optimization levels, the ADDIStocHA may appear in a different block, which may be assembled textually following the block containing the LDtocL or ADDItocL. So it is necessary to include the possibility of creating a new TOC entry for those two instructions. Note that for LDtocL, we generate a new form of LD called LDrs. This allows specifying the @toc@l relocation for the offset field of the LD instruction (i.e., the offset is replaced by a SymbolLo relocation). When the peephole optimization described above is added, we will need to do similar things for all immediate-form load and store operations. The seven "mcm-n.ll" test cases are kept separate because otherwise the intermingling of various TOC entries and so forth makes the tests fragile and hard to understand. The above assumes use of an external assembler. For use of the integrated assembler, new relocations are added and used by PPCELFObjectWriter. Testing is done with "mcm-obj.ll", which tests for proper generation of the various relocations for the same sequences tested with the external assembler. llvm-svn: 168708	2012-11-27 17:35:46 +00:00
Ulrich Weigand	d899cee68f	Never use .lcomm on platforms where it does not accept an alignment argument. Instead, use a pair of .local and .comm directives. This avoids spurious differences between binaries built by the integrated assembler vs. those built by the external assembler, since the external assembler may impose alignment requirements on .lcomm symbols where the integrated assembler does not. llvm-svn: 168704	2012-11-27 16:11:16 +00:00

1 2 3 4 5 ...

17698 Commits