llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00

Author	SHA1	Message	Date
Anton Korobeynikov	173f5d7e53	`MSP430InstrInfo::loadRegFromStackSlot` forgets to set register def. Summary: For instance, compiling the below results in a panic: ``` llc: ../lib/CodeGen/InlineSpiller.cpp:1140: bool (anonymous namespace)::InlineSpiller::foldMemoryOperand(ArrayRef<std::pair<MachineInstr , unsigned int> >, llvm::MachineInstr ): Assertion `MO->isDead() && "Cannot fold physreg def"' failed. #0 0x00007f50fbcf353e llvm::sys::PrintStackTrace(llvm::raw_ostream&) /home/h/3rd/llvm/build/../lib/Support/Unix/Signals.inc:321:15 #1 0x00007f50fbcf3929 PrintStackTraceSignalHandler(void) /home/h/3rd/llvm/build/../lib/Support/Unix/Signals.inc:380:1 #2 0x00007f50fbcf22a3 llvm::sys::RunSignalHandlers() /home/h/3rd/llvm/build/../lib/Support/Signals.cpp:45:5 #3 0x00007f50fbcf3bb4 SignalHandler(int) /home/h/3rd/llvm/build/../lib/Support/Unix/Signals.inc:210:1 #4 0x00007f50fa87a180 (/lib/x86_64-linux-gnu/libc.so.6+0x35180) #5 0x00007f50fa87a107 gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x35107) #6 0x00007f50fa87b4e8 abort (/lib/x86_64-linux-gnu/libc.so.6+0x364e8) #7 0x00007f50fa873226 (/lib/x86_64-linux-gnu/libc.so.6+0x2e226) #8 0x00007f50fa8732d2 (/lib/x86_64-linux-gnu/libc.so.6+0x2e2d2) #9 0x00007f50fddd9287 (anonymous namespace)::InlineSpiller::foldMemoryOperand(llvm::ArrayRef<std::pair<llvm::MachineInstr, unsigned int> >, llvm::MachineInstr) /home/h/3rd/llvm/build/../lib/CodeGen/InlineSpiller.cpp:1141:21 #10 0x00007f50fddd9ee9 (anonymous namespace)::InlineSpiller::spillAroundUses(unsigned int) /home/h/3rd/llvm/build/../lib/CodeGen/InlineSpiller.cpp:1286:9 #11 0x00007f50fddd388b (anonymous namespace)::InlineSpiller::spillAll() /home/h/3rd/llvm/build/../lib/CodeGen/InlineSpiller.cpp:1338:21 #12 0x00007f50fddd221d (anonymous namespace)::InlineSpiller::spill(llvm::LiveRangeEdit&) /home/h/3rd/llvm/build/../lib/CodeGen/InlineSpiller.cpp:1391:3 #13 0x00007f50fdfd921b (anonymous namespace)::RAGreedy::selectOrSplitImpl(llvm::LiveInterval&, llvm::SmallVectorImpl<unsigned int>&, llvm::SmallSet<unsigned int, 16u, std::less<unsigned int> >&, unsigned int) /home/h/3rd/llvm/build/../lib/CodeGen/RegAllocGreedy.cpp:2555:5 #14 0x00007f50fdfd647b (anonymous namespace)::RAGreedy::selectOrSplit(llvm::LiveInterval&, llvm::SmallVectorImpl<unsigned int>&) /home/h/3rd/llvm/build/../lib/CodeGen/RegAllocGreedy.cpp:2221:12 #15 0x00007f50fdfc89f9 llvm::RegAllocBase::allocatePhysRegs() /home/h/3rd/llvm/build/../lib/CodeGen/RegAllocBase.cpp:110:14 #16 0x00007f50fdfd6337 (anonymous namespace)::RAGreedy::runOnMachineFunction(llvm::MachineFunction&) /home/h/3rd/llvm/build/../lib/CodeGen/RegAllocGreedy.cpp:2611:3 #17 0x00007f50fded33ee llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /home/h/3rd/llvm/build/../lib/CodeGen/MachineFunctionPass.cpp:43:3 #18 0x00007f50fd6cdc6f llvm::FPPassManager::runOnFunction(llvm::Function&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1550:23 #19 0x00007f50fd6cdf85 llvm::FPPassManager::runOnModule(llvm::Module&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1571:16 #20 0x00007f50fd6ce71a (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1627:23 #21 0x00007f50fd6ce246 llvm::legacy::PassManagerImpl::run(llvm::Module&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1730:16 #22 0x00007f50fd6cec31 llvm::legacy::PassManager::run(llvm::Module&) /home/h/3rd/llvm/build/../lib/IR/LegacyPassManager.cpp:1761:3 #23 0x0000000000415bdc compileModule(char, llvm::LLVMContext&) /home/h/3rd/llvm/build/../tools/llc/llc.cpp:405:5 #24 0x0000000000414571 main /home/h/3rd/llvm/build/../tools/llc/llc.cpp:211:13 #25 0x00007f50fa866b45 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b45) #26 0x0000000000414296 _start (/home/h/3rd/llvm/build/bin/llc+0x414296) Stack dump: 0. Program arguments: ./bin/llc -mtriple msp430 loadstore.ll 1. Running pass 'Function Pass Manager' on module 'loadstore.ll'. 2. Running pass 'Greedy Register Allocator' on function '@inc' ``` Original IR: ```llvm %struct.VeryLarge = type { i8, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 } ; Function Attrs: norecurse nounwind define void @inc(%struct.VeryLarge noalias nocapture sret %agg.result, %struct.VeryLarge* byval align 1 %s) #0 { entry: %p0 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 0 %0 = load i8, i8* %p0, align 1, !tbaa !1 %p1 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 1 %1 = load i32, i32* %p1, align 1, !tbaa !6 %p2 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 2 %2 = load i32, i32* %p2, align 1, !tbaa !7 %p3 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 3 %3 = load i32, i32* %p3, align 1, !tbaa !8 %p4 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 4 %4 = load i32, i32* %p4, align 1, !tbaa !9 %p5 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 5 %5 = load i32, i32* %p5, align 1, !tbaa !10 %p6 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 6 %6 = load i32, i32* %p6, align 1, !tbaa !11 %p7 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 7 %7 = load i32, i32* %p7, align 1, !tbaa !12 %p8 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 8 %8 = load i32, i32* %p8, align 1, !tbaa !13 %p9 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 9 %9 = load i32, i32* %p9, align 1, !tbaa !14 %p10 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 10 %10 = load i32, i32* %p10, align 1, !tbaa !15 %p11 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 11 %11 = load i32, i32* %p11, align 1, !tbaa !16 %p12 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 12 %12 = load i32, i32* %p12, align 1, !tbaa !17 %p13 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 13 %13 = load i32, i32* %p13, align 1, !tbaa !18 %p14 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 14 %14 = load i32, i32* %p14, align 1, !tbaa !19 %p15 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 15 %15 = load i32, i32* %p15, align 1, !tbaa !20 %p16 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 16 %16 = load i32, i32* %p16, align 1, !tbaa !21 %p17 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 17 %17 = load i32, i32* %p17, align 1, !tbaa !22 %p18 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 18 %18 = load i32, i32* %p18, align 1, !tbaa !23 %p19 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 19 %19 = load i32, i32* %p19, align 1, !tbaa !24 %p20 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 20 %20 = load i32, i32* %p20, align 1, !tbaa !25 %p21 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 21 %21 = load i32, i32* %p21, align 1, !tbaa !26 %p22 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 22 %22 = load i32, i32* %p22, align 1, !tbaa !27 %p23 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 23 %23 = load i32, i32* %p23, align 1, !tbaa !28 %p24 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 24 %24 = load i32, i32* %p24, align 1, !tbaa !29 %p25 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 25 %25 = load i32, i32* %p25, align 1, !tbaa !30 %p26 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 26 %26 = load i32, i32* %p26, align 1, !tbaa !31 %p27 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 27 %27 = load i32, i32* %p27, align 1, !tbaa !32 %p28 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 28 %28 = load i32, i32* %p28, align 1, !tbaa !33 %p29 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 29 %29 = load i32, i32* %p29, align 1, !tbaa !34 %p30 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 30 %30 = load i32, i32* %p30, align 1, !tbaa !35 %p31 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 31 %31 = load i32, i32* %p31, align 1, !tbaa !36 %p32 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %s, i32 0, i32 32 %32 = load i32, i32* %p32, align 1, !tbaa !37 %add = add i8 %0, 1 store i8 %add, i8* %p0, align 1, !tbaa !1 %add2 = add i32 %1, 2 store i32 %add2, i32* %p1, align 1, !tbaa !6 %add3 = add i32 %2, 3 store i32 %add3, i32* %p2, align 1, !tbaa !7 %add4 = add i32 %3, 4 store i32 %add4, i32* %p3, align 1, !tbaa !8 %add5 = add i32 %4, 5 store i32 %add5, i32* %p4, align 1, !tbaa !9 %add6 = add i32 %5, 6 store i32 %add6, i32* %p5, align 1, !tbaa !10 %add7 = add i32 %6, 7 store i32 %add7, i32* %p6, align 1, !tbaa !11 %add8 = add i32 %7, 8 store i32 %add8, i32* %p7, align 1, !tbaa !12 %add9 = add i32 %8, 9 store i32 %add9, i32* %p8, align 1, !tbaa !13 %add10 = add i32 %9, 10 store i32 %add10, i32* %p9, align 1, !tbaa !14 %add11 = add i32 %10, 11 store i32 %add11, i32* %p10, align 1, !tbaa !15 %add12 = add i32 %11, 12 store i32 %add12, i32* %p11, align 1, !tbaa !16 %add13 = add i32 %12, 13 store i32 %add13, i32* %p12, align 1, !tbaa !17 %add14 = add i32 %13, 14 store i32 %add14, i32* %p13, align 1, !tbaa !18 %add15 = add i32 %14, 15 store i32 %add15, i32* %p14, align 1, !tbaa !19 %add16 = add i32 %15, 16 store i32 %add16, i32* %p15, align 1, !tbaa !20 %add17 = add i32 %16, 17 store i32 %add17, i32* %p16, align 1, !tbaa !21 %add18 = add i32 %17, 18 store i32 %add18, i32* %p17, align 1, !tbaa !22 %add19 = add i32 %18, 19 store i32 %add19, i32* %p18, align 1, !tbaa !23 %add20 = add i32 %19, 20 store i32 %add20, i32* %p19, align 1, !tbaa !24 %add21 = add i32 %20, 21 store i32 %add21, i32* %p20, align 1, !tbaa !25 %add22 = add i32 %21, 22 store i32 %add22, i32* %p21, align 1, !tbaa !26 %add23 = add i32 %22, 23 store i32 %add23, i32* %p22, align 1, !tbaa !27 %add24 = add i32 %23, 24 store i32 %add24, i32* %p23, align 1, !tbaa !28 %add25 = add i32 %24, 25 store i32 %add25, i32* %p24, align 1, !tbaa !29 %add26 = add i32 %25, 26 store i32 %add26, i32* %p25, align 1, !tbaa !30 %add27 = add i32 %26, 27 store i32 %add27, i32* %p26, align 1, !tbaa !31 %add28 = add i32 %27, 28 store i32 %add28, i32* %p27, align 1, !tbaa !32 %add29 = add i32 %28, 29 store i32 %add29, i32* %p28, align 1, !tbaa !33 %add30 = add i32 %29, 30 store i32 %add30, i32* %p29, align 1, !tbaa !34 %add31 = add i32 %30, 31 store i32 %add31, i32* %p30, align 1, !tbaa !35 %add32 = add i32 %31, 32 store i32 %add32, i32* %p31, align 1, !tbaa !36 %add33 = add i32 %32, 33 store i32 %add33, i32* %p32, align 1, !tbaa !37 %33 = getelementptr inbounds %struct.VeryLarge, %struct.VeryLarge* %agg.result, i32 0, i32 0 call void @llvm.memcpy.p0i8.p0i8.i32(i8* %33, i8* %p0, i32 129, i32 1, i1 false), !tbaa.struct !38 ret void } ; Function Attrs: argmemonly nounwind declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture readonly, i32, i32, i1) #1 attributes #0 = { norecurse nounwind "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" } attributes #1 = { argmemonly nounwind } !llvm.ident = !{!0} !0 = !{!"clang version 3.8.0 (git://github.com/llvm-mirror/clang 40ef2b7531472c41212c4719a9294aeb7bddebbc) (git://github.com/llvm-mirror/llvm c601eaf55606dfb9ad372b514b77aa00d1409be1)"} !1 = !{!2, !3, i64 0} !2 = !{!"", !3, i64 0, !5, i64 1, !5, i64 5, !5, i64 9, !5, i64 13, !5, i64 17, !5, i64 21, !5, i64 25, !5, i64 29, !5, i64 33, !5, i64 37, !5, i64 41, !5, i64 45, !5, i64 49, !5, i64 53, !5, i64 57, !5, i64 61, !5, i64 65, !5, i64 69, !5, i64 73, !5, i64 77, !5, i64 81, !5, i64 85, !5, i64 89, !5, i64 93, !5, i64 97, !5, i64 101, !5, i64 105, !5, i64 109, !5, i64 113, !5, i64 117, !5, i64 121, !5, i64 125} !3 = !{!"omnipotent char", !4, i64 0} !4 = !{!"Simple C/C++ TBAA"} !5 = !{!"int", !3, i64 0} !6 = !{!2, !5, i64 1} !7 = !{!2, !5, i64 5} !8 = !{!2, !5, i64 9} !9 = !{!2, !5, i64 13} !10 = !{!2, !5, i64 17} !11 = !{!2, !5, i64 21} !12 = !{!2, !5, i64 25} !13 = !{!2, !5, i64 29} !14 = !{!2, !5, i64 33} !15 = !{!2, !5, i64 37} !16 = !{!2, !5, i64 41} !17 = !{!2, !5, i64 45} !18 = !{!2, !5, i64 49} !19 = !{!2, !5, i64 53} !20 = !{!2, !5, i64 57} !21 = !{!2, !5, i64 61} !22 = !{!2, !5, i64 65} !23 = !{!2, !5, i64 69} !24 = !{!2, !5, i64 73} !25 = !{!2, !5, i64 77} !26 = !{!2, !5, i64 81} !27 = !{!2, !5, i64 85} !28 = !{!2, !5, i64 89} !29 = !{!2, !5, i64 93} !30 = !{!2, !5, i64 97} !31 = !{!2, !5, i64 101} !32 = !{!2, !5, i64 105} !33 = !{!2, !5, i64 109} !34 = !{!2, !5, i64 113} !35 = !{!2, !5, i64 117} !36 = !{!2, !5, i64 121} !37 = !{!2, !5, i64 125} !38 = !{i64 0, i64 1, !39, i64 1, i64 4, !40, i64 5, i64 4, !40, i64 9, i64 4, !40, i64 13, i64 4, !40, i64 17, i64 4, !40, i64 21, i64 4, !40, i64 25, i64 4, !40, i64 29, i64 4, !40, i64 33, i64 4, !40, i64 37, i64 4, !40, i64 41, i64 4, !40, i64 45, i64 4, !40, i64 49, i64 4, !40, i64 53, i64 4, !40, i64 57, i64 4, !40, i64 61, i64 4, !40, i64 65, i64 4, !40, i64 69, i64 4, !40, i64 73, i64 4, !40, i64 77, i64 4, !40, i64 81, i64 4, !40, i64 85, i64 4, !40, i64 89, i64 4, !40, i64 93, i64 4, !40, i64 97, i64 4, !40, i64 101, i64 4, !40, i64 105, i64 4, !40, i64 109, i64 4, !40, i64 113, i64 4, !40, i64 117, i64 4, !40, i64 121, i64 4, !40, i64 125, i64 4, !40} !39 = !{!3, !3, i64 0} !40 = !{!5, !5, i64 0} ``` Reviewers: asl Subscribers: qcolombet Differential Revision: http://reviews.llvm.org/D17441 llvm-svn: 261746	2016-02-24 15:15:02 +00:00
Simon Pilgrim	9dfc61c3a6	[X86][SSE41] Combine vector blends with zero Part 2 of 2 This patch add support for combining target shuffles into blends-with-zero. Differential Revision: http://reviews.llvm.org/D17483 llvm-svn: 261745	2016-02-24 15:14:21 +00:00
Simon Pilgrim	21285b3ce4	[X86][SSE41] Combine insertion of zero scalars into vector blends with zero Part 1 of 2 This patch attempts to replace the insertion of zero scalars with a vector blend with zero, avoiding the use of the integer insertion instructions (which are particularly slow on many targets). (Part 2 will add support for combining multiple blends-with-zero). Differential Revision: http://reviews.llvm.org/D17483 llvm-svn: 261743	2016-02-24 14:53:27 +00:00
Nikolay Haustov	c13c21ef6e	[AMDGPU] Assembler: Simplify handling of optional operands Prepare to support DPP encodings. For DPP encodings, we want row_mask/bank_mask/bound_ctrl to be optional operands. However this means that when parsing instruction which has no mnemonic prefix, we cannot add both default values for VOP3 and for DPP optional operands to OperandVector - neither instructions would match. So add default values for optional operands to MCInst during conversion instead. Mark more operands as IsOptional = 1 in .td files. Do not add default values for optional operands to OperandVector in AMDGPUAsmParser. Add default values for optional operands during conversion using new helper addOptionalImmOperand. Change to cvtVOP3_2_mod to check instruction flag instead of presence of modifiers. In the future, cvtVOP3* functions can be combined into one. Separate cvtFlat and cvtFlatAtomic. Fix CNDMASK_B32 definition to have no modifiers. Review: http://reviews.llvm.org/D17445 Reviewers: tstellarAMD llvm-svn: 261742	2016-02-24 14:22:47 +00:00
Artur Pilipenko	5746dd289e	NFC. Move isDereferenceable to Loads.h/cpp This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated. Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D16180 llvm-svn: 261736	2016-02-24 12:49:04 +00:00
Artur Pilipenko	e41e2d4c04	NFC. Move getAlignment helper function from ValueTracking to Value class. Reviewed By: reames, hfinkel Differential Revision: http://reviews.llvm.org/D16144 llvm-svn: 261735	2016-02-24 12:25:10 +00:00
Nikolay Haustov	0caff3d99c	[AMDGPU] fix amd_kernel_code_t bit field position as per spec (added missing reserved fields) lit tests passed before and after because it doesn't test the binary representation of amd_kernel_code_t. Patch by: Valery Pykhtin (Valery.Pykhtin@amd.com) Reviewers: arsenm llvm-svn: 261732	2016-02-24 10:54:25 +00:00
David Majnemer	f4887d90a6	[SimplifyCFG] Do not blindly remove unreachable blocks DeleteDeadBlock was called indiscriminately, leading to cleanuprets with undef cleanuppad references. Instead, try to drain the BB of most of it's instructions if it is unreachable. We can then remove the BB if it solely consists of a terminator (and maybe some phis). llvm-svn: 261731	2016-02-24 10:02:16 +00:00
David Majnemer	83f0a59a00	[CodeView] Describe variables live in x87 registers We didn't have a mapping from LLVM's x87 floating point registers to CodeView's encoding. llvm-svn: 261730	2016-02-24 10:01:24 +00:00
Simon Pilgrim	962ee76b8b	[X86][SSE] Don't get target shuffle operands prematurely. PerformShuffleCombine should be usable by unary and binary target shuffles, but was attempting to get the first two operands whatever the instruction type. Since these are only used for VECTOR_SHUFFLE instructions for one particular combine I've moved them inside the relevant if statement. llvm-svn: 261727	2016-02-24 09:07:47 +00:00
Michael Zuckerman	d1c409a5af	[LLVM][AVX512][PSHUFHW ][PSHUFLW ] Change imm8 to int Differential Revision: http://reviews.llvm.org/D17538 llvm-svn: 261725	2016-02-24 08:39:05 +00:00
Igor Breger	8b9daa338d	AVX512: Add vpmovzxbw/d/q ,vpmovzxw/d/q ,vpmovzxbdq lowering patterns that support 256bit inputs like AVX patterns ( that are disable in case HasVLX , see SS41I_pmovx_avx2_patterns). Differential Revision: http://reviews.llvm.org/D17504 llvm-svn: 261724	2016-02-24 08:15:20 +00:00
Justin Bogner	c26c003b44	X86: Wrap a helper for an assert in #ifndef NDEBUG This function is used in exactly one place, and only in asserts builds. Move it a few lines up before the use and only define it when asserts are enabled. Fixes the release build under -Werror. Also remove the forward declaration and commentary that was basically identical to the code itself. llvm-svn: 261722	2016-02-24 07:58:02 +00:00
Matt Arsenault	6c6bd4573e	AMDGPU: Check cheaper condition before SignBitIsZero Don't do an expensive computeKnownBits call when we can do the cheap check for legal offsets first. llvm-svn: 261720	2016-02-24 04:55:29 +00:00
Sanjay Patel	4f966611aa	[InstCombine] refactor visitOr() to use foldCastedBitwiseLogic() Note: The 'and' case in foldCastedBitwiseLogic() is inheriting one extra check from the nearly identical 'or' case: if ((!isa<ICmpInst>(Cast0Src) \|\| !isa<ICmpInst>(Cast1Src)) But I'm not sure how to expose that difference in a regression test. Without that check, the 'or' path will infinite loop on: test/Transforms/InstCombine/zext-or-icmp.ll because the zext-or-icmp fold is attempting a reverse transform. The refactoring should extend to the 'xor' case next to solve part of PR26702. llvm-svn: 261707	2016-02-23 23:56:23 +00:00
Derek Schuff	c81038d2b7	Revert "[WebAssembly] Stackify code emitted by eliminateFrameIndex" This reverts r261685 due to wasm test breakage. llvm-svn: 261702	2016-02-23 22:13:21 +00:00
Tim Northover	9ff0bb755b	AArch64: rename compact unwind forms back to UNWIND_ARM64_*. NFC. Looks like the global rename last year was a bit over-zealous. These things really are referred to with ARM64 elsewhere (ld64, libunwind, ...). llvm-svn: 261698	2016-02-23 21:49:05 +00:00
Derek Schuff	74080913f6	[WebAssembly] Stackify code emitted by eliminateFrameIndex llvm-svn: 261685	2016-02-23 21:25:17 +00:00
Tim Northover	ccd1c20321	ARM: fix handling of movw/movt relocations with addend. We were emitting only one half of a the paired relocations needed for these instructions because we decided that an offset needed a scattered relocation. In fact, movw/movt relocations can be paired without being scattered. llvm-svn: 261679	2016-02-23 20:20:23 +00:00
Geoff Berry	5377924081	[AArch64] Generate csinv instruction more often Reviewers: t.p.northover, jmolloy Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17546 llvm-svn: 261675	2016-02-23 19:34:13 +00:00
Hans Wennborg	1c98857e6a	Revert r261633 "Supporting all entities declared in lexical scope in LLVM debug info." This and the corresponding Clang change caused PR26715. llvm-svn: 261671	2016-02-23 19:17:03 +00:00
Davide Italiano	715c1d0a06	[X86ISelLowering] Stop typing the same return over and over and over. llvm-svn: 261666	2016-02-23 18:39:38 +00:00
Weiming Zhao	df8f1f1370	Fix PR25339: ARM Constant Island Summary: Currently, the ARM Constant Island may not converge (or not converge quickly). This patch let it move to the closest water after the user if it doesn't converge after 15 iterations. This address https://llvm.org/bugs/show_bug.cgi?id=25339 Reviewers: t.p.northover, srhines, kristof.beyls, aadg, rengolin Subscribers: weimingz, aemerson, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D16890 llvm-svn: 261665	2016-02-23 18:39:19 +00:00
Derek Schuff	313b544b12	[WebAssembly] Add TODO comment to revisit red zone size llvm-svn: 261664	2016-02-23 18:17:46 +00:00
Derek Schuff	7fa1e4425f	[WebAssembly] Implement red zone for user stack Implements a mostly-conventional redzone for the userspace stack. Because we have unsigned load/store offsets we continue to use a local SP subtracted from the incoming SP but do not write it back to memory. Differential Revision: http://reviews.llvm.org/D17525 llvm-svn: 261662	2016-02-23 18:13:07 +00:00
Sanjay Patel	e4ec3d519e	[InstCombine] improve readability ; NFCI Less indenting, named local variables, more descriptive names. llvm-svn: 261659	2016-02-23 17:41:34 +00:00
David Majnemer	29ae1c3d76	[WinEH] Don't inline an 'unwinds to caller' cleanupret into funclets which locally unwind It is problematic if the inlinee has a cleanupret which unwinds to caller and we inline it into a call site which doesn't unwind. If the funclet unwinds anywhere other than to the caller, then we will give the funclet two unwind destinations. This will result in a verifier failure. Seeing as how the caller wasn't an invoke (which would locally unwind) and that the funclet cannot unwind to caller, we must conclude that an 'unwind to caller' cleanupret is dynamically unreachable. This fixes PR26698. Differential Revision: http://reviews.llvm.org/D17536 llvm-svn: 261656	2016-02-23 17:11:04 +00:00
Sanjay Patel	4e682fcfe8	[InstCombine] less indenting; NFC llvm-svn: 261652	2016-02-23 16:59:21 +00:00
Geoff Berry	a37df81842	[AArch64] Fix fastcc -tailcallopt epilog code generation. Summary: Fix a bug in epilog generation where the incoming stack arguments were not being popped for fastcc functions when -tailcallopt was passed. Reviewers: t.p.northover, mcrosier, jmolloy, rengolin Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16894 llvm-svn: 261650	2016-02-23 16:54:36 +00:00
Sanjay Patel	70dd21aa9d	[InstCombine] add helper function to foldCastedBitwiseLogic() ; NFCI This is a straight cut and paste of the existing code and is intended to be the first step in solving part of PR26702: https://llvm.org/bugs/show_bug.cgi?id=26702 We should be able to reuse most of this and delete the nearly identical existing code in visitOr(). Then, we can enhance visitXor() to use the same code too. llvm-svn: 261649	2016-02-23 16:36:07 +00:00
Aaron Ballman	96f0a702b2	Silencing a signed vs unsigned mismatch. llvm-svn: 261640	2016-02-23 15:02:43 +00:00
Chad Rosier	9d95aa7f78	[AArch64] Fix comment typo in Cyclone scheduling defs. NFC. llvm-svn: 261637	2016-02-23 14:05:13 +00:00
Amjad Aboud	5e45ef3cfb	Supporting all entities declared in lexical scope in LLVM debug info. Differential Revision: http://reviews.llvm.org/D15976 llvm-svn: 261633	2016-02-23 13:36:51 +00:00
Chandler Carruth	2b87d730fb	[PM] Remove an overly aggressive assert now that I can actually test the pattern that triggers it. This essentially requires an immutable function analysis, as that will survive anything we do to invalidate it. When we have such patterns, the function analysis manager will not get cleared between runs of the proxy. If we actually need an assert about how things are queried, we can add more elaborate machinery for computing it, but so far I'm not aware of significant value provided. Thanks to Justin Lebar for noticing this when he made a (seemingly innocuous) change to FunctionAttrs that is enough to trigger it in one test there. Now it is covered by a direct test of the pass manager code. llvm-svn: 261627	2016-02-23 10:47:57 +00:00
Junmo Park	90235fdff5	[ARM] fix initialization of PredictableSelectIsExpensive Summary: If we want classify OoO or not, using getSchedModel().isOutOfOrder() could be more proper way than using Subtarget->isLikeA9(). Reviewers: jmolloy, rengolin Differential Revision: http://reviews.llvm.org/D17433 llvm-svn: 261623	2016-02-23 09:56:58 +00:00
Nikolay Haustov	f7d5b41b7c	[AMDGPU] Fix operands of S_BFE_U64 and S_BFM_B64 src1 of s_bfe_u64 is 32-bit (same as s_bfe_i64). src0 and src1 of s_bfm_b64 are 32-bit. Update tests. Review: http://reviews.llvm.org/D17480 Reviewers: arsenm llvm-svn: 261621	2016-02-23 09:19:14 +00:00
Igor Breger	e60efa9c40	AVX512: Fix predicate of AVX pcmpeqw/b , pcmpgtb/w/d instructions . AVX512 version of this instructions return result in kmask register, so AVX patterns should not be disabled. Differential Revision: http://reviews.llvm.org/D17517 llvm-svn: 261619	2016-02-23 08:55:33 +00:00
David Majnemer	70ff357751	[WinEH] Visit 'unwind to caller' catchswitches nested in catchswitches We had the right logic for the nested cleanuppad case but omitted it for catchswitches. llvm-svn: 261615	2016-02-23 07:18:15 +00:00
Yaron Keren	672ca3153a	Assert when trying to seek un-seekable raw_fd_ostream. llvm-svn: 261614	2016-02-23 07:17:58 +00:00
Dehao Chen	ba3eb3f3a0	Add prefix based function layout when profile is available. Summary: If a function is hot, put it in text.hot section. Reviewers: davidxl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17532 llvm-svn: 261607	2016-02-23 03:39:24 +00:00
Duncan P. N. Exon Smith	53cb4596f6	CodeGen: TII: Take MachineInstr& in predicate API, NFC Change TargetInstrInfo API to take `MachineInstr&` instead of `MachineInstr*` in the functions related to predicated instructions (I'll try to come back later and get some of the rest). All of these functions require non-null parameters already, so references are more clear. As a bonus, this happens to factor away a host of implicit iterator => pointer conversions. No functionality change intended. llvm-svn: 261605	2016-02-23 02:46:52 +00:00
Duncan P. N. Exon Smith	dbf1a64537	Revert "Add prefix based function layout when profile is available." This reverts commit r261582, since this bot has been broken for four hours: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/19399/ llvm-svn: 261604	2016-02-23 02:28:40 +00:00
Michael Zolotukhin	7219052084	Follow up for r261597: Add the * to the auto. llvm-svn: 261600	2016-02-23 00:57:48 +00:00
Michael Zolotukhin	3da31c17bb	Follow-up for r261595: use range loop. llvm-svn: 261597	2016-02-23 00:48:44 +00:00
Michael Zolotukhin	cb26e1de36	[LoopUnroll] Avoid unnecessary DT recomputation. Summary: When we completely unroll a loop, it's pretty easy to update DT in-place and thus avoid rebuilding it. DT recalculation is one of the most time-consuming tasks in loop-unroll, so avoiding it at least in case of full unroll should be beneficial. On some extreme (but still real-world) tests this patch improves compile time by ~2x. Reviewers: escha, jmolloy, hfinkel, sanjoy, chandlerc Subscribers: joker.eph, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D17473 llvm-svn: 261595	2016-02-23 00:30:50 +00:00
Chandler Carruth	37c3c56afb	[PM] Improve the API and comments around the analysis manager proxies. These are really handles that ensure the analyses get cleared at appropriate places, and as such copying doesn't really make sense. Instead, they should look more like unique ownership objects. Make that the case. Relatedly, if you create a temporary of one and move out of it its destructor shouldn't actually clear anything. I don't think there is any code that can trigger this currently, but it seems like a more robust implementation. If folks want, I can add a unittest that forces this to be exercised, but that seems somewhat pointless -- whether a temporary is ever created in the innards of AnalysisManager is not really something we should be adding a reliance on, but I didn't want to leave a timebomb in the code here. If anyone has a cleaner way to represent this, I'm all ears, but I wanted to assure myself that this wasn't in fact responsible for another bug I'm chasing down (it wasn't) and figured I'd commit that. llvm-svn: 261594	2016-02-23 00:05:00 +00:00
Krzysztof Parzyszek	5090bc6ee3	More detailed dependence test between volatile and non-volatile accesses Differential Revision: http://reviews.llvm.org/D16857 llvm-svn: 261589	2016-02-22 23:07:43 +00:00
Dehao Chen	f09ab5a032	Include ProfileData as CodeGen's required library. Summary: Fixing buildbot failure introduced by http://reviews.llvm.org/D17460 Reviewers: davidxl, hans Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17524 llvm-svn: 261588	2016-02-22 22:54:14 +00:00
Dehao Chen	755e933005	Set function entry count as 0 if sample profile is not found for the function. Summary: This change makes the sample profile's behavior consistent with instr profile. Reviewers: davidxl, eraman, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17522 llvm-svn: 261587	2016-02-22 22:46:21 +00:00
David Majnemer	f566971993	[X86] Create mergeable constant pool entries for AVX We supported creating mergeable constant pool entries for smaller constants but not for 32-byte AVX constants. llvm-svn: 261584	2016-02-22 22:23:11 +00:00
Dehao Chen	b86da71790	Add prefix based function layout when profile is available. Summary: If a function is hot, put it in text.hot section. Reviewers: davidxl Subscribers: eraman, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17460 llvm-svn: 261582	2016-02-22 22:14:14 +00:00
Matt Arsenault	8fecbc9fea	SelectionDAG: Use correct addrspace when lowering memcpy This was causing assertions later from using the wrong pointer size with LDS operations. getOptimalMemOpType should also have address space arguments later. This avoids assertions in existing tests exposed by a future commit. llvm-svn: 261580	2016-02-22 22:01:42 +00:00
Derek Schuff	5fd7c2542e	[WebAssembly] Fix writeback of stack pointer with dynamic alloca Previously the stack pointer was only written back to memory in the prolog. But this is wrong for dynamic allocas, for which target-independent codegen handles SP updates after the prolog (and possibly even in another BB). Instead update the SP global in ADJCALLSTACKDOWN which is generated after the SP update sequence. This will have further refinements when we add red zone support. llvm-svn: 261579	2016-02-22 21:57:17 +00:00
Adam Nemet	6f1d2a2687	[LoopDataPrefetch] Make it testable with opt Summary: Since this is an IR pass it's nice to be able to write tests without llc. This is the counterpart of the llc test under CodeGen/PowerPC/loop-data-prefetch.ll. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17464 llvm-svn: 261578	2016-02-22 21:41:22 +00:00
Duncan P. N. Exon Smith	429a618d84	CodeGen: Bring back MachineBasicBlock::iterator::getInstrIterator()... This is a little embarrassing. When I reverted r261504 (getIterator() => getInstrIterator()) in r261567, I did a `git grep` to see if there were new calls to `getInstrIterator()` that I needed to migrate. There were 10-20 hits, and I blindly did a `sed ...` before calling `ninja check`. However, these were `MachineInstrBundleIterator::getInstrIterator()`, which predated r261567. Perhaps coincidentally, these had an identical name and return type. This commit undoes my careless sed and restores `MachineBasicBlock::iterator::getInstrIterator()`. llvm-svn: 261577	2016-02-22 21:30:15 +00:00
Michael Zolotukhin	369872c96c	[LoopUnrolling] Fix a bug introduced in r259869 (PR26688). The issue was that we only required LCSSA rebuilding if the immediate parent-loop had values used outside of it. The fix is to enaable the same logic for all outer loops, not only immediate parent. llvm-svn: 261575	2016-02-22 21:21:45 +00:00
Davide Italiano	f2c3cb292f	[X86ISelLowering] Consolidate duplicated code in a single place. llvm-svn: 261573	2016-02-22 21:06:46 +00:00
Matt Arsenault	34ccf25c19	AMDGPU/R600: Implement allowsMisalignedMemoryAccess This avoids some test regressions in a future commit when unaligned operations are expanded when they have custom lowering. llvm-svn: 261570	2016-02-22 21:04:16 +00:00
Philip Reames	f10b87b138	[RS4GC] "Constant fold" the rs4gc-split-vector-values flag This flag was part of a migration to a new means of handling vectors-of-points which was described in the llvm-dev thread "FYI: Relocating vector of pointers". The old code path has been off by default for a while without complaints, so time to cleanup. llvm-svn: 261569	2016-02-22 21:01:28 +00:00
Tim Northover	369e0e389f	ARM: sink atomic release barrier as far as possible into cmpxchg. DMB instructions can be expensive, so it's best to avoid them if possible. In atomicrmw operations there will always be an attempted store so a release barrier is always needed, but in the cmpxchg case we can delay the DMB until we know we'll definitely try to perform a store (and so need release semantics). In the strong cmpxchg case this isn't quite free: we must duplicate the LDREX instructions to skip the barrier on subsequent iterations. The basic outline becomes: ldrex rOld, [rAddr] cmp rOld, rDesired bne Ldone dmb Lloop: strex rRes, rNew, [rAddr] cbz rRes Ldone ldrex rOld, [rAddr] cmp rOld, rDesired beq Lloop Ldone: So we'll skip this version for strong operations in "minsize" functions. llvm-svn: 261568	2016-02-22 20:55:50 +00:00
Duncan P. N. Exon Smith	0fa6439bcd	Revert "CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC" This reverts commit r261504, since it's not obvious the new name is better: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160222/334298.html I'll recommit if we get consensus that it's the right direction. llvm-svn: 261567	2016-02-22 20:49:58 +00:00
Dan Gohman	eb0777680f	[WebAssembly] Re-enable the TailDuplicate pass. llvm-svn: 261566	2016-02-22 20:47:12 +00:00
Philip Reames	7715e0d9b9	[RS4GC] Revert optimization attempt due to memory corruption This change reverts "246133 [RewriteStatepointsForGC] Reduce the number of new instructions for base pointers" and a follow on bugfix 12575. As pointed out in pr25846, this code suffers from a memory corruption bug. Since I'm (empirically) not going to get back to this any time soon, simply reverting the problematic change is the right answer. llvm-svn: 261565	2016-02-22 20:45:56 +00:00
JF Bastien	f05f3f0c3e	WebAssembly: update expected failures clang r261557 lowers va_arg in clang. llvm-svn: 261564	2016-02-22 20:37:34 +00:00
Dan Gohman	c71fd3c15c	[WebAssembly] Teach address folding to fold bitwise-or nodes. LLVM converts adds into ors when it can prove that the operands don't share any non-zero bits. Teach address folding to recognize or instructions with constant operands with this property that can be folded into addresses as if they were adds. llvm-svn: 261562	2016-02-22 20:04:02 +00:00
Tom Stellard	d6c924b960	[AMDGPU][llvm-mc] Support for 32-bit inline literals Patch by: Artem Tamazov Summary: Note: Support for 64-bit inline literals TBD Added: Support of abs/neg modifiers for literals (incomplete; parsing TBD). Added: Some TODO comments. Reworked/clarity: rename isInlineImm() to isInlinableImm() Reworked/robustness: disallow BitsToFloat() with undefined value in isInlinableImm() Reworked/reuse: isSSrc32/64(), isVSrc32/64() Tests added. Reviewers: tstellarAMD, arsenm Subscribers: vpykhtin, nhaustov, SamWot, arsenm Projects: #llvm-amdgpu-spb Differential Revision: http://reviews.llvm.org/D17204 llvm-svn: 261559	2016-02-22 19:17:56 +00:00
Tom Stellard	5186536b5b	[AMDGPU] [llvm-mc] [VI] Fix encoding of LDS/GDS instructions. Patch by: Artem Tamazov Summary: Tests added. Reviewers: tstellarAMD, arsenm Subscribers: vpykhtin, SamWot, #llvm-amdgpu-spb Projects: #llvm-amdgpu-spb Differential Revision: http://reviews.llvm.org/D17271 llvm-svn: 261558	2016-02-22 19:17:53 +00:00
Justin Lebar	f84464712c	Revert "[attrs] Handle convergent CallSites." This reverts r261544, which was causing a test failure in Transforms/FunctionAttrs/readattrs.ll. llvm-svn: 261549	2016-02-22 18:24:43 +00:00
Justin Lebar	5bee0dc502	Revert "[ifcnv] Add comment explaining why it's OK to duplicate convergent MIs in ifcnv." This reverts r261543. Accidental commit (not LGTM'ed). llvm-svn: 261547	2016-02-22 18:17:27 +00:00
Nemanja Ivanovic	662ab414aa	Fix for PR26690 take 2 This is what was meant to be in the initial commit to fix this bug. The parens were missing. This commit also adds a test case for the bug and has undergone full testing on PPC and X86. llvm-svn: 261546	2016-02-22 18:04:00 +00:00
Justin Lebar	ca379cda9f	[attrs] Handle convergent CallSites. Summary: Previously we had a notion of convergent functions but not of convergent calls. This is insufficient to correctly analyze calls where the target is unknown, e.g. indirect calls. Now a call is convergent if it targets a known-convergent function, or if it's explicitly marked as convergent. As usual, we can remove convergent where we can prove that no convergent operations are performed in the call. Reviewers: chandlerc, jingyue Subscribers: hfinkel, jhen, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17317 llvm-svn: 261544	2016-02-22 17:51:35 +00:00
Justin Lebar	5a4cdf2207	[ifcnv] Add comment explaining why it's OK to duplicate convergent MIs in ifcnv. Summary: Also add a comment briefly explaining what ifcnv is. No functional changes. Reviewers: resistor Subscribers: echristo, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17430 llvm-svn: 261543	2016-02-22 17:51:30 +00:00
Justin Lebar	4e79c03dad	[ifcnv] Use unique_ptr in IfConversion. NFC Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17466 llvm-svn: 261541	2016-02-22 17:51:28 +00:00
Justin Lebar	3251647c03	Don't tail-duplicate blocks that contain convergent instructions. Summary: Convergent instrs shouldn't be made control-dependent on other values, but this is basically the whole point of tail duplication. So just bail if we see a convergent instruction. Reviewers: iteratee Subscribers: jholewinski, jhen, hfinkel, tra, jingyue, llvm-commits Differential Revision: http://reviews.llvm.org/D17320 llvm-svn: 261540	2016-02-22 17:50:52 +00:00
Dan Gohman	6cd6f419ab	[WebAssembly] Properly ignore llvm.dbg.value instructions. llvm-svn: 261538	2016-02-22 17:45:20 +00:00
Sanjoy Das	47bea86903	[ConstantRange] Rename a method and add more doc Rename makeNoWrapRegion to a more obvious makeGuaranteedNoWrapRegion, and add a comment about the counter-intuitive aspects of the function. This is to help prevent cases like PR26628. llvm-svn: 261532	2016-02-22 16:13:02 +00:00
Zoran Jovanovic	8ae69ab849	[mips] added support for trunc macro Author: obucina Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D15745 llvm-svn: 261529	2016-02-22 16:00:23 +00:00
Nemanja Ivanovic	752e2bfba7	Revert bad fix for PR26690. llvm-svn: 261527	2016-02-22 15:06:32 +00:00
Nemanja Ivanovic	b136ac47cd	Fix for PR26690 I mistook BitVector::empty() to mean BitVector::count() == 0 and it does not. Corrected the issue with the fix for PR26500. llvm-svn: 261525	2016-02-22 14:47:49 +00:00
Benjamin Kramer	e5027dce2c	Fix some abuse of auto flagged by clang's -Wrange-loop-analysis. llvm-svn: 261524	2016-02-22 13:11:58 +00:00
Igor Breger	0f4267c518	AVX512F: Add assembler Intel syntax tests for knl, fix minor bugs. Differential Revision: http://reviews.llvm.org/D17498 llvm-svn: 261521	2016-02-22 12:37:41 +00:00
Igor Breger	2d437b4341	AVX512: Fix scalar mem operands. Differential Revision: http://reviews.llvm.org/D17500 llvm-svn: 261520	2016-02-22 11:48:27 +00:00
Elena Demikhovsky	c545950e89	Allow setting MaxRerollIterations above 16 By Ayal Zaks. Differential Revision http://reviews.llvm.org/D17258 llvm-svn: 261517	2016-02-22 09:38:28 +00:00
Craig Topper	37f137f856	[X86] Minor formatting fix. NFC llvm-svn: 261515	2016-02-22 08:00:04 +00:00
Duncan P. N. Exon Smith	3b54098f86	Reapply "CodeGen: Use references in MachineTraceMetrics::Trace, NFC" This reverts commit r261510, effectively reapplying r261509. The original commit missed a caller in AArch64ConditionalCompares. Original commit message: Pass non-null arguments by reference in MachineTraceMetrics::Trace, simplifying future work to remove implicit iterator => pointer conversions. llvm-svn: 261511	2016-02-22 03:33:28 +00:00
Duncan P. N. Exon Smith	5d1b25325e	Revert "CodeGen: Use references in MachineTraceMetrics::Trace, NFC" This reverts commit r261509. I'm not sure how this compiled locally, but something was out of whack. llvm-svn: 261510	2016-02-22 03:12:42 +00:00
Duncan P. N. Exon Smith	3cbb1cb653	CodeGen: Use references in MachineTraceMetrics::Trace, NFC Pass non-null arguments by reference in MachineTraceMetrics::Trace, simplifying future work to remove implicit iterator => pointer conversions. llvm-svn: 261509	2016-02-22 03:07:49 +00:00
Duncan P. N. Exon Smith	c548df4a3a	CodeGen: Explicitly convert from iterator to pointer, NFC llvm-svn: 261508	2016-02-22 02:53:42 +00:00
Duncan P. N. Exon Smith	052f6c7bc1	Document assumption in X86FrameLowering::inlineStackProbe() Resolve FIXME from r261504. Apparently bundled instructions are illegal here: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160215/334146.html llvm-svn: 261507	2016-02-22 02:32:35 +00:00
Duncan P. N. Exon Smith	b2dab65ba3	CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC Delete MachineInstr::getIterator(), since the term "iterator" is overloaded when talking about MachineInstr. - Downcast to ilist_node in iplist::getNextNode() and getPrevNode() so that ilist_node::getIterator() is still available. - Add it back as MachineInstr::getInstrIterator(). This matches the naming in MachineBasicBlock. - Add MachineInstr::getBundleIterator(). This is explicitly called "bundle" (not matching MachineBasicBlock) to disintinguish it clearly from ilist_node::getIterator(). - Update all calls. Some of these I switched to `auto` to remove boiler-plate, since the new name is clear about the type. There was one call I updated that looked fishy, but it wasn't clear what the right answer was. This was in X86FrameLowering::inlineStackProbe(), added in r252578 in lib/Target/X86/X86FrameLowering.cpp. I opted to leave the behaviour unchanged, but I'll reply to the original commit on the list in a moment. llvm-svn: 261504	2016-02-21 22:58:35 +00:00
Lang Hames	3749d7d9d4	[Orc] Add stack-realignment code to the i386 resolver function. The resolver uses the fxsave/fxrstor instructions, which require 16-byte alignment, to save SSE state to the stack. Since 16-byte alignment can't be assumed on all OSes (and all i386 OSes share this function) - add code to automatically bump the alignment to 16-bytes on entry to the function. llvm-svn: 261503	2016-02-21 22:50:26 +00:00
Duncan P. N. Exon Smith	d5e432aea7	ADT: Remove == and != comparisons between ilist iterators and pointers I missed == and != when I removed implicit conversions between iterators and pointers in r252380 since they were defined outside ilist_iterator. Since they depend on getNodePtrUnchecked(), they indirectly rely on UB. This commit removes all uses of these operators. (I'll delete the operators themselves in a separate commit so that it can be easily reverted if necessary.) There should be NFC here. llvm-svn: 261498	2016-02-21 20:39:50 +00:00
Duncan P. N. Exon Smith	37982bac02	TransformUtils: Avoid getNodePtrUnchecked() in integer division, NFC Stop relying on `getNodePtrUnchecked()` being useful on invalid iterators. This function is documented to be for internal use only, and the pointer type will eventually have to change to remove UB from ilist_iterator. Instead, check the iterator before it has been invalidated. llvm-svn: 261497	2016-02-21 20:14:29 +00:00
Duncan P. N. Exon Smith	9a2563de7c	ADT: Stop using getNodePtrUnchecked on end() iterators Stop using `getNodePtrUnchecked()` when building IR. Eventually a dereference will be required to get at the downcast node, since the iterator will only store an `ilist_node_base` of some sort. This should have no functionality change for now, but is a path towards removing some more UB from ilist. llvm-svn: 261495	2016-02-21 19:52:15 +00:00
Craig Topper	f1ad8f775d	[X86] Remove unused encoding types from disassembler. NFC llvm-svn: 261494	2016-02-21 19:49:16 +00:00
Duncan P. N. Exon Smith	1eee6063b4	CodeGen: Avoid getNodePtrUnchecked() where we need a Value, NFC `ilist_iterator<NodeTy>::getNodePtrUnchecked()` is documented as being for internal use only, but CodeGenPrepare was using it anyway. This code relies on pulling out the `Value` pointer even after the lifetime of the iterator is over. But having this pointer available in ilist_iterator depends on UB in the first place. Instead, safely pull out the `Value` when the iterator is alive and stop using the internal-only API. There should be no functionality change here. llvm-svn: 261493	2016-02-21 19:37:45 +00:00
Simon Pilgrim	ec0f8ea81f	[X86][AVX] Add shuffle masking support for EltsFromConsecutiveLoads Add support for the case where we have a consecutive load (which must include the first + last elements) with a mixture of undef/zero elements. We load the vector and then apply a shuffle to clear the zero'd elements. Differential Revision: http://reviews.llvm.org/D17297 llvm-svn: 261490	2016-02-21 19:15:48 +00:00
Tobias Grosser	7157394614	ScalerEvolution: Only erase temporary values if they actually have been added This addresses post-review comments from Sanjoy Das for r261485. llvm-svn: 261486	2016-02-21 18:50:09 +00:00
Tobias Grosser	e896bbb2c0	ScalarEvolution: Do not keep temporary PHI values in ValueExprMap Before this patch simplified SCEV expressions for PHI nodes were only returned the very first time getSCEV() was called, but later calls to getSCEV always returned the non-simplified value, which had "temporarily" been stored in the ValueExprMap, but was never removed and consequently blocked the caching of the simplified PHI expression. llvm-svn: 261485	2016-02-21 17:42:10 +00:00
Sanjay Patel	ff9ee24191	fix inaccurate comment; NFC llvm-svn: 261484	2016-02-21 17:33:31 +00:00
Sanjay Patel	67805931c5	[InstCombine] add getNegativeIsTrueBoolVec() helper function; NFC Originally part of: http://reviews.llvm.org/D17485 We need this when simplifying masked memory ops too. llvm-svn: 261483	2016-02-21 17:29:33 +00:00
Sanjoy Das	7fe777fc31	Fix LLVM's handling and detection of skylake and cannonlake CPUs Summary: - Rename `"skylake"` == SkylakeServerProc to `"skylake-avx512"` - Change `"skylake"` to denote SkylakeClientProc - Fix the detection of cpu family 6 and model 94 to be SkylakeClientProc instead of SkylakeServerProc - Remove the `"cnl"` for CannonLake Reviewers: craig.topper, delena Subscribers: zansari, echristo, qcolombet, RKSimon, spatel, DavidKreitzer, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17090 llvm-svn: 261482	2016-02-21 17:12:03 +00:00
Sanjoy Das	1d32ffd382	[LoopDeletion] Add an assert that verifies LCSSA This is inspired by PR24804 -- had this assert been there before, isolating the root cause for PR24804 would have been far easier. llvm-svn: 261481	2016-02-21 17:11:59 +00:00
JF Bastien	d168b044b1	WebAssembly: update expected torture test failures r261457 handles CopyToReg nodes with flag results in LowerCopyToReg, which was causing the SelectionDAGNodes assert. llvm-svn: 261479	2016-02-21 16:52:00 +00:00
Dan Gohman	c3f23ceef3	[WebAssembly] Support physical registers in the rewrite-to-discard optimization. llvm-svn: 261465	2016-02-21 03:27:22 +00:00
Duncan P. N. Exon Smith	0d6cf7b3fa	IR: Add ConstantData, for operand-less Constants Add a common parent `ConstantData` to the constants that have no operands. These are guaranteed to represent abstract data that is in no way tied to a specific Module. This is a good cleanup on its own. It also makes it simpler to disallow RAUW (and factor away use-lists) on these constants in the future. (I have some experimental patches that make RAUW illegal on ConstantData, and they seem to catch a bunch of bugs...) llvm-svn: 261464	2016-02-21 02:39:49 +00:00
David Majnemer	b56105ad7f	Unbreak non-X86 targets from fallout caused by r261462 llvm-svn: 261463	2016-02-21 01:40:04 +00:00
David Majnemer	174ea5bfc4	[X86] Use the correct alignment for COMDAT constant pool entries COFF doesn't have sections with mergeable contents. Instead, each constant pool entry ends up in a COMDAT section. The linker, when choosing between COMDAT sections, doesn't choose the max alignment of the two sections. You just get whatever alignment was on the section. If one constant needed a higher alignment in one object file from another one, then we will get into trouble if the linker chooses the lower alignment one. Instead, lets promote the alignment of the constant pool entry to make sure we don't use an under aligned constant with an instruction which assumed otherwise. This fixes PR26680. llvm-svn: 261462	2016-02-21 01:30:30 +00:00
Simon Pilgrim	f781386f81	[InstCombine] SSE/SSE2 (u)comiss/(u)comisd comparison intrinsics only use the lowest vector element llvm-svn: 261460	2016-02-20 23:17:35 +00:00
Dan Gohman	9d09f603ef	[WebAssembly] Refine a README.txt entry. The register coloring pass may also need to be involved in order to optimally sort registers. llvm-svn: 261458	2016-02-20 23:11:14 +00:00
Dan Gohman	3f10ef519d	[WebAssembly] Handle CopyToReg nodes with flag results in LowerCopyToReg. llvm-svn: 261457	2016-02-20 23:09:44 +00:00
Derek Schuff	ee2bcbc1d2	[WebAssembly] Write stack pointer back to memory when FP is used The stack pointer is bumped when there is a frame pointer or when there are static-size objects, but was only getting written back when there were static-size objects. llvm-svn: 261453	2016-02-20 22:18:47 +00:00
Derek Schuff	5cc4608440	[WebAssembly] Stackify function prologs and epilogs The instructions are the same, but fewer locals are used. Differential Revision: http://reviews.llvm.org/D17428 llvm-svn: 261452	2016-02-20 21:46:50 +00:00
Dan Gohman	86b91ec3dc	Don't scan for SSA register operands to update when not in SSA form. TailDuplicate can run on either on SSA code or non-SSA code, as indicated to it by MRI->isSSA() ("PreRegAlloc" here). TailDuplicate does extra work to preserve SSA invariants when it duplicates code. This patch makes it skip some of this extra work in the case where the code is not in SSA form. llvm-svn: 261450	2016-02-20 21:28:18 +00:00
Nemanja Ivanovic	6ba5cb86b8	Fix the build bot break caused by rL261441. The patch has a necessary call to a function inside an assert. Which is fine when you have asserts turned on. Not so much when they're off. Sorry about the regression. llvm-svn: 261447	2016-02-20 20:45:37 +00:00
Nemanja Ivanovic	57bd7dee35	Fix for PR 26500 This patch corresponds to review: http://reviews.llvm.org/D17294 It ensures that whatever block we are emitting the prologue/epilogue into, we have the necessary scratch registers. It takes away the hard-coded register numbers for use as scratch registers as registers that are guaranteed to be available in the function prologue/epilogue are not guaranteed to be available within the function body. Since we shrink-wrap, the prologue/epilogue may end up in the function body. llvm-svn: 261441	2016-02-20 18:16:25 +00:00
Simon Pilgrim	1f327425d2	[DAGCombiner] Use getBitcast helper when possible. NFCI. llvm-svn: 261437	2016-02-20 15:05:29 +00:00
Simon Pilgrim	f7fbbebbc2	[X86][SSE] Fixed issue with commutation of 'faux unary' target shuffles (PR26667) Fixed a bug introduced by D16683 when a binary shuffle is simplified to a unary shuffle (with undef/zero sentinel mask indices) - if this resulted in only the second input being used combineX86ShuffleChain failed to take this into account and still referenced the first input. llvm-svn: 261434	2016-02-20 14:39:45 +00:00
Simon Pilgrim	3ddb55acee	[X86][SSE] Move all undef/zero cases before target shuffle combining. First small step towards fixing PR26667 - we need to ensure that combineX86ShuffleChain only gets called with a valid shuffle input node (a similar issue was found in D17041). llvm-svn: 261433	2016-02-20 12:57:32 +00:00
Joerg Sonnenberger	47b2d0be59	When MemoryDependenceAnalysis hits a CFG with many transparent blocks, the algorithm easily degrades into quadratic memory and time complexity. The easiest example is a long chain of BBs that don't otherwise use a location. The caching will add an entry for every intermediate block and limiting the number of results doesn't help as no results are produced until a definition is found. Introduce a limit similar to the existing instructions-per-block limit. This limit counts the total number of blocks checked. If the limit is reached, entries are considered unknown. The initial value is 1000, which avoids regressions for normal sized functions while still limiting edge cases to reasnable memory consumption and execution time. Differential Revision: http://reviews.llvm.org/D16123 llvm-svn: 261430	2016-02-20 11:24:44 +00:00
Andrey Turetskiy	0027eb9e6b	[X86] Enable the LEA optimization pass by default. Differential Revision: http://reviews.llvm.org/D16877 llvm-svn: 261429	2016-02-20 11:11:55 +00:00
Andrey Turetskiy	9b5b22afce	[X86] PR26575: Fix LEA optimization pass (Part 2). Handle address displacement operands of a type other than Immediate or Global in LEAs and load/stores. Ref: https://llvm.org/bugs/show_bug.cgi?id=26575 Differential Revision: http://reviews.llvm.org/D17374 llvm-svn: 261428	2016-02-20 10:58:28 +00:00
Benjamin Kramer	aa1f9ae3db	[SimplifyCFG] Use pointer identity to simplify predicate. No functional change intended. llvm-svn: 261427	2016-02-20 10:40:42 +00:00
Benjamin Kramer	50138d023d	[LVI] Move ConstantRanges instead of copying. No functional change intended. Copying small (<= 64 bits) APInts isn't expensive but bloats code by generating the slow path everywhere. Moving doesn't care about the size of the value. llvm-svn: 261426	2016-02-20 10:40:34 +00:00
David Majnemer	56dae9ed9d	Move some code from doInitialization to runOnFunction This has no observable behavior change, it just makes the state insertion pass look a little more like normal passes. llvm-svn: 261420	2016-02-20 07:34:21 +00:00
Craig Topper	f8c17bbd8f	[X86] Add some missing reversed forms of XOP instructions. llvm-svn: 261417	2016-02-20 06:20:17 +00:00
Chandler Carruth	c495d89da2	[PM/AA] Wire up TBAA to the new pass manager's registry and test it. llvm-svn: 261411	2016-02-20 04:04:52 +00:00
Chandler Carruth	0b3be566d6	[PM/AA] Wire up the scoped-no-alias AA to the new pass manager's registry and test it. llvm-svn: 261410	2016-02-20 04:03:06 +00:00
Chandler Carruth	6d49893aaa	[PM/AA] Wire up SCEVAA to the new pass manager's registry and test it. llvm-svn: 261409	2016-02-20 04:01:45 +00:00
Matthias Braun	64ff5abd54	MachineCopyPropagation: Introduce Reg2MIMap typedef; NFC llvm-svn: 261408	2016-02-20 03:56:41 +00:00
Matthias Braun	0f2764ab3d	MachineCopyPropagation: Move variables from function to pass This avoids unnecessarily passing them around when calling helper functions. It may also be slightly faster to call clear() on the datastructures instead of freshly initializing them for each block. llvm-svn: 261407	2016-02-20 03:56:39 +00:00
Matthias Braun	1eda62c585	MachineCopyPropagation: Use ranged for, cleanup; NFC llvm-svn: 261406	2016-02-20 03:56:36 +00:00
Matthias Braun	29aecbb64e	MachineCopyPropagation: Use assert() instead of if{report_error()} for 'impossible' condition llvm-svn: 261405	2016-02-20 03:56:33 +00:00
Chandler Carruth	a43d617870	[PM/AA] Wire up CFLAA to the new pass manager fully, and port one of its tests over to exercise this code. This uncovered a few missing bits here and there in the analysis, but nothing interesting. llvm-svn: 261404	2016-02-20 03:52:02 +00:00
Chandler Carruth	6d0392224e	[PM/AA] Port alias analysis evaluator to the new pass manager, and use it to actually test the new pass manager AA wiring. This patch was extracted from the (somewhat too large) D12357 and rebosed on top of the slightly different design of the new pass manager AA wiring that I just landed. With this we can start testing the AA in a thorough way with the new pass manager. Some minor cleanups to the code in the pass was necessitated here, but otherwise it is a very minimal change. Differential Revision: http://reviews.llvm.org/D17372 llvm-svn: 261403	2016-02-20 03:46:03 +00:00
Sanjoy Das	bd0add2c21	[SCEV] Don't spell `SCEV ` variables as `Scev`; NFC It reads odd since most other places name a `SCEV ` as `S`. Pure renaming change. llvm-svn: 261393	2016-02-20 01:44:10 +00:00
Sanjoy Das	798f2fd7cb	[SCEV] Don't use std::make_pair; NFC `{A, B}` reads cleaner than `std::make_pair(A, B)`. llvm-svn: 261392	2016-02-20 01:35:56 +00:00
David Majnemer	20fef8ad53	[SimplifyCFG] Merge together cleanuppads Cleanuppads may be merged together if one is the only predecessor of the other in which case a simple transform can be performed: replace the a cleanupret with a branch and remove an unnecessary cleanuppad. Differential Revision: http://reviews.llvm.org/D17459 llvm-svn: 261390	2016-02-20 01:07:45 +00:00
Davide Italiano	93a025fa59	[X86ISelLowering] Fix TLSADDR lowering when shrink-wrapping is enabled. TLSADDR nodes are lowered into actuall calls inside MC. In order to prevent shrink-wrapping from pushing prologue/epilogue past them (which result in TLS variables being accessed before the stack frame is set up), we put markers, so that the stack gets adjusted properly. Thanks to Quentin Colombet for guidance/help on how to fix this problem! llvm-svn: 261387	2016-02-20 00:44:47 +00:00
Tom Stellard	060bccc1f3	AMDGPU/SI: Use v_readfirstlane to legalize SMRD with VGPR base pointer Summary: Instead of trying to replace SMRD instructions with a VGPR base pointer with an equivalent MUBUF instruction, we now copy the base pointer to SGPRs using v_readfirstlane. This is safe to do, because any load selected as an SMRD instruction has been proven to have a uniform base pointer, so each thread in the wave will have the same pointer value in VGPRs. This will fix some errors on VI from trying to replace SMRD instructions with addr64-enabled MUBUF instructions that don't exist. Reviewers: arsenm, cfang, nhaehnle Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17305 llvm-svn: 261385	2016-02-20 00:37:25 +00:00
Quentin Colombet	d3e6821ba3	[RegAllocFast] Properly track the physical register definitions on calls. PR26485 llvm-svn: 261384	2016-02-20 00:32:29 +00:00
Reid Kleckner	a02013d969	[codeview] Fix emission of file changes in inline line tables These are supposed to be file checksum table offsets, not file ids. llvm-svn: 261379	2016-02-19 23:55:38 +00:00
Davide Italiano	f288ed2d05	[X86ISelLowering] Provide a more informative assert message. I stumbled upon this while debugging a lowering bug. llvm-svn: 261371	2016-02-19 22:18:49 +00:00
Davide Italiano	2cdf3c2ffb	[X86ISelLowering] Merge two conditions inside a single if. llvm-svn: 261370	2016-02-19 22:01:07 +00:00
Hans Wennborg	8cdb9f2953	Revert r255691 "[LoopVectorizer] Refine loop vectorizer's register usage calculator by ignoring specific instructions." It caused PR26509. llvm-svn: 261368	2016-02-19 21:40:12 +00:00
Hans Wennborg	a329081c88	Revert r253557 "Alternative to long nops for X86 CPUs, by Andrey Turetsky" Turns out the new nop sequences aren't actually nops on x86_64 (PR26554). llvm-svn: 261365	2016-02-19 21:26:31 +00:00
Dimitry Andric	be984b406a	Fix incorrect selection of AVX512 sqrt when OptForSize is on Summary: When optimizing for size, sqrt calls can be incorrectly selected as AVX512 VSQRT instructions. This is because X86InstrAVX512.td has a `Requires<[OptForSize]>` in its `avx512_sqrt_scalar` multiclass definition. Even if the target does not support AVX512, the class can apparently still be chosen, leading to an incorrect selection of `vsqrtss`. In PR26625, this lead to an assertion: Reg >= X86::FP0 && Reg <= X86::FP6 && "Expected FP register!", because the `vsqrtss` instruction requires an XMM register, which is not available on i686 CPUs. Reviewers: grosbach, resistor, joker.eph Subscribers: spatel, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D17414 llvm-svn: 261360	2016-02-19 20:14:11 +00:00
Sanjoy Das	b661129b6d	[StatepointLowering] Minor non-semantic cleanups Use auto, bring file up to coding standards etc. llvm-svn: 261358	2016-02-19 19:37:07 +00:00
Dan Gohman	3d596becbf	[WebAssembly] Add another optimization idea to README.txt. llvm-svn: 261354	2016-02-19 19:22:44 +00:00
Geoff Berry	986cabdd71	[AArch64][ShrinkWrap] Fix bug in prolog clobbering live reg when shrink wrapping. Summary: See bug https://llvm.org/bugs/show_bug.cgi?id=26642 Reviewers: qcolombet, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17350 llvm-svn: 261349	2016-02-19 18:27:32 +00:00
Sanjoy Das	78129f5b46	[StatepointLowering] Update StatepointMaxSlotsRequired correctly Now that we don't always add an element to AllocatedStackSlots if we don't find a pre-existing unallocated stack slot, bumping StatepointMaxSlotsRequired to `NumSlots + 1` is not correct. Instead bump the statistic near the push_back, to Builder.FuncInfo.StatepointStackSlots.size(). llvm-svn: 261348	2016-02-19 18:15:56 +00:00
Sanjoy Das	e1ce176524	[StatepointLowering] Fix a mistake in rL261336 The check on MFI->getObjectSize() has to be on the FrameIndex, not on the index of the FrameIndex in AllocatedStackSlots. Weirdly, the tests I added in rL261336 didn't catch this. llvm-svn: 261347	2016-02-19 18:15:53 +00:00
Matthew Simpson	2ebb736740	[LV] Vectorize first-order recurrences This patch enables the vectorization of first-order recurrences. A first-order recurrence is a non-reduction recurrence relation in which the value of the recurrence in the current loop iteration equals a value defined in the previous iteration. The load PRE of the GVN pass often creates these recurrences by hoisting loads from within loops. In this patch, we add a new recurrence kind for first-order phi nodes and attempt to vectorize them if possible. Vectorization is performed by shuffling the values for the current and previous iterations. The vectorization cost estimate is updated to account for the added shuffle instruction. Contributed-by: Matthew Simpson and Chad Rosier <mcrosier@codeaurora.org> Differential Revision: http://reviews.llvm.org/D16197 llvm-svn: 261346	2016-02-19 17:56:08 +00:00
Sanjoy Das	5f2f187542	[StatepointLowering] Change AllocatedStackSlots to use SmallBitVector NFCI. They key motivation here is that I'd like to use SmallBitVector::all() in a later change. Also, using a bit vector here seemed better in general. The only interesting change here is that in the failure case of allocateStackSlot, we no longer (the equivalent of) push_back(true) to AllocatedStackSlots. As far as I can tell, this is fine, since we'd never re-use those slots in the same StatepointLoweringState instance. Technically there was no need to change the operator[] type accesses to set() and test(), but I thought it'd be nice to make it obvious that we're using something other than a std::vector like thing. llvm-svn: 261337	2016-02-19 17:15:26 +00:00
Sanjoy Das	77d7a161ac	[StatepointLowering] Fix bug in allocateStackSlot allocateStackSlot did not consider the size of the value to be spilled before deciding to re-use a spill slot. This was originally okay (since originally we'd only ever spill pointers), but it became not okay when we changed our scheme to directly spill vectors of pointers. While this change fixes the bug pointed out, it has two performance caveats: - It matches spill slot and spillee size exactly, while in theory we can spill, e.g., an 8 byte pointer into a 16 byte slot. This is slightly complicated to fix since in the stackmaps section, we report the size of the spill slot as the size of the "indirect value"; and if they're no longer equivalent, we'll have to keep track of the (indirect) value size separately from the stack slot size. - It will "spuriously run out" of reusable slots, since we now have an second check in the search loop in addition to the availablity check (e.g. you had two free scalar slots, and you first ask for a vector slot followed by a scalar slot). I'll fix this in a later commit. llvm-svn: 261336	2016-02-19 17:15:22 +00:00
Sanjoy Das	708dd86edd	[StatepointLowering] Clean up allocateStackSlot This removes the unusual loop structure in allocateStackSlot in favor of something more straightforward. I've also removed the cautionary comment in the function, which I suspect is historical cruft now, and confuses more than it enlightens. llvm-svn: 261335	2016-02-19 17:15:17 +00:00
Silviu Baranga	7a8e19daa2	[LV] Fix PR26600: avoid out of bounds loads for interleaved access vectorization Summary: If we don't have the first and last access of an interleaved load group, the first and last wide load in the loop can do an out of bounds access. Even though we discard results from speculative loads, this can cause problems, since it can technically generate page faults (or worse). We now discard interleaved load groups that don't have the first and load in the group. Reviewers: hfinkel, rengolin Subscribers: rengolin, llvm-commits, mzolotukhin, anemet Differential Revision: http://reviews.llvm.org/D17332 llvm-svn: 261331	2016-02-19 15:46:10 +00:00
Tom Stellard	db5af50c55	AMDGPU/SI: Fix s_waitcnt insertion for flat instructions Summary: This was broken in r260694 which swapped the address and data operands for flat store instructions. The code in SIInsertWaits assumes that the data operand always comes before the address operand, so we need to add a special case for flat. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17366 llvm-svn: 261330	2016-02-19 15:33:13 +00:00
Rafael Espindola	d0e316db0d	Add support for merging strings with alignment larger than one char. This will be used in a lld patch. llvm-svn: 261326	2016-02-19 14:13:52 +00:00
Ulrich Weigand	849b3a7299	[SystemZ] Fix ABI for i128 argument and return types According to the SystemZ ABI, 128-bit integer types should be passed and returned via implicit reference. However, this is not currently implemented at the LLVM IR level for the i128 type. This does not matter when compiling C/C++ code, since clang will implement the implicit reference itself. However, it turns out that when calling libgcc helper routines operating on 128-bit integers, LLVM will use i128 argument and return value types; the resulting code is not compatible with the ABI used in libgcc, leading to crashes (see PR26559). This should be simple to fix, except that i128 currently is not even a legal type for the SystemZ back end. Therefore, common code will already split arguments and return values into multiple parts. The bulk of this patch therefore consists of detecting such parts, and correctly handling passing via implicit reference of a value split into multiple parts. If at some time in the future, i128 becomes a legal type, this code can be removed again. This fixes PR26559. llvm-svn: 261325	2016-02-19 14:10:21 +00:00
Chandler Carruth	b42444d804	[LPM] Factor all of the loop analysis usage updates into a common helper routine. We were getting this wrong in small ways and generally being very inconsistent about it across loop passes. Instead, let's have a common place where we do this. One minor downside is that this will require some analyses like SCEV in more places than they are strictly needed. However, this seems benign as these analyses are complete no-ops, and without this consistency we can in many cases end up with the legacy pass manager scheduling deciding to split up a loop pass pipeline in order to run the function analysis half-way through. It is very, very annoying to fix these without just being very pedantic across the board. The only loop passes I've not updated here are ones that use AU.setPreservesAll() such as IVUsers (an analysis) and the pass printer. They seemed less relevant. With this patch, almost all of the problems in PR24804 around loop pass pipelines are fixed. The one remaining issue is that we run simplify-cfg and instcombine in the middle of the loop pass pipeline. We've recently added some loop variants of these passes that would seem substantially cleaner to use, but this at least gets us much closer to the previous state. Notably, the seven loop pass managers is down to three. I've not updated the loop passes using LoopAccessAnalysis because that analysis hasn't been fully wired into LoopSimplify/LCSSA, and it isn't clear that those transforms want to support those forms anyways. They all run late anyways, so this is harmless. Similarly, LSR is left alone because it already carefully manages its forms and doesn't need to get fused into a single loop pass manager with a bunch of other loop passes. LoopReroll didn't use loop simplified form previously, and I've updated the test case to match the trivially different output. Finally, I've also factored all the pass initialization for the passes that use this technique as well, so that should be done regularly and reliably. Thanks to James for the help reviewing and thinking about this stuff, and Ben for help thinking about it as well! Differential Revision: http://reviews.llvm.org/D17435 llvm-svn: 261316	2016-02-19 10:45:18 +00:00
Craig Topper	fea5656c80	[X86] Remove unused entries from the disassembler type enum. llvm-svn: 261311	2016-02-19 06:57:40 +00:00
David Majnemer	53b454e537	Shuffle header file as per the Coding Standards llvm-svn: 261308	2016-02-19 04:46:48 +00:00
David Majnemer	d85c2d0d13	[SjLjEHPrepare] Simplify/cleanup code No functional change is intended. llvm-svn: 261307	2016-02-19 04:46:06 +00:00
Matthias Braun	2bbf522cfc	LegalizeDAG: Fix ExpandFCOPYSIGN assuming the same type on both inputs llvm-svn: 261306	2016-02-19 04:44:19 +00:00
Easwaran Raman	b6f88013ea	Add profile summary support for sample profile. Differential Revision: http://reviews.llvm.org/D17178 llvm-svn: 261304	2016-02-19 03:15:33 +00:00
David Majnemer	c3565229fb	[SjLjEHPrepare] Don't grab pointers to functions in doInitialization Certain optimization passes (like globaldce) can prune function declaration that SjLjEHPrepare assumed would exit when it'd runOnFunction. This fixes PR26669. llvm-svn: 261303	2016-02-19 03:13:40 +00:00
Chandler Carruth	3d4a43dca8	[AA] Preserve the AA results wrapper pass as well as BasicAA in a few more places to prevent gratuitous re-"runs" of these passes. The passes themselves don't do any work when run, but we keep spending time scheduling and running these needlessly when we really don't need to do so. This is the first patch towards fixing the really horrible loop pass pipeline fragmentation pointed out by Sanjoy in PR24804. llvm-svn: 261302	2016-02-19 03:12:14 +00:00
Lawrence Hu	405ada2ef1	Bug fix: use dyn_cast_or_null instead of dyn_cast Differential Revision: http://reviews.llvm.org/D17154 llvm-svn: 261299	2016-02-19 02:17:07 +00:00
Junmo Park	62e71df568	Minor code cleanups. NFC. llvm-svn: 261294	2016-02-19 01:46:04 +00:00
Justin Lebar	975bf7a977	When printing MIR, output to errs() rather than outs(). Summary: Without this, this command $ llvm-run llc -stop-after machine-cp -o - <( echo '' ) outputs an error, because we close stdout twice -- once when closing the file opened for "-o", and again when closing outs(). Also clarify in the outs() definition that you can't ever call it if you want to open your own raw_fd_ostream on stdout. Reviewers: jroelofs, tstellarAMD Subscribers: jholewinski, qcolombet, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D17422 llvm-svn: 261286	2016-02-19 00:18:46 +00:00
Philip Reames	2d78d97ccc	[IR] Extend cmpxchg to allow pointer type operands Today, we do not allow cmpxchg operations with pointer arguments. We require the frontend to insert ptrtoint casts and do the cmpxchg in integers. While correct, this is problematic from a couple of perspectives: 1) It makes the IR harder to analyse (for instance, it make capture tracking overly conservative) 2) It pushes work onto the frontend authors for no real gain This patch implements the simplest form of IR support. As we did with floating point loads and stores, we teach AtomicExpand to convert back to the old representation. This prevents us needing to change all backends in a single lock step change. Over time, we can migrate each backend to natively selecting the pointer type. In the meantime, we get the advantages of a cleaner IR representation without waiting for the backend changes. Differential Revision: http://reviews.llvm.org/D17413 llvm-svn: 261281	2016-02-19 00:06:41 +00:00
Sanjay Patel	3941775761	[x86] fix initialization of PredictableSelectIsExpensive This is effectively NFC because Atom is the only in-order x86 subtarget currently, but the predicate would have become wrong if any other in-order CPU came along. See related discussion in: http://reviews.llvm.org/D16836 llvm-svn: 261275	2016-02-18 23:08:48 +00:00
Richard Trieu	5a759985de	Remove uses of builtin comma operator. Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270	2016-02-18 22:09:30 +00:00
Kostya Serebryany	7ef7f142ff	[libFuzzer] only read MaxLen bytes from every file in the corpus to speedup loading the corpus llvm-svn: 261267	2016-02-18 21:49:10 +00:00
Adam Nemet	27b8897111	[PPCLoopDataPrefetch] Move pass to Transforms/Scalar/LoopDataPrefetch. NFC This patch is part of the work to make PPCLoopDataPrefetch target-independent (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758). Obviously the pass still only used from PPC at this point. Subsequent patches will start driving this from ARM64 as well. Due to the previous patch most lines should show up as moved lines. llvm-svn: 261265	2016-02-18 21:38:19 +00:00
Adam Nemet	f9d4c08808	[PPCLoopDataPrefetch] Remove PPC from some of the names. NFC This is done only to make the next patch that move the pass out PPC to Transforms easier to read. After this most line should show up as moved lines in that patch. This patch is part of the work to make PPCLoopDataPrefetch target-independent (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758). llvm-svn: 261264	2016-02-18 21:37:12 +00:00
David Majnemer	40015c4774	[WinEH] Hoist state stores from successors If we know that all of our successors want to be in the exact same state, it makes sense to hoist the state transition into their common predecessor. Differential Revision: http://reviews.llvm.org/D17391 llvm-svn: 261262	2016-02-18 21:13:35 +00:00
Davide Italiano	74aa817e29	[X86ISelLowering] Use isPowerof2 instead of rewriting it. NFC. llvm-svn: 261255	2016-02-18 20:43:15 +00:00
Amaury Sechet	c379e9149e	Add support for invoke/landingpad/resume in C API test Summary: As per title. There was a lot of part missing in the C API, so I had to extend the invoke and landingpad API. Reviewers: echristo, joker.eph, Wallbraker Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17359 llvm-svn: 261254	2016-02-18 20:38:32 +00:00
Philip Reames	8040b371e2	Restrict scope of variables [NFC] llvm-svn: 261250	2016-02-18 19:45:31 +00:00
Philip Reames	b731b338c8	[CaptureTracking] Support atomicrmw and cmpxchg These atomic operations are conceptually both a load and store from the same location. As such, we can treat them as the most conservative of those two components which in practice, means we can treat them like stores. An cmpxchg or atomicrmw captures the values, but not the locations accessed. Note: We can probably be more aggressive about the comparison value in an cmpxhg since to have it be in memory, it must already be captured, but I figured it was better to avoid that for the moment. Note 2: It turns out that since we don't actually support cmpxchg of pointer type, writing a negative test is impossible. Differential Revision: http://reviews.llvm.org/D17400 llvm-svn: 261245	2016-02-18 19:23:27 +00:00
Zachary Turner	f3c152db24	[DebugInfoPDB] Add source / line number accessors for PDB. This patch adds a variety of different methods to query source and line number information from PDB files. llvm-svn: 261239	2016-02-18 18:47:29 +00:00
Matthew Simpson	4ea67eb418	[AArch64] Reduce vector insert/extract cost for Kryo Differential Revision: http://reviews.llvm.org/D17379 llvm-svn: 261237	2016-02-18 18:35:45 +00:00
Hans Wennborg	0667f54980	Revert to extend i8/i16 return values on Darwin (PR26665) In r260133, LLVM was changed to no longer extend i8/i16 return values, as it's not required by the ABI. However, code was found in the wild that relies on the old behaviour on Darwin, so this commit reverts back to that old behaviour for Darwin. On other platforms, it's less likely that code would be depending on the old behaviour, as GCC and MSVC haven't been extending such return values. llvm-svn: 261235	2016-02-18 18:17:05 +00:00
Benjamin Kramer	651ce4399b	Make header self-contained. NFC. llvm-svn: 261234	2016-02-18 18:02:48 +00:00
Chad Rosier	80e96aa3c6	[Hexagon] Remove redundant check. llvm-svn: 261232	2016-02-18 17:49:57 +00:00
Xinliang David Li	7e1b06a8d7	Stop creating covmap as note section on ELF covmap needs to created as non allocatable, but not with SHT_NOTE. The latter was needed to workaround a problem of BFD linker with gc, which is no longer needed. (A more proper longer term fix requires changing FE driver to force referencing the section using linker script). Differential Revision: http://reviews.llvm.org/D17309 llvm-svn: 261228	2016-02-18 17:20:22 +00:00
Nicolai Haehnle	9352856846	AMDGPU/SI: add llvm.amdgcn.image.load/store[.mip] intrinsics Summary: These correspond to IMAGE_LOAD/STORE[_MIP] and are going to be used by Mesa for the GL_ARB_shader_image_load_store extension. IMAGE_LOAD is already matched by llvm.SI.image.load. That intrinsic has a legacy name and pretends not to read memory. Differential Revision: http://reviews.llvm.org/D17276 llvm-svn: 261224	2016-02-18 16:44:18 +00:00
Krzysztof Parzyszek	6457a3b226	[Hexagon] Fix compilation error with GCC 6 Compiling Hexagon target with GCC 6 produces "error: should have been declared inside" due to GCC PR c++/69657 which was merged. Properly wrapping operator<<() definitions within the namespace llvm fixes the issue. Author: domagoj.stolfa Differential Revision: http://reviews.llvm.org/D17281 llvm-svn: 261220	2016-02-18 16:10:27 +00:00
Krzysztof Parzyszek	dc9d44881a	[Hexagon] Implement TLS support Patch by Anand Kodnani. llvm-svn: 261218	2016-02-18 15:42:57 +00:00
Matthew Simpson	36f5056c0b	Reapply commit r259357 with a fix for PR26629 Commit r259357 was reverted because it caused PR26629. We were assuming all roots of a vectorizable tree could be truncated to the same width, which is not the case in general. This commit reapplies the patch along with a fix and a new test case to ensure we don't regress because of this issue again. This should fix PR26629. llvm-svn: 261212	2016-02-18 14:14:40 +00:00
Zlatko Buljan	ed9a2f8059	[mips][microMIPS] Implement TLBINV and TLBINVF instructions Differential Revision: http://reviews.llvm.org/D16849 llvm-svn: 261211	2016-02-18 14:10:52 +00:00
Krzysztof Parzyszek	64689602da	[Hexagon] Add support for __builtin_prefetch llvm-svn: 261210	2016-02-18 13:58:38 +00:00
Krzysztof Parzyszek	2fbedb5d62	[Hexagon] Update the callee-saved register set for EH-aware functions llvm-svn: 261208	2016-02-18 13:41:05 +00:00
Chandler Carruth	d8a5b5b32e	[PM] Port the PostOrderFunctionAttrs pass to the new pass manager and convert one test to use this. This is a particularly significant milestone because it required a working per-function AA framework which can be queried over each function from within a CGSCC transform pass (and additionally a module analysis to be accessible). This is essentially the point of the entire pass manager rewrite. A CGSCC transform is able to query for multiple different function's analysis results. It works. The whole thing appears to actually work and accomplish the original goal. While we were able to hack function attrs and basic-aa to "work" in the old pass manager, this port doesn't use any of that, it directly leverages the new fundamental functionality. For this to work, the CGSCC framework also has to support SCC-based behavior analysis, etc. The only part of the CGSCC pass infrastructure not sorted out at this point are the updates in the face of inlining and running function passes that mutate the call graph. The changes are pretty boring and boiler-plate. Most of the work was factored into more focused preperatory patches. But this is what wires it all together. llvm-svn: 261203	2016-02-18 11:03:11 +00:00
Simon Pilgrim	82dcce5934	[X86][SSE] Improve PSHUFB shuffle mask decoding. In cases where the PSHUFB shuffle mask is shared it might not be bitcasted to a vXi8 byte vector. This patch adds support for decoding these wider shuffle masks from the ConstantPool. The test case in question makes use of this to recognise the shuffle mask is an unary UNPCKL pattern and simplifies accordingly. llvm-svn: 261201	2016-02-18 10:17:40 +00:00
Junmo Park	f4fd158223	Minor code cleanup. NFC. llvm-svn: 261200	2016-02-18 10:09:20 +00:00
Nikolay Haustov	190b841f70	Test commit access. llvm-svn: 261199	2016-02-18 10:02:12 +00:00
Michael Zuckerman	285264f877	[AVX512][PRORQ][PRORD] Change imm8 to int Differential Revision: http://reviews.llvm.org/D17024 llvm-svn: 261198	2016-02-18 09:52:12 +00:00

... 2 3 4 5 6 ...

87657 Commits