llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00

Author	SHA1	Message	Date
Will Schmidt	68f6e5d89e	Enable the P8Model entry This was missed last time around, for the P8 Instruction Scheduling changes (223257). This will hook the P8Model entry in so those changes will actually be used. llvm-svn: 224452	2014-12-17 19:56:29 +00:00
Matthias Braun	96127e1a14	ExecutionDepsFix: Correctly handle wide registers. The ExecutionDepsFix previously mapped each register to 1 or zero registers of the register class it was called with and therefore simulating liveness for. This was problematic for cases involving wider registers like Q0 on ARM where ExecutionDepsFix gets invoked for the Dxx registers. In these cases the wide register would get mapped to the last matching D register, while it should have been all matching D registers. This commit changes the AliasMap to use a SmallVector to map registers to potentially multiple destination regclass registers. This is required to avoid regressions with subregister liveness tracking enabled. llvm-svn: 224447	2014-12-17 19:13:47 +00:00
JF Bastien	c7a41915e8	Random Number Generator Refactoring (removing from Module) This patch removes the RNG from Module. Passes should instead create a new RNG for their use as needed. Patch by Stephen Crane @rinon. Differential revision: http://reviews.llvm.org/D4377 llvm-svn: 224444	2014-12-17 18:12:10 +00:00
Jingyue Wu	17566f954b	[NVPTX] Fix bugs related to isSingleValueType Summary: With isSingleValueType starting to treat vector types as single-value types, code that uses this interface needs to be updated. Test Plan: vector-global.ll nvcl-param-align.ll Reviewers: jholewinski Reviewed By: jholewinski Subscribers: llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D6573 llvm-svn: 224440	2014-12-17 17:59:04 +00:00
Saleem Abdulrasool	3a1f685ef5	ARM: correct an off-by-one in an assert The assert was off-by-one, resulting in failures for valid input. Thanks to Asiri Rathnayake for pointing out the failure! llvm-svn: 224432	2014-12-17 16:17:44 +00:00
Michael Kuperstein	3790301d73	[DAGCombine] Slightly improve lowering of BUILD_VECTOR into a shuffle. This handles the case of a BUILD_VECTOR being constructed out of elements extracted from a vector twice the size of the result vector. Previously this was always scalarized. Now, we try to construct a shuffle node that feeds on extract_subvectors. This fixes PR15872 and provides a partial fix for PR21711. Differential Revision: http://reviews.llvm.org/D6678 llvm-svn: 224429	2014-12-17 12:32:17 +00:00
Vladimir Medic	c9f0072599	MipsABIInfo class is used in different libraries. Moving the files to MCTargetDesc folder(LLVMMipsDesc library) prevents linkage errors. There are no functional changes. llvm-svn: 224427	2014-12-17 11:49:56 +00:00
Toma Tabacu	311b69b658	[mips] Set GCC-compatible MIPS asssembler options before inline asm blocks. Summary: When generating MIPS assembly, LLVM always overrides the default assembler options by emitting the '.set noreorder', '.set nomacro' and '.set noat' directives, while GCC uses the default options if an assembly-level function contains inline assembly code. This becomes a problem when the code generated by LLVM is interleaved with inline assembly which assumes GCC-like assembler options (from Linux, for example). This patch fixes these conflicts by setting the appropriate assembler options at the beginning of an inline asm block and popping them at the end. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6637 llvm-svn: 224425	2014-12-17 10:56:16 +00:00
Suyog Sarda	ea83428380	Revert 224119 "This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders them for bundling into vector of loads, and vectorizes it." This was re-ordering floating point data types resulting in mismatch in output. llvm-svn: 224424	2014-12-17 10:34:27 +00:00
Erik Eckstein	042c032147	Strength reduce intrinsics with overflow into regular arithmetic operations if possible. Some intrinsics, like s/uadd.with.overflow and umul.with.overflow, are already strength reduced. This change adds other arithmetic intrinsics: s/usub.with.overflow, smul.with.overflow. It completes the work on PR20194. llvm-svn: 224417	2014-12-17 07:29:19 +00:00
Duncan P. N. Exon Smith	759b4ed45d	Revert "Linker: Drop superseded subprograms" This reverts commit r224389. Based on feedback from the bots, the assertion seems to be going off more often, not less (previously I was just seeing it in an internal bootstrap, now it's happening in public builds too). http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto_build/936/ http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/5325 Reverting in order to investigate. llvm-svn: 224416	2014-12-17 07:27:31 +00:00
Justin Hibbits	9dd5e8fee1	Add parsing of 'foo@local". Summary: Currently, it supports generating, but not parsing, this expression. Test added as well. Test Plan: New test added, no regressions due to this. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6672 llvm-svn: 224415	2014-12-17 06:23:35 +00:00
Rafael Espindola	598eacab05	Remove a debugging assert. Sorry for the noise, I have no idea how it survived to the final version. llvm-svn: 224414	2014-12-17 03:38:04 +00:00
Rafael Espindola	d22f1dcf1a	Fix the windows build. llvm-svn: 224412	2014-12-17 02:42:20 +00:00
Rafael Espindola	28785d1703	Refactor and simplify the code reading /proc/cpuinfo. NFC. llvm-svn: 224410	2014-12-17 02:32:44 +00:00
Matthias Braun	45928b05a4	RegisterCoalescer: Sprinkle some const modifiers. llvm-svn: 224409	2014-12-17 02:18:13 +00:00
Nick Lewycky	f60b003316	Delete debugging cruft that crept in with r223802. llvm-svn: 224407	2014-12-17 01:56:51 +00:00
David Majnemer	fe299df41a	InstSimplify: shl nsw/nuw undef, %V -> undef We can always choose an value for undef which might cause %V to shift out an important bit except for one case, when %V is zero. However, shl behaves like an identity function when the right hand side is zero. llvm-svn: 224405	2014-12-17 01:54:33 +00:00
Nick Lewycky	224bcdd295	Make ValueEnumerator::print use OS for metadata too. Noticed by inspection. llvm-svn: 224404	2014-12-17 01:52:08 +00:00
Quentin Colombet	5896cdb9ff	[CodeGenPrepare] Reapply r224351 with a fix for the assertion failure: The type promotion helper does not support vector type, so when make such it does not kick in in such cases. Original commit message: [CodeGenPrepare] Move sign/zero extensions near loads using type promotion. This patch extends the optimization in CodeGenPrepare that moves a sign/zero extension near a load when the target can combine them. The optimization may promote any operations between the extension and the load to make that possible. Although this optimization may be beneficial for all targets, in particular AArch64, this is enabled for X86 only as I have not benchmarked it for other targets yet. Context Most targets feature extended loads, i.e., loads that perform a zero or sign extension for free. In that context it is interesting to expose such pattern in CodeGenPrepare so that the instruction selection pass can form such loads. Sometimes, this pattern is blocked because of instructions between the load and the extension. When those instructions are promotable to the extended type, we can expose this pattern. Motivating Example Let us consider an example: define void @foo(i8* %addr1, i32* %addr2, i8 %a, i32 %b) { %ld = load i8* %addr1 %zextld = zext i8 %ld to i32 %ld2 = load i32* %addr2 %add = add nsw i32 %ld2, %zextld %sextadd = sext i32 %add to i64 %zexta = zext i8 %a to i32 %addza = add nsw i32 %zexta, %zextld %sextaddza = sext i32 %addza to i64 %addb = add nsw i32 %b, %zextld %sextaddb = sext i32 %addb to i64 call void @dummy(i64 %sextadd, i64 %sextaddza, i64 %sextaddb) ret void } As it is, this IR generates the following assembly on x86_64: [...] movzbl (%rdi), %eax # zero-extended load movl (%rsi), %es # plain load addl %eax, %esi # 32-bit add movslq %esi, %rdi # sign extend the result of add movzbl %dl, %edx # zero extend the first argument addl %eax, %edx # 32-bit add movslq %edx, %rsi # sign extend the result of add addl %eax, %ecx # 32-bit add movslq %ecx, %rdx # sign extend the result of add [...] The throughput of this sequence is 7.45 cycles on Ivy Bridge according to IACA. Now, by promoting the additions to form more extended loads we would generate: [...] movzbl (%rdi), %eax # zero-extended load movslq (%rsi), %rdi # sign-extended load addq %rax, %rdi # 64-bit add movzbl %dl, %esi # zero extend the first argument addq %rax, %rsi # 64-bit add movslq %ecx, %rdx # sign extend the second argument addq %rax, %rdx # 64-bit add [...] The throughput of this sequence is 6.15 cycles on Ivy Bridge according to IACA. This kind of sequences happen a lot on code using 32-bit indexes on 64-bit architectures. Note: The throughput numbers are similar on Sandy Bridge and Haswell. Proposed Solution To avoid the penalty of all these sign/zero extensions, we merge them in the loads at the beginning of the chain of computation by promoting all the chain of computation on the extended type. The promotion is done if and only if we do not introduce new extensions, i.e., if we do not degrade the code quality. To achieve this, we extend the existing “move ext to load” optimization with the promotion mechanism introduced to match larger patterns for addressing mode (r200947). The idea of this extension is to perform the following transformation: ext(promotableInst1(...(promotableInstN(load)))) => promotedInst1(...(promotedInstN(ext(load)))) The promotion mechanism in that optimization is enabled by a new TargetLowering switch, which is off by default. In other words, by default, the optimization performs the “move ext to load” optimization as it was before this patch. Performance Configuration: x86_64: Ivy Bridge fixed at 2900MHz running OS X 10.10. Tested Optimization Levels: O3/Os Tests: llvm-testsuite + externals. Results: - No regression beside noise. - Improvements: CINT2006/473.astar: ~2% Benchmarks/PAQ8p: ~2% Misc/perlin: ~3% The results are consistent for both O3 and Os. <rdar://problem/18310086> llvm-svn: 224402	2014-12-17 01:36:17 +00:00
Kevin Enderby	dc6f805541	Add printing the LC_ENCRYPTION_INFO_64 load command with llvm-objdump’s -private-headers and add tests for the two AArch64 binaries. llvm-svn: 224400	2014-12-17 01:01:30 +00:00
David Blaikie	93e50409ac	PR21875: codegen for non-type template parameters of nullptr_t type llvm-svn: 224399	2014-12-17 00:43:22 +00:00
Reid Kleckner	b4ee65bf9b	Revert "[CodeGenPrepare] Move sign/zero extensions near loads using type promotion." This reverts commit r224351. It causes assertion failures when building ICU. llvm-svn: 224397	2014-12-17 00:29:23 +00:00
Hans Wennborg	37a572f581	SelectionDAG switch lowering: use 'unsigned' to count destination popularity SwitchInst::getNumCases() returns unsinged, so using uint64_t to count cases seems unnecessary. Also fix a missing CHECK in the test case. llvm-svn: 224393	2014-12-16 23:41:59 +00:00
Colin LeMahieu	55e21b6e4f	[Hexagon] Updating doubleword shift usages to new versions. llvm-svn: 224391	2014-12-16 23:36:15 +00:00
Kevin Enderby	c4d55fc277	Add printing the LC_ENCRYPTION_INFO load command with llvm-objdump’s -private-headers. llvm-svn: 224390	2014-12-16 23:25:52 +00:00
Duncan P. N. Exon Smith	4efb9deca4	Linker: Drop superseded subprograms When a function gets replaced by `ModuleLinker`, drop superseded subprograms. This ensures that the "first" subprogram pointing at a function is the same one that `!dbg` references point at. This is a stop-gap fix for PR21910. Notably, this fixes Release+Asserts bootstraps that are currently asserting out in `LexicalScopes::initialize()` due to the explicit instantiations in `lib/IR/Dominators.cpp` eventually getting replaced by -argpromotion. llvm-svn: 224389	2014-12-16 23:23:41 +00:00
Simon Pilgrim	f9bdd6a092	[X86][SSE] Vector double -> float conversion memory folding (cvtpd2ps) Added a missing memory folding relationship for the (V)CVTPD2PS instruction - we can safely fold these for stack reloads. Differential Revision: http://reviews.llvm.org/D6663 llvm-svn: 224383	2014-12-16 22:30:10 +00:00
Rafael Espindola	2673fda3c7	Make the assert a bit stronger. We should get no declarations in here. llvm-svn: 224382	2014-12-16 22:29:43 +00:00
Colin LeMahieu	7907ae05cd	[Hexagon] Removing old XTYPE/BIT instructions and replacing usages. llvm-svn: 224381	2014-12-16 22:17:09 +00:00
Sanjay Patel	af93d5f15c	merge consecutive loads that are offset from a base address SelectionDAG::isConsecutiveLoad() was not detecting consecutive loads when the first load was offset from a base address. This patch recognizes that pattern and subtracts the offset before comparing the second load to see if it is consecutive. The codegen change in the new test case improves from: vmovsd 32(%rdi), %xmm0 vmovsd 48(%rdi), %xmm1 vmovhpd 56(%rdi), %xmm1, %xmm1 vmovhpd 40(%rdi), %xmm0, %xmm0 vinsertf128 $1, %xmm1, %ymm0, %ymm0 To: vmovups 32(%rdi), %ymm0 An existing test case is also improved from: vmovsd (%rdi), %xmm0 vmovsd 16(%rdi), %xmm1 vmovsd 24(%rdi), %xmm2 vunpcklpd %xmm2, %xmm0, %xmm0 ## xmm0 = xmm0[0],xmm2[0] vmovhpd 8(%rdi), %xmm1, %xmm3 To: vmovsd (%rdi), %xmm0 vmovsd 16(%rdi), %xmm1 vmovhpd 24(%rdi), %xmm0, %xmm0 vmovhpd 8(%rdi), %xmm1, %xmm1 This patch fixes PR21771 ( http://llvm.org/bugs/show_bug.cgi?id=21771 ). Differential Revision: http://reviews.llvm.org/D6642 llvm-svn: 224379	2014-12-16 21:57:18 +00:00
Colin LeMahieu	5cbdca29ae	[Hexagon] Adding tstbit/bitclr/bitset instructions. llvm-svn: 224374	2014-12-16 21:28:58 +00:00
Kostya Serebryany	7693e4617d	[sanitizer] prevent function call merging for sanitizer-coverage callbacks llvm-svn: 224372	2014-12-16 21:24:15 +00:00
Colin LeMahieu	bb5c698516	[Hexagon] Adding bit count and twiddling instructions. llvm-svn: 224367	2014-12-16 20:57:56 +00:00
Colin LeMahieu	28f6e273c4	[Hexagon] Adding asr/lsr/asl reg/imm, asl with saturation, asr with rounding. Doubleword abs/neg/not. Interleave and deinterleave instructions. llvm-svn: 224365	2014-12-16 20:40:23 +00:00
JF Bastien	02501293ba	x86-32: PUSHF/POPF use/def EFLAGS Summary: As a side-quest for D6629 jvoung pointed out that I should use -verify-machineinstrs and this found a bug in x86-32's handling of EFLAGS for PUSHF/POPF. This patch fixes the use/def, and adds -verify-machineinstrs to all x86 tests which contain 'EFLAGS'. One exception: this patch leaves inline-asm-fpstack.ll as-is because it fails -verify-machineinstrs in a way unrelated to EFLAGS. This patch also modifies cmpxchg-clobber-flags.ll along the lines of what D6629 already does by also testing i386. Test Plan: ninja check Reviewers: t.p.northover, jvoung Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6687 llvm-svn: 224359	2014-12-16 20:15:45 +00:00
Rafael Espindola	0e63c2a0a5	Use CastInst::castIsValid to simplify the verifier. Also delete a dead member variable. llvm-svn: 224356	2014-12-16 19:29:29 +00:00
Matt Arsenault	4e68f48bcd	NVPTX: Remove duplicate of AsmPrinter::lowerConstant llvm-svn: 224355	2014-12-16 19:16:17 +00:00
Matt Arsenault	dbdac5d39f	Move lowerConstant to AsmPrinter This was a static function before, and NVPTX duplicated it because it wasn't exposed. llvm-svn: 224354	2014-12-16 19:16:14 +00:00
Quentin Colombet	d31121348b	[CodeGenPrepare] Move sign/zero extensions near loads using type promotion. This patch extends the optimization in CodeGenPrepare that moves a sign/zero extension near a load when the target can combine them. The optimization may promote any operations between the extension and the load to make that possible. Although this optimization may be beneficial for all targets, in particular AArch64, this is enabled for X86 only as I have not benchmarked it for other targets yet. Context Most targets feature extended loads, i.e., loads that perform a zero or sign extension for free. In that context it is interesting to expose such pattern in CodeGenPrepare so that the instruction selection pass can form such loads. Sometimes, this pattern is blocked because of instructions between the load and the extension. When those instructions are promotable to the extended type, we can expose this pattern. Motivating Example Let us consider an example: define void @foo(i8* %addr1, i32* %addr2, i8 %a, i32 %b) { %ld = load i8* %addr1 %zextld = zext i8 %ld to i32 %ld2 = load i32* %addr2 %add = add nsw i32 %ld2, %zextld %sextadd = sext i32 %add to i64 %zexta = zext i8 %a to i32 %addza = add nsw i32 %zexta, %zextld %sextaddza = sext i32 %addza to i64 %addb = add nsw i32 %b, %zextld %sextaddb = sext i32 %addb to i64 call void @dummy(i64 %sextadd, i64 %sextaddza, i64 %sextaddb) ret void } As it is, this IR generates the following assembly on x86_64: [...] movzbl (%rdi), %eax # zero-extended load movl (%rsi), %es # plain load addl %eax, %esi # 32-bit add movslq %esi, %rdi # sign extend the result of add movzbl %dl, %edx # zero extend the first argument addl %eax, %edx # 32-bit add movslq %edx, %rsi # sign extend the result of add addl %eax, %ecx # 32-bit add movslq %ecx, %rdx # sign extend the result of add [...] The throughput of this sequence is 7.45 cycles on Ivy Bridge according to IACA. Now, by promoting the additions to form more extended loads we would generate: [...] movzbl (%rdi), %eax # zero-extended load movslq (%rsi), %rdi # sign-extended load addq %rax, %rdi # 64-bit add movzbl %dl, %esi # zero extend the first argument addq %rax, %rsi # 64-bit add movslq %ecx, %rdx # sign extend the second argument addq %rax, %rdx # 64-bit add [...] The throughput of this sequence is 6.15 cycles on Ivy Bridge according to IACA. This kind of sequences happen a lot on code using 32-bit indexes on 64-bit architectures. Note: The throughput numbers are similar on Sandy Bridge and Haswell. Proposed Solution To avoid the penalty of all these sign/zero extensions, we merge them in the loads at the beginning of the chain of computation by promoting all the chain of computation on the extended type. The promotion is done if and only if we do not introduce new extensions, i.e., if we do not degrade the code quality. To achieve this, we extend the existing “move ext to load” optimization with the promotion mechanism introduced to match larger patterns for addressing mode (r200947). The idea of this extension is to perform the following transformation: ext(promotableInst1(...(promotableInstN(load)))) => promotedInst1(...(promotedInstN(ext(load)))) The promotion mechanism in that optimization is enabled by a new TargetLowering switch, which is off by default. In other words, by default, the optimization performs the “move ext to load” optimization as it was before this patch. Performance Configuration: x86_64: Ivy Bridge fixed at 2900MHz running OS X 10.10. Tested Optimization Levels: O3/Os Tests: llvm-testsuite + externals. Results: - No regression beside noise. - Improvements: CINT2006/473.astar: ~2% Benchmarks/PAQ8p: ~2% Misc/perlin: ~3% The results are consistent for both O3 and Os. <rdar://problem/18310086> llvm-svn: 224351	2014-12-16 19:09:03 +00:00
Robert Khasanov	104b98b388	[AVX512] Enable integer arithmetic lowering for AVX512BW/VL subsets. Added lowering tests. llvm-svn: 224349	2014-12-16 18:24:07 +00:00
Colin LeMahieu	4932546e48	[Hexagon] Adding absolute value, and negate with saturation llvm-svn: 224346	2014-12-16 17:44:49 +00:00
Sanjay Patel	8363dd3b42	combine consecutive subvector 16-byte loads into one 32-byte load This is a fix for PR21709 ( http://llvm.org/bugs/show_bug.cgi?id=21709 ). When we have 2 consecutive 16-byte loads that are merged into one 32-byte vector, we can use a single 32-byte load instead. But we don't do this for SandyBridge / IvyBridge because they have slower 32-byte memops. We also don't bother using 32-byte integer loads on a machine that only has AVX1 (btver2) because those operands would have to be split in half anyway since there is no support for 32-byte integer math ops. Differential Revision: http://reviews.llvm.org/D6492 llvm-svn: 224344	2014-12-16 16:30:01 +00:00
Colin LeMahieu	585f29d985	[Hexagon] Adding saturate and swizzle instructions. llvm-svn: 224343	2014-12-16 16:27:17 +00:00
Robert Khasanov	8231be9f66	[AVX512] Add a comment for avx512_broadcast_pat multiclass llvm-svn: 224341	2014-12-16 16:12:11 +00:00
Colin LeMahieu	c1eb9c21e5	[Hexagon] Removing old multiply defs and updating references to new versions. llvm-svn: 224340	2014-12-16 16:10:01 +00:00
Vladimir Medic	6c45970ced	The single check for N64 inside MipsDisassemblerBase's subclasses is actually wrong. It should be testing for FeatureGP64bit.There are no functional changes. llvm-svn: 224339	2014-12-16 15:29:12 +00:00
Zoran Jovanovic	d72dae73a8	[mips][microMIPS] Implement SWP and LWP instructions Differential Revision: http://reviews.llvm.org/D5667 llvm-svn: 224338	2014-12-16 14:59:10 +00:00
Aaron Ballman	d1ab012d86	Fixing -Wsign-compare warnings; NFC. llvm-svn: 224337	2014-12-16 14:04:11 +00:00
Elena Demikhovsky	fe73fcc29b	Masked Load and Store Intrinsics in loop vectorizer. The loop vectorizer optimizes loops containing conditional memory accesses by generating masked load and store intrinsics. This decision is target dependent. http://reviews.llvm.org/D6527 llvm-svn: 224334	2014-12-16 11:50:42 +00:00
Bradley Smith	5d5a40a0f8	[ARM] Prevent PerformVCVTCombine from combining a vmul/vcvt with 8 lanes This would result in a crash since the vcvt used does not support v8i32 types. llvm-svn: 224332	2014-12-16 10:59:27 +00:00
Elena Demikhovsky	06e22de2d3	X86: Added FeatureVectorUAMem for all AVX architectures. According to AVX specification: "Most arithmetic and data processing instructions encoded using the VEX prefix and performing memory accesses have more flexible memory alignment requirements than instructions that are encoded without the VEX prefix. Specifically, With the exception of explicitly aligned 16 or 32 byte SIMD load/store instructions, most VEX-encoded, arithmetic and data processing instructions operate in a flexible environment regarding memory address alignment, i.e. VEX-encoded instruction with 32-byte or 16-byte load semantics will support unaligned load operation by default. Memory arguments for most instructions with VEX prefix operate normally without causing #GP(0) on any byte-granularity alignment (unlike Legacy SSE instructions)." The same for AVX-512. This change does not affect anything right now, because only the "memop pattern fragment" depends on FeatureVectorUAMem and it is not used in AVX patterns. All AVX patterns are based on the "unaligned load" anyway. llvm-svn: 224330	2014-12-16 09:10:08 +00:00
Duncan P. N. Exon Smith	58ed764767	IR: Stop printing 'metadata' in Metadata::print() Stop printing `metadata` in `Metadata::print()` and `Metadata::printAsOperand()`. llvm-svn: 224327	2014-12-16 07:40:31 +00:00
Duncan P. N. Exon Smith	1fb1f7f9a7	IR: Make MDNode::dump() useful by adding addresses It's horrible to inspect `MDNode`s in a debugger. All of their operands that are `MDNode`s get dumped as `<badref>`, since we can't assign metadata slots in the context of a `Metadata::dump()`. (Why not? Why not assign numbers lazily? Because then each time you called `dump()`, a given `MDNode` could have a different lazily assigned number.) Fortunately, the C memory model gives us perfectly good identifiers for `MDNode`. Add pointer addresses to the dumps, transforming this: (lldb) e N->dump() !{i32 662302, i32 26, <badref>, null} (lldb) e ((MDNode)N->getOperand(2))->dump() !{i32 4, !"foo"} into: (lldb) e N->dump() !{i32 662302, i32 26, <0x100706ee0>, null} (lldb) e ((MDNode)0x100706ee0)->dump() !{i32 4, !"foo"} and this: (lldb) e N->dump() 0x101200248 = !{<badref>, <badref>, <badref>, <badref>, <badref>} (lldb) e N->getOperand(0) (const llvm::MDOperand) $0 = { MD = 0x00000001012004e0 } (lldb) e N->getOperand(1) (const llvm::MDOperand) $1 = { MD = 0x00000001012004e0 } (lldb) e N->getOperand(2) (const llvm::MDOperand) $2 = { MD = 0x0000000101200058 } (lldb) e N->getOperand(3) (const llvm::MDOperand) $3 = { MD = 0x00000001012004e0 } (lldb) e N->getOperand(4) (const llvm::MDOperand) $4 = { MD = 0x0000000101200058 } (lldb) e ((MDNode)0x00000001012004e0)->dump() !{} (lldb) e ((MDNode)0x0000000101200058)->dump() !{null} into: (lldb) e N->dump() !{<0x1012004e0>, <0x1012004e0>, <0x101200058>, <0x1012004e0>, <0x101200058>} (lldb) e ((MDNode)0x1012004e0)->dump() !{} (lldb) e ((MDNode)0x101200058)->dump() !{null} llvm-svn: 224325	2014-12-16 07:09:37 +00:00
Saleem Abdulrasool	c163948b80	ARM: diagnose deprecated syntax The use of SP and PC in the register list for stores is deprecated on ARM (ARM ARM A.8.8.199): ARM deprecates the use of ARM instructions that include the SP or the PC in the list. Provide a deprecation warning from the assembler in the case that the syntax is ever seen. llvm-svn: 224319	2014-12-16 05:53:25 +00:00
Hal Finkel	04ae4c36c5	[PowerPC] Improve instruction selection bit-permuting operations (32-bit) The PowerPC backend, somewhat embarrassingly, did not generate an optimal-length sequence of instructions for a 32-bit bswap. While adding a pattern for the bswap intrinsic to fix this would not have been terribly difficult, doing so would not have addressed the real problem: we had been generating poor code for many bit-permuting operations (by which I mean things like byte-swap that permute the bits of one or more inputs around in various ways). Here are some initial steps toward solving this deficiency. Bit-permuting operations are represented, at the SDAG level, using ISD::ROTL, SHL, SRL, AND and OR (mostly with constant second operands). Looking back through these operations, we can build up a description of the bits in the resulting value in terms of bits of one or more input values (and constant zeros). For each bit, we compute the rotation amount from the original value, and then group consecutive (value, rotation factor) bits into groups. Groups sharing these attributes are then collected and sorted, and we can then instruction select the entire permutation using a combination of masked rotations (rlwinm), imm ands (andi/andis), and masked rotation inserts (rlwimi). The result is that instead of lowering an i32 bswap as: rlwinm 5, 3, 24, 16, 23 rlwinm 4, 3, 24, 0, 7 rlwimi 4, 3, 8, 8, 15 rlwimi 5, 3, 8, 24, 31 rlwimi 4, 5, 0, 16, 31 we now produce: rlwinm 4, 3, 8, 0, 31 rlwimi 4, 3, 24, 16, 23 rlwimi 4, 3, 24, 0, 7 and for the 'test6' example in the PowerPC/README.txt file: unsigned test6(unsigned x) { return ((x & 0x00FF0000) >> 16) \| ((x & 0x000000FF) << 16); } we used to produce: lis 4, 255 rlwinm 3, 3, 16, 0, 31 ori 4, 4, 255 and 3, 3, 4 and now we produce: rlwinm 4, 3, 16, 24, 31 rlwimi 4, 3, 16, 8, 15 and, as a nice bonus, this fixes the FIXME in test/CodeGen/PowerPC/rlwimi-and.ll. This commit does not include instruction-selection for i64 operations, those will come later. llvm-svn: 224318	2014-12-16 05:51:41 +00:00
Saleem Abdulrasool	55dd0f7b1d	ARM: 80-column clang-format a function with an overly long string constant. NFC. llvm-svn: 224314	2014-12-16 04:10:10 +00:00
Matthias Braun	93f392ca19	LiveRangeCalc: Rewrite subrange calculation This changes subrange calculation to calculate subranges sequentially instead of in parallel. The code is easier to understand that way and addresses the code review issues raised about LiveOutData being hard to understand/needing more comments by removing them :) llvm-svn: 224313	2014-12-16 04:03:38 +00:00
Rafael Espindola	b79a41f44b	Remove the last unnecessary member variable of mapped_file_region. NFC. llvm-svn: 224312	2014-12-16 03:10:29 +00:00
Rafael Espindola	f9c70f7fb7	Convert a member variable to a local variable. NFC. llvm-svn: 224311	2014-12-16 02:53:35 +00:00
Rafael Espindola	a52bb2443b	Remove unused member and simplify. NFC. llvm-svn: 224309	2014-12-16 02:19:26 +00:00
Rafael Espindola	f367e92d33	Start adding thin archive support. This is just sufficient for 'ar t' to work. llvm-svn: 224307	2014-12-16 01:43:41 +00:00
Adrian Prantl	33921ffabc	ARM/AArch64: Attach the FrameSetup MIFlag to CFI instructions. Debug info marks the first instruction without the FrameSetup flag as being the end of the function prologue. Any CFI instructions in the middle of the function prologue would cause debug info to end the prologue too early and worse, attach the line number of the CFI instruction, which incidentally is often 0. llvm-svn: 224294	2014-12-16 00:20:49 +00:00
Colin LeMahieu	4c0e2a35a6	[Hexagon] Adding doubleword multiplies with and without accumulation. llvm-svn: 224293	2014-12-16 00:07:24 +00:00
Michael Ilseman	d27db299e8	Sink the isa into the assert llvm-svn: 224291	2014-12-15 23:41:21 +00:00
Colin LeMahieu	0a4e0a7b23	[Hexagon] Adding halfword to doubleword multiplies. llvm-svn: 224289	2014-12-15 23:29:37 +00:00
Colin LeMahieu	b56764d577	[Hexagon] Adding logical-logical accumulation instructions and tests. llvm-svn: 224288	2014-12-15 23:19:07 +00:00
Sanjoy Das	0cdde3ea1f	Teach ScalarEvolution to exploit min and max expressions when proving isKnownPredicate. The motivation for this change is to optimize away checks in loops like this: limit = min(t, len) for (i = 0 to limit) if (i >= len \|\| i < 0) throw_array_of_of_bounds(); a[i] = ... Differential Revision: http://reviews.llvm.org/D6635 llvm-svn: 224285	2014-12-15 22:50:15 +00:00
JF Bastien	27a63b4d77	x86: Emit LOCK prefix after DATA16 Summary: x86 allows either ordering for the LOCK and DATA16 prefixes, but using GCC+GAS leads to different code generation than using LLVM. This change matches the order that GAS emits the x86 prefixes when a semicolon isn't used in inline assembly (see tc-i386.c comment before define LOCK_PREFIX), and helps simplify tooling that operates on the instruction's byte sequence (such as NaCl's validator). This change shouldn't have any performance impact. Test Plan: ninja check Reviewers: craig.topper, jvoung Subscribers: jfb, llvm-commits Differential Revision: http://reviews.llvm.org/D6630 llvm-svn: 224283	2014-12-15 22:34:58 +00:00
Colin LeMahieu	a6e921963f	[Hexagon] Adding a number of additional multiply forms with tests. llvm-svn: 224282	2014-12-15 22:10:37 +00:00
Michael Ilseman	4239fd4c02	Clean up warning about unused variable llvm-svn: 224281	2014-12-15 21:47:09 +00:00
Matthias Braun	0e11dc527c	Revert "LiveRangeCalc: Rewrite subrange calculation" Revert until I find out why non-subreg enabled targets break. This reverts commit 6097277eefb9c5fb35a7f493c783ee1fd1b9d6a7. llvm-svn: 224278	2014-12-15 21:36:35 +00:00
Michael Ilseman	56b4b7d789	Revert of r223763, in spirit. r223763 was made to work around a temporary issue where a user of the JIT was passing down a declaration (incorrectly). This shouldn't occur, so assert rather than silently continue. llvm-svn: 224277	2014-12-15 21:36:29 +00:00
Mark Heffernan	4271864b43	Clarify HowFarToZero computation when the step is a positive power of two. Functionally this should be identical to the existing code except for the case where Step is maximally negative (eg, INT_MIN). We now punt in that one corner case to make reasoning about the code easier. llvm-svn: 224274	2014-12-15 21:19:53 +00:00
Colin LeMahieu	cb4ac18de9	[Hexagon] Adding misc multiply encodings and tests. llvm-svn: 224273	2014-12-15 21:17:03 +00:00
Matthias Braun	6e44c21bbb	LiveRangeCalc: Rewrite subrange calculation This changes subrange calculation to calculate subranges sequentially instead of in parallel. The code is easier to understand that way and addresses the code review issues raised about LiveOutData being hard to understand/needing more comments by removing them :) llvm-svn: 224272	2014-12-15 21:16:21 +00:00
Colin LeMahieu	cfd931a5a2	[Hexagon] Adding doubleworld accumulating multiplies of halfwords. llvm-svn: 224267	2014-12-15 20:17:46 +00:00
Colin LeMahieu	410a9d158e	[Hexagon] Adding accumulating half word multiplies. llvm-svn: 224266	2014-12-15 20:10:28 +00:00
Colin LeMahieu	5b550fb31a	[Hexagon] Adding multiply with rnd/sat/rndsat llvm-svn: 224265	2014-12-15 20:01:59 +00:00
Matthias Braun	4126c05045	LiveRangeCalc: use more range based for loops; NFC llvm-svn: 224263	2014-12-15 19:40:46 +00:00
Colin LeMahieu	4225ddfd4f	[Hexagon] Adding encoding bits for halfword multiplies. llvm-svn: 224261	2014-12-15 19:22:07 +00:00
Ahmed Bougacha	d521315e3e	[X86] Also pretty-print shuffle mask for INSERTPS rm variants. llvm-svn: 224260	2014-12-15 19:17:54 +00:00
Duncan P. N. Exon Smith	9c5542c040	IR: Make metadata typeless in assembly Now that `Metadata` is typeless, reflect that in the assembly. These are the matching assembly changes for the metadata/value split in r223802. - Only use the `metadata` type when referencing metadata from a call intrinsic -- i.e., only when it's used as a `Value`. - Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode` when referencing it from call intrinsics. So, assembly like this: define @foo(i32 %v) { call void @llvm.foo(metadata !{i32 %v}, metadata !0) call void @llvm.foo(metadata !{i32 7}, metadata !0) call void @llvm.foo(metadata !1, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{metadata !3}, metadata !0) ret void, !bar !2 } !0 = metadata !{metadata !2} !1 = metadata !{i32* @global} !2 = metadata !{metadata !3} !3 = metadata !{} turns into this: define @foo(i32 %v) { call void @llvm.foo(metadata i32 %v, metadata !0) call void @llvm.foo(metadata i32 7, metadata !0) call void @llvm.foo(metadata i32* @global, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{!3}, metadata !0) ret void, !bar !2 } !0 = !{!2} !1 = !{i32* @global} !2 = !{!3} !3 = !{} I wrote an upgrade script that handled almost all of the tests in llvm and many of the tests in cfe (even handling many `CHECK` lines). I've attached it (or will attach it in a moment if you're speedy) to PR21532 to help everyone update their out-of-tree testcases. This is part of PR21532. llvm-svn: 224257	2014-12-15 19:07:53 +00:00
Michael Ilseman	dd56e9aa72	Silence more static analyzer warnings. Add in definedness checks for shift operators, null checks when pointers are assumed by the code to be non-null, and explicit unreachables. llvm-svn: 224255	2014-12-15 18:48:43 +00:00
Vladimir Medic	ab769bd4d1	Add disassembler tests for mips3 platform. There are no functional changes. llvm-svn: 224253	2014-12-15 16:19:34 +00:00
Aaron Ballman	4c568b1783	Changing a cast from unsigned to uint64_t, should be NFC in practice. llvm-svn: 224249	2014-12-15 14:25:12 +00:00
Elena Demikhovsky	1560acad27	Sink store based on alias analysis - by Ella Bolshinsky The alias analysis is used define whether the given instruction is a barrier for store sinking. For 2 identical stores, following instructions are checked in the both basic blocks, to determine whether they are sinking barriers. http://reviews.llvm.org/D6420 llvm-svn: 224247	2014-12-15 14:09:53 +00:00
Michael Kuperstein	cc87d705cb	[X86] Break false dependencies before partial register updates when the source operand is in memory Adds the various "rm" instruction variants into the list of instructions that have a partial register update. Also adds all variants of SQRTSD that were missing in the original list. Differential Revision: http://reviews.llvm.org/D6620 llvm-svn: 224246	2014-12-15 13:18:21 +00:00
Elena Demikhovsky	51c511a201	AVX-512: Added EXPAND instructions and intrinsics. llvm-svn: 224241	2014-12-15 10:03:52 +00:00
Alexey Bataev	f8adf5cca1	Fix line mapping information in LLVM JIT profiling with Vtune The line mapping information for dynamic code is reported incorrectly. It causes VTune to map LLVM generated code to source lines incorrectly. This patch fix this issue. Patch by Denis Pravdin. Differential Revision: http://reviews.llvm.org/D6603 llvm-svn: 224229	2014-12-15 04:45:43 +00:00
David Majnemer	ad5d90d6f2	ThreadLocal: Move Unix-specific code out of Support/ThreadLocal.cpp Just a cleanup, no functionality change is intended. llvm-svn: 224227	2014-12-15 01:19:53 +00:00
David Majnemer	5756ece9cb	ThreadLocal: Return a mutable pointer if templated with a non-const type It makes more sense for ThreadLocal<const T>::get to return a const T* and ThreadLocal<T>::get to return a T*. llvm-svn: 224225	2014-12-15 01:04:45 +00:00
Elena Demikhovsky	b5f1976682	Loop Vectorizer minor changes in the code - some comments, function names, identation. Reviewed here: http://reviews.llvm.org/D6527 llvm-svn: 224218	2014-12-14 09:43:50 +00:00
David Majnemer	e9e3482be6	APInt: udivrem should use machine instructions for single-word APInts This mirrors the behavior of APInt::udiv and APInt::urem. Some architectures, like X86, have a single instruction which can compute both division and remainder. llvm-svn: 224217	2014-12-14 09:41:56 +00:00
David Majnemer	2c4c28163b	ScalarEvolution: Remove SCEVUDivision, it's unused This is just a code simplification, no functionality change is intended. llvm-svn: 224216	2014-12-14 09:12:33 +00:00
Hal Finkel	acf8e7a584	[PowerPC] Handle cmp op promotion for SELECT[_CC] nodes in PPCTL::DAGCombineExtBoolTrunc PPCTargetLowering::DAGCombineExtBoolTrunc contains logic to remove unwanted truncations and extensions when dealing with nodes of the form: zext(binary-ops(binary-ops(trunc(x), trunc(y)), ...) There was a FIXME in the implementation (now removed) regarding the fact that the function would abort the transformations if any of the non-output operands of a SELECT or SELECT_CC node would need to be promoted (because they were also output operands, for example). As a result, we continued to generate unnecessary zero-extends for code such as this: unsigned foo(unsigned a, unsigned b) { return (a <= b) ? a : b; } which would produce: cmplw 0, 3, 4 isel 3, 4, 3, 1 rldicl 3, 3, 0, 32 blr and now we produce: cmplw 0, 3, 4 isel 3, 4, 3, 1 blr which is better in the obvious way. llvm-svn: 224213	2014-12-14 05:53:19 +00:00
Ahmed Bougacha	88111b0889	Reapply "[ARM] Combine base-updating/post-incrementing vector load/stores." r223862 tried to also combine base-updating load/stores. r224198 reverted it, as "it created a regression on the test-suite on test MultiSource/Benchmarks/Ptrdist/anagram by scrambling the order in which the words are shown." Reapply, with a fix to ignore non-normal load/stores. Truncstores are handled elsewhere (you can actually write a pattern for those, whereas for postinc loads you can't, since they return two values), but it should be possible to also combine extloads base updates, by checking that the memory (rather than result) type is of the same size as the addend. Original commit message: We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). Differential Revision: http://reviews.llvm.org/D6585 llvm-svn: 224203	2014-12-13 23:22:12 +00:00
Renato Golin	3418b50014	Revert "[ARM] Combine base-updating/post-incrementing vector load/stores." This reverts commit r223862, as it created a regression on the test-suite on test MultiSource/Benchmarks/Ptrdist/anagram by scrambling the order in which the words are shown. We'll investigate the issue and re-apply when safe. llvm-svn: 224198	2014-12-13 20:23:18 +00:00
Aaron Ballman	29c803c115	Silencing a -Wsign-compare warning; NFC. llvm-svn: 224195	2014-12-13 16:55:02 +00:00
Akira Hatanaka	e6d6f49584	Rename argument strings of codegen passes to avoid collisions with command line options. This commit changes the command line arguments (PassInfo::PassArgument) of two passes, MachineFunctionPrinter and MachineScheduler, to avoid collisions with command line options that have the same argument strings. This bug manifests when the PassList construct (defined in opt.cpp) is used in a tool that links with codegen passes. To reproduce the bug, paste the following lines into llc.cpp and run llc. #include "llvm/IR/LegacyPassNameParser.h" static llvm:🆑:list<const llvm::PassInfo*, bool, llvm::PassNameParser> PassList(llvm:🆑:desc("Optimizations available:")); rdar://problem/19212448 llvm-svn: 224186	2014-12-13 04:52:04 +00:00
Hal Finkel	30da0a42c8	[PowerPC] Add a DAGToDAG peephole to remove unnecessary zero-exts On PPC64, we end up with lots of i32 -> i64 zero extensions, not only from all of the usual places, but also from the ABI, which specifies that values passed are zero extended. Almost all 32-bit PPC instructions in PPC64 mode are defined to do something to the higher-order bits, and for some instructions, that action clears those bits (thus providing a zero-extended result). This is especially common after rotate-and-mask instructions. Adding an additional instruction to zero-extend the results of these instructions is unnecessary. This PPCISelDAGToDAG peephole optimization examines these zero-extensions, and looks back through their operands to see if all instructions will implicitly zero extend their results. If so, we convert these instructions to their 64-bit variants (which is an internal change only, the actual encoding of these instructions is the same as the original 32-bit ones) and remove the unnecessary zero-extension (changing where the INSERT_SUBREG instructions are to make everything internally consistent). llvm-svn: 224169	2014-12-12 23:59:36 +00:00
David Majnemer	70ded5026c	ValueTracking: Don't recurse too deeply in computeKnownBitsFromAssume Respect the MaxDepth recursion limit, doing otherwise will trigger an assert in computeKnownBits. This fixes PR21891. llvm-svn: 224168	2014-12-12 23:59:29 +00:00
Chad Rosier	b5c6a89ee6	[ARMConstantIsland] Insert tbb/tbh optimization where previous jump table resided. llvm-svn: 224165	2014-12-12 23:27:40 +00:00
Yaron Keren	8537a3f54c	Pass EC by reference to MemoryBufferMMapFile to return error code. Patch by Kim Grasman! llvm-svn: 224159	2014-12-12 22:27:53 +00:00
Michael Ilseman	6d636ee500	Clean up static analyzer warnings. Clang's static analyzer found several potential cases of undefined behavior, use of un-initialized values, and potentially null pointer dereferences in tablegen, Support, MC, and ADT. This cleans them up with specific assertions on the assumptions of the code. llvm-svn: 224154	2014-12-12 21:48:03 +00:00
Colin LeMahieu	e750275948	[Hexagon] Adding double word add/min/minu/max/maxu instructions and tests. llvm-svn: 224153	2014-12-12 21:29:25 +00:00
Colin LeMahieu	980129091a	[Hexagon] Adding J class call instructions. llvm-svn: 224150	2014-12-12 21:12:27 +00:00
Duncan P. N. Exon Smith	77f50da5c2	IR: Don't track nullptr on metadata RAUW The RAUW support in `Metadata` supports going to `nullptr` specifically to handle values being deleted, causing `ValueAsMetadata` to be deleted. Fix the case where the reference is from a `TrackingMDRef` (as opposed to an `MDOperand` or a `MetadataAsValue`). This is surprisingly rare -- metadata tracked by `TrackingMDRef` going to null -- but it came up in an openSUSE bootstrap during inlining. The tracking ref was held by the `ValueMap` because it was referencing a local, the basic block containing the local became dead after it had been merged in, and when the local was deleted, the tracking ref asserted in an `isa`. llvm-svn: 224146	2014-12-12 19:24:33 +00:00
Rafael Espindola	8851da85ea	MAP_FILE is the default. We don't need to add it. llvm-svn: 224144	2014-12-12 19:12:42 +00:00
Steven Wu	17876438f0	More code format fix from r224133, NFC llvm-svn: 224140	2014-12-12 18:48:37 +00:00
Rafael Espindola	518ffae495	Remove silly left over from the Windows resize_file implementation. I didn't notice the problem first because on a non debug build the CRT was just exiting the process without any message. llvm-svn: 224139	2014-12-12 18:37:43 +00:00
Rafael Espindola	de997f41a2	Move the resize file feature from mapped_file_region to the only user. This removes a duplicated stat on every file that llvm-ar looks at. llvm-svn: 224138	2014-12-12 18:13:23 +00:00
Rafael Espindola	cb99753ea9	Pass a FD to resise_file and add a testcase. I will add a real use in another commit. llvm-svn: 224136	2014-12-12 17:55:12 +00:00
Rafael Espindola	56ca84422a	Remove unused feature. NFC. llvm-svn: 224135	2014-12-12 17:35:34 +00:00
Steven Wu	498e8ef334	Restructure code from r224097. NFC llvm-svn: 224133	2014-12-12 17:21:54 +00:00
Robert Khasanov	64bf0f6845	[AVX512] Enabling bit logic lowering Added lowering tests. llvm-svn: 224132	2014-12-12 17:02:18 +00:00
Vasileios Kalintiris	fe5bb204f0	[mips] Enable code generation for MIPS-III. Summary: This commit enables the MIPS-III target and adds support for code generation of SELECT nodes. We have to use pseudo-instructions with custom inserters for these nodes as MIPS-III CPUs do not have conditional-move instructions. Depends on D6212 Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6464 llvm-svn: 224128	2014-12-12 15:16:46 +00:00
Robert Khasanov	efae7453cb	[AVX512] Enabling MIN/MAX lowering. Added lowering tests. llvm-svn: 224127	2014-12-12 15:10:43 +00:00
Andrea Di Biagio	0e11c13141	Reapply "[MachineScheduler] Fix for PR21807: minor code difference building with/without -g." This reapplies r224118 with a fix for test 'misched-code-difference-with-debug.ll'. That test was failing on some buildbots because it was x86 specific but it was missing a target triple. Added an explicit triple to test misched-code-difference-with-debug.ll. llvm-svn: 224126	2014-12-12 15:09:58 +00:00
Chad Rosier	5670ef081f	[Reassociate] Use dbgs() instead of errs(). llvm-svn: 224125	2014-12-12 14:44:12 +00:00
Vasileios Kalintiris	40712e048d	[mips] Support SELECT nodes for targets that don't have conditional-move instructions. Summary: For Mips targets that do not have conditional-move instructions, ie. targets before MIPS32 and MIPS-IV, we have to insert a diamond control-flow pattern in order to support SELECT nodes. In order to do that, we add pseudo-instructions with a custom inserter that emits the necessary control-flow that selects the correct value. With this patch we add complete support for code generation of Mips-II targets based on the LLVM test-suite. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6212 llvm-svn: 224124	2014-12-12 14:41:37 +00:00
Robert Khasanov	634dbbea7c	[AVX512] Minor fix in lowering pattern for broadcast intrustions. No functional change. llvm-svn: 224122	2014-12-12 14:21:30 +00:00
Andrea Di Biagio	c2fd26c14f	Revert: [MachineScheduler] Fix for PR21807: minor code difference building with/without -g. Test 'misched-code-difference-with-debug.ll' was failing on some buildbots. llvm-svn: 224121	2014-12-12 13:34:03 +00:00
Suyog Sarda	d1c6149e4d	This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders them for bundling into vector of loads, and vectorizes it. Test case : float hadd(float* a) { return (a[0] + a[1]) + (a[2] + a[3]); } AArch64 assembly before patch : ldp s0, s1, [x0] ldp s2, s3, [x0, #8] fadd s0, s0, s1 fadd s1, s2, s3 fadd s0, s0, s1 ret AArch64 assembly after patch : ldp d0, d1, [x0] fadd v0.2s, v0.2s, v1.2s faddp s0, v0.2s ret Reviewed Link : http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141208/248531.html llvm-svn: 224119	2014-12-12 12:53:44 +00:00
Andrea Di Biagio	77f74f08b1	[MachineScheduler] Fix for PR21807: minor code difference building with/without -g. This patch fixes the issue reported as PR21807. There was a minor difference in the generated code depending on the -g flag. The cause was that with -g the machine scheduler used a different scheduling strategy. This decision was based on the number of instructions in a schedule region and included debug instructions in that count. This patch fixes the issue in MISched and provides a test. Patch by Russell Gallop! llvm-svn: 224118	2014-12-12 12:41:22 +00:00
Charlie Turner	8bc4c033ad	Emit Tag_ABI_FP_16bit_format build attribute. The __fp16 type is unconditionally exposed. Since -mfp16-format is not yet supported, there is not a user switch to change this behaviour. This build attribute should capture the default behaviour of the compiler, which is to expose the IEEE 754 version of __fp16. When -mfp16-format is emitted, that will be the way to control the value of this build attribute. Change-Id: I8a46641ff0fd2ef8ad0af5f482a6d1af2ac3f6b0 llvm-svn: 224115	2014-12-12 11:59:18 +00:00
Ekaterina Romanova	b2120d0c25	A fix for PR21176. DW_OP_const <const> doesn't describe a constant value, but a value at a constant address. The proper way to describe a constant value is DW_OP_constu <const>, DW_OP_stack_value. Added DW_OP_stack_value to the stack. Marked incorrect-variable-debugloc1.ll to xfail for PowerPC64, while the the failure (PR21881) is being investigated. llvm-svn: 224098	2014-12-12 05:11:47 +00:00
Steven Wu	896d3dd47b	Fix another infinite loop in InstCombine Summary: InstCombine infinite-loops for the testcase added It is because InstCombine is generating instructions that can be optimized by itself. Fix by not optimizing frem if the optimized type is the same as original type. rdar://problem/19150820 Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D6634 llvm-svn: 224097	2014-12-12 04:34:07 +00:00
Matt Arsenault	c1a6f36235	R600: Fix min/max matching problems with unordered compares The returned operand needs to be permuted for the unordered compares. Also fix incorrectly producing fmin_legacy / fmax_legacy for f64, which don't exist. llvm-svn: 224094	2014-12-12 02:30:37 +00:00
Matt Arsenault	89a384686e	R600/SI: fmin/fmax_legacy are not associative llvm-svn: 224093	2014-12-12 02:30:33 +00:00
Matt Arsenault	9c85ddbf8c	R600/SI: Don't promote f32 select to i32 This is nice for the instruction patterns, but it complicates min / max matching. The select doesn't have the correct type and would require looking through the bitcasts for the real float operands. llvm-svn: 224092	2014-12-12 02:30:29 +00:00
Duncan P. N. Exon Smith	2c88fce7b0	Bitcode: Add missing "Remove in 4.0" comments llvm-svn: 224090	2014-12-12 02:11:31 +00:00
Matthias Braun	8b4b1ae185	Document that PassManager::add() may delete the pass right away. Also remove redundant documentation: - doxygen will copy documentation to overriden methods. - Use \copydoc on PIMPL classes instead of replicating the text. llvm-svn: 224089	2014-12-12 01:27:01 +00:00
Philip Reames	5f7fe2e3ac	Comment and minor code cleanup for GCStrategy (NFC) Updating comments to reflect the current state of the world after my recent changes to ownership structure and generally better describe what a GCStrategy is and how it works. llvm-svn: 224086	2014-12-12 00:49:03 +00:00
Matt Arsenault	b0274833dd	Add target hook for whether it is profitable to reduce load widths Add an option to disable optimization to shrink truncated larger type loads to smaller type loads. On SI this prevents using scalar load instructions in some cases, since there are no scalar extloads. llvm-svn: 224084	2014-12-12 00:00:24 +00:00
Sanjay Patel	e8ec85de5e	remove function names from comments; NFC llvm-svn: 224080	2014-12-11 23:38:43 +00:00
Matt Arsenault	ea199c18ab	R600/SI: Handle physical registers in getOpRegClass llvm-svn: 224079	2014-12-11 23:37:34 +00:00
Matt Arsenault	2a8e3283e5	R600/SI: Don't verify constant bus usage of flag ops This was checking if pseudo-operands like the source modifiers were using the constant bus, which happens to work because the values these all can be happen to be valid inline immediates. This fixes a later commit which starts checking the register class of the operands. llvm-svn: 224078	2014-12-11 23:37:32 +00:00
Duncan P. N. Exon Smith	e45da1b1ef	Bitcode: Use unsigned char to record MDStrings `MDString`s can have arbitrary characters in them. Prevent an assertion that fired in `BitcodeWriter` because of sign extension by copying the characters into the record as `unsigned char`s. Based on a patch by Keno Fischer; fixes PR21882. llvm-svn: 224077	2014-12-11 23:34:30 +00:00
Sanjay Patel	4688c23cf8	return without temporary; NFC llvm-svn: 224076	2014-12-11 23:30:36 +00:00
Matthias Braun	1deb9a9e81	Enable MachineVerifier in debug mode for X86, ARM, AArch64, Mips. llvm-svn: 224075	2014-12-11 23:18:03 +00:00
Ahmed Bougacha	4b8a22ae51	[X86] Add a temporary testcase for PR21876/r223996. llvm-svn: 224074	2014-12-11 23:07:52 +00:00
Duncan P. N. Exon Smith	972205b3d9	Bitcode: Add METADATA_NODE and METADATA_VALUE This reflects the typelessness of `Metadata` in the bitcode format, removing types from all metadata operands. `METADATA_VALUE` represents a `ValueAsMetadata`, and always has two fields: the type and the value. `METADATA_NODE` represents an `MDNode`, and unlike `METADATA_OLD_NODE`, doesn't store types. It stores operands at their ID+1 so that `0` can reference `nullptr` operands. Part of PR21532. llvm-svn: 224073	2014-12-11 23:02:24 +00:00
Hal Finkel	1b92efa70e	[PowerPC] Better lowering for add/or of a FrameIndex If we have an add (or an or that is really an add), where one operand is a FrameIndex and the other operand is a small constant, we can combine the lowering of the FrameIndex (which is lowered as an add of the FI and a zero offset) with the constant operand. Amusingly, this is an old potential improvement entry from lib/Target/PowerPC/README.txt which had never been resolved. In short, we used to lower: %X = alloca { i32, i32 } %Y = getelementptr {i32,i32}* %X, i32 0, i32 1 ret i32* %Y as: addi 3, 1, -8 ori 3, 3, 4 blr and now we produce: addi 3, 1, -4 blr which is much more sensible. llvm-svn: 224071	2014-12-11 22:51:06 +00:00
Duncan P. N. Exon Smith	fe1e836701	Bitcode: Add `OLD_` prefix to metadata node records I'm about to change these, so move the old ones out of the way. Part of PR21532. llvm-svn: 224070	2014-12-11 22:30:48 +00:00
Matt Arsenault	1744b97776	R600/SI: Use unordered equal instructions llvm-svn: 224067	2014-12-11 22:15:43 +00:00
Matt Arsenault	221a1a532c	R600/SI: Make more unordered comparisons legal This saves a second compare and an and / or by using the unordered comparison instructions. llvm-svn: 224066	2014-12-11 22:15:39 +00:00
Matt Arsenault	91d14e0009	R600/SI: Use unordered not equal instructions llvm-svn: 224065	2014-12-11 22:15:35 +00:00
Alexey Samsonov	a17215eae4	[ASan] Change fake stack and local variables handling. This commit changes the way we get fake stack from ASan runtime (to find use-after-return errors) and the way we represent local variables: - __asan_stack_malloc function now returns pointer to newly allocated fake stack frame, or NULL if frame cannot be allocated. It doesn't take pointer to real stack as an input argument, it is calculated inside the runtime. - __asan_stack_free function doesn't take pointer to real stack as an input argument. Now this function is never called if fake stack frame wasn't allocated. - __asan_init version is bumped to reflect changes in the ABI. - new flag "-asan-stack-dynamic-alloca" allows to store all the function local variables in a dynamic alloca, instead of the static one. It reduces the stack space usage in use-after-return mode (dynamic alloca will not be called if the local variables are stored in a fake stack), and improves the debug info quality for local variables (they will not be described relatively to %rbp/%rsp, which are assumed to be clobbered by function calls). This flag is turned off by default for now, but I plan to turn it on after more testing. llvm-svn: 224062	2014-12-11 21:53:03 +00:00
Duncan P. N. Exon Smith	d43538fc17	CodeGen: Stop using LeakDetector for MachineInstr Since `MachineInstr` is required to have a trivial destructor, it cannot remove itself from `LeakDetection`. Remove the calls. As it happens, this requirement is because `MachineFunction` allocates all `MachineInstr`s in a custom allocator; when the `MachineFunction` is destroyed they're dropped of the edge. There's no benefit to detecting leaks. llvm-svn: 224061	2014-12-11 21:51:37 +00:00
Duncan P. N. Exon Smith	3258b2e21f	IR: Store MDNodes in a separate LeakDetector container This gives us better leak detection messages, like `Value` has. This also has the side effect of papering over a problem where `MachineInstr`s are added as garbage to the leak detector and then deleted without being removed. If `MDNode::getTemporary()` allocates an `MDNodeFwdDecl` in the same spot, the leak detector asserts. By separating `MDNode`s into their own container we lose that assertion. Since `MachineInstr` is required to have a trivial destructor, its usage of `LeakDetector` at all is pretty suspect. I'll be sending a patch soon to strip that out. llvm-svn: 224060	2014-12-11 21:39:39 +00:00
Matthias Braun	aa888a6f1e	[CodeGen] Add print and verify pass after each MachineFunctionPass by default Previously print+verify passes were added in a very unsystematic way, which is annoying when debugging as you miss intermediate steps and allows bugs to stay unnotice when no verification is performed. To make this change practical I added the possibility to explicitely disable verification. I used this option on all places where no verification was performed previously (because alot of places actually don't pass the MachineVerifier). In the long term these problems should be fixed properly and verification enabled after each pass. I'll enable some more verification in subsequent commits. This is the 2nd attempt at this after realizing that PassManager::add() may actually delete the pass. llvm-svn: 224059	2014-12-11 21:26:47 +00:00
David Majnemer	87f7df4d2e	AsmParser: Don't crash on an ill-formed MDNodeVector llvm-svn: 224056	2014-12-11 20:51:54 +00:00
Andrea Di Biagio	6186490ec7	[InstCombine][X86] Improved folding of calls to Intrinsic::x86_sse4a_insertqi. This patch teaches the instruction combiner how to fold a call to 'insertqi' if the 'length field' (3rd operand) is set to zero, and if the sum between field 'length' and 'bit index' (4th operand) is bigger than 64. From the AMD64 Architecture Programmer's Manual: 1. If the sum of the bit index + length field is greater than 64, then the results are undefined; 2. A value of zero in the field length is defined as a length of 64. This patch improves the existing combining logic for intrinsic 'insertqi' adding extra checks to address both point 1. and point 2. Differential Revision: http://reviews.llvm.org/D6583 llvm-svn: 224054	2014-12-11 20:44:59 +00:00
David Majnemer	3705a77a71	AsmParser: Don't crash on an ill-formed MDNodeVector llvm-svn: 224053	2014-12-11 20:44:09 +00:00
Rafael Espindola	a4ea055f1a	Remove a convoluted way of calling close by moving the call to the only caller. As a bonus we can actually check the return value. llvm-svn: 224046	2014-12-11 20:12:55 +00:00
Rafael Espindola	aa48306a03	This reverts commit r224043 and r224042. check-llvm was failing. llvm-svn: 224045	2014-12-11 20:03:57 +00:00
Michael Ilseman	5ae09fdaf3	Silence static analyzer warnings in LLVMSupport. The static analyzer catches a few potential bugs in LLVMSupport. Add in asserts to silence the warnings. llvm-svn: 224044	2014-12-11 19:46:38 +00:00
Matthias Braun	bf0827b784	Enable machineverifier in debug mode for X86, ARM, AArch64, Mips llvm-svn: 224043	2014-12-11 19:42:09 +00:00
Matthias Braun	42e36608f0	[CodeGen] Add print and verify pass after each MachineFunctionPass by default Previously print+verify passes were added in a very unsystematic way, which is annoying when debugging as you miss intermediate steps and allows bugs to stay unnotice when no verification is performed. To make this change practical I added the possibility to explicitely disable verification. I used this option on all places where no verification was performed previously (because alot of places actually don't pass the MachineVerifier). In the long term these problems should be fixed properly and verification enabled after each pass. I'll enable some more verification in subsequent commits. llvm-svn: 224042	2014-12-11 19:42:05 +00:00
Matthias Braun	335449f68a	[CodeGen] Let MachineVerifierPass own its banner string llvm-svn: 224041	2014-12-11 19:41:51 +00:00
Colin LeMahieu	f4ec473c32	[Hexagon] Renaming classes in preparation for replacement. llvm-svn: 224036	2014-12-11 19:01:28 +00:00
Tim Northover	2e6f9cc501	ARM: convert isTargetIOS checks to isTargetDarwin. The distinction is mostly useful in the front-end. By the time we get here, there are very few situations where we actually want different behaviour for Darwin and IOS (in fact Darwin mostly just exists in a few tests). So this should reduce any surprising weirdness for anyone using it. No functional change on anything anyone actually cares about. llvm-svn: 224035	2014-12-11 18:49:37 +00:00
Hal Finkel	f4a8d09521	[PowerPC] Implement BuildSDIVPow2, lower i64 pow2 sdiv using sradi PPCISelDAGToDAG contained existing code to lower i32 sdiv by a power-of-2 using srawi/addze, but did not implement the i64 case. DAGCombine now contains a callback specifically designed for this purpose (BuildSDIVPow2), and part of the logic has been moved to an implementation of that callback. Doing this lowering using BuildSDIVPow2 likely does not matter, compared to handling everything in PPCISelDAGToDAG, for the positive divisor case, but the negative divisor case, which generates an additional negation, can potentially benefit from additional folding from DAGCombine. Now, both the i32 and the i64 cases have been implemented. Fixes PR20732. llvm-svn: 224033	2014-12-11 18:37:52 +00:00
Rafael Espindola	22caf7934c	Remove dead code. NFC. llvm-svn: 224029	2014-12-11 17:17:26 +00:00
Cameron McInally	a7f40d9986	[AVX512] Add support for 512b variable bit shift intrinsics. llvm-svn: 224028	2014-12-11 17:13:05 +00:00
Colin LeMahieu	b0c5eb965a	[Hexagon] Ading i64 <- i32, i32 sextw pattern. llvm-svn: 224027	2014-12-11 17:08:21 +00:00
Colin LeMahieu	de7232ce5b	[Hexagon] Adding encoding information for sign extend word instruction. llvm-svn: 224026	2014-12-11 16:43:06 +00:00
Elena Demikhovsky	e879b19906	AVX-512: Added all forms of COMPRESS instruction + intrinsics + tests llvm-svn: 224019	2014-12-11 15:02:24 +00:00
Jozef Kolek	3a4db003e2	[mips][microMIPS] Implement CodeGen support for LI16 instruction. Differential Revision: http://reviews.llvm.org/D5840 llvm-svn: 224017	2014-12-11 13:56:23 +00:00
Michael Kuperstein	c280bc0e29	The inliner needs to fix up debug information for llvm.dbg.declare, not only for llvm.dbg.value. Patch by Amjad Aboud Differential Revision: http://reviews.llvm.org/D6525 llvm-svn: 224015	2014-12-11 12:41:10 +00:00
Michael Kuperstein	a0c5a09356	[X86] When converting movs to pushes, don't assume MOVmi operand is an actual immediate This should fix PR21878. llvm-svn: 224010	2014-12-11 11:26:16 +00:00
Patrik Hagglund	fdf10dc04c	Bugfix in InlineSpiller::traceSiblingValue(). Properly determine whether or not a phi was added by splitting. Check against the current VNInfo of OrigLI instead of against the OrigVNI argument. Patch provided by Jonas Paulsson. Reviewed by Quentin Colombet. llvm-svn: 224009	2014-12-11 10:40:17 +00:00
Elena Demikhovsky	42a41becb2	AVX-512: Fixed a bug in lowering setcc for MVT::i1 type llvm-svn: 224008	2014-12-11 10:21:12 +00:00
Kumar Sukhani	be55fd773c	test commit (spelling correction) llvm-svn: 224007	2014-12-11 08:33:36 +00:00
Erik Eckstein	4078937e34	Refactor creation of overflow result tuples in InstCombineCalls. Extract the creation of overflow result tuples in a separate function. NFC. llvm-svn: 224006	2014-12-11 08:02:30 +00:00
Craig Topper	9a511af4f9	Use range-based for loops. NFC llvm-svn: 224005	2014-12-11 07:04:54 +00:00
Ekaterina Romanova	2d1303b0d6	Reverting commit 223981, because the test that I added (incorrect-variable-debugloc1.ll) failed for llvm-ppc64. The test is failing for llvm-ppc64 because for this platform the location list is not being generated at all (most likely because of the bug in PPC code optimization or generation). I will file a bug agains PPC compiler, but meanwhile, until PPC bug is fixed, I will have to revert my change. llvm-svn: 224000	2014-12-11 06:22:35 +00:00
Craig Topper	09f2fc9487	Make MultiClass::DefPrototypes own their Records to fix memory leaks. llvm-svn: 223998	2014-12-11 05:25:33 +00:00
Craig Topper	4e19eb8884	Replace std::map<K, V*> with std::map<K, std::unique_ptr<V>> to handle ownership and deletion of the values. Ideally we would store the MultiClasses by value directly in the maps, but I had some trouble with that before and this at least fixes the leak. llvm-svn: 223997	2014-12-11 05:25:30 +00:00
Ahmed Bougacha	9304854896	[X86] Add back AVX2 VR256 PMOVX patterns. We can't reach those from zext, but other parts of the backend (the shuffle lowering) generate 256-bit VZEXT nodes. Fixes PR21876. llvm-svn: 223996	2014-12-11 04:32:17 +00:00
Nick Lewycky	62f8f08187	Fix LLVMContext to match what MDKind names that the LL parser permits. Fixes PR21799! llvm-svn: 223995	2014-12-11 02:10:28 +00:00
Philip Reames	e6833acc3a	GCStrategy should not own GCFunctionInfo This change moves the ownership and access of GCFunctionInfo (the object which describes the safepoints associated with a safepoint under GCRoot) to GCModuleInfo. Previously, this was owned by GCStrategy which was in turned owned by GCModuleInfo. This made GCStrategy module specific which is 'surprising' given it's name and other purposes. There's a few more changes needed, but we're getting towards the point we can reuse GCStrategy for gc.statepoint as well. p.s. The style of this code ends up being a mess. I was trying to move code around without otherwise changing much. Once I get the ownership structure rearranged, I will go through and fixup spacing, naming, comments etc. Differential Revision: http://reviews.llvm.org/D6587 llvm-svn: 223994	2014-12-11 01:47:23 +00:00
Matthias Braun	675c7b6a7c	LiveInterval: Use range based for loops for subregister ranges. llvm-svn: 223991	2014-12-11 00:59:06 +00:00
Tim Northover	2e78e5c83f	ARM: correctly expand LDR-lit based globals. Quite a major error here: the expansions for the Pseudos with and without folded load were mixed up. Fortunately it only affects ARM-mode, when not using movw/movt, on Darwin. I'm guessing no-one actually uses that combination. llvm-svn: 223986	2014-12-10 23:40:50 +00:00
Ekaterina Romanova	33f856278b	A fix for PR21176. DW_OP_const <const> doesn't describe a constant value, but a value at a constant address. The proper way to describe a constant value is DW_OP_constu <const>, DW_OP_stack_value. Added DW_OP_stack_value to the stack. -This line, and those below, will be ignored-- M lib/CodeGen/AsmPrinter/DwarfDebug.cpp A test/DebugInfo/incorrect-variable-debugloc1.ll llvm-svn: 223981	2014-12-10 23:19:56 +00:00
Matthias Braun	549d124e5c	LiveInterval: Use more range based for loops for value numbers and segments. llvm-svn: 223978	2014-12-10 23:07:54 +00:00
Mark Heffernan	614c4b347a	Fix PR21694. r219517 added a use of SCEV divide in HowFarToZero computation. This divide can produce incorrect results as we are using an unsigned divide for what should be a modular divide. This change reverts back to a more conservative computation using trailing zeros. llvm-svn: 223974	2014-12-10 22:53:52 +00:00
Colin LeMahieu	6bafcd8eab	[Hexagon] Adding combine ri/ir instructions. llvm-svn: 223971	2014-12-10 22:23:07 +00:00
David Majnemer	fb504668cb	ConstantFold: Clean up X * undef code No functional change intended. llvm-svn: 223970	2014-12-10 21:58:17 +00:00
David Majnemer	cde1ba6638	ConstantFold, InstSimplify: undef >>a x can be either -1 or 0, choose 0 Zero is usually a nicer constant to have than -1. llvm-svn: 223969	2014-12-10 21:58:15 +00:00
David Majnemer	56c5d273bf	ConstantFold: an undef shift amount results in undef X shifted by undef results in undef because the undef value can represent values greater than the width of the operands. llvm-svn: 223968	2014-12-10 21:38:05 +00:00
Colin LeMahieu	5a093ecd78	[Hexagon] Adding encodings for JR class instructions. Updating complier usages. llvm-svn: 223967	2014-12-10 21:24:10 +00:00
Rafael Espindola	a94308f825	Move three methods only used by MCJIT to MCJIT. These methods are only used by MCJIT and are very specific to it. In fact, they are also fairly specific to the fact that we have a dynamic linker of relocatable objects. llvm-svn: 223964	2014-12-10 20:46:55 +00:00
Juergen Ributzka	8175f5b997	[AArch64] MachO large code-model: Materialize FP constants in code. In the large code model we have to first get the address of the GOT entry, load the address of the constant, and then load the constant itself. To avoid these loads and the GOT entry alltogether this commit changes the way how FP constants are materialized in the large code model. The constats are now materialized in a GPR and then bitconverted/moved into the FPR. Reviewed by Tim Northover Fixes rdar://problem/16572564. llvm-svn: 223941	2014-12-10 19:43:32 +00:00
Marek Olsak	74a7e40b65	R600/SI: Use getTargetConstant in AdjustRegClass llvm-svn: 223940	2014-12-10 19:25:31 +00:00
Colin LeMahieu	490a0f9c58	[Hexagon] Adding JR class predicated call reg instructions. llvm-svn: 223933	2014-12-10 18:24:16 +00:00
Sanjay Patel	ecf92813fa	Match new shuffle codegen for MOVHPD patterns Add patterns to match SSE (shufpd) and AVX (vpermilpd) shuffle codegen when storing the high element of a v2f64. The existing patterns were only checking for an unpckh type of shuffle. http://llvm.org/bugs/show_bug.cgi?id=21791 Differential Revision: http://reviews.llvm.org/D6586 llvm-svn: 223929	2014-12-10 16:58:54 +00:00
Aaron Ballman	b249bf3fc3	Silencing a -Wsequence-point warning, and the resulting undefined behavior. NFC. llvm-svn: 223926	2014-12-10 14:14:54 +00:00
David Majnemer	fdaa66cd59	ConstantFold: div undef, 0 should fold to undef, not zero Dividing by zero yields an undefined value. llvm-svn: 223924	2014-12-10 09:14:55 +00:00
David Majnemer	59eb07e1c0	InstSimplify: [al]shr exact undef, %X -> undef Exact shifts always keep the non-zero bits of their input. This means it keeps it's undef bits. llvm-svn: 223923	2014-12-10 09:14:52 +00:00
Michael Kuperstein	2b0f6b010a	[X86] Make a code path in EltsFromConsecutiveLoads work only on vectors it expects EltsFromConsecutiveLoads was apparently only ever called for 128-bit vectors, and assumed this implicitly. r223518 started calling it for AVX-sized vectors, causing the code path that had this assumption to crash. This adds a check to make this path fire only for 128-bit vectors. Differential Revision: http://reviews.llvm.org/D6579 llvm-svn: 223922	2014-12-10 08:46:12 +00:00
David Majnemer	8ec2668c75	InstSimplify: div %X, 0 -> undef We already optimized rem %X, 0 to undef, we should do the same for div. llvm-svn: 223919	2014-12-10 07:52:18 +00:00
David Majnemer	df607e95a0	DataLayout: Provide nicer diagnostics for malformed strings llvm-svn: 223911	2014-12-10 02:36:41 +00:00
David Majnemer	6f52870a48	AsmParser: Don't allow null bytes in BB labels Since Value objects can't have null bytes in their name, we shouldn't allow them in the labels of basic blocks. llvm-svn: 223907	2014-12-10 02:10:35 +00:00
Duncan P. N. Exon Smith	2fceb99340	IR: Move call to dropAllReferences() to MDNode subclasses Don't call `dropAllReferences()` from `MDNode::~MDNode()`, call it directly from `~MDNodeFwdDecl()` and `~GenericMDNode()`. llvm-svn: 223904	2014-12-10 01:45:04 +00:00
David Majnemer	c64f605e39	DataLayout: Be more verbose when diagnosing problems in pointer specs llvm-svn: 223903	2014-12-10 01:38:28 +00:00
David Majnemer	6ed74720c1	DataLayout: Move asserts over to report_fatal_error As indicated by the tests, it is possible to feed the AsmParser an invalid datalayout string. We should verify the result of parsing this string regardless of whether or not we have assertions enabled. llvm-svn: 223898	2014-12-10 01:17:08 +00:00
Matthias Braun	a4879cff1f	MachineVerifier: Allow physreg use if just a subreg is defined. We can't mark partially undefined registers, so we have to allow reading a register in the machine verifier if just parts of a register are defined. llvm-svn: 223896	2014-12-10 01:13:13 +00:00
Matthias Braun	468dc2a2ae	MachineVerifier: Allow LiveInterval segments to end at a partial write. In the subregister liveness tracking case we do not create implicit reads on partial register writes anymore, still we need to produce a new SSA value for partial writes so the live segment has to end. llvm-svn: 223895	2014-12-10 01:13:11 +00:00
Matthias Braun	127bb01000	VirtRegMap: Improve block live-in info if subregister liveness is available. llvm-svn: 223894	2014-12-10 01:13:08 +00:00
Matthias Braun	dea09a72f9	VirtRegMap: No implicit defs/uses for super registers with subreg liveness tracking. Adding the implicit defs/uses to the superregisters is semantically questionable but was not dangerous before as the register allocator never assigned the same register to two overlapping LiveIntervals even when the actually live subregisters do not overlap. With subregister liveness tracking enabled this does actually happen and leads to subsequent bugs if we don't stop adding the superregister defs/uses. llvm-svn: 223892	2014-12-10 01:13:04 +00:00
Matthias Braun	96b93cde64	LiveRegMatrix: Respect subregister liveness when allocating registers. llvm-svn: 223891	2014-12-10 01:13:01 +00:00
Matthias Braun	1ace5adf9a	LiveIntervalUnion: Allow specification of liverange when unifying/extracting. This allows it to add subregister ranges into the union. llvm-svn: 223890	2014-12-10 01:12:59 +00:00
Matthias Braun	aee137b922	RegisterCoalescer: Preserve subregister liveranges. llvm-svn: 223888	2014-12-10 01:12:52 +00:00
Matthias Braun	1a64f03c3e	LiveInterval: Add removeEmptySubRanges(). llvm-svn: 223887	2014-12-10 01:12:40 +00:00
Matthias Braun	b3f3f853d1	LiveIntervalAnalysis: Add subregister aware variants pruneValue(). llvm-svn: 223886	2014-12-10 01:12:36 +00:00
Matthias Braun	2f6ca57115	Add a flag to enable/disable subregister liveness. llvm-svn: 223884	2014-12-10 01:12:30 +00:00
Matthias Braun	15bc252518	LiveIntervalAnalysis: Adapt repairIntervalsInRange() to subregister liveness. llvm-svn: 223883	2014-12-10 01:12:26 +00:00
Matthias Braun	6a29b7d65d	LiveRangeEdit: Adapt eliminateDeadDef() to subregister liveness. llvm-svn: 223882	2014-12-10 01:12:23 +00:00
Matthias Braun	e99321cb2c	LiveIntervalAnalysis: Adapt handleMove() to subregister ranges. llvm-svn: 223881	2014-12-10 01:12:20 +00:00
Matthias Braun	c0de88f8cf	LiveIntervalAnalysis: Update SubRanges in shrinkToUses(). llvm-svn: 223880	2014-12-10 01:12:18 +00:00
Matthias Braun	811d864c60	LiveIntervalAnalysis: Compute subregister ranges. llvm-svn: 223878	2014-12-10 01:12:12 +00:00
Matthias Braun	40d9c3d4f3	LiveInterval: Add support to track liveness of subregisters. This code adds the required data structures. Algorithms to compute it follow. llvm-svn: 223877	2014-12-10 01:12:10 +00:00
Matthias Braun	8d1fe23470	LiveInterval: Add a 'covers' operation to LiveRange. llvm-svn: 223876	2014-12-10 01:12:06 +00:00
David Majnemer	2d2a18adb4	AsmParser: Don't crash if a null byte is inside a quoted string We don't allow Value* to have names which contain null bytes. The AsmParser should reject .ll files that try to do this. llvm-svn: 223869	2014-12-10 00:43:17 +00:00
Ahmed Bougacha	9f7458d44b	[ARM] Combine base-updating/post-incrementing vector load/stores. We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). Differential Revision: http://reviews.llvm.org/D6585 llvm-svn: 223862	2014-12-10 00:07:37 +00:00
Philip Reames	ef92ec467d	Remove the Module pointer from GCStrategy and GCMetadataPrinter In the current implementation, GCStrategy is a part of the ownership structure for the gc metadata which describes a Module. It also contains a reference to the module in question. As a result, GCStrategy instances are essentially Module specific. I plan to transition away from this design. Instead, a GCStrategy will be owned by the LLVMContext. It will be a lightweight policy object which contains no information about the Modules or Functions involved, but can be easily reached given a Function. The first step in this transition is to remove the direct Module reference from GCStrategy. This also requires removing the single user of this reference, the GCMetadataPrinter hierarchy. In theory, this will allow the lifetime of the printers to be scoped to the LLVMContext as well, but in practice, I'm not actually changing that. (Yet?) An alternate design would have been to move the direct Module reference into the GCMetadataPrinter and change the keying of the owning maps to explicitly key off both GCStrategy and Module. I'm open to doing it that way instead, but didn't see much value in preserving the per Module association for GCMetadataPrinters. The next change in this sequence will be to start unwinding the intertwined ownership between GCStrategy, GCModuleInfo, and GCFunctionInfo. Differential Revision: http://reviews.llvm.org/D6566 llvm-svn: 223859	2014-12-09 23:57:54 +00:00
Duncan P. N. Exon Smith	c0e64c52d3	IR: Fix memory corruption in MDNode new/delete There were two major problems with `MDNode` memory management. 1. `MDNode::operator new()` called a placement array constructor for `MDOperand`. What? Each operand needs to be placed individually. 2. `MDNode::operator delete()` failed to destruct the `MDOperand`s at all. Frankly it's hard to understand how this worked locally, how this survived an LTO bootstrap, or how it worked on most of the bots. llvm-svn: 223858	2014-12-09 23:56:39 +00:00
David Majnemer	c91d74860c	AsmParser: Verifier that the contents of a hex integer are hex llvm-svn: 223856	2014-12-09 23:50:38 +00:00
Kaelyn Takata	b44fdb0f19	Rename static functiom "map" to be more descriptive and to avoid potential confusion with the std::map type. llvm-svn: 223853	2014-12-09 23:32:46 +00:00
Duncan P. N. Exon Smith	9c554e7d55	IR: Metadata: Detect an RAUW recursion Speculatively handle a recursion in `GenericMDNode::handleChangedOperand()`. I'm hoping this fixes the failing hexagon bot [1]. [1]: http://lab.llvm.org:8011/builders/llvm-hexagon-elf/builds/13434 llvm-svn: 223849	2014-12-09 23:04:59 +00:00
Michael Zolotukhin	54137f16c1	Remove redundant variable. Tested by adding assert(LoopVectorPreHeader == VecPreheader) on LLVM test suite and SPECs. llvm-svn: 223847	2014-12-09 22:45:07 +00:00
Colin LeMahieu	a380819b69	[Hexagon] [NFC] Cleaning up unused classes. llvm-svn: 223845	2014-12-09 22:33:26 +00:00
Ahmed Bougacha	b49dc58627	[ARM] Factor out base-updating VLD/VST combiner function. NFC. Move the combiner-state check into another function, add a few small comments, and use a more general type in a cast<>. In preparation for a future patch. llvm-svn: 223834	2014-12-09 21:30:00 +00:00
Ahmed Bougacha	169238fd92	[ARM] Move the store combiner function down. NFC. And flip its final condition. In preparation for a future patch. llvm-svn: 223833	2014-12-09 21:26:53 +00:00
Ahmed Bougacha	80726eea3d	[ARM] Also support v2f64 vld1/vst1. It was missing from the VLD1/VST1 handling logic, even though the corresponding instructions exist (same form as v2i64). In preparation for a future patch. llvm-svn: 223832	2014-12-09 21:25:00 +00:00
Duncan P. N. Exon Smith	2f0a43e16e	IR: Metadata/Value split: RAUW in a deterministic order RAUW in a deterministic order to try to recover the hexagon bot [1], whose tests started failing once my GCC fixes were in for r223802. Otherwise, I'm not sure why tests would fail there and not here. [1]: http://lab.llvm.org:8011/builders/llvm-hexagon-elf/builds/13426 llvm-svn: 223829	2014-12-09 21:12:56 +00:00
Rafael Espindola	2b82749cbe	Return ErrorOr<std::unique_ptr<Archive>> form getAsArchive. This is the same return type of Archive::create. llvm-svn: 223827	2014-12-09 21:05:36 +00:00
Hans Wennborg	d118238f64	Try fixing MSVC build after r223802 LLVM_EXPLICIT is only supported by recent version of MSVC, and it seems the not-so-recent versions get confused about the operator bool() when tryint to resolve operator== calls. This removed the operator bool()'s since they don't seem to be used anyway. llvm-svn: 223824	2014-12-09 20:39:15 +00:00
Colin LeMahieu	442a846f8d	[Hexagon] Fixing broken tests. llvm-svn: 223823	2014-12-09 20:36:53 +00:00
Rafael Espindola	52a8798c70	Rename createIRObjectFile to just create. It is a static method of IRObjectFile, so having to use IRObjectFile::createIRObjectFile was redundant. llvm-svn: 223822	2014-12-09 20:36:13 +00:00
Colin LeMahieu	867128021f	[Hexagon] Updating rr/ri 32/64 transfer encodings and adding tests. llvm-svn: 223821	2014-12-09 20:23:30 +00:00
Juergen Ributzka	a7f0b27412	[FastISel][AArch64] Fix a missing nullptr check in 'computeAddress'. The load/store value type is currently not available when lowering the memcpy intrinsic. Add the missing nullptr check to support this in 'computeAddress'. Fixes rdar://problem/19178947. llvm-svn: 223818	2014-12-09 19:44:38 +00:00
Colin LeMahieu	56e9b3ffa1	[Hexagon] Adding word combine dot-new form and replacing old combine opcode. llvm-svn: 223815	2014-12-09 19:23:45 +00:00
Chandler Carruth	2100e2da37	Revert r223764 which taught instcombine about integer-based elment extraction patterns. This is causing Clang to miscompile itself for 32-bit x86 somehow, and likely also on ARM and PPC. I really don't know how, but reverting now that I've confirmed this is actually the culprit. I have a reproduction as well and so should be able to restore this shortly. This reverts commit r223764. Original commit log follows: Teach instcombine to canonicalize "element extraction" from a load of an integer and "element insertion" into a store of an integer into actual element extraction, element insertion, and vector loads and stores. Previously various parts of LLVM (including instcombine itself) would introduce integer loads and stores into the code as a way of opaquely loading and storing "bits". In some cases (such as a memcpy of std::complex<float> object) we will eventually end up using those bits in non-integer types. In order for SROA to effectively promote the allocas involved, it splits these "store a bag of bits" integer loads and stores up into the constituent parts. However, for non-alloca loads and tsores which remain, it uses integer math to recombine the values into a large integer to load or store. All of this would be "fine", except that it forces LLVM to go through integer math to combine and split up values. While this makes perfect sense for integers (and in fact is critical for bitfields to end up lowering efficiently) it is terrible for non-integer types, especially floating point types. We have a much more canonical way of representing the act of concatenating the bits of two SSA values in LLVM: a vector and insertelement. This patch teaching InstCombine to use this representation. With this patch applied, LLVM will no longer introduce integer math into the critical path of every loop over std::complex<float> operations such as those that make up the hot path of ... oh, most HPC code, Eigen, and any other heavy linear algebra library. For the record, I looked extensively at fixing this in other parts of the compiler, but it just doesn't work: - We really do want to canonicalize memcpy and other bit-motion to integer loads and stores. SSA values are tremendously more powerful than "copy" intrinsics. Not doing this regresses massive amounts of LLVM's scalar optimizer. - We really do need to split up integer loads and stores of this form in SROA or every memcpy of a trivially copyable struct will prevent SSA formation of the members of that struct. It essentially turns off SROA. - The closest alternative is to actually split the loads and stores when partitioning with SROA, but this has all of the downsides historically discussed of splitting up loads and stores -- the wide-store information is fundamentally lost. We would also see performance regressions for bitfield-heavy code and other places where the integers aren't really intended to be split without seemingly arbitrary logic to treat integers totally differently. - We can effectively fix this in instcombine, so it isn't that hard of a choice to make IMO. llvm-svn: 223813	2014-12-09 19:21:16 +00:00
David Majnemer	e1b75899bd	AsmParser: Don't crash on short hex constants for fp128 types If we see 0xL01, treat it like 0xL00000000000000000000000000000001 instead of crashing. llvm-svn: 223811	2014-12-09 19:10:03 +00:00
Frederic Riss	325a4c3cb9	Remove unneeded curly braces. llvm-svn: 223809	2014-12-09 18:57:39 +00:00
Frederic Riss	3c7cb43338	Reorder the code to avoid inserting at the beginning of a vector. As per dblaikie suggestion, thanks\! llvm-svn: 223808	2014-12-09 18:57:34 +00:00
Duncan P. N. Exon Smith	8c28346293	Fix a GCC build failure from r223802 llvm-svn: 223806	2014-12-09 18:52:38 +00:00

... 3 4 5 6 7 ...

75245 Commits