llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 19:52:54 +01:00

Author	SHA1	Message	Date
Max Kazantsev	90a71c8d58	[SCEV][NFC] Remove TBB, FBB parameters from exit limit computations Methods `computeExitLimitFromCondCached` and `computeExitLimitFromCondImpl` take true and false branches as parameters and only use them for asserts and for identifying whether true/false branch belongs to the loop (which can be done once earlier). This fact complicates generalization of exit limit computation logic on guards because the guards don't have blocks to which they go in case of failure explicitly. The motivation of this patch is that currently this part of SCEV knows nothing about guards and only works with explicit branches. As result, it fails to prove that a loop for (i = 0; i < 100; i++) guard(i < 10); exits after 10th iteration, while in the equivalent example for (i = 0; i < 100; i++) if (i >= 10) break; SCEV easily proves this fact. We are going to change it in near future, and this is why we need to make these methods operate on more abstract level. This patch refactors this code to get rid of these parameters as meaningless and prepare ground for teaching these methods to work with guards as well as they work with explicit branching instructions. Differential Revision: https://reviews.llvm.org/D44419 llvm-svn: 327615	2018-03-15 09:38:00 +00:00
Craig Topper	1a9d589ad9	[X86] Add test cases for 512-bit addsub from build_vector. There is no 512 bit addsub instruction, but we partially match it handle fmaddsub matching. We explicitly bail out for 512 bit vectors after failing the fmaddsub match, but we had no test coverage for that bail out. We might want to consider splitting and using 256 bit instructions instead of the long sequence seen here. llvm-svn: 327605	2018-03-15 06:49:01 +00:00
Craig Topper	a1d520cde1	[X86] Add support for matching FMSUBADD from build_vector. llvm-svn: 327604	2018-03-15 06:14:55 +00:00
Craig Topper	a2cacb318c	[X86] Remove old TODO. We have coverage for this now. Coverage was added in r320950. llvm-svn: 327603	2018-03-15 06:14:53 +00:00
Craig Topper	1f3d1c4f17	[X86] Use MVT in a couple places where we know the type is legal. llvm-svn: 327602	2018-03-15 06:14:51 +00:00
Aaron Smith	82d81c0715	[DebugInfo] Add a new method IPDBSession::findLineNumbersBySectOffset Summary: Some PDB symbols do not have a valid VA or RVA but have Addr by Section and Offset. For example, a variable in thread-local storage has the following properties: get_addressOffset: 0 get_addressSection: 5 get_lexicalParentId: 2 get_name: g_tls get_symIndexId: 12 get_typeId: 4 get_dataKind: 6 get_symTag: 7 get_locationType: 2 This change provides a new method to locate line numbers by Section and Offset from those symbols. Reviewers: zturner, rnk, llvm-commits Subscribers: asmith, JDevlieghere Differential Revision: https://reviews.llvm.org/D44407 llvm-svn: 327601	2018-03-15 06:04:51 +00:00
Lei Huang	3d55e8ff4a	[PowerPC][NFC] formatting-only fix llvm-svn: 327599	2018-03-15 03:06:44 +00:00
George Burgess IV	43c5d9b599	Remove unused variable; NFC llvm-svn: 327597	2018-03-15 02:58:36 +00:00
Lang Hames	1e6a98aff9	[ORC] Re-apply r327566 with a fix for test-global-ctors.ll. Also clang-formats the patch, which I should have done the first time around. llvm-svn: 327594	2018-03-15 00:30:14 +00:00
Matt Davis	d26ba2892b	[CleanUp] Remove NumInstructions field from LoopVectorizer's RegisterUsage struct. Summary: This variable is largely going unused; aside from reporting number of instructions for in DEBUG builds. The only use of NumInstructions is in debug output to represent the LoopSize. That value can be can be misleading as it also includes metadata instructions (e.g., DBG_VALUE) which have no real impact. If we do choose to keep this around, we probably should guard it by a DEBUG macro, as it's not used in production builds. Reviewers: majnemer, congh, rengolin Reviewed By: rengolin Subscribers: llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D44495 llvm-svn: 327589	2018-03-14 23:30:31 +00:00
Simon Pilgrim	e2248ef2c0	[X86][Btver2] Add support for multiple pipelines stages for fpu schedules. NFCI. This allows us to use JWriteResFpuPair for complex schedule classes as well as single pipe instructions. llvm-svn: 327588	2018-03-14 23:12:09 +00:00
Sanjay Patel	fd4ca60b1a	[InstSimplify] add tests for frem and vectors with undef; NFC These should all be folded. The vector tests need to have m_AnyZero updated to ignore undef elements, but we need to be careful not to return the existing value in that case and unintentionally propagate undef. llvm-svn: 327585	2018-03-14 22:45:58 +00:00
Mark Searles	06abb9a57c	[AMDGPU] Waitcnt pass: Modify the waitcnt pass to propagate info in the case of a single basic block loop. mergeInputScoreBrackets() does this for us; update it so that it processes the single bb's score bracket when processing the single bb's preds. It is, after all, a pred of itself, so it's score bracket is needed. Differential Revision: https://reviews.llvm.org/D44434 llvm-svn: 327583	2018-03-14 22:04:32 +00:00
Simon Pilgrim	42c74c7697	[X86][Btver2] Add ResourceCycles and NumMicroOps overrides to scalar instructions. NFCI. Currently still use default values - this is setup for a future patch. llvm-svn: 327582	2018-03-14 21:55:54 +00:00
Reid Kleckner	20a3d2184f	[FastISel] Sink local value materializations to first use Summary: Local values are constants, global addresses, and stack addresses that can't be folded into the instruction that uses them. For example, when storing the address of a global variable into memory, we need to materialize that address into a register. FastISel doesn't want to materialize any given local value more than once, so it generates all local value materialization code at EmitStartPt, which always dominates the current insertion point. This allows it to maintain a map of local value registers, and it knows that the local value area will always dominate the current insertion point. The downside is that local value instructions are always emitted without a source location. This is done to prevent jumpy line tables, but it means that the local value area will be considered part of the previous statement. Consider this C code: call1(); // line 1 ++global; // line 2 ++global; // line 3 call2(&global, &local); // line 4 Today we end up with assembly and line tables like this: .loc 1 1 callq call1 leaq global(%rip), %rdi leaq local(%rsp), %rsi .loc 1 2 addq $1, global(%rip) .loc 1 3 addq $1, global(%rip) .loc 1 4 callq call2 The LEA instructions in the local value area have no source location and are treated as being on line 1. Stepping through the code in a debugger and correlating it with the assembly won't make much sense, because these materializations are only required for line 4. This is actually problematic for the VS debugger "set next statement" feature, which effectively assumes that there are no registers live across statement boundaries. By sinking the local value code into the statement and fixing up the source location, we can make that feature work. This was filed as https://bugs.llvm.org/show_bug.cgi?id=35975 and https://crbug.com/793819. This change is obviously not enough to make this feature work reliably in all cases, but I felt that it was worth doing anyway because it usually generates smaller, more comprehensible -O0 code. I measured a 0.12% regression in code generation time with LLC on the sqlite3 amalgamation, so I think this is worth doing. There are some special cases worth calling out in the commit message: 1. local values materialized for phis 2. local values used by no-op casts 3. dead local value code Local values can be materialized for phis, and this does not show up as a vreg use in MachineRegisterInfo. In this case, if there are no other uses, this patch sinks the value to the first terminator, EH label, or the end of the BB if nothing else exists. Local values may also be used by no-op casts, which adds the register to the RegFixups table. Without reversing the RegFixups map direction, we don't have enough information to sink these instructions. Lastly, if the local value register has no other uses, we can delete it. This comes up when fastisel tries two instruction selection approaches and the first materializes the value but fails and the second succeeds without using the local value. Reviewers: aprantl, dblaikie, qcolombet, MatzeB, vsk, echristo Subscribers: dotdash, chandlerc, hans, sdardis, amccarth, javed.absar, zturner, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D43093 llvm-svn: 327581	2018-03-14 21:54:21 +00:00
Francis Visoiu Mistrih	ef0d5db760	[CodeGen] Use MIR syntax for MachineMemOperand printing Get rid of the "; mem:" suffix and use the one we use in MIR: ":: (load 2)". rdar://38163529 Differential Revision: https://reviews.llvm.org/D42377 llvm-svn: 327580	2018-03-14 21:52:13 +00:00
Philip Reames	72785c07ee	[EarlyCSE] Exploit open ended invariant.start scopes If we have an invariant.start with no corresponding invariant.end, then the memory location becomes invariant indefinitely after the invariant.start. As a result, anything dominated by the start is guaranteed to see the value the memory location had when the invariant.start executed. This patch adds an AvailableInvariants table which tracks the generation a particular memory location became invariant and then uses that information to allow value forwarding that would otherwise be disallowed by potentially aliasing stores. (Reminder: In EarlyCSE everything clobbers everything by default.) This should be compatible with the MemorySSA variant, but design is generational. We can and should add first class support for invariant.start within MemorySSA at a later time. I took a quick look at doing so, but probably need some input from a MemorySSA expert. Differential Revision: https://reviews.llvm.org/D43716 llvm-svn: 327577	2018-03-14 21:35:06 +00:00
Reid Kleckner	f02357933a	Revert "[ORC] Switch from shared_ptr to unique_ptr for addModule methods." This reverts commit r327566, it breaks test/ExecutionEngine/OrcMCJIT/test-global-ctors.ll. The test doesn't crash with a stack trace, unfortunately. It merely returns 1 as the exit code. ASan didn't produce a report, and I reproduced this on my Linux machine and Windows box. llvm-svn: 327576	2018-03-14 21:32:34 +00:00
Sanjay Patel	7475763bc6	[InstSimplify] fix folds for (0.0 - X) + X --> 0 (PR27151) As shown in: https://bugs.llvm.org/show_bug.cgi?id=27151 ...the existing fold could miscompile when X is NaN. The fold was also dependent on 'ninf' but that's not necessary. From IEEE-754 (with default rounding which we can assume for these opcodes): "When the sum of two operands with opposite signs (or the difference of two operands with like signs) is exactly zero, the sign of that sum (or difference) shall be +0...However, x + x = x − (−x) retains the same sign as x even when x is zero." llvm-svn: 327575	2018-03-14 21:23:27 +00:00
Simon Pilgrim	8c7fd1029e	[X86] Add haswell testing for PR35635 as well. To improve complete model testing for schedulers for instructions with multiple results. llvm-svn: 327572	2018-03-14 21:03:09 +00:00
Francis Visoiu Mistrih	648e1e248b	[AArch64] Emit CSR loads in the same order as stores Optionally allow the order of restoring the callee-saved registers in the epilogue to be reversed. The flag -reverse-csr-restore-seq generates the following code: ``` stp x26, x25, [sp, #-64]! stp x24, x23, [sp, #16] stp x22, x21, [sp, #32] stp x20, x19, [sp, #48] ; [..] ldp x24, x23, [sp, #16] ldp x22, x21, [sp, #32] ldp x20, x19, [sp, #48] ldp x26, x25, [sp], #64 ret ``` Note how the CSRs are restored in the same order as they are saved. One exception to this rule is the last `ldp`, which allows us to merge the stack adjustment and the ldp into a post-index ldp. This is done by first generating: ldp x26, x27, [sp] add sp, sp, #64 which gets merged by the arm64 load store optimizer into ldp x26, x25, [sp], #64 The flag is disabled by default. llvm-svn: 327569	2018-03-14 20:34:03 +00:00
Lang Hames	481402ffd7	[ORC] Switch from shared_ptr to unique_ptr for addModule methods. Layer implementations typically mutate module state, and this is better reflected by having layers own the Module they are operating on. llvm-svn: 327566	2018-03-14 20:29:45 +00:00
Alexander Richardson	e7d60b130e	[UpdateTestChecks] Handle IR variables with a '-' in the name Summary: I noticed that clang will emit variables such as %indirect-arg-temp when running update_cc1_test_checks.py and therefore update_cc1_test_checks.py wasn't adding FileCheck captures for those variables. Reviewers: MaskRay Reviewed By: MaskRay Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44459 llvm-svn: 327564	2018-03-14 20:28:53 +00:00
Reid Kleckner	990a8a41e0	[MC] Always emit relocations for same-section function references Summary: We already emit relocations in this case when the "incremental linker compatible" flag is set, but it turns out these relocations are also required for /guard:cf. Now that we have two use cases for this behavior, let's make it unconditional to try to keep things simple. We never hit this problem in Clang because it always sets the "incremental linker compatible" flag when targeting MSVC. However, LLD LTO doesn't set this flag, so we'd get CFG failures at runtime when using ThinLTO and /guard:cf. We probably don't want LLD LTO to set the "incremental linker compatible" assembler flag, since this has nothing to do with incremental linking, and we don't need to timestamp LTO temporary objects. Fixes PR36624. Reviewers: inglorion, espindola, majnemer Subscribers: mehdi_amini, llvm-commits, hiraditya Differential Revision: https://reviews.llvm.org/D44485 llvm-svn: 327557	2018-03-14 19:24:32 +00:00
Sanjay Patel	7d97f3022e	[InstSimplify] add tests to show missing/broken fadd folds (PR27151, PR26958); NFC llvm-svn: 327554	2018-03-14 18:52:40 +00:00
Sanjay Patel	47d2ec99f6	[InstSimplify] regenerate checks; NFC llvm-svn: 327553	2018-03-14 18:49:57 +00:00
Reid Kleckner	e307bb0d2f	[LLVM-C] [bindings/go] Add C and Golang bindings for COMDAT Patch by Ben Clayton Differential Revision: https://reviews.llvm.org/D44086 llvm-svn: 327551	2018-03-14 18:33:53 +00:00
Roman Lebedev	903f8b41de	[InstSimplify] [NFC] cast-unsigned-icmp-cmp-0.ll - don't run instcombine As disscussed in post-commit review of D44421, there is simply no reason to run instcombine on this testcase. llvm-svn: 327541	2018-03-14 17:59:12 +00:00
Craig Topper	6caa3d4f85	[X86] Add back fast-isel code for handling i8 shifts. I removed this in r316797 because the coverage report showed no coverage and I thought it should have been handled by the auto generated table. I now see that there is code that bypasses the table if the shift amount is out of bounds. This adds back the code. We'll codegen out of bounds i8 shifts to effectively (amount & 0x1f). The 0x1f is a strange quirk of x86 that shift amounts are always masked to 5-bits(except 64-bits). So if the masked value is still out bounds the result will be 0. Fixes PR36731. llvm-svn: 327540	2018-03-14 17:57:19 +00:00
Fangrui Song	1480f2111a	Fix LLVM IR check lines in utils/update_cc_test_checks.py Reviewers: arichardson Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D44400 llvm-svn: 327538	2018-03-14 17:47:07 +00:00
Roman Lebedev	6ca91d0711	[InstSimplify] [NFC] Add tests for peeking through unsigned FP casts for sign compares (PR36682) Summary: This pattern came up in PR36682 / D44390 https://bugs.llvm.org/show_bug.cgi?id=36682 https://reviews.llvm.org/D44390 https://godbolt.org/g/oKvT5H Looking at the IR pattern in question, as per [[ https://github.com/rutgers-apl/alive-nj \| alive-nj ]], for all the type combinations i checked (input: `i16`, `i32`, `i64`; intermediate: `half`/`i16`, `float`/`i32`, `double`/`i64`) for the following `icmp` comparisons the `uitofp`+`bitcast`+`icmp` can be evaluated to a boolean: * `slt 0` * `sgt -1` I did not check vectors, but i'm guessing it's the same there. {F5889242} Thus all these cases are in the testcase (along with the vector variant with additional `undef` element in the middle). There are no negative patterns here (unless alive-nj lied/is broken), all of these should be optimized. Reviewers: spatel, majnemer, efriedma, arsenm Reviewed By: spatel Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D44421 llvm-svn: 327535	2018-03-14 17:31:08 +00:00
Roman Lebedev	aa77573831	[InstCombine] [NFC] Add tests for peeking through unsigned FP casts for zero-equality compares (PR36682) Summary: This pattern came up in PR36682 / D44390 https://bugs.llvm.org/show_bug.cgi?id=36682 https://reviews.llvm.org/D44390 https://godbolt.org/g/oKvT5H Looking at the IR pattern in question, as per [[ https://github.com/rutgers-apl/alive-nj \| alive-nj ]], for all the type combinations i checked (input: `i16`, `i32`, `i64`; intermediate: `half`/`i16`, `float`/`i32`, `double`/`i64`) for the following `icmp` comparisons the `uitofp`+`bitcast` can be dropped: * `eq 0` * `ne 0` I did not check vectors, but i'm guessing it's the same there. {F5889189} Thus all these cases are in the testcase (along with the vector variant with additional `undef` element in the middle). There are no negative patterns here (unless alive-nj lied/is broken), all of these should be optimized. Generated with {F5889196} Reviewers: spatel, majnemer, efriedma, arsenm Reviewed By: spatel Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D44416 llvm-svn: 327534	2018-03-14 17:31:03 +00:00
Francis Visoiu Mistrih	319b50aebd	[AArch64] Keep track of MIFlags in the LoadStoreOptimizer Merging: * $x26, $x25 = frame-setup LDPXi $sp, 0 * $sp = frame-destroy ADDXri $sp, 64, 0 into an LDPXpost should preserve the flags from both instructions as following: * frame-setup frame-destroy LDPXpost Differential Revision: https://reviews.llvm.org/D44446 llvm-svn: 327533	2018-03-14 17:10:58 +00:00
Craig Topper	d6a1996dad	[X86] Teach X86TargetLowering::targetShrinkDemandedConstant to set non-demanded bits if it helps created an and mask that can be matched as a zero extend. I had to modify the bswap recognition to allow unshrunk masks to make this work. Fixes PR36689. Differential Revision: https://reviews.llvm.org/D44442 llvm-svn: 327530	2018-03-14 16:55:15 +00:00
Nicholas Wilson	9e012a14df	[WebAssembly] Add DenseMap traits and operator== for Wasm type structs Differential Revision: https://reviews.llvm.org/D44303 llvm-svn: 327526	2018-03-14 15:58:03 +00:00
Simon Pilgrim	3488806962	[X86][AVX] Use WriteFShuffleLd for broadcast reg-mem instructions They shouldn't be treated as pure loads. Found while investigating D44428 llvm-svn: 327524	2018-03-14 15:47:08 +00:00
Nicholas Wilson	ca5faab1ba	[WebAssembly] Identify COMDATs by index rather than string. NFC This will enable an optimisation in LLD. Differential Revision: https://reviews.llvm.org/D44343 llvm-svn: 327522	2018-03-14 15:44:45 +00:00
Arnold Schwaighofer	4a17b3df7a	SjLjEHPrepare: Don't reg-to-mem swifterror values swifterror llvm values model the swifterror register as memory at the LLVM IR level. ISel will perform adhoc mem-to-reg on them. swifterror values are constraint in how they can be used. Spilling them to memory is not allowed. SjLjEHPrepare tried to lower swifterror values to memory which is unecessary since the back-end will spill and reload the register as neccessary (as long as clobbering calls are marked as such which is the case here) and further leads to invalid IR because swifterror values can't be stored to memory. rdar://38164004 llvm-svn: 327521	2018-03-14 15:44:07 +00:00
Alexander Ivchenko	4661c1b4e5	[GlobalIsel][X86] Support for G_SDIV instruction Reviewed By: igorb Differential Revision: https://reviews.llvm.org/D44430 llvm-svn: 327520	2018-03-14 15:41:11 +00:00
Sanjay Patel	bfd9d1ac1a	[CodeGen] allow printing of zero latency in sched comments I don't know how to expose this in a test. There are ARM / AArch64 sched classes that include zero latency instructions, but I'm not seeing sched info printed for those targets. X86 will almost certainly have these soon (see PR36671), but no model has 'let Latency = 0' currently. llvm-svn: 327518	2018-03-14 15:28:48 +00:00
Andrea Di Biagio	77ab4ed0a3	[llvm-mca] Remove unused variable from InstrBuilder.cpp. NFC This was causing a buildbot failure. llvm-svn: 327517	2018-03-14 15:19:47 +00:00
Andrea Di Biagio	29d48bb255	[llvm-mca] Move the logic that updates the register files from InstrBuilder to DispatchUnit. NFCI Before this patch, the register file was always updated at instruction creation time. That means, new read-after-write dependencies, and new temporary registers were allocated at instruction creation time. This patch refactors the code in InstrBuilder, and move all the logic that updates the register file into the dispatch unit. We only want to update the register file when instructions are effectively dispatched (not before). This refactoring also helps removing a bad dependency between the InstrBuilder and the DispatchUnit. No functional change intended. llvm-svn: 327514	2018-03-14 14:57:23 +00:00
Petar Jovanovic	0510f66dad	[mips] Add support for CRC ASE This includes Instructions: crc32b, crc32h, crc32w, crc32d, crc32cb, crc32ch, crc32cw, crc32cd Assembler directives: .set crc, .set nocrc, .module crc, .module nocrc Attribute: crc .MIPS.abiflags: CRC (0x8000) Patch by Vladimir Stefanovic. Differential Revision: https://reviews.llvm.org/D44176 llvm-svn: 327511	2018-03-14 14:13:31 +00:00
Simon Pilgrim	2dcfe7cbde	[X86][Btver2] Fix YMM shuffle, permute and permutevar scheduler costs Account for ymm double pumping and add proper pshufb/permutevar support llvm-svn: 327510	2018-03-14 14:05:19 +00:00
Teresa Johnson	487d5b7d62	[LTO/gold] Fix workaround for old plugin-api.h in --wrap support The workaround for older plugin-api.h in r327506 unfortunately used another union member that is also fairly new and not available in the plugin-api.h on some of the bots, leading to: http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/9121/steps/build-stage2-LLVMgold.so/logs/stdio Change to use a different member that we will definitely have (as it is used elsewhere in gold-plugin.cpp already). llvm-svn: 327509	2018-03-14 14:00:57 +00:00
Teresa Johnson	1ab7e0a34c	[LTO/gold] Support --wrap Summary: (Restores r327459 with handling for old plugin-api.h) Utilize new gold plugin api interface for obtaining --wrap option arguments, and LTO API handling (added for --wrap support in lld LTO), to mark symbols so that LTO does not optimize them inappropriately. Note the test cases will be in a new gold test subdirectory that is dependent on the next release of gold which will contain the new interfaces. Reviewers: pcc, tmsriram Subscribers: mehdi_amini, llvm-commits, inglorion Differential Revision: https://reviews.llvm.org/D44235 llvm-svn: 327506	2018-03-14 13:26:18 +00:00
Simon Pilgrim	b11ab1c3fb	[X86][SSE] Use WriteFShuffleLd for MOVDDUP/MOVSHDUP/MOVSLDUP reg-mem instructions They shouldn't be treated as pure loads. Found while investigating D44428 llvm-svn: 327505	2018-03-14 13:22:56 +00:00
Martin Storsjo	ee100c3673	[AArch64] Don't produce R_AARCH64_TLSLE_LDST32_TPREL_LO12_NC Support for this relocation is missing in both LLD and GNU binutils at the moment. This reverts the ELF parts of SVN r327316. llvm-svn: 327503	2018-03-14 13:09:10 +00:00
Simon Pilgrim	141985c26b	Fix 'not all control paths return a value' MSVC warning. NFCI. llvm-svn: 327502	2018-03-14 12:04:51 +00:00
Pavel Labath	bafa38cd94	Fix msvc compiler error in r327498 msvc reports an "illegal indirection" error here. Attempt to appease it with a different initialization syntax. llvm-svn: 327500	2018-03-14 11:31:17 +00:00

1 2 3 4 5 ...

161413 Commits