llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 19:12:56 +02:00

Author	SHA1	Message	Date
Craig Topper	aa42014a97	[X86] Add some comments about when some X86 intrinsic autoupgrade code was added. Someday we'd like to remove old autoupgrade code so it helps to annotate how long its been there so we don't have to go digging through commit history. llvm-svn: 348728	2018-12-09 18:02:40 +00:00
Craig Topper	dfe2ba9df9	[X86] If the carry input to an addcarry/subborrow intrinsic is known to be 0, emit a flag setting ADD/SUB instead of ADC/SBB. Previously we had to take the carry in and add -1 to it to set the carry flag so we could use it with ADC/SBB. But if we know its 0 then we don't need to bother. This should go a long way towards fixing PR24545. llvm-svn: 348727	2018-12-09 18:02:37 +00:00
Nico Weber	cb61e3e1b4	Remove unneeded dependency from lib/Target/X86/Utils/ to lib/IR (aka Core). The dependency was added in r213995 in response to r213986 which did make X86/Utils depend on IR, but r256680 later removed that dependency again. llvm-svn: 348724	2018-12-09 15:15:13 +00:00
Sanjay Patel	9b0f938a41	[x86] regenerate test checks; NFC llvm-svn: 348723	2018-12-09 14:47:53 +00:00
Sanjay Patel	6fae56f82e	[x86] don't try to convert add with undef operands to LEA The existing code tries to handle an undef operand while transforming an add to an LEA, but it's incomplete because we will crash on the i16 test with the debug output shown below. It's better to just give up instead. Really, GlobalIsel should have folded these before we could get into trouble. # Machine code for function add_undef_i16: NoPHIs, TracksLiveness, Legalized, RegBankSelected, Selected bb.0 (%ir-block.0): liveins: $edi %1:gr32 = COPY killed $edi %0:gr16 = COPY %1.sub_16bit:gr32 %5:gr64_nosp = IMPLICIT_DEF %5.sub_16bit:gr64_nosp = COPY %0:gr16 %6:gr64_nosp = IMPLICIT_DEF %6.sub_16bit:gr64_nosp = COPY %2:gr16 %4:gr32 = LEA64_32r killed %5:gr64_nosp, 1, killed %6:gr64_nosp, 0, $noreg %3:gr16 = COPY killed %4.sub_16bit:gr32 $ax = COPY killed %3:gr16 RET 0, implicit killed $ax # End machine code for function add_undef_i16. * Bad machine code: Reading virtual register without a def * - function: add_undef_i16 - basic block: %bb.0 (0x7fe6cd83d940) - instruction: %6.sub_16bit:gr64_nosp = COPY %2:gr16 - operand 1: %2:gr16 LLVM ERROR: Found 1 machine code errors. Differential Revision: https://reviews.llvm.org/D54710 llvm-svn: 348722	2018-12-09 14:40:37 +00:00
Simon Pilgrim	5d88c564ae	[X86] Extend pfm counter coverage for llvm-exegesis Extension to rL348617, turns out llvm-exegesis doesn't need to match the perf counter name against a scheduler model resource name - so I've added a few more counters that I could find in the libpfm4 source code (and fix a typo in the knl/knm retired_uops counter - which uses 'all' instead of 'any'). llvm-svn: 348721	2018-12-09 13:45:15 +00:00
Nikita Popov	aa966f7a7a	[X86] Add test for PR39926; NFC The test file shows a case where the avoid store forwarding block pass misses to copy a range (-1..1) when the load displacement changes sign. Baseline test for D55485. llvm-svn: 348712	2018-12-09 12:02:56 +00:00
Martin Storsjo	8328ba8d5f	[COFF] Map truncated .eh_frame section name PE/COFF sections can have section names truncated to 8 chars, in order to have the name available at runtime. (The string table, where long untruncated names are stored, isn't loaded at runtime.) This allows various llvm tools to dump the .eh_frame section from such executables. Patch by Peiyuan Song! Differential Revision: https://reviews.llvm.org/D55407 llvm-svn: 348708	2018-12-08 18:15:41 +00:00
Sanjay Patel	8720d89aac	[DAGCombiner] re-enable truncation of binops This is effectively re-committing the changes from: rL347917 (D54640) rL348195 (D55126) ...which were effectively reverted here: rL348604 ...because the code had a bug that could induce infinite looping or eventual out-of-memory compilation. The bug was that this code did not guard against transforming opaque constants. More details are in the post-commit mailing list thread for r347917. A reduced test for that is included in the x86 bool-math.ll file. (I wasn't able to reduce a PPC backend test for this, but it was almost the same pattern.) Original commit message for r347917: The motivating case for this is shown in: https://bugs.llvm.org/show_bug.cgi?id=32023 and the corresponding rot16.ll regression tests. Because x86 scalar shift amounts are i8 values, we can end up with trunc-binop-trunc sequences that don't get folded in IR. As the TODO comments suggest, there will be regressions if we extend this (for x86, we mostly seem to be missing LEA opportunities, but there are likely vector folds missing too). I think those should be considered existing bugs because this is the same transform that we do as an IR canonicalization in instcombine. We just need more tests to make those visible independent of this patch. llvm-svn: 348706	2018-12-08 16:07:38 +00:00
Sanjay Patel	d3bc67fe6d	[x86] add 32-bit RUN for tests and test with opaque constants; NFC The opaque constant test is reduced from a Chrome file that infinite-looped with rL347917. llvm-svn: 348705	2018-12-08 15:34:09 +00:00
Nico Weber	0f070d70e6	[gn build] Add build files for CodeGen subfolders AsmPrinter, GlobalISel, SelectionDAG. Differential Revision: https://reviews.llvm.org/D55462 llvm-svn: 348704	2018-12-08 10:53:10 +00:00
Heejin Ahn	a6ebf898de	[WebAssembly] Make WasmSymbol's signature usable for events (NFC) Summary: WasmSignature used to use its `WasmSignature` member variable only for function types, but now it also can be used for events as well. Reviewers: sbc100 Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D55247 llvm-svn: 348702	2018-12-08 06:16:13 +00:00
Xing GUO	b0341aa182	[llvm-readobj] Little clean up inside `parseDynamicTable` Summary: This anoymous function actually has same logic with `Obj->toMappedAddr`. Besides, I have a question on resolving illegal value. `gnu-readelf`, `gnu-objdump` and `llvm-objdump` could parse the test file 'test/tools/llvm-objdump/Inputs/private-headers-x86_64.elf', but `llvm-readobj` will fail when parse `DT_RELR` segment. Because, the value is 0x87654321 which is illegal. So, shall we do this clean up rather then remove the checking statements inside anoymous function? ``` if (Delta >= Phdr.p_filesz) return createError("Virtual address is not in any segment"); ``` Reviewers: rupprecht, jhenderson Reviewed By: jhenderson Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55329 llvm-svn: 348701	2018-12-08 05:32:28 +00:00
Nico Weber	f16ea2b84e	[gn build] Merge r348593 llvm-svn: 348671	2018-12-08 00:37:14 +00:00
Craig Topper	fae9ce72c0	[SelectionDAG] Remove ISD::ADDC/ADDE from some undef handling code in getNode. NFCI These nodes should have two results. A real VT and a Glue. But this code would have returned Undef which would only be a single result. But we're in the single result version of getNode so these opcodes should never be seen by this function anyway. llvm-svn: 348670	2018-12-08 00:27:34 +00:00
Nico Weber	59094317e4	[gn build] Add build files for lib/CodeGen, lib/Transforms/..., and lib/Bitcode/Writer Differential Revision: https://reviews.llvm.org/D55454 llvm-svn: 348667	2018-12-08 00:09:56 +00:00
Craig Topper	4045d8c8d6	[X86] Remove the XFAILed test added in r348620 It seems to be unexpectedly passing on some bots probably because it requires asserts to fail, but doesn't say that. But we already have a patch in review to make it not xfail so I'd rather just focus on getting it passing rather than trying to figure out an unexpected pass. llvm-svn: 348661	2018-12-07 22:16:40 +00:00
Matt Arsenault	9d0b0e531a	AMDGPU: Fix offsets for < 4-byte aggregate kernel arguments We were still using the rounded down offset and alignment even though they aren't handled because you can't trivially bitcast the loaded value. llvm-svn: 348658	2018-12-07 22:12:17 +00:00
Jessica Paquette	6ddf71d313	[GlobalISel] Add IR translation support for the @llvm.log10 intrinsic This adds IR translation support for @llvm.log10 and updates relevant tests. https://reviews.llvm.org/D55392 llvm-svn: 348657	2018-12-07 22:08:02 +00:00
Krzysztof Parzyszek	9407b60244	[Hexagon] Fix post-ra expansion of PS_wselect llvm-svn: 348655	2018-12-07 22:00:53 +00:00
George Burgess IV	764c8768bd	[ModuleSummary] use StringRefs to avoid a redundant copy; NFC `Saver` is a StringSaver, which has a few overloads of `save` that all ultimately just call `StringRef save(StringRef)`. Just take a StringRef here instead of building up a std::string to convert it to a StringRef. llvm-svn: 348650	2018-12-07 21:47:32 +00:00
Simon Pilgrim	c31efa0fbe	Fix unused variable warning. NFCI. llvm-svn: 348649	2018-12-07 21:44:25 +00:00
Heejin Ahn	b118993d60	[WebAssembly] clang-format/clang-tidy AsmParser (NFC) Summary: - LLVM clang-format style doesn't allow one-line ifs. - LLVM clang-tidy style says method names should start with a lowercase letter. But currently WebAssemblyAsmParser's parent class MCTargetAsmParser is mixing lowercase and uppercase method names itself so overridden methods cannot be renamed now. - Changed else ifs after returns to ifs. - Added some newlines for readability. Reviewers: aardappel, sbc100 Subscribers: dschuff, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D55350 llvm-svn: 348648	2018-12-07 21:35:37 +00:00
Heejin Ahn	d6762e6368	Delete registerScope function `unregisterScope()` is not currently used, so removing it. llvm-svn: 348647	2018-12-07 21:31:14 +00:00
Pete Cooper	5d6a65eda1	Follow-up from r348441 to add the rest of the objc ARC intrinsics. This adds the other intrinsics used by ARC and codegen's them to their respective runtime methods. llvm-svn: 348646	2018-12-07 21:28:47 +00:00
Nikita Popov	31fa5fa3a2	[MemCpyOpt] memset->memcpy forwarding with undef tail Currently memcpyopt optimizes cases like memset(a, byte, N); memcpy(b, a, M); to memset(a, byte, N); memset(b, byte, M); if M <= N. Often this allows further simplifications down the line, which drop the first memset entirely. This patch extends this optimization for the case where M > N, but we know that the bytes a[N..M] are undef due to alloca/lifetime.start. This situation arises relatively often for Rust code, because Rust does not initialize trailing structure padding and loves to insert redundant memcpys. This also fixes https://bugs.llvm.org/show_bug.cgi?id=39844. For the implementation, I'm reusing a bit of code for a similar existing optimization (direct memcpy of undef). I've also added memset support to MemDepAnalysis GetLocation -- Instead, getPointerDependencyFrom could be used, but it seems to make more sense to add this to GetLocation and thus make the computation cachable. Differential Revision: https://reviews.llvm.org/D55120 llvm-svn: 348645	2018-12-07 21:16:58 +00:00
Nikita Popov	6fc1e6bf13	[MemCpyOpt] Add tests for memset->memcpy forwaring with undef tail; NFC These are baseline tests for D55120. llvm-svn: 348644	2018-12-07 21:16:52 +00:00
Matt Arsenault	e7b289f0e4	AMDGPU: Use gfx9 instead of gfx8 in a test They are the same for the purposes of the tests, but it's much easier to write check lines for the memory instructions with offsets. llvm-svn: 348643	2018-12-07 20:57:43 +00:00
Vedant Kumar	2b29fba381	[HotColdSplitting] Refine definition of unlikelyExecuted The splitting pass uses its 'unlikelyExecuted' predicate to statically decide which blocks are cold. - Do not treat noreturn calls as if they are cold unless they are actually marked cold. This is motivated by functions like exit() and longjmp(), which are not beneficial to outline. - Do not treat inline asm as an outlining barrier. In practice asm("") is frequently used to inhibit basic block merging; enabling outlining in this case results in substantial memory savings. - Treat invokes of cold functions as cold. As a drive-by, remove the 'exceptionHandlingFunctions' predicate, because it's no longer needed. The pass can identify & outline blocks dominated by EH pads, so there's no need to special-case __cxa_begin_catch etc. Differential Revision: https://reviews.llvm.org/D54244 llvm-svn: 348640	2018-12-07 20:24:04 +00:00
Vedant Kumar	2c542e21a7	[HotColdSplitting] Outline more than once per function Algorithm: Identify maximal cold regions and put them in a worklist. If a candidate region overlaps with another, discard it. While the worklist is full, remove a single-entry sub-region from the worklist and attempt to outline it. By the non-overlap property, this should not invalidate parts of the domtree pertaining to other outlining regions. Testing: LNT results on X86 are clean. With test-suite + externals, llvm outlines 134KB pre-patch, and 352KB post-patch (+ ~2.6x). The file 483.xalancbmk/src/Constants.cpp stands out as an extreme case where llvm outlines over 100 times in some functions (mostly EH paths). There was not a significant performance impact pre vs. post-patch. Differential Revision: https://reviews.llvm.org/D53887 llvm-svn: 348639	2018-12-07 20:23:52 +00:00
Michael Trent	bc00ad23c1	Update the Swift version numbers reported by objdump Summary: Add Swift 4.1, Swift 4.2, and Swift 5 version numbers to objdump's MachODump's print_imae_info routines. rdar://46548425 Reviewers: pete, lhames, bob.wilson Reviewed By: pete, bob.wilson Subscribers: bob.wilson, llvm-commits Differential Revision: https://reviews.llvm.org/D55442 llvm-svn: 348632	2018-12-07 19:55:03 +00:00
Zachary Turner	8fcc0a84a9	[NativePDB] Reconstruct function declarations from debug info. Previously we would create an lldb::Function object for each function parsed, but we would not add these to the clang AST. This is a first step towards getting local variable support working, as we first need an AST decl so that when we create local variable entries, they have the proper DeclContext. Differential Revision: https://reviews.llvm.org/D55384 llvm-svn: 348631	2018-12-07 19:34:02 +00:00
Sam Clegg	b6e4e22ceb	[llvm-tapi] Don't try to override SequenceTraits for std::string For some reason this doesn't seem to work with LLVM_LINK_LLVM_DYLIB build. See https://logs.chromium.org/logs/chromium/bb/client.wasm.llvm/linux/37764/+/recipes/steps/LLVM_regression_tests/0/stdout What is more it seems that overriding these traits for core types (including std::string) is not supported/recommend by YAMLTraits.h. See line 1918 which has the assertion: "only use LLVM_YAML_IS_SEQUENCE_VECTOR for types you control" Differential Revision: https://reviews.llvm.org/D55381 llvm-svn: 348630	2018-12-07 19:29:00 +00:00
Sanjay Patel	09eb5f94fc	[DAGCombiner] split trunc from extend in hoistLogicOpWithSameOpcodeHands; NFC This duplicates several shared checks, but we need to split this up to fix underlying bugs in smaller steps. llvm-svn: 348627	2018-12-07 18:51:08 +00:00
Simon Pilgrim	06f9d40ab4	[X86] Replace instregex with instrs list. NFCI. llvm-svn: 348626	2018-12-07 18:47:05 +00:00
Matt Arsenault	ad317823df	AMDGPU: Allow f32 types for llvm.amdgcn.s.buffer.load llvm-svn: 348625	2018-12-07 18:41:39 +00:00
Simon Pilgrim	8595034d33	[llvm-mca][x86] Add RDSEED instruction resource tests for GLM llvm-svn: 348624	2018-12-07 18:37:40 +00:00
Simon Pilgrim	c1cd4c9422	[llvm-mca][x86] Add missing AES instruction resource tests Add missing non-VEX instructions llvm-svn: 348623	2018-12-07 18:35:54 +00:00
Simon Pilgrim	a29cf2ff39	[llvm-mca][x86] Add RDRAND/RDSEED instruction resource tests llvm-svn: 348622	2018-12-07 18:29:47 +00:00
Craig Topper	456baa52f2	[CostModel][X86] Fix overcounting arithmetic cost in illegal types in getArithmeticReductionCost/getMinMaxReductionCost We were overcounting the number of arithmetic operations needed at each level before we reach a legal type. We were using the full vector type for that level, but we are going to split the input vector at that level in half. So the effective arithmetic operation cost at that level is half the width. So for example on 8i32 on an sse target. Were were calculating the cost of an 8i32 op which is likely 2 for basic integer. Then after the loop we count 2 more v4i32 ops. For a total arith cost of 4. But if you look at the assembly there would only be 3 arithmetic ops. There are still more bugs in this code that I'm going to work on next. The non pairwise code shouldn't count extract subvectors in the loop. There are no extracts, the types are split in registers. For pairwise we need to use 2 two src permute shuffles. Differential Revision: https://reviews.llvm.org/D55397 llvm-svn: 348621	2018-12-07 18:20:56 +00:00
Craig Topper	bccb7f13ae	[X86] Initialize and Register X86CondBrFoldingPass To make X86CondBrFoldingPass can be run with --run-pass option, this can test one wrong assertion on analyzeCompare function for SUB32ri when its operand is not imm Patch by Jianping Chen Differential Revision: https://reviews.llvm.org/D55412 llvm-svn: 348620	2018-12-07 18:10:34 +00:00
Matt Arsenault	e910613f08	AMDGPU: Remove llvm.SI.tbuffer.store llvm-svn: 348619	2018-12-07 18:03:47 +00:00
Simon Pilgrim	80dd2fdab9	[X86] Improve pfm counter coverage for llvm-exegesis This patch attempts to improve pfm perf counter coverage for all the x86 CPUs that libpfm4 supports. Intel/AMD CPU families tend to share names for cycle/uops counters so even if they don't have a scheduler model yet they can at least use the default values (checked against the libpfm4 source code). The remaining CPUs (where their port/pipe resource counters are known) I've tried to add to the existing model mappings. These are untested but don't represent a regression to current llvm-exegesis behaviour for these CPUs. Differential Revision: https://reviews.llvm.org/D55432 llvm-svn: 348617	2018-12-07 17:48:40 +00:00
Matt Arsenault	a3f67ed4ba	AMDGPU: Remove llvm.SI.buffer.load.dword llvm-svn: 348616	2018-12-07 17:46:20 +00:00
Matt Arsenault	f4454532b0	AMDGPU: Remove llvm.AMDGPU.kill This is the last of the old AMDGPU intrinsics. llvm-svn: 348615	2018-12-07 17:46:16 +00:00
Sanjay Patel	d4c88a8dd2	[DAGCombiner] disable truncation of binops by default As discussed in the post-commit thread of r347917, this transform is fighting with an existing transform causing an infinite loop or out-of-memory, so this is effectively reverting r347917 and its follow-up r348195 while we investigate the bug. llvm-svn: 348604	2018-12-07 15:47:52 +00:00
Nikita Popov	04e5b3198b	Reapply "[DemandedBits][BDCE] Support vectors of integers" DemandedBits and BDCE currently only support scalar integers. This patch extends them to also handle vector integer operations. In this case bits are not tracked for individual vector elements, instead a bit is demanded if it is demanded for any of the elements. This matches the behavior of computeKnownBits in ValueTracking and SimplifyDemandedBits in InstCombine. Unlike the previous iteration of this patch, getDemandedBits() can now again be called on arbirary (sized) instructions, even if they don't have integer or vector of integer type. (For vector types the size of the returned mask will now be the scalar size in bits though.) The added LoopVectorize test case shows a case which triggered an assertion failure with the previous attempt, because getDemandedBits() was called on a pointer-typed instruction. Differential Revision: https://reviews.llvm.org/D55297 llvm-svn: 348602	2018-12-07 15:38:13 +00:00
Graham Sellers	141144803d	[AMDGPU] Shrink scalar AND, OR, XOR instructions This change attempts to shrink scalar AND, OR and XOR instructions which take an immediate that isn't inlineable. It performs: AND s0, s0, ~(1 << n) -> BITSET0 s0, n OR s0, s0, (1 << n) -> BITSET1 s0, n AND s0, s1, x -> ANDN2 s0, s1, ~x OR s0, s1, x -> ORN2 s0, s1, ~x XOR s0, s1, x -> XNOR s0, s1, ~x In particular, this catches setting and clearing the sign bit for fabs (and x, 0x7ffffffff -> bitset0 x, 31 and or x, 0x80000000 -> bitset1 x, 31). llvm-svn: 348601	2018-12-07 15:33:21 +00:00
Sanjay Patel	bf412c9070	[DAGCombiner] remove explicit calls to AddToWorkList; NFCI As noted in the post-commit thread for rL347917: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181203/608936.html ...we don't need to repeat these calls because the combiner does it automatically. llvm-svn: 348597	2018-12-07 15:00:56 +00:00
Max Kazantsev	be11b6ca1a	Introduce llvm.experimental.widenable_condition intrinsic This patch introduces a new instinsic `@llvm.experimental.widenable_condition` that allows explicit representation for guards. It is an alternative to using `@llvm.experimental.guard` intrinsic that does not contain implicit control flow. We keep finding places where `@llvm.experimental.guard` is not supported or treated too conservatively, and there are 2 reasons to that: - `@llvm.experimental.guard` has memory write side effect to model implicit control flow, and this sometimes confuses passes and analyzes that work with memory; - Not all passes and analysis are aware of the semantics of guards. These passes treat them as regular throwing call and have no idea that the condition of guard may be used to prove something. One well-known place which had caused us troubles in the past is explicit loop iteration count calculation in SCEV. Another example is new loop unswitching which is not aware of guards. Whenever a new pass appears, we potentially have this problem there. Rather than go and fix all these places (and commit to keep track of them and add support in future), it seems more reasonable to leverage the existing optimizer's logic as much as possible. The only significant difference between guards and regular explicit branches is that guard's condition can be widened. It means that a guard contains (explicitly or implicitly) a `deopt` block successor, and it is always legal to go there no matter what the guard condition is. The other successor is a guarded block, and it is only legal to go there if the condition is true. This patch introduces a new explicit form of guards alternative to `@llvm.experimental.guard` intrinsic. Now a widenable guard can be represented in the CFG explicitly like this: %widenable_condition = call i1 @llvm.experimental.widenable.condition() %new_condition = and i1 %cond, %widenable_condition br i1 %new_condition, label %guarded, label %deopt guarded: ; Guarded instructions deopt: call type @llvm.experimental.deoptimize(<args...>) [ "deopt"(<deopt_args...>) ] The new intrinsic `@llvm.experimental.widenable.condition` has semantics of an `undef`, but the intrinsic prevents the optimizer from folding it early. This form should exploit all optimization boons provided to `br` instuction, and it still can be widened by replacing the result of `@llvm.experimental.widenable.condition()` with `and` with any arbitrary boolean value (as long as the branch that is taken when it is `false` has a deopt and has no side-effects). For more motivation, please check llvm-dev discussion "[llvm-dev] Giving up using implicit control flow in guards". This patch introduces this new intrinsic with respective LangRef changes and a pass that converts old-style guards (expressed as intrinsics) into the new form. The naming discussion is still ungoing. Merging this to unblock further items. We can later change the name of this intrinsic. Reviewed By: reames, fedor.sergeev, sanjoy Differential Revision: https://reviews.llvm.org/D51207 llvm-svn: 348593	2018-12-07 14:39:46 +00:00

1 2 3 4 5 ...

172466 Commits