llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00

Author	SHA1	Message	Date
Davide Italiano	063a880856	[SimplifyLibCalls] Add a new transformation: pow(exp(x), y) -> exp(x*y) This one is enabled only under -ffast-math (due to rounding/overflows) but allows us to emit shorter code. Before (on FreeBSD x86-64): 4007f0: 50 push %rax 4007f1: f2 0f 11 0c 24 movsd %xmm1,(%rsp) 4007f6: e8 75 fd ff ff callq 400570 <exp2@plt> 4007fb: f2 0f 10 0c 24 movsd (%rsp),%xmm1 400800: 58 pop %rax 400801: e9 7a fd ff ff jmpq 400580 <pow@plt> 400806: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1) 40080d: 00 00 00 After: 4007b0: f2 0f 59 c1 mulsd %xmm1,%xmm0 4007b4: e9 87 fd ff ff jmpq 400540 <exp2@plt> 4007b9: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) Differential Revision: http://reviews.llvm.org/D14045 llvm-svn: 251976	2015-11-03 20:32:23 +00:00
Simon Pilgrim	ac4c196247	[X86][XOP] Add support for the matching of the VPCMOV bit select instruction XOP has the VPCMOV instruction that performs the common vector bit select operation OR( AND( SRC1, SRC3 ), AND( SRC2, ~SRC3 ) ) This patch adds tablegen pattern matching for this instruction. Differential Revision: http://reviews.llvm.org/D8841 llvm-svn: 251975	2015-11-03 20:27:01 +00:00
Rui Ueyama	2aae8dc2fb	llmv-pdbdump: Make BuiltinDumper shorter. NFC. llvm-svn: 251974	2015-11-03 20:16:18 +00:00
Adam Nemet	b9c59b29d9	[LAA] LLE 2/6: Fix a NoDep case that should be a Forward dependence Summary: When the dependence distance in zero then we have a loop-independent dependence from the earlier to the later access. No current client of LAA uses forward dependences so other than potentially hitting the MaxDependences threshold earlier, this change shouldn't affect anything right now. This and the previous patch were tested together for compile-time regression. None found in LNT/SPEC. Reviewers: hfinkel Subscribers: rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13255 llvm-svn: 251973	2015-11-03 20:13:43 +00:00
Adam Nemet	c168f14a53	[LAA] LLE 1/6: Expose Forward dependences Summary: Before this change, we didn't use to collect forward dependences since none of the current clients (LV, LDist) required them. The motivation to also collect forward dependences is a new pass LoopLoadElimination (LLE) which discovers store-to-load forwarding opportunities across the loop's backedge. The pass uses both lexically forward or backward loop-carried dependences to detect these opportunities. The new pass also analyzes loop-independent (forward) dependences since they can conflict with the loop-carried dependences in terms of how the data flows through memory. The newly added test only covers loop-carried forward dependences because loop-independent ones are currently categorized as NoDep. The next patch will fix this. The two patches were tested together for compile-time regression. None found in LNT/SPEC. Note that with this change LAA provides all dependences rather than just "interesting" ones. A subsequent NFC patch will remove the now trivial isInterestingDependence and rename the APIs. Reviewers: hfinkel Subscribers: jmolloy, rengolin, llvm-commits Differential Revision: http://reviews.llvm.org/D13254 llvm-svn: 251972	2015-11-03 20:13:23 +00:00
Rafael Espindola	dc3ad4835d	Don't create empty sections just to look like gas. We are long past the time when this much bug for bug compatibility was useful. llvm-svn: 251970	2015-11-03 20:02:22 +00:00
Rafael Espindola	454fc24e96	Relax a few more overspecified tests. llvm-svn: 251967	2015-11-03 19:38:19 +00:00
Teresa Johnson	93fae75d76	Revert "Move metadata linking after lazy global materialization/linking." This reverts commit r251926. I believe this is causing an LTO bootstrapping bot failure (http://lab.llvm.org:8080/green/job/llvm-stage2-cmake-RgLTO_build/3669/). Haven't been able to repro it yet, but after looking at the metadata I am pretty sure I know what is going on. llvm-svn: 251965	2015-11-03 19:36:04 +00:00
Rafael Espindola	7d3b847175	Remove unnecessary dependency on section and string positions. llvm-svn: 251964	2015-11-03 19:24:17 +00:00
Kostya Serebryany	6ba411ce7a	[libFuzzer] make -test_single_input more reliable: make sure the input's size is equal to it's capacity llvm-svn: 251961	2015-11-03 18:57:25 +00:00
Rafael Espindola	5583e816fe	Delete dead code. llvm-svn: 251960	2015-11-03 18:55:58 +00:00
Rafael Espindola	7953bd5d91	Simplify local common output. We now create them as they are found and use higher level APIs. This is a step in avoiding creating unnecessary sections. llvm-svn: 251958	2015-11-03 18:50:51 +00:00
Igor Laevsky	691dbb68d2	[CodegenPrepare] Do not rematerialize gc.relocates across different basic blocks Differential Revision: http://reviews.llvm.org/D14258 llvm-svn: 251957	2015-11-03 18:37:40 +00:00
Rafael Espindola	9103955acf	Move code out of a loop and use a range loop. llvm-svn: 251952	2015-11-03 18:04:07 +00:00
Rafael Espindola	7ec60e8686	Revert "Revert "[Orc] Directly emit machine code for the x86 resolver block and trampolines."" This reverts commit r251937. The test was updated to the new API, bring the API back. llvm-svn: 251944	2015-11-03 16:40:37 +00:00
Lang Hames	a08608417e	[Kaleidoscope][Orc] Fix the fully_lazy Orc Kaleidoscope example. r251933 changed the Orc compile callbacks API, which broke this. llvm-svn: 251942	2015-11-03 16:35:10 +00:00
Silviu Baranga	be20c3f9e9	Fix PR25372 - teach replaceCongruentPHIs to handle cases where SE evaluates a PHI to a SCEVConstant Summary: Since now Scalar Evolution can create non-add rec expressions for PHI nodes, it can also create SCEVConstant expressions. This will confuse replaceCongruentPHIs, which previously relied on the fact that SCEV could not produce constants in this case. We will now replace the node with a constant in these cases - or avoid processing the Phi in case of a type mismatch. Reviewers: sanjoy Subscribers: llvm-commits, majnemer Differential Revision: http://reviews.llvm.org/D14230 llvm-svn: 251938	2015-11-03 16:27:04 +00:00
Rafael Espindola	c04851c1d4	Revert "[Orc] Directly emit machine code for the x86 resolver block and trampolines." This reverts commit r251933. It broke the build of examples/Kaleidoscope/Orc/fully_lazy/toy.cpp. llvm-svn: 251937	2015-11-03 16:25:20 +00:00
David Blaikie	7dbba55a46	Kaleidoscope-ch2: Remove the dependence on LLVM by cloning make_unique into this project llvm-svn: 251936	2015-11-03 16:23:21 +00:00
Lang Hames	f8e78a5c05	[Orc] Directly emit machine code for the x86 resolver block and trampolines. Bypassing LLVM for this has a number of benefits: 1) Laziness support becomes asm-syntax agnostic (previously lazy jitting didn't work on Windows as the resolver block was in Darwin asm). 2) For cross-process JITs, it allows resolver blocks and trampolines to be emitted directly in the target process, reducing cross process traffic. 3) It should be marginally faster. llvm-svn: 251933	2015-11-03 16:10:18 +00:00
Teresa Johnson	9cd7a891ec	Move metadata linking after lazy global materialization/linking. Summary: Currently, named metadata is linked before the LazilyLinkGlobalValues list is walked and materialized/linked. As a result, references from DISubprogram and DIGlobalVariable metadata to yet unmaterialized functions and variables cause them to be added to the lazy linking list and their definitions are materialized and linked. This makes the llvm-link -only-needed option not have the intended effect when debug information is present, as the otherwise unneeded functions/variables are still linked in. Additionally, for ThinLTO I have implemented a mechanism to only link in debug metadata needed by imported functions. Moving named metadata linking after lazy GV linking will facilitate applying this mechanism to the LTO and "llvm-link -only-needed" cases as well. Reviewers: dexonsmith, tra, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14195 llvm-svn: 251926	2015-11-03 15:11:27 +00:00
Teresa Johnson	c2f01dd7ba	Pass enum instead of bool to new linkInModule call in llvm-link A new call I added to linkInModule from llvm-link in r251866 was still passing in a boolean for an argument that was changed to an enum in r246561. I didn't catch this in my merge since the bool false matched the flag value it mapped to. llvm-svn: 251925	2015-11-03 15:10:50 +00:00
Filipe Cabecinhas	e813424af3	Don't assert if materializing before seeing any function bodies This assert was reachable from user input. A minimized test case (no FUNCTION_BLOCK_ID record) is attached. Bug found with afl-fuzz llvm-svn: 251910	2015-11-03 13:48:26 +00:00
Filipe Cabecinhas	89688e0631	Don't use Twine objects after their lifetimes end. No test, since it would depend on what the compiler can optimize/reuse. My next commit made this bug visible on Linux Release compiles with some versions of gcc. llvm-svn: 251909	2015-11-03 13:48:21 +00:00
Elena Demikhovsky	45c8421c3f	LoopVectorizer - skip 'bitcast' between GEP and load. Skipping 'bitcast' in this case allows to vectorize load: %arrayidx = getelementptr inbounds double, double* %in, i64 %indvars.iv %tmp53 = bitcast double** %arrayidx to i64* %tmp54 = load i64, i64* %tmp53, align 8 Differential Revision http://reviews.llvm.org/D14112 llvm-svn: 251907	2015-11-03 10:29:34 +00:00
Michael Kuperstein	5991145dd5	[X86] Generate .cfi_adjust_cfa_offset correctly when pushing arguments When push instructions are being used to pass function arguments on the stack, and either EH or debugging are enabled, we need to generate .cfi_adjust_cfa_offset directives appropriately. For (synch) EH, it is enough for the CFA offset to be correct at every call site, while for debugging we want to be correct after every push. Darwin does not support this well, so don't use pushes whenever it would be required. Differential Revision: http://reviews.llvm.org/D13767 llvm-svn: 251904	2015-11-03 08:17:25 +00:00
Igor Breger	207c14b67f	AVX512: add encoding tests for vmovq/d instructions. llvm-svn: 251903	2015-11-03 07:30:17 +00:00
Tobias Grosser	b12c4fa7a9	Revert "[IndVarSimplify] Rewrite loop exit values with their initial values from loop preheader" Commit 251839 triggers miscompiles on some bots: http://lab.llvm.org:8011/builders/perf-x86_64-penryn-O3-polly-fast/builds/13723 (The commit is listed in 13722, but due to an existing failure introduced in 13721 and reverted in 13723 the failure is only visible in 13723) To verify r251839 is indeed the only change that triggered the buildbot failures and to ensure the buildbots remain green while investigating I temporarily revert this commit. At the current state it is unclear if this commit introduced some miscompile or if it only exposed code to Polly that is subsequently miscompiled by Polly. llvm-svn: 251901	2015-11-03 07:14:39 +00:00
Matthias Braun	d6b0ddadb7	Fix build problme introduced in r251883 llvm-svn: 251888	2015-11-03 02:19:07 +00:00
Matthias Braun	3795aef91e	RegisterPressure: Improve assert message llvm-svn: 251885	2015-11-03 01:53:36 +00:00
Matthias Braun	6a0bab40ee	RegisterPressure: Slightly nicer pressure diff dumping llvm-svn: 251884	2015-11-03 01:53:33 +00:00
Matthias Braun	a75cf73a70	ScheduleDAGInstrs: Remove IsPostRA flag; NFC ScheduleDAGInstrs doesn't behave differently before or after register allocation. It was only used in a method of MachineSchedulerBase which behaved differently in MachineScheduler/PostMachineScheduler. Change this to let MachineScheduler/PostMachineScheduler just pass in a parameter to that function. The order of the LiveIntervals* and bool RemoveKillFlags paramters have been switched to make out-of-tree code fail instead of unintentionally passing a value intended for the IsPostRA flag to the (previously following and default initialized) RemoveKillFlags. Differential Revision: http://reviews.llvm.org/D14245 llvm-svn: 251883	2015-11-03 01:53:29 +00:00
Rafael Espindola	2faf05c4dc	Don't implicitly construct a Archive::child_iterator. llvm-svn: 251878	2015-11-03 01:32:40 +00:00
Rafael Espindola	703dd2fb10	This never returns end(), simplify to use Child instead of iterator. NFC. llvm-svn: 251876	2015-11-03 01:20:44 +00:00
Rui Ueyama	c7d5a46f58	llvm-pdbdump: Simplify. NFC. llvm-svn: 251873	2015-11-03 01:04:44 +00:00
Colin LeMahieu	d7cec86ab4	[Hexagon] Fixing mistaken case fallthrough. llvm-svn: 251867	2015-11-03 00:21:19 +00:00
Teresa Johnson	90b0eac682	Restore "Support for ThinLTO function importing and symbol linking." This restores commit r251837, with the new library dependence added to llvm-link/Makefile to address bot failures. llvm-svn: 251866	2015-11-03 00:14:15 +00:00
Kevin Enderby	234f8e1f0f	Allow llvm-nm’s single letter command line flags to be grouped. Which is needed if we want to replace darwin’s nm(1) with llvm-nm as there are many uses of grouped flags. The added test case is one specific case that is in real use. rdar://23337419 llvm-svn: 251864	2015-11-02 23:42:05 +00:00
Matt Arsenault	456805768c	AMDGPU: Stop assuming vreg for build_vector This was causing a variety of test failures when v2i64 is added as a legal type. SIFixSGPRCopies should correctly handle the case of vector inputs to a scalar reg_sequence, so this isn't necessary anymore. This was hiding some deficiencies in how reg_sequence is handled later, but this shouldn't be a problem anymore since the register class copy of a reg_sequence is now done before the reg_sequence. llvm-svn: 251860	2015-11-02 23:30:48 +00:00
Derek Schuff	2e77ccebb9	[WebAssembly] Make WebAssemblyCodeGen depend on WebAssemblyAsmPrinter llvm-svn: 251859	2015-11-02 23:23:16 +00:00
Matt Arsenault	4d9ca98b18	AMDGPU: Error on graphics shaders with HSA I've found myself pointlessly debugging problems from running graphics tests with an HSA triple a few times, so stop this from happening again. llvm-svn: 251858	2015-11-02 23:23:02 +00:00
Sanjay Patel	2ea054f685	[CGP] widen switch condition and case constants to target's register width (2nd try) This is a redo of r251849 except the tests have been split into arch-specific folders to hopefully make the bots happy. This is a follow-up from the discussion in D12965. The block-at-a-time limitation of SelectionDAG also came up in D13297. Without the InstCombine change from D12965, I don't expect this patch to make any difference in the real world because InstCombine does not shrink cases like this in visitSwitchInst(). But we need to have this CGP safety harness in place before proceeding with any shrinkage in D12965, so we won't generate extra extends for compares. I've opted for IR regression tests in the patch because that seems like a clearer way to test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86 will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473 Before: BB#0: mr 4, 3 extsh. 3, 4 ble 0, .LBB0_5 BB#1: cmpwi 3, 99 bgt 0, .LBB0_9 BB#2: rlwinm 4, 4, 0, 16, 31 <--- 32-bit mask/extend li 3, 0 cmplwi 4, 1 beqlr 0 BB#3: cmplwi 4, 10 bne 0, .LBB0_12 BB#4: li 3, 1 blr .LBB0_5: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 65436 beq 0, .LBB0_13 BB#6: cmplwi 3, 65526 beq 0, .LBB0_15 BB#7: cmplwi 3, 65535 bne 0, .LBB0_12 BB#8: li 3, 4 blr .LBB0_9: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 100 beq 0, .LBB0_14 ... After: BB#0: rlwinm 4, 3, 0, 16, 31 <--- mask/extend to 32-bit and then use that for comparisons cmpwi 4, 999 ble 0, .LBB0_5 BB#1: lis 3, 0 ori 3, 3, 65525 cmpw 4, 3 bgt 0, .LBB0_9 BB#2: cmplwi 4, 1000 beq 0, .LBB0_14 BB#3: cmplwi 4, 65436 bne 0, .LBB0_13 BB#4: li 3, 6 blr .LBB0_5: li 3, 0 cmplwi 4, 1 beqlr 0 BB#6: cmplwi 4, 10 beq 0, .LBB0_12 BB#7: cmplwi 4, 100 bne 0, .LBB0_13 BB#8: li 3, 2 blr .LBB0_9: cmplwi 4, 65526 beq 0, .LBB0_15 BB#10: cmplwi 4, 65535 bne 0, .LBB0_13 ... Differential Revision: http://reviews.llvm.org/D13532 llvm-svn: 251857	2015-11-02 23:22:49 +00:00
Matt Arsenault	6d6f62b066	AMDGPU: Un XFAIL a test This should probably be merged with one of the other private memory tests, but it fails on r600. llvm-svn: 251856	2015-11-02 23:15:46 +00:00
Matt Arsenault	6d010fa207	AMDGPU: Distribute SGPR->VGPR copies of REG_SEQUENCE Make the REG_SEQUENCE be a VGPR, and do the register class copy first. llvm-svn: 251855	2015-11-02 23:15:42 +00:00
David Blaikie	0eb4964368	Fix the build I just broke llvm-svn: 251854	2015-11-02 23:10:52 +00:00
David Blaikie	aec8d8648f	Orc: Drop some else-after-return, reflow a few spots, and avoid use of pointee types llvm-svn: 251853	2015-11-02 23:09:38 +00:00
Davide Italiano	6b8a532d6f	[SimplifyLibCalls] Remove variables that are not used. NFC. llvm-svn: 251852	2015-11-02 23:07:14 +00:00
Sanjay Patel	860c632d57	revert r251849; need to move tests to arch-specific folders llvm-svn: 251851	2015-11-02 23:05:20 +00:00
Cong Hou	0b6d5e284f	Add a flag vectorizer-maximize-bandwidth in loop vectorizer to enable using larger vectorization factor. To be able to maximize the bandwidth during vectorization, this patch provides a new flag vectorizer-maximize-bandwidth. When it is turned on, the vectorizer will determine the vectorization factor (VF) using the smallest instead of widest type in the loop. To avoid increasing register pressure too much, estimates of the register usage for different VFs are calculated so that we only choose a VF when its register usage doesn't exceed the number of available registers. This is the second attempt to submit this patch. The first attempt got a test failure on ARM. This patch is updated to try to fix the failure (more specifically, by handling the case when VF=1). Differential revision: http://reviews.llvm.org/D8943 llvm-svn: 251850	2015-11-02 22:53:48 +00:00
Sanjay Patel	8c2ddfb9bd	[CGP] widen switch condition and case constants to target's register width This is a follow-up from the discussion in D12965. The block-at-a-time limitation of SelectionDAG also came up in D13297. Without the InstCombine change from D12965, I don't expect this patch to make any difference in the real world because InstCombine does not shrink cases like this in visitSwitchInst(). But we need to have this CGP safety harness in place before proceeding with any shrinkage in D12965, so we won't generate extra extends for compares. I've opted for IR regression tests in the patch because that seems like a clearer way to test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86 will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473 Before: BB#0: mr 4, 3 extsh. 3, 4 ble 0, .LBB0_5 BB#1: cmpwi 3, 99 bgt 0, .LBB0_9 BB#2: rlwinm 4, 4, 0, 16, 31 <--- 32-bit mask/extend li 3, 0 cmplwi 4, 1 beqlr 0 BB#3: cmplwi 4, 10 bne 0, .LBB0_12 BB#4: li 3, 1 blr .LBB0_5: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 65436 beq 0, .LBB0_13 BB#6: cmplwi 3, 65526 beq 0, .LBB0_15 BB#7: cmplwi 3, 65535 bne 0, .LBB0_12 BB#8: li 3, 4 blr .LBB0_9: rlwinm 3, 4, 0, 16, 31 <--- 32-bit mask/extend cmplwi 3, 100 beq 0, .LBB0_14 ... After: BB#0: rlwinm 4, 3, 0, 16, 31 <--- mask/extend to 32-bit and then use that for comparisons cmpwi 4, 999 ble 0, .LBB0_5 BB#1: lis 3, 0 ori 3, 3, 65525 cmpw 4, 3 bgt 0, .LBB0_9 BB#2: cmplwi 4, 1000 beq 0, .LBB0_14 BB#3: cmplwi 4, 65436 bne 0, .LBB0_13 BB#4: li 3, 6 blr .LBB0_5: li 3, 0 cmplwi 4, 1 beqlr 0 BB#6: cmplwi 4, 10 beq 0, .LBB0_12 BB#7: cmplwi 4, 100 bne 0, .LBB0_13 BB#8: li 3, 2 blr .LBB0_9: cmplwi 4, 65526 beq 0, .LBB0_15 BB#10: cmplwi 4, 65535 bne 0, .LBB0_13 ... Differential Revision: http://reviews.llvm.org/D13532 llvm-svn: 251849	2015-11-02 22:46:24 +00:00

1 2 3 4 5 ...

123275 Commits