llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 13:02:52 +02:00

History

Nirav Dave afe2eccae3 In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Retrying after fixing after removing load-store factoring through token factors in favor of improved token factor operand pruning Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 llvm-svn: 289659		2016-12-14 15:44:26 +00:00
..
AsmParser	Replace APFloatBase static fltSemantics data members with getter functions	2016-12-14 11:57:17 +00:00
Disassembler	AMDGPU: Fix handling of 16-bit immediates	2016-12-10 00:39:12 +00:00
InstPrinter	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-12 22:23:53 +00:00
MCTargetDesc	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-12 22:23:53 +00:00
TargetInfo	Move the global variables representing each Target behind accessor function	2016-10-09 23:00:34 +00:00
Utils	AMDGPU: Fix handling of 16-bit immediates	2016-12-10 00:39:12 +00:00
AMDGPU.h	[AMDGPU] Add amdgpu-unify-metadata pass	2016-12-08 19:46:04 +00:00
AMDGPU.td	AMDGPU/SI: Remove XNACK feature from CI	2016-12-09 19:49:58 +00:00
AMDGPUAlwaysInlinePass.cpp	Use StringRef in Pass/PassManager APIs (NFC)	2016-10-01 02:56:57 +00:00
AMDGPUAnnotateKernelFeatures.cpp	Use StringRef in Pass/PassManager APIs (NFC)	2016-10-01 02:56:57 +00:00
AMDGPUAnnotateUniformValues.cpp	[AMDGPU] Scalarization of global uniform loads.	2016-12-08 17:28:47 +00:00
AMDGPUAsmPrinter.cpp	AMDGPU/SI: Don't reserve FLAT_SCR on non-HSA targets & without stack objects	2016-12-09 19:49:48 +00:00
AMDGPUAsmPrinter.h	AMDGPU: Emit runtime metadata as a note element in .note section	2016-11-10 21:18:49 +00:00
AMDGPUCallingConv.td
AMDGPUCallLowering.cpp	GlobalISel: pass Function to lowerFormalArguments directly (NFC).	2016-09-21 12:57:35 +00:00
AMDGPUCallLowering.h	GlobalISel: pass Function to lowerFormalArguments directly (NFC).	2016-09-21 12:57:35 +00:00
AMDGPUCodeGenPrepare.cpp	AMDGPU: Fix crash on i16 constant expression	2016-12-06 23:18:06 +00:00
AMDGPUFrameLowering.cpp	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-09 22:06:55 +00:00
AMDGPUFrameLowering.h	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-09 22:06:55 +00:00
AMDGPUInstrInfo.cpp	MachineScheduler: Export function to construct "default" scheduler.	2016-11-28 20:11:54 +00:00
AMDGPUInstrInfo.h	MachineScheduler: Export function to construct "default" scheduler.	2016-11-28 20:11:54 +00:00
AMDGPUInstrInfo.td	AMDGPU : Add S_SETREG instructions to fix fdiv precision issues.	2016-12-07 02:42:15 +00:00
AMDGPUInstructions.td	[AMDGPU] Add f16 support (VI+)	2016-11-13 07:01:11 +00:00
AMDGPUIntrinsicInfo.cpp
AMDGPUIntrinsicInfo.h
AMDGPUIntrinsics.td
AMDGPUISelDAGToDAG.cpp	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-09 22:06:55 +00:00
AMDGPUISelLowering.cpp	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.	2016-12-14 15:44:26 +00:00
AMDGPUISelLowering.h	AMDGPU : Add S_SETREG instructions to fix fdiv precision issues.	2016-12-07 02:42:15 +00:00
AMDGPUMachineFunction.cpp
AMDGPUMachineFunction.h	AMDGPU/SI: Set correct value for amd_kernel_code_t::kernarg_segment_alignment	2016-12-06 21:53:10 +00:00
AMDGPUMCInstLower.cpp	[AMDGPU] Add wave barrier builtin	2016-11-15 19:00:15 +00:00
AMDGPUMCInstLower.h	Reapply "AMDGPU: Support using tablegened MC pseudo expansions"	2016-10-06 17:19:11 +00:00
AMDGPUOpenCLImageTypeLoweringPass.cpp	Use StringRef in Pass/PassManager APIs (NFC)	2016-10-01 02:56:57 +00:00
AMDGPUPromoteAlloca.cpp	AMDGPU: Fix AMDGPUPromoteAlloca breaking addrspacecasts	2016-12-10 00:52:50 +00:00
AMDGPUPTNote.h	AMDGPU: Emit runtime metadata as a note element in .note section	2016-11-10 21:18:49 +00:00
AMDGPURegisterInfo.cpp
AMDGPURegisterInfo.h
AMDGPURegisterInfo.td
AMDGPURuntimeMetadata.h	AMDGPU: Attempt to fix build failure on x86-64 selfhost build	2016-11-11 02:48:50 +00:00
AMDGPUSubtarget.cpp	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-12 22:23:53 +00:00
AMDGPUSubtarget.h	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-09 22:06:55 +00:00
AMDGPUTargetMachine.cpp	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-12 22:23:53 +00:00
AMDGPUTargetMachine.h	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-09 22:06:55 +00:00
AMDGPUTargetObjectFile.cpp	Target: Change various section classifiers in TargetLoweringObjectFile to take a GlobalObject.	2016-10-24 19:23:39 +00:00
AMDGPUTargetObjectFile.h	Target: Change various section classifiers in TargetLoweringObjectFile to take a GlobalObject.	2016-10-24 19:23:39 +00:00
AMDGPUTargetTransformInfo.cpp	AMDGPU: llvm.amdgcn.interp.mov is a source of divergence	2016-12-12 16:52:19 +00:00
AMDGPUTargetTransformInfo.h	Do a sweep over move ctors and remove those that are identical to the default.	2016-10-20 12:20:28 +00:00
AMDGPUUnifyMetadata.cpp	[AMDGPU] Add amdgpu-unify-metadata pass	2016-12-08 19:46:04 +00:00
AMDILCFGStructurizer.cpp	Use StringRef in Pass/PassManager APIs (NFC)	2016-10-01 02:56:57 +00:00
AMDKernelCodeT.h
BUFInstructions.td	AMDGPU: Add VI i16 support	2016-11-10 16:02:37 +00:00
CaymanInstructions.td
CIInstructions.td	[AMDGPU] Refactor VOP1 and VOP2 instruction TD definitions	2016-09-23 09:08:07 +00:00
CMakeLists.txt	[AMDGPU] Add amdgpu-unify-metadata pass	2016-12-08 19:46:04 +00:00
DSInstructions.td	AMDGPU: Add VI i16 support	2016-11-10 16:02:37 +00:00
EvergreenInstructions.td
FLATInstructions.td	AMDGPU: Rename flat operands to match mubuf	2016-11-29 19:30:44 +00:00
GCNHazardRecognizer.cpp	AMDGPU: Rename flat operands to match mubuf	2016-11-29 19:30:44 +00:00
GCNHazardRecognizer.h	AMDGPU/SI: Handle hazard with s_rfe_b64	2016-10-27 23:50:21 +00:00
GCNSchedStrategy.cpp	AMDGPU/SI: Allow using SGPRs 96-101 on VI	2016-12-09 19:49:40 +00:00
GCNSchedStrategy.h
LLVMBuild.txt
MIMGInstructions.td	[AMDGPU] TableGen: change individual instruction flags to bit type from bits<1>	2016-11-15 13:39:07 +00:00
Processors.td	AMDGPU: Refactor processor definition to use ISA version features	2016-10-26 16:37:56 +00:00
R600ClauseMergePass.cpp	Use StringRef in Pass/PassManager APIs (NFC)	2016-10-01 02:56:57 +00:00
R600ControlFlowFinalizer.cpp	Use StringRef in Pass/PassManager APIs (NFC)	2016-10-01 02:56:57 +00:00
R600Defines.h
R600EmitClauseMarkers.cpp	Use StringRef in Pass/PassManager APIs (NFC)	2016-10-01 02:56:57 +00:00
R600ExpandSpecialInstrs.cpp	Use StringRef in Pass/PassManager APIs (NFC)	2016-10-01 02:56:57 +00:00
R600FrameLowering.cpp	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-12 22:23:53 +00:00
R600FrameLowering.h	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-09 22:06:55 +00:00
R600InstrFormats.td
R600InstrInfo.cpp	Finish renaming remaining analyzeBranch functions	2016-09-14 20:43:16 +00:00
R600InstrInfo.h	Finish renaming remaining analyzeBranch functions	2016-09-14 20:43:16 +00:00
R600Instructions.td	AMDGPU: Refactor exp instructions	2016-12-05 20:23:10 +00:00
R600Intrinsics.td
R600ISelLowering.cpp	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-09 22:06:55 +00:00
R600ISelLowering.h
R600MachineFunctionInfo.cpp
R600MachineFunctionInfo.h
R600MachineScheduler.cpp
R600MachineScheduler.h	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-09 22:06:55 +00:00
R600OptimizeVectorRegisters.cpp	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-12 22:23:53 +00:00
R600Packetizer.cpp	Fix spelling mistakes in AMDGPU target comments. NFC.	2016-11-18 11:04:02 +00:00
R600RegisterInfo.cpp
R600RegisterInfo.h
R600RegisterInfo.td
R600Schedule.td
R700Instructions.td
SIAnnotateControlFlow.cpp	Use StringRef in Pass/PassManager APIs (NFC)	2016-10-01 02:56:57 +00:00
SIDebuggerInsertNops.cpp	Use StringRef in Pass/PassManager APIs (NFC)	2016-10-01 02:56:57 +00:00
SIDefines.h	AMDGPU: Fix handling of 16-bit immediates	2016-12-10 00:39:12 +00:00
SIFixControlFlowLiveIntervals.cpp	Use StringRef in Pass/PassManager APIs (NFC)	2016-10-01 02:56:57 +00:00
SIFixSGPRCopies.cpp	AMDGPU/SI: Don't move copies of immediates to the VALU	2016-12-06 21:13:30 +00:00
SIFoldOperands.cpp	AMDGPU: Fix handling of 16-bit immediates	2016-12-10 00:39:12 +00:00
SIFrameLowering.cpp	AMDGPU: Fix using incorrect private resource with no allocation	2016-10-28 19:43:31 +00:00
SIFrameLowering.h	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-09 22:06:55 +00:00
SIInsertSkips.cpp	AMDGPU: Refactor exp instructions	2016-12-05 20:23:10 +00:00
SIInsertWaits.cpp	AMDGPU: Refactor exp instructions	2016-12-05 20:23:10 +00:00
SIInstrFormats.td	AMDGPU: Fix vintrp disassembly	2016-12-10 00:29:55 +00:00
SIInstrInfo.cpp	AMDGPU: Fix handling of 16-bit immediates	2016-12-10 00:39:12 +00:00
SIInstrInfo.h	AMDGPU: Fix asan errors when folding operands	2016-12-10 19:58:00 +00:00
SIInstrInfo.td	AMDGPU: Fix handling of 16-bit immediates	2016-12-10 00:39:12 +00:00
SIInstructions.td	AMDGPU: Fix handling of 16-bit immediates	2016-12-10 00:39:12 +00:00
SIIntrinsics.td	AMDGPU: Refactor exp instructions	2016-12-05 20:23:10 +00:00
SIISelLowering.cpp	AMDGPU: Fix isTypeDesirableForOp for i16	2016-12-09 17:57:43 +00:00
SIISelLowering.h	AMDGPU: Make f16 ConstantFP legal	2016-12-08 20:14:46 +00:00
SILoadStoreOptimizer.cpp	[AMDGPU][CodeGen] To improve CGEMM performance: combine LDS reads.	2016-11-03 14:37:13 +00:00
SILowerControlFlow.cpp	[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies	2016-11-28 18:58:49 +00:00
SILowerI1Copies.cpp	[AMDGPU] Allow hoisting of comparisons out of a loop and eliminate condition copies	2016-11-28 18:58:49 +00:00
SIMachineFunctionInfo.cpp	AMDGPU/SI: Add support for triples with the mesa3d operating system	2016-09-16 21:34:26 +00:00
SIMachineFunctionInfo.h	[AMDGPU] Wave and register controls	2016-09-06 20:22:28 +00:00
SIMachineScheduler.cpp	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-12 22:23:53 +00:00
SIMachineScheduler.h	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-09 22:06:55 +00:00
SIOptimizeExecMasking.cpp	AMDGPU: Fix use-after-free in SIOptimizeExecMasking	2016-10-07 08:40:14 +00:00
SIRegisterInfo.cpp	AMDGPU: Fix handling of 16-bit immediates	2016-12-10 00:39:12 +00:00
SIRegisterInfo.h	AMDGPU: Fix handling of 16-bit immediates	2016-12-10 00:39:12 +00:00
SIRegisterInfo.td	AMDGPU: Fix handling of 16-bit immediates	2016-12-10 00:39:12 +00:00
SISchedule.td
SIShrinkInstructions.cpp	AMDGPU: Fix handling of 16-bit immediates	2016-12-10 00:39:12 +00:00
SITypeRewriter.cpp	Use StringRef in Pass/PassManager APIs (NFC)	2016-10-01 02:56:57 +00:00
SIWholeQuadMode.cpp	[AMDGPU, PowerPC, TableGen] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).	2016-12-09 22:06:55 +00:00
SMInstructions.td	[AMDGPU] Scalarization of global uniform loads.	2016-12-08 17:28:47 +00:00
SOPInstructions.td	AMDGPU : Add S_SETREG instructions to fix fdiv precision issues.	2016-12-07 02:42:15 +00:00
VIInstrFormats.td	[AMDGPU] Refactor VOP1 and VOP2 instruction TD definitions	2016-09-23 09:08:07 +00:00
VIInstructions.td	AMDGPU: Add VI i16 support	2016-11-10 16:02:37 +00:00
VOP1Instructions.td	Check that emitted instructions meet their predicates on all targets except ARM, Mips, and X86.	2016-11-19 13:05:44 +00:00
VOP2Instructions.td	AMDGPU: Fix handling of 16-bit immediates	2016-12-10 00:39:12 +00:00
VOP3Instructions.td	[AMDGPU] Handle f16 select{_cc}	2016-11-16 03:16:26 +00:00
VOPCInstructions.td	[AMDGPU] Add f16 support (VI+)	2016-11-13 07:01:11 +00:00
VOPInstructions.td	Fix spelling mistakes in AMDGPU target comments. NFC.	2016-11-18 11:04:02 +00:00