llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-01-31 12:41:49 +01:00

Author	SHA1	Message	Date
Chandler Carruth	be6bf6de1d	[PM] [cleanup] Rearrange the public and private sections of this class to be a bit more sensible. The public interface now is first followed by the implementation details. This also resolves a FIXME to make something private -- it was already possible as the one special caller was already a friend. No functionality changed. llvm-svn: 196095	2013-12-02 12:35:56 +00:00
NAKAMURA Takumi	5aa98ebb35	XCoreFrameLowering.cpp: Use [in,out] instead of [in] [out]. [-Wdocumentation] llvm-svn: 196094	2013-12-02 11:31:25 +00:00
NAKAMURA Takumi	ba8c4a443d	[CMake] add_lit_target: Tests should be excluded from "Build Solution". llvm-svn: 196093	2013-12-02 11:31:19 +00:00
Robert Lytton	aec919de4b	XCore target: Make handling of large frames not dependent upon an FP. eliminateFrameIndex() has been reworked to handle both small & large frames with either a FP or SP. An additional Slot is required for Scavenging spills when not using FP for large frames. Reworked the handling of Register Scavenging. Whether we are using an FP or not, whether it is a large frame or not, and whether we are using a large code model or not are now independent. llvm-svn: 196091	2013-12-02 11:05:28 +00:00
Tim Northover	46df9f449d	ARM: add pseudo-instructions for lit-pool global materialisation These are used by MachO only at the moment, and (much like the existing MOVW/MOVT set) work around the fact that the labels used in the actual instructions often contain PC-dependent components, which means that repeatedly materialising the same global can't be CSEed. With small modifications, it could be adapted to how ELF finds the address of _GLOBAL_OFFSET_TABLE_, which would give similar benefits in PIC mode there. llvm-svn: 196090	2013-12-02 10:35:41 +00:00
Benjamin Kramer	d153673433	XCore: Unbreak C++11 build. llvm-svn: 196089	2013-12-02 10:29:26 +00:00
Robert Lytton	7a58a4e90d	XCore target: fix large code model 'select' indirect address handling. llvm-svn: 196088	2013-12-02 10:18:37 +00:00
Robert Lytton	3eb24d0e61	XCore target: Add large code model When using large code model: Global objects larger than 'CodeModelLargeSize' bytes are placed in sections named with a trailing ".large" The folded global address of such objects are lowered into the const pool. During inspection it was noted that LowerConstantPool() was using a default offset of zero. A fix was made, but due to only offsets of zero being generated, testing only verifies the change is not detrimental. Correct the flags emitted for explicitly specified sections. We assume the size of the object queried by getSectionForConstant() is never greater than CodeModelLargeSize. To handle greater than CodeModelLargeSize, changes to AsmPrinter would be required. llvm-svn: 196087	2013-12-02 10:18:31 +00:00
Robert Lytton	c3b700cb09	XCore target: extend tests in preparation llvm-svn: 196086	2013-12-02 10:18:24 +00:00
Robert Lytton	75d72dfcd2	XCore target: Fix eliminateFrameIndex() to handle large frames Large frame offsets are loaded from the ConstantPool. Where possible, offsets are encoded using the smaller MKMSK instruction. Large frame offsets can only be used when there is a frame-pointer. llvm-svn: 196085	2013-12-02 10:18:19 +00:00
Robert Lytton	9c8a9af745	XCore target: Enable frames larger than 65535 to be lowered llvm-svn: 196084	2013-12-02 10:18:14 +00:00
Kostya Serebryany	e71e08d007	[tsan] fix instrumentation of vector vptr updates (https://code.google.com/p/thread-sanitizer/issues/detail?id=43 ) llvm-svn: 196079	2013-12-02 08:07:15 +00:00
Alp Toker	29a5122909	Update the LTO GoldPlugin documentation * Update build instructions to reflect the current source tree layout. * Don't inflict CVS on readers; there's a perfectly good git mirror. * configure with --disable-werror making it possible to build using clang. * ar and nm-new now support the -plugin option. llvm-svn: 196069	2013-12-02 07:15:33 +00:00
Rafael Espindola	29768368be	Remove leftovers from a non-MC asm printer. llvm-svn: 196068	2013-12-02 05:42:16 +00:00
Rafael Espindola	b121e6ba41	Remove #if 0 declarations. llvm-svn: 196067	2013-12-02 05:24:28 +00:00
Rafael Espindola	299ef825a5	Remove dead code. llvm-svn: 196066	2013-12-02 05:10:04 +00:00
Rafael Espindola	427ca8d886	Change the default of AsmWriterClassName and isMCAsmWriter. llvm-svn: 196065	2013-12-02 04:55:42 +00:00
Alp Toker	fcc4ea594d	Rename test with misspelt filename llvm-svn: 196064	2013-12-02 04:31:36 +00:00
Rafael Espindola	2f18f751ff	Remove dead declarations. llvm-svn: 196063	2013-12-02 04:18:19 +00:00
Rafael Espindola	3192965fa3	Refactor for clarity and efficiency. The PPC GetSymbolFromOperand already prefixed stubs of MO_ExternalSymbol, so this should be a nop. llvm-svn: 196059	2013-12-02 03:26:43 +00:00
Rafael Espindola	cf111ab2af	Also test the created stubs on 32 bits. llvm-svn: 196052	2013-12-01 21:24:30 +00:00
Andrew Trick	26c262f3a7	Add -mcpu to stackmap.ll llvm-svn: 196051	2013-12-01 18:17:05 +00:00
Tim Northover	bcd72d7348	ARM: fix bug in -Oz stack adjustment folding Previously, we clobbered callee-saved registers when folding an "add sp, #N" into a "pop {rD, ...}" instruction. This change checks whether a register we're going to add to the "pop" could actually be live outside the function before doing so and should fix the issue. This should fix PR18081. llvm-svn: 196046	2013-12-01 14:16:24 +00:00
Benjamin Kramer	68c312e788	Revamp error checking in the ms inline asm parser. - Actually abort when an error occurred. - Check that the frontend lookup worked when parsing length/size/type operators. Tested by a clang test. PR18096. llvm-svn: 196044	2013-12-01 11:47:42 +00:00
Michael Kuperstein	356b61c610	Ensure bitcode encoding of linkage types stays stable. Patch by Boaz Ouriel llvm-svn: 196042	2013-12-01 10:16:35 +00:00
Bill Wendling	0f97d98496	Use accessor methods instead. llvm-svn: 196006	2013-12-01 03:40:42 +00:00
Bill Wendling	178e2b5358	Use 'unsigned char' to get this past gcc error message: error: invalid conversion from 'unsigned char' to '{anonymous}::Sequence' llvm-svn: 196004	2013-12-01 03:36:07 +00:00
Hal Finkel	725757ccc8	Add a scheduling model (with itinerary) for the PPC POWER7 This adds a scheduling model for the POWER7 (P7) core, and enables the machine-instruction scheduler when targeting the P7. Scheduling for the P7, like earlier ooo PPC cores, requires considering both dispatch group hazards, and functional unit resources and latencies. These are both modeled in a combined itinerary. Dispatch group formation is still handled by the post-RA scheduler (which still needs to be updated for the P7, but nevertheless does a pretty good job). One interesting aspect of this change is that I've also enabled to use of AA duing CodeGen for the P7 (just as it is for the embedded cores). The benchmark results seem to support this decision (see below), and while this is normally useful for in-order cores, and not for ooo cores like the P7, I think that the dispatch slot hazards are enough like in-order resources to make the AA useful. Test suite significant performance differences (where negative is a speedup, and positive is a regression) vs. the current situation: MultiSource/Benchmarks/BitBench/drop3/drop3 with AA: N/A without AA: -28.7614% +/- 19.8356% (significantly against AA) MultiSource/Benchmarks/FreeBench/neural/neural with AA: -17.7406% +/- 11.2712% without AA: N/A (significantly in favor of AA) MultiSource/Benchmarks/SciMark2-C/scimark2 with AA: -11.2079% +/- 1.80543% without AA: -11.3263% +/- 2.79651% MultiSource/Benchmarks/TSVC/Symbolics-flt/Symbolics-flt with AA: -41.8649% +/- 17.0053% without AA: -34.5256% +/- 23.7072% MultiSource/Benchmarks/mafft/pairlocalalign with AA: 25.3016% +/- 17.8614% without AA: 38.6629% +/- 14.9391% (significantly in favor of AA) MultiSource/Benchmarks/sim/sim with AA: N/A without AA: 13.4844% +/- 7.18195% (significantly in favor of AA) SingleSource/Benchmarks/BenchmarkGame/Large/fasta with AA: 15.0664% +/- 6.70216% without AA: 12.7747% +/- 8.43043% SingleSource/Benchmarks/BenchmarkGame/puzzle with AA: 82.2713% +/- 26.3567% without AA: 75.7525% +/- 41.1842% SingleSource/Benchmarks/Misc/flops-2 with AA: -37.1621% +/- 20.7964% without AA: -35.2342% +/- 20.2999% (significantly in favor of AA) These are 99.5% confidence intervals from 5 runs per configuration. Regarding the choice to turn on AA during CodeGen, of these results, four seem significantly in favor of using AA, and one seems significantly against. I'm not making this decision based on these numbers alone, but these results seem consistent with results I have from other tests, and so I think that, on balance, using AA is a win. llvm-svn: 195981	2013-11-30 20:55:12 +00:00
Hal Finkel	fa2b249f38	Split some PPC itinerary classes In preparation for adding scheduling definitions for the POWER7, split some PPC itinerary classes so that the P7's latencies and hazards can be better described. For the most part, this means differentiating indexed from non-index pre-increment loads and stores. Also, differentiate single from double-precision sqrt. No functionality change intended (except for a more-specific latency for single-precision sqrt on the A2). llvm-svn: 195980	2013-11-30 20:41:13 +00:00
Hal Finkel	14673817db	Convert a PPC test from grep to FileCheck Convert this test to FileCheck, and improve it to check for the instructions it is trying to exclude instead of checking for register use (especially because grepping for r1 can be thrown off, for example, by a use of r12). llvm-svn: 195979	2013-11-30 20:04:33 +00:00
Hal Finkel	ded988ca4c	Desensitize a couple of PPC regression tests Use CHECK-DAG to make these regression tests more resilient against changes in instruction scheduling. llvm-svn: 195978	2013-11-30 19:52:28 +00:00
Hal Finkel	1cdcead814	Update the cpu specified on some PPC regression tests Some of these tests did not specify a cpu but were also sensitive to instruction scheduling and/or register assignment choices. A few others similarly-sensitive tests specified a cpu (often the POWER7), and while the P7 currently uses the default model for PPC64, this will soon change. For those tests which should not really be cpu-dependent anyway, the cpu is set to the generic 'ppc64'. llvm-svn: 195977	2013-11-30 19:39:27 +00:00
Zoran Jovanovic	335dc8689e	Test case for issue with microMIPS long branch. llvm-svn: 195976	2013-11-30 19:13:15 +00:00
Zoran Jovanovic	b3e74abf46	Fixed issue with microMIPS long branch. llvm-svn: 195975	2013-11-30 19:12:28 +00:00
Daniel Sanders	65ab9582ba	[mips][msa] MSA loads and stores have a 10-bit offset. Account for this when lowering FrameIndex. This prevents the compiler from emitting invalid ld.[bhwd]'s and st.[bhwd]'s when the stack frame is between 512 and 32,768 bytes in size. llvm-svn: 195973	2013-11-30 13:47:57 +00:00
Daniel Sanders	f397466fb3	[mips][msa] A small refactor to reduce patch noise in my next commit No functional change. An if-statement has been split into two nested if-statements. llvm-svn: 195972	2013-11-30 13:15:21 +00:00
Juergen Ributzka	7150312963	Force CPU type to unbreak unit tests on Haswell machines. llvm-svn: 195971	2013-11-30 03:07:16 +00:00
Andrew Trick	b7e697ed41	Reverse the order of eviction checks for possible compile time savings. No functionality. llvm-svn: 195969	2013-11-29 23:49:38 +00:00
Reed Kotler	95269c69db	Part 1 of 3 patches that completes very long conditional branches in constant islands for Mips16. We introdcuce JalB16 as a synomnym for Jal16. It makes it easier to read and is also necessary because Jal16 is a call instruction but JalB16 is being used as a branch. Various parts of LLVM will not work properly even in this late stage of the backend if we use what was declared as a call instruction to function as a branch. For one, basic block labels may not get emitted in some situations. llvm-svn: 195968	2013-11-29 22:32:56 +00:00
Zoran Jovanovic	b8cffe14c6	Revert revision 195965. llvm-svn: 195967	2013-11-29 22:10:02 +00:00
Petar Jovanovic	f12c338160	mips: XFAIL llvm-cov test XFAIL llvm-cov.test for MIPS until big-endian issues are fixed for llvm-cov. The test does pass on MIPS little-endian. llvm-svn: 195966	2013-11-29 21:59:09 +00:00
Zoran Jovanovic	797919cb22	Fixed issue with microMIPS long branch. llvm-svn: 195965	2013-11-29 21:41:24 +00:00
Hal Finkel	2d7cc00415	Adjust PPC A2 input operand latencies On the PPC A2, instructions are only issued after their input operands are ready. Model this by specifying that input operands are read at dispatch (0 cycles after issue). This changes all input operand latencies from 1 to 0. Significant test-suite performance changes (these are 99.5% confidence intervals on 6 runs for both before and after): speedups: MultiSource/Benchmarks/sim/sim -1.21915% +/- 0.175063% MultiSource/Benchmarks/TSVC/LinearDependence-flt/LinearDependence-flt -1.23946% +/- 1.05133% SingleSource/Benchmarks/Misc/flops-2 -1.24237% +/- 0.681362% MultiSource/Applications/JM/lencod/lencod -1.33992% +/- 0.757498% MultiSource/Benchmarks/TSVC/InductionVariable-flt/InductionVariable-flt -1.51802% +/- 1.21468% MultiSource/Benchmarks/TSVC/GlobalDataFlow-flt/GlobalDataFlow-flt -2.18818% +/- 1.28605% MultiSource/Benchmarks/TSVC/Packing-flt/Packing-flt -2.21977% +/- 1.19499% SingleSource/Benchmarks/BenchmarkGame/spectral-norm -2.29822% +/- 0.671871% MultiSource/Benchmarks/TSVC/Packing-dbl/Packing-dbl -2.40975% +/- 0.355931% SingleSource/Benchmarks/Misc/fp-convert -2.41899% +/- 1.04751% MultiSource/Benchmarks/TSVC/Searching-dbl/Searching-dbl -2.50349% +/- 0.126765% SingleSource/Benchmarks/Misc/flops-3 -3.00214% +/- 0.700795% MultiSource/Benchmarks/TSVC/LoopRestructuring-flt/LoopRestructuring-flt -3.56995% +/- 3.2929% MultiSource/Applications/sgefa/sgefa -4.24908% +/- 2.00413% MultiSource/Benchmarks/ASC_Sequoia/IRSmk/IRSmk -18.1294% +/- 3.96489% regressions: MultiSource/Benchmarks/TSVC/Reductions-dbl/Reductions-dbl 1.03249% +/- 0.178547% MultiSource/Applications/hexxagon/hexxagon 1.16597% +/- 0.285235% MultiSource/Benchmarks/TSVC/IndirectAddressing-flt/IndirectAddressing-flt 1.39576% +/- 1.07855% SingleSource/Benchmarks/Misc-C++/stepanov_v1p2 1.71539% +/- 0.173182% MultiSource/Benchmarks/Fhourstones-3.1/fhourstones3.1 1.90013% +/- 0.866472% MultiSource/Benchmarks/TSVC/Recurrences-dbl/Recurrences-dbl 2.39854% +/- 1.05914% MultiSource/Benchmarks/TSVC/ControlFlow-dbl/ControlFlow-dbl 2.4402% +/- 0.817904% MultiSource/Benchmarks/TSVC/LoopRestructuring-dbl/LoopRestructuring-dbl 5.87997% +/- 3.3172% MultiSource/Benchmarks/Trimaran/netbench-crc/netbench-crc 9.02643% +/- 5.79591% MultiSource/Benchmarks/VersaBench/bmm/bmm 10.3517% +/- 1.227% Obviously, there are data points on both sides of this; but I think, overall, this supports making the change. llvm-svn: 195951	2013-11-29 07:04:59 +00:00
Lang Hames	7883c4d5ae	Teach LocalStackSlotAllocation that stackmaps/patchpoints don't have range constraints on their frame offsets. llvm-svn: 195950	2013-11-29 06:35:30 +00:00
Hal Finkel	69f21285ed	Create a PPC440 SchedMachineModel Some of the older PPC processor definitions don't have associated SchedMachineModels; correct this for the PPC440. llvm-svn: 195949	2013-11-29 06:32:17 +00:00
Hal Finkel	c5a38fd3e6	Fixup PPC440 load/store operand latencies The operand latencies for loads and stores in the PPC440 itinerary were wrong (the store operands are all inputs, and the "with update" (pre-increment) instructions need a latency for the additional output). llvm-svn: 195948	2013-11-29 06:19:43 +00:00
Hal Finkel	a9d93b1740	Adjust PPC440 operand latencies The operand latencies for the PPC440 should be specified relative to dispatch, not relative to the initial fetch-and-decode stages. Because most instructions (ignoring bypass) wait in dispatch until their operands are ready, this is modeled as reading input operands "at dispatch" (0 cycles after issue), and so every input and output operand has 4 cycles subtracted from it. This could alter scheduling slightly, but I don't expect a large effect. llvm-svn: 195947	2013-11-29 05:59:00 +00:00
Hal Finkel	f086fc01ab	Don't model the fetch and decode units for the PPC440 Modeling the fetch and decode units in the PPC440 itinerary does not add anything to the hazard detection capability (and so modeling them just wastes compile time). No functionality change intended. llvm-svn: 195946	2013-11-29 05:58:38 +00:00
Lang Hames	82e8d4faa9	Remove unused variable from r195944. llvm-svn: 195945	2013-11-29 03:36:53 +00:00
Lang Hames	067c025250	Refactor a lot of patchpoint/stackmap related code to simplify and make it target independent. Most of the x86 specific stackmap/patchpoint handling was necessitated by the use of the native address-mode format for frame index operands. PEI has now been modified to treat stackmap/patchpoint similarly to DEBUG_INFO, allowing us to use a simple, platform independent register/offset pair for frame indexes on stackmap/patchpoints. Notes: - Folding is now platform independent and automatically supported. - Emiting patchpoints with direct memory references now just involves calling the TargetLoweringBase::emitPatchPoint utility method from the target's XXXTargetLowering::EmitInstrWithCustomInserter method. (See X86TargetLowering for an example). - No more ugly platform-specific operand parsers. This patch shouldn't change the generated output for X86. llvm-svn: 195944	2013-11-29 03:07:54 +00:00

1 2 3 4 5 ...

98002 Commits