llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 04:52:54 +02:00

Author	SHA1	Message	Date
Eric Christopher	da9e98d98a	Fix formatting of comment. llvm-svn: 200420	2014-01-29 22:06:21 +00:00
David Majnemer	c3f8d074b2	MC: Better management of macro arguments The linux kernel makes uses of a GAS `feature' which substitutes nothing for macro arguments which aren't specified. Proper support for these kind of macro arguments necessitated a cleanup of differences between `GAS' and `Darwin' dialect macro processing. Differential Revision: http://llvm-reviews.chandlerc.com/D2634 llvm-svn: 200409	2014-01-29 18:57:46 +00:00
Jordan Rose	990729fc2a	[CommandLine] Aliases require an value if their target requires a value. This can still be overridden by explicitly setting a value requirement on the alias option, but by default it should be the same. PR18649 llvm-svn: 200407	2014-01-29 18:54:17 +00:00
Lang Hames	fbd5c97fba	Add support for PC-relative non-extern relocations to RuntimeDyldMachO. Also replaces testcase for r180790 (support for absolute non-externs relocs) with a more robust version. <rdar://problem/15864721> llvm-svn: 200404	2014-01-29 18:31:35 +00:00
Quentin Colombet	99cdbaf711	[X86][SchedModel] Fix typos in the definitions of the ports for Haswell. llvm-svn: 200403	2014-01-29 18:26:59 +00:00
Oliver Stannard	6bfcc7f53d	Test commit llvm-svn: 200401	2014-01-29 16:01:24 +00:00
Matheus Almeida	67244395fb	[mips][msa] Add fill.d instruction. This instruction is only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200400	2014-01-29 15:12:02 +00:00
Matheus Almeida	3e07e293c7	[mips][msa] Add copy_{u,s}.d. These instructions are only available on Mips64 cores that implement the MSA ASE. llvm-svn: 200398	2014-01-29 14:05:28 +00:00
Chandler Carruth	6ba48b6c38	[LPM] Fix PR18643, another scary place where loop transforms failed to preserve loop simplify of enclosing loops. The problem here starts with LoopRotation which ends up cloning code out of the latch into the new preheader it is buidling. This can create a new edge from the preheader into the exit block of the loop which breaks LoopSimplify form. The code tries to fix this by splitting the critical edge between the latch and the exit block to get a new exit block that only the latch dominates. This sadly isn't sufficient. The exit block may be an exit block for multiple nested loops. When we clone an edge from the latch of the inner loop to the new preheader being built in the outer loop, we create an exiting edge from the outer loop to this exit block. Despite breaking the LoopSimplify form for the inner loop, this is fine for the outer loop. However, when we split the edge from the inner loop to the exit block, we create a new block which is in neither the inner nor outer loop as the new exit block. This is a predecessor to the old exit block, and so the split itself takes the outer loop out of LoopSimplify form. We need to split every edge entering the exit block from inside a loop nested more deeply than the exit block in order to preserve all of the loop simplify constraints. Once we try to do that, a problem with splitting critical edges surfaces. Previously, we tried a very brute force to update LoopSimplify form by re-computing it for all exit blocks. We don't need to do this, and doing this much will sometimes but not always overlap with the LoopRotate bug fix. Instead, the code needs to specifically handle the cases which can start to violate LoopSimplify -- they aren't that common. We need to see if the destination of the split edge was a loop exit block in simplified form for the loop of the source of the edge. For this to be true, all the predecessors need to be in the exact same loop as the source of the edge being split. If the dest block was originally in this form, we have to split all of the deges back into this loop to recover it. The old mechanism of doing this was conservatively correct because at least one of the exiting blocks it rewrote was the DestBB and so the DestBB's predecessors were fixed. But this is a much more targeted way of doing it. Making it targeted is important, because ballooning the set of edges touched prevents LoopRotate from being able to split edges it needs to split to preserve loop simplify in a coherent way -- the critical edge splitting would sometimes find the other edges in need of splitting but not others. Many, many thanks for help from Nick reducing these test cases mightily. And helping lots with the analysis here as this one was quite tricky to track down. llvm-svn: 200393	2014-01-29 13:16:53 +00:00
Renato Golin	6ca0034624	Enable EHABI by default After all hard work to implement the EHABI and with the test-suite passing, it's time to turn it on by default and allow users to disable it as a work-around while we fix the eventual bugs that show up. This commit also remove the -arm-enable-ehabi-descriptors, since we want the tables to be printed every time the EHABI is turned on for non-Darwin ARM targets. Although MCJIT EHABI is not working yet (needs linking with the right libraries), this commit also fixes some relocations on MCJIT regarding the EH tables/lib calls, and update some tests to avoid using EH tables when none are needed. The EH tests in the test-suite that were previously disabled on ARM now pass with these changes, so a follow-up commit on the test-suite will re-enable them. llvm-svn: 200388	2014-01-29 11:50:56 +00:00
Venkatraman Govindaraju	a50ca1f645	[Sparc] Use %r_disp32 for pc_rel entries in FDE as well. This makes MCAsmInfo::getExprForFDESymbol() a virtual function and overrides it in SparcMCAsmInfo. llvm-svn: 200376	2014-01-29 06:59:20 +00:00
NAKAMURA Takumi	782750fa03	Revert r200340, "Add line table debug info to COFF files when using a win32 triple." It was incompatible with --target=i686-win32. llvm-svn: 200375	2014-01-29 06:05:38 +00:00
Venkatraman Govindaraju	1541483555	[Sparc] Use %r_disp32 for pc_rel entries in gcc_except_table and eh_frame. Otherwise, assembler (gas) fails to assemble them with error message "operation combines symbols in different segments". This is because MC computes pc_rel entries with subtract expression between labels from different sections. llvm-svn: 200373	2014-01-29 04:51:35 +00:00
Chandler Carruth	ed726e1be7	[LPM] Fix PR18642, a pretty nasty bug in IndVars that "never mattered" because of the inside-out run of LoopSimplify in the LoopPassManager and the fact that LoopSimplify couldn't be "preserved" across two independent LoopPassManagers. Anyways, in that case, IndVars wasn't correctly preserving an LCSSA PHI node because it thought it was rewriting (via SCEV) the incoming value to a loop invariant value. While it may well be invariant for the current loop, it may be rewritten in terms of an enclosing loop's values. This in and of itself is fine, as the LCSSA PHI node in the enclosing loop for the inner loop value we're rewriting will have its own LCSSA PHI node if used outside of the enclosing loop. With me so far? Well, the current loop and the enclosing loop may share an exiting block and exit block, and when they do they also share LCSSA PHI nodes. In this case, its not valid to RAUW through the LCSSA PHI node. Expected crazy test included. llvm-svn: 200372	2014-01-29 04:40:19 +00:00
Arnold Schwaighofer	5b96c24a7a	LoopVectorizer: Don't count the induction variable multiple times When estimating register pressure, don't count the induction variable mulitple times. It is unlikely to be unrolled. This is currently disabled and hidden behind a flag ("enable-ind-var-reg-heur"). llvm-svn: 200371	2014-01-29 04:36:12 +00:00
Venkatraman Govindaraju	416f0c2389	[SparcV9] Use correct register class (I64RegClass) to hold the address of _GLOBAL_OFFSET_TABLE_ in sparcv9. llvm-svn: 200368	2014-01-29 03:35:08 +00:00
Rafael Espindola	f6087fc40c	Use a raw_stream to implement the mangler. This is a bit more convenient for some callers, but more importantly, it is easier to implement correctly. Doing this removes the patching of already printed data that was used for fastcall, fixing a crash with private fastcall symbols. llvm-svn: 200367	2014-01-29 02:30:38 +00:00
Kevin Qin	379441a4e6	[AArch64 NEON] Lower SELECT_CC with vector operand. When the scalar compare is between floating point and operands are vector, we custom lower SELECT_CC to use NEON SIMD compare for generating less instructions. llvm-svn: 200365	2014-01-29 01:57:30 +00:00
Mark Seaborn	c53fc2913b	Remove unnecessary call to pthread_mutexattr_setpshared() The default value of this attribute is PTHREAD_PROCESS_PRIVATE, so there's no point in calling pthread_mutexattr_setpshared() to set that. See: http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutexattr_getpshared.html This removes some ifdefs that tend to need to be extended for other platforms (e.g. for NaCl). Note that this call was in the first implementation of Mutex, added in r22403, so it doesn't appear to have been added in response to a performance problem. Differential Revision: http://llvm-reviews.chandlerc.com/D2633 llvm-svn: 200360	2014-01-29 00:20:44 +00:00
David Majnemer	e40cc64844	MC: Clean up error paths in AsmParser::parseMacroArgument Use an RAII object Instead of inserting a call to AsmLexer::setSkipSpace(true) in all error paths. No functional change. llvm-svn: 200358	2014-01-29 00:07:39 +00:00
Rafael Espindola	9ade36c920	Make createObjectFile's signature a bit less error prone. This will be better with c++11, but right now file_magic converts to bool, which makes the api really easy to misuse. llvm-svn: 200357	2014-01-29 00:02:26 +00:00
David Woodhouse	0cc2f6368a	[Sparc] Fix breakage in r200345 Oops. Don't do build tests on patches like that with --enable-targets=x86_64 llvm-svn: 200355	2014-01-28 23:38:16 +00:00
David Woodhouse	6c8fefd999	Delete MCSubtargetInfo data members from target MCCodeEmitter classes The subtarget info is explicitly passed to the EncodeInstruction method and we should use that subtarget info to influence any encoding decisions. llvm-svn: 200350	2014-01-28 23:13:25 +00:00
David Woodhouse	a79a37b435	Propagate MCSubtargetInfo through TableGen's getBinaryCodeForInstr() llvm-svn: 200349	2014-01-28 23:13:18 +00:00
David Woodhouse	4a4c611e36	Explictly pass MCSubtargetInfo to MCCodeEmitter::EncodeInstruction() llvm-svn: 200348	2014-01-28 23:13:07 +00:00
David Woodhouse	b9294892a1	Keep the MCSubtargetInfo in the MCRelxableFragment class. Needed to fix PR18303 to correctly re-encode the instruction if it is relaxed. We keep a copy of the MCSubtargetInfo to make sure that we are not effected by future changes to the subtarget info coming from the assembler (e.g. when parsing .code 16 directived). llvm-svn: 200347	2014-01-28 23:12:53 +00:00
David Woodhouse	d7e33ceb85	Modify MCObjectStreamer EmitInstTo* interface Add MCSubtargetInfo parameter virtual void EmitInstToFragment(const MCInst &Inst, const MCSubtargetInfo &); virtual void EmitInstToData(const MCInst &Inst, const MCSubtargetInfo &); llvm-svn: 200346	2014-01-28 23:12:49 +00:00
David Woodhouse	5d0b529d58	Change MCStreamer EmitInstruction interface to take subtarget info llvm-svn: 200345	2014-01-28 23:12:42 +00:00
Timur Iskhodzhanov	53b4a3ded1	Add line table debug info to COFF files when using a win32 triple. Reviewed at http://llvm-reviews.chandlerc.com/D2232 llvm-svn: 200340	2014-01-28 21:33:27 +00:00
Matheus Almeida	6447ece792	[mips] Fix ELF header flags. As opposed to GCC/GAS the default ABI for Mips64 is n64. Compatibility bit should be set if o32 ABI is used when targeting Mips64. llvm-svn: 200332	2014-01-28 19:24:11 +00:00
Gautam Chakrabarti	dcd42d2079	[NVPTX] Fix emitting aggregate parameters The code was missing the case for aggregate parameters and hence was emitting them as .b0 type. Also fixed a couple of comments. llvm-svn: 200325	2014-01-28 18:35:29 +00:00
Andrea Di Biagio	47b83fb85b	[X86] Add extra rules for combining vselect dag nodes into movsd. This improves the fix committed at revision 199683 adding the following new target specific combine rules: 1) fold (v4i32: vselect <0,0,-1,-1>, A, B) -> (v4i32 (bitcast (movsd (v2i64 (bitcast A)), (v2i64 (bitcast B))) )) 2) fold (v4f32: vselect <0,0,-1,-1>, A, B) -> (v4f32 (bitcast (movsd (v2f64 (bitcast A)), (v2f64 (bitcast B))) )) 3) fold (v4i32: vselect <-1,-1,0,0>, A, B) -> (v4i32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) )) 4) fold (v4f32: vselect <-1,-1,0,0>, A, B) -> (v4f32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) )) llvm-svn: 200324	2014-01-28 18:14:21 +00:00
Adrian Prantl	5d80581891	typo llvm-svn: 200323	2014-01-28 18:13:47 +00:00
Rafael Espindola	e8856107f0	Fix pr14893. When simplifycfg moves an instruction, it must drop metadata it doesn't know is still valid with the preconditions changes. In particular, it must drop the range and tbaa metadata. The patch implements this with an utility function to drop all metadata not in a white list. llvm-svn: 200322	2014-01-28 16:56:46 +00:00
Andrea Di Biagio	9e72586184	[DAGCombiner] Avoid introducing an illegal build_vector when folding a sign_extend. Make sure that we don't introduce illegal build_vector dag nodes when trying to fold a sign_extend of a build_vector. This fixes a regression introduced by r200234. Added test CodeGen/X86/fold-vector-sext-crash.ll to verify that llc no longer crashes with an assertion failure due to an illegal build_vector of type MVT::v4i64. Thanks to Ilia Filippov for spotting this regression and for providing a reproducible test case. llvm-svn: 200313	2014-01-28 12:53:56 +00:00
Iain Sandoe	f16c0c72d6	Provide a stub Target Streamer implementation for PPC MachO At present, this handles .tc (error) and needs to be expanded to deal properly with .machine llvm-svn: 200309	2014-01-28 11:03:17 +00:00
Chandler Carruth	6a45efab46	[vectorizer] Completely disable the block frequency guidance of the loop vectorizer, placing it behind an off-by-default flag. It turns out that block frequency isn't what we want at all, here or elsewhere. This has been I think a nagging feeling for several of us working with it, but Arnold has given some really nice simple examples where the results are so comprehensively wrong that they aren't useful. I'm planning to email the dev list with a summary of why its not really useful and a couple of ideas about how to better structure these types of heuristics. llvm-svn: 200294	2014-01-28 09:10:41 +00:00
Hal Finkel	5c72f63cb7	Handle spilling the PPC GPRC_NOR0 register class GPRC_NOR0 is not a subclass of GPRC (because it also contains the ZERO pseudo register). As a result, we also need to check for it in the spilling code. llvm-svn: 200288	2014-01-28 05:32:58 +00:00
Timur Iskhodzhanov	88dd8b3837	MC: Add a .debug section that we'll soon use to emit debug info into COFF files llvm-svn: 200285	2014-01-28 03:48:44 +00:00
Michel Danzer	71542b5f92	R600/SI: Add pattern for truncating i32 to i1 Fixes half a dozen piglit tests with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200283	2014-01-28 03:01:16 +00:00
Jakob Stoklund Olesen	8a060724b3	Fix the DWARF EH encodings for Sparc PIC code. Also emit the stubs that were generated for references to typeinfo symbols. llvm-svn: 200282	2014-01-28 02:52:26 +00:00
Reid Kleckner	c9ab4a9a3b	Update optimization passes to handle inalloca arguments Summary: I searched Transforms/ and Analysis/ for 'ByVal' and updated those call sites to check for inalloca if appropriate. I added tests for any change that would allow an optimization to fire on inalloca. Reviewers: nlewycky Differential Revision: http://llvm-reviews.chandlerc.com/D2449 llvm-svn: 200281	2014-01-28 02:38:36 +00:00
Reid Kleckner	3e9723ef68	x86: add implicit defs for cpuid This avoids miscompiling MS inline asm in LLVM where we have to infer clobbers. Test case forthcoming in Clang. llvm-svn: 200279	2014-01-28 02:08:22 +00:00
Chandler Carruth	b19a7319a9	[LPM] Fix PR18616 where the shifts to the loop pass manager to extract LCSSA from it caused a crasher with the LoopUnroll pass. This crasher is really nasty. We destroy LCSSA form in a suprising way. When unrolling a loop into an outer loop, we not only need to restore LCSSA form for the outer loop, but for all children of the outer loop. This is somewhat obvious in retrospect, but hey! While this seems pretty heavy-handed, it's not that bad. Fundamentally, we only do this when we unroll a loop, which is already a heavyweight operation. We're unrolling all of these hypothetical inner loops as well, so their size and complexity is already on the critical path. This is just adding another pass over them to re-canonicalize. I have a test case from PR18616 that is great for reproducing this, but pretty useless to check in as it relies on many 10s of nested empty loops that get unrolled and deleted in just the right order. =/ What's worse is that investigating this has exposed another source of failure that is likely to be even harder to test. I'll try to come up with test cases for these fixes, but I want to get the fixes into the tree first as they're causing crashes in the wild. llvm-svn: 200273	2014-01-28 01:25:38 +00:00
Juergen Ributzka	8a4f2500be	[TLI] Add a new hook to TargetLowering to query the target if a load of a constant should be converted to simply the constant itself. Before this patch we used getIntImmCost from TargetTransformInfo to determine if a load of a constant should be converted to just a constant, but the threshold for this was set to an arbitrary value. This value works well for the two targets (X86 and ARM) that implement this target-hook, but it isn't target-independent at all. Now targets have the possibility to decide directly if this optimization should be performed. The default value is set to false to preserve the current behavior. The target hook has been moved to TargetLowering, which removed the last use and need of TargetTransformInfo in SelectionDAG. llvm-svn: 200271	2014-01-28 01:20:14 +00:00
Arnold Schwaighofer	8f596e2047	LoopVectorize: Support conditional stores by scalarizing The vectorizer takes a loop like this and widens all instructions except for the store. The stores are scalarized/unrolled and hidden behind an "if" block. for (i = 0; i < 128; ++i) { if (a[i] < 10) a[i] += val; } for (i = 0; i < 128; i+=2) { v = a[i:i+1]; v0 = (extract v, 0) + 10; v1 = (extract v, 1) + 10; if (v0 < 10) a[i] = v0; if (v1 < 10) a[i] = v1; } The vectorizer relies on subsequent optimizations to sink instructions into the conditional block where they are anticipated. The flag "vectorize-num-stores-pred" controls whether and how many stores to handle this way. Vectorization of conditional stores is disabled per default for now. This patch also adds a change to the heuristic when the flag "enable-loadstore-runtime-unroll" is enabled (off by default). It unrolls small loops until load/store ports are saturated. This heuristic uses TTI's getMaxUnrollFactor as a measure for load/store ports. I also added a second flag -enable-cond-stores-vec. It will enable vectorization of conditional stores. But there is no cost model for vectorization of conditional stores in place yet so this will not do good at the moment. rdar://15892953 Results for x86-64 -O3 -mavx +/- -mllvm -enable-loadstore-runtime-unroll -vectorize-num-stores-pred=1 (before the BFI change): Performance Regressions: Benchmarks/Ptrdist/yacr2/yacr2 7.35% (maze3() is identical but 10% slower) Applications/siod/siod 2.18% Performance improvements: mesa -4.42% libquantum -4.15% With a patch that slightly changes the register heuristics (by subtracting the induction variable on both sides of the register pressure equation, as the induction variable is probably not really unrolled): Performance Regressions: Benchmarks/Ptrdist/yacr2/yacr2 7.73% Applications/siod/siod 1.97% Performance Improvements: libquantum -13.05% (we now also unroll quantum_toffoli) mesa -4.27% llvm-svn: 200270	2014-01-28 01:01:53 +00:00
Eric Christopher	2b6e161fce	Revert r199871 and replace it with a simple check in the debug info code to see if we're emitting a function into a non-default text section. This is still a less-than-ideal solution, but more contained than r199871 to determine whether or not we're emitting code into an array of comdat sections. llvm-svn: 200269	2014-01-28 00:49:26 +00:00
Eric Christopher	346de7b82f	Reformat slightly. llvm-svn: 200264	2014-01-27 23:50:03 +00:00
Manman Ren	c3f51e8e54	PGO branch weight: keep halving the weights until they can fit into uint32. When folding branches to common destination, the updated branch weights can exceed uint32 by more than factor of 2. We should keep halving the weights until they can fit into uint32. llvm-svn: 200262	2014-01-27 23:39:03 +00:00
Mark Seaborn	e8973b2061	Fix the "#ifndef HAVE_SYS_WAIT_H" code path in Program.inc to compile Without this fix, WaitResult is not defined. llvm-svn: 200259	2014-01-27 22:53:07 +00:00

1 2 3 4 5 ...

66785 Commits