llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 04:52:54 +02:00

Author	SHA1	Message	Date
NAKAMURA Takumi	782750fa03	Revert r200340, "Add line table debug info to COFF files when using a win32 triple." It was incompatible with --target=i686-win32. llvm-svn: 200375	2014-01-29 06:05:38 +00:00
Venkatraman Govindaraju	1541483555	[Sparc] Use %r_disp32 for pc_rel entries in gcc_except_table and eh_frame. Otherwise, assembler (gas) fails to assemble them with error message "operation combines symbols in different segments". This is because MC computes pc_rel entries with subtract expression between labels from different sections. llvm-svn: 200373	2014-01-29 04:51:35 +00:00
Chandler Carruth	ed726e1be7	[LPM] Fix PR18642, a pretty nasty bug in IndVars that "never mattered" because of the inside-out run of LoopSimplify in the LoopPassManager and the fact that LoopSimplify couldn't be "preserved" across two independent LoopPassManagers. Anyways, in that case, IndVars wasn't correctly preserving an LCSSA PHI node because it thought it was rewriting (via SCEV) the incoming value to a loop invariant value. While it may well be invariant for the current loop, it may be rewritten in terms of an enclosing loop's values. This in and of itself is fine, as the LCSSA PHI node in the enclosing loop for the inner loop value we're rewriting will have its own LCSSA PHI node if used outside of the enclosing loop. With me so far? Well, the current loop and the enclosing loop may share an exiting block and exit block, and when they do they also share LCSSA PHI nodes. In this case, its not valid to RAUW through the LCSSA PHI node. Expected crazy test included. llvm-svn: 200372	2014-01-29 04:40:19 +00:00
Arnold Schwaighofer	5b96c24a7a	LoopVectorizer: Don't count the induction variable multiple times When estimating register pressure, don't count the induction variable mulitple times. It is unlikely to be unrolled. This is currently disabled and hidden behind a flag ("enable-ind-var-reg-heur"). llvm-svn: 200371	2014-01-29 04:36:12 +00:00
Venkatraman Govindaraju	416f0c2389	[SparcV9] Use correct register class (I64RegClass) to hold the address of _GLOBAL_OFFSET_TABLE_ in sparcv9. llvm-svn: 200368	2014-01-29 03:35:08 +00:00
Rafael Espindola	f6087fc40c	Use a raw_stream to implement the mangler. This is a bit more convenient for some callers, but more importantly, it is easier to implement correctly. Doing this removes the patching of already printed data that was used for fastcall, fixing a crash with private fastcall symbols. llvm-svn: 200367	2014-01-29 02:30:38 +00:00
Kevin Qin	379441a4e6	[AArch64 NEON] Lower SELECT_CC with vector operand. When the scalar compare is between floating point and operands are vector, we custom lower SELECT_CC to use NEON SIMD compare for generating less instructions. llvm-svn: 200365	2014-01-29 01:57:30 +00:00
Mark Seaborn	c53fc2913b	Remove unnecessary call to pthread_mutexattr_setpshared() The default value of this attribute is PTHREAD_PROCESS_PRIVATE, so there's no point in calling pthread_mutexattr_setpshared() to set that. See: http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutexattr_getpshared.html This removes some ifdefs that tend to need to be extended for other platforms (e.g. for NaCl). Note that this call was in the first implementation of Mutex, added in r22403, so it doesn't appear to have been added in response to a performance problem. Differential Revision: http://llvm-reviews.chandlerc.com/D2633 llvm-svn: 200360	2014-01-29 00:20:44 +00:00
David Majnemer	e40cc64844	MC: Clean up error paths in AsmParser::parseMacroArgument Use an RAII object Instead of inserting a call to AsmLexer::setSkipSpace(true) in all error paths. No functional change. llvm-svn: 200358	2014-01-29 00:07:39 +00:00
Rafael Espindola	9ade36c920	Make createObjectFile's signature a bit less error prone. This will be better with c++11, but right now file_magic converts to bool, which makes the api really easy to misuse. llvm-svn: 200357	2014-01-29 00:02:26 +00:00
David Woodhouse	0cc2f6368a	[Sparc] Fix breakage in r200345 Oops. Don't do build tests on patches like that with --enable-targets=x86_64 llvm-svn: 200355	2014-01-28 23:38:16 +00:00
David Woodhouse	6c8fefd999	Delete MCSubtargetInfo data members from target MCCodeEmitter classes The subtarget info is explicitly passed to the EncodeInstruction method and we should use that subtarget info to influence any encoding decisions. llvm-svn: 200350	2014-01-28 23:13:25 +00:00
David Woodhouse	a79a37b435	Propagate MCSubtargetInfo through TableGen's getBinaryCodeForInstr() llvm-svn: 200349	2014-01-28 23:13:18 +00:00
David Woodhouse	4a4c611e36	Explictly pass MCSubtargetInfo to MCCodeEmitter::EncodeInstruction() llvm-svn: 200348	2014-01-28 23:13:07 +00:00
David Woodhouse	b9294892a1	Keep the MCSubtargetInfo in the MCRelxableFragment class. Needed to fix PR18303 to correctly re-encode the instruction if it is relaxed. We keep a copy of the MCSubtargetInfo to make sure that we are not effected by future changes to the subtarget info coming from the assembler (e.g. when parsing .code 16 directived). llvm-svn: 200347	2014-01-28 23:12:53 +00:00
David Woodhouse	d7e33ceb85	Modify MCObjectStreamer EmitInstTo* interface Add MCSubtargetInfo parameter virtual void EmitInstToFragment(const MCInst &Inst, const MCSubtargetInfo &); virtual void EmitInstToData(const MCInst &Inst, const MCSubtargetInfo &); llvm-svn: 200346	2014-01-28 23:12:49 +00:00
David Woodhouse	5d0b529d58	Change MCStreamer EmitInstruction interface to take subtarget info llvm-svn: 200345	2014-01-28 23:12:42 +00:00
Timur Iskhodzhanov	53b4a3ded1	Add line table debug info to COFF files when using a win32 triple. Reviewed at http://llvm-reviews.chandlerc.com/D2232 llvm-svn: 200340	2014-01-28 21:33:27 +00:00
Matheus Almeida	6447ece792	[mips] Fix ELF header flags. As opposed to GCC/GAS the default ABI for Mips64 is n64. Compatibility bit should be set if o32 ABI is used when targeting Mips64. llvm-svn: 200332	2014-01-28 19:24:11 +00:00
Gautam Chakrabarti	dcd42d2079	[NVPTX] Fix emitting aggregate parameters The code was missing the case for aggregate parameters and hence was emitting them as .b0 type. Also fixed a couple of comments. llvm-svn: 200325	2014-01-28 18:35:29 +00:00
Andrea Di Biagio	47b83fb85b	[X86] Add extra rules for combining vselect dag nodes into movsd. This improves the fix committed at revision 199683 adding the following new target specific combine rules: 1) fold (v4i32: vselect <0,0,-1,-1>, A, B) -> (v4i32 (bitcast (movsd (v2i64 (bitcast A)), (v2i64 (bitcast B))) )) 2) fold (v4f32: vselect <0,0,-1,-1>, A, B) -> (v4f32 (bitcast (movsd (v2f64 (bitcast A)), (v2f64 (bitcast B))) )) 3) fold (v4i32: vselect <-1,-1,0,0>, A, B) -> (v4i32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) )) 4) fold (v4f32: vselect <-1,-1,0,0>, A, B) -> (v4f32 (bitcast (movsd (v2i64 (bitcast B)), (v2i64 (bitcast A))) )) llvm-svn: 200324	2014-01-28 18:14:21 +00:00
Adrian Prantl	5d80581891	typo llvm-svn: 200323	2014-01-28 18:13:47 +00:00
Rafael Espindola	e8856107f0	Fix pr14893. When simplifycfg moves an instruction, it must drop metadata it doesn't know is still valid with the preconditions changes. In particular, it must drop the range and tbaa metadata. The patch implements this with an utility function to drop all metadata not in a white list. llvm-svn: 200322	2014-01-28 16:56:46 +00:00
Andrea Di Biagio	9e72586184	[DAGCombiner] Avoid introducing an illegal build_vector when folding a sign_extend. Make sure that we don't introduce illegal build_vector dag nodes when trying to fold a sign_extend of a build_vector. This fixes a regression introduced by r200234. Added test CodeGen/X86/fold-vector-sext-crash.ll to verify that llc no longer crashes with an assertion failure due to an illegal build_vector of type MVT::v4i64. Thanks to Ilia Filippov for spotting this regression and for providing a reproducible test case. llvm-svn: 200313	2014-01-28 12:53:56 +00:00
Iain Sandoe	f16c0c72d6	Provide a stub Target Streamer implementation for PPC MachO At present, this handles .tc (error) and needs to be expanded to deal properly with .machine llvm-svn: 200309	2014-01-28 11:03:17 +00:00
Chandler Carruth	6a45efab46	[vectorizer] Completely disable the block frequency guidance of the loop vectorizer, placing it behind an off-by-default flag. It turns out that block frequency isn't what we want at all, here or elsewhere. This has been I think a nagging feeling for several of us working with it, but Arnold has given some really nice simple examples where the results are so comprehensively wrong that they aren't useful. I'm planning to email the dev list with a summary of why its not really useful and a couple of ideas about how to better structure these types of heuristics. llvm-svn: 200294	2014-01-28 09:10:41 +00:00
Hal Finkel	5c72f63cb7	Handle spilling the PPC GPRC_NOR0 register class GPRC_NOR0 is not a subclass of GPRC (because it also contains the ZERO pseudo register). As a result, we also need to check for it in the spilling code. llvm-svn: 200288	2014-01-28 05:32:58 +00:00
Timur Iskhodzhanov	88dd8b3837	MC: Add a .debug section that we'll soon use to emit debug info into COFF files llvm-svn: 200285	2014-01-28 03:48:44 +00:00
Michel Danzer	71542b5f92	R600/SI: Add pattern for truncating i32 to i1 Fixes half a dozen piglit tests with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 200283	2014-01-28 03:01:16 +00:00
Jakob Stoklund Olesen	8a060724b3	Fix the DWARF EH encodings for Sparc PIC code. Also emit the stubs that were generated for references to typeinfo symbols. llvm-svn: 200282	2014-01-28 02:52:26 +00:00
Reid Kleckner	c9ab4a9a3b	Update optimization passes to handle inalloca arguments Summary: I searched Transforms/ and Analysis/ for 'ByVal' and updated those call sites to check for inalloca if appropriate. I added tests for any change that would allow an optimization to fire on inalloca. Reviewers: nlewycky Differential Revision: http://llvm-reviews.chandlerc.com/D2449 llvm-svn: 200281	2014-01-28 02:38:36 +00:00
Reid Kleckner	3e9723ef68	x86: add implicit defs for cpuid This avoids miscompiling MS inline asm in LLVM where we have to infer clobbers. Test case forthcoming in Clang. llvm-svn: 200279	2014-01-28 02:08:22 +00:00
Chandler Carruth	b19a7319a9	[LPM] Fix PR18616 where the shifts to the loop pass manager to extract LCSSA from it caused a crasher with the LoopUnroll pass. This crasher is really nasty. We destroy LCSSA form in a suprising way. When unrolling a loop into an outer loop, we not only need to restore LCSSA form for the outer loop, but for all children of the outer loop. This is somewhat obvious in retrospect, but hey! While this seems pretty heavy-handed, it's not that bad. Fundamentally, we only do this when we unroll a loop, which is already a heavyweight operation. We're unrolling all of these hypothetical inner loops as well, so their size and complexity is already on the critical path. This is just adding another pass over them to re-canonicalize. I have a test case from PR18616 that is great for reproducing this, but pretty useless to check in as it relies on many 10s of nested empty loops that get unrolled and deleted in just the right order. =/ What's worse is that investigating this has exposed another source of failure that is likely to be even harder to test. I'll try to come up with test cases for these fixes, but I want to get the fixes into the tree first as they're causing crashes in the wild. llvm-svn: 200273	2014-01-28 01:25:38 +00:00
Juergen Ributzka	8a4f2500be	[TLI] Add a new hook to TargetLowering to query the target if a load of a constant should be converted to simply the constant itself. Before this patch we used getIntImmCost from TargetTransformInfo to determine if a load of a constant should be converted to just a constant, but the threshold for this was set to an arbitrary value. This value works well for the two targets (X86 and ARM) that implement this target-hook, but it isn't target-independent at all. Now targets have the possibility to decide directly if this optimization should be performed. The default value is set to false to preserve the current behavior. The target hook has been moved to TargetLowering, which removed the last use and need of TargetTransformInfo in SelectionDAG. llvm-svn: 200271	2014-01-28 01:20:14 +00:00
Arnold Schwaighofer	8f596e2047	LoopVectorize: Support conditional stores by scalarizing The vectorizer takes a loop like this and widens all instructions except for the store. The stores are scalarized/unrolled and hidden behind an "if" block. for (i = 0; i < 128; ++i) { if (a[i] < 10) a[i] += val; } for (i = 0; i < 128; i+=2) { v = a[i:i+1]; v0 = (extract v, 0) + 10; v1 = (extract v, 1) + 10; if (v0 < 10) a[i] = v0; if (v1 < 10) a[i] = v1; } The vectorizer relies on subsequent optimizations to sink instructions into the conditional block where they are anticipated. The flag "vectorize-num-stores-pred" controls whether and how many stores to handle this way. Vectorization of conditional stores is disabled per default for now. This patch also adds a change to the heuristic when the flag "enable-loadstore-runtime-unroll" is enabled (off by default). It unrolls small loops until load/store ports are saturated. This heuristic uses TTI's getMaxUnrollFactor as a measure for load/store ports. I also added a second flag -enable-cond-stores-vec. It will enable vectorization of conditional stores. But there is no cost model for vectorization of conditional stores in place yet so this will not do good at the moment. rdar://15892953 Results for x86-64 -O3 -mavx +/- -mllvm -enable-loadstore-runtime-unroll -vectorize-num-stores-pred=1 (before the BFI change): Performance Regressions: Benchmarks/Ptrdist/yacr2/yacr2 7.35% (maze3() is identical but 10% slower) Applications/siod/siod 2.18% Performance improvements: mesa -4.42% libquantum -4.15% With a patch that slightly changes the register heuristics (by subtracting the induction variable on both sides of the register pressure equation, as the induction variable is probably not really unrolled): Performance Regressions: Benchmarks/Ptrdist/yacr2/yacr2 7.73% Applications/siod/siod 1.97% Performance Improvements: libquantum -13.05% (we now also unroll quantum_toffoli) mesa -4.27% llvm-svn: 200270	2014-01-28 01:01:53 +00:00
Eric Christopher	2b6e161fce	Revert r199871 and replace it with a simple check in the debug info code to see if we're emitting a function into a non-default text section. This is still a less-than-ideal solution, but more contained than r199871 to determine whether or not we're emitting code into an array of comdat sections. llvm-svn: 200269	2014-01-28 00:49:26 +00:00
Eric Christopher	346de7b82f	Reformat slightly. llvm-svn: 200264	2014-01-27 23:50:03 +00:00
Manman Ren	c3f51e8e54	PGO branch weight: keep halving the weights until they can fit into uint32. When folding branches to common destination, the updated branch weights can exceed uint32 by more than factor of 2. We should keep halving the weights until they can fit into uint32. llvm-svn: 200262	2014-01-27 23:39:03 +00:00
Mark Seaborn	e8973b2061	Fix the "#ifndef HAVE_SYS_WAIT_H" code path in Program.inc to compile Without this fix, WaitResult is not defined. llvm-svn: 200259	2014-01-27 22:53:07 +00:00
Mark Seaborn	1b9cb5e4a2	ARM MC: Fix the initial DWARF CFI unwind info at the start of a function This brings MC into line with GNU 'as' on ARM, and it brings the ARM target into line with most other LLVM targets, which declare the initial CFI state with addInitialFrameState(). Without this, functions generated with .cfi_startproc/endproc on ARM will tend to cause GDB to abort with: gdb/dwarf2-frame.c:1132: internal-error: Unknown CFA rule. I've also tested this by comparing the output of "readelf -w" on the object files produced by llvm-mc and gas when given the .s file added here. This change is part of addressing PR18636. Differential Revision: http://llvm-reviews.chandlerc.com/D2597 llvm-svn: 200255	2014-01-27 22:38:14 +00:00
Matt Arsenault	0719e41557	Fix sext(setcc) -> select_cc using wrong type for setcc. Also update the comment, since it actually produces a select (setcc) instead of select_cc. It was checking and using the setcc result type for the type of the sext, instead of the type of the compared items. In my problem case, the sext was to i32 and was used as the setcc type, but the expected type was i64. No test since I haven't been able to hit the problem with this on any in-tree targets. llvm-svn: 200249	2014-01-27 21:41:54 +00:00
David Peixotto	84910c53e6	Fix unsupported addressing mode assertion for pld Summary: This commit gives an address mode to the PLD instruction. We were getting an assertion failure in the frame lowering code because we had code that was doing a pld of a stack allocated address. The frame lowering was checking the address mode and then asserting because pld had none defined. This commit fixes pld for arm mode. There was a previous fix for thumb mode in a separate commit. The commit for thumb mode added a test in a separate file because it would otherwise fail for arm. This commit moves the thumb test back into the prefetch.ll file and adds the corresponding arm test. Differential Revision: http://llvm-reviews.chandlerc.com/D2622 llvm-svn: 200248	2014-01-27 21:39:04 +00:00
Gautam Chakrabarti	4fc1ad55a3	test commit: add minor comment llvm-svn: 200244	2014-01-27 20:03:35 +00:00
Andrea Di Biagio	e962698410	[DAGCombiner] Teach how to fold sext/aext/zext of constant build vectors. This patch teaches the DAGCombiner how to fold a sext/aext/zext dag node when the operand in input is a build vector of constants (or UNDEFs). The inability to fold a sext/zext of a constant build_vector was the root cause of some pcg bugs affecting vselect expansion on x86-64 with AVX support. Before this change, the DAGCombiner only knew how to fold a sext/zext/aext of a ConstantSDNode. llvm-svn: 200234	2014-01-27 18:45:30 +00:00
David Majnemer	68ba6d5a5b	MC: Add support for .cfi_startproc simple This commit allows LLVM MC to process .cfi_startproc directives when they are followed by an additional `simple' identifier. This signals to elide the emission of target specific CFI instructions that would normally occur initially. This fixes PR16587. Differential Revision: http://llvm-reviews.chandlerc.com/D2624 llvm-svn: 200227	2014-01-27 17:20:25 +00:00
Chandler Carruth	f70ef7ae29	[vectorize] Initial version of respecting PGO in the vectorizer: treat cold loops as-if they were being optimized for size. Nothing fancy here. Simply test case included. The nice thing is that we can now incrementally build on top of this to drive other heuristics. All of the infrastructure work is done to get the profile information into this layer. The remaining work necessary to make this a fully general purpose loop unroller for very hot loops is to make it a fully general purpose loop unroller. Things I know of but am not going to have time to benchmark and fix in the immediate future: 1) Don't disable the entire pass when the target is lacking vector registers. This really doesn't make any sense any more. 2) Teach the unroller at least and the vectorizer potentially to handle non-if-converted loops. This is trivial for the unroller but hard for the vectorizer. 3) Compute the relative hotness of the loop and thread that down to the various places that make cost tradeoffs (very likely only the unroller makes sense here, and then only when dealing with loops that are small enough for unrolling to not completely blow out the LSD). I'm still dubious how useful hotness information will be. So far, my experiments show that if we can get the correct logic for determining when unrolling actually helps performance, the code size impact is completely unimportant and we can unroll in all cases. But at least we'll no longer burn code size on cold code. One somewhat unrelated idea that I've had forever but not had time to implement: mark all functions which are only reachable via the global constructors rigging in the module as optsize. This would also decrease the impact of any more aggressive heuristics here on code size. llvm-svn: 200219	2014-01-27 13:11:50 +00:00
Benjamin Kramer	65df2371a8	ConstantHoisting: We can't insert instructions directly in front of a PHI node. Insert before the terminating instruction of the dominating block instead. llvm-svn: 200218	2014-01-27 13:11:43 +00:00
Benjamin Kramer	ce3ca2ba83	XCore: Fix typo in function name. llvm-svn: 200216	2014-01-27 11:50:13 +00:00
Chandler Carruth	88d92716dd	[vectorizer] Add an override for the target instruction cost and use it to stabilize a test that really is trying to test generic behavior and not a specific target's behavior. llvm-svn: 200215	2014-01-27 11:41:50 +00:00
Chandler Carruth	eb82628ff7	[vectorizer] Simplify code to use existing helpers on the Function object and fewer pointless variables. Also, add a clarifying comment and a FIXME because the code which disables all vectorization if we can't use implicit floating point instructions just makes no sense at all. llvm-svn: 200214	2014-01-27 11:27:37 +00:00

1 2 3 4 5 ...

66774 Commits