llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00

Author	SHA1	Message	Date
Derek Schuff	7fe1fbbe81	Revert r155745 llvm-svn: 155746	2012-04-27 23:37:41 +00:00
Derek Schuff	80bd01f406	Fix fastcc structure return with fast-isel on x86-32 On x86-32, structure return via sret lets the callee pop the hidden pointer argument off the stack, which the caller then re-pushes. However if the calling convention is fastcc, then a register is used instead, and the caller should not adjust the stack. This is implemented with a check of IsTailCallConvention X86TargetLowering::LowerCall but is now checked properly in X86FastISel::DoSelectCall. llvm-svn: 155745	2012-04-27 23:27:17 +00:00
Jakob Stoklund Olesen	2cb81f69d6	Track worst case alignment padding more accurately. Previously, ARMConstantIslandPass would conservatively compute the address of an aligned basic block as: RoundUpToAlignment(Offset + UnknownPadding) This worked fine for the layout algorithm itself, but it could fool the verify() function because it accounts for alignment padding twice: Once when adding the worst case UnknownPadding, and again by rounding up the fictional block offset. This meant that when optimizeThumb2Instructions would shrink an instruction, the conservative distance estimate could grow. That shouldn't be possible since the woorst case alignment padding wss already included. This patch drops the use of RoundUpToAlignment, and depends only on worst case padding to compute conservative block offsets. This has the weird effect that the computed offset for an aligned block may not be aligned. The important difference is that shrinking an instruction can never cause the estimated distance between two instructions to grow. The estimated distance is always larger than the real distance that only the assembler knows. <rdar://problem/11339352> llvm-svn: 155744	2012-04-27 22:58:38 +00:00
Andrew Trick	cbe7b03dbe	Temporarily revert r155668: Fix the SD scheduler to avoid gluing. This definitely caused regression with ARM -mno-thumb. llvm-svn: 155743	2012-04-27 22:55:59 +00:00
Craig Topper	5270dd7a71	Use 'unsigned' instead of 'int' in several places when retrieving number of vector elements. llvm-svn: 155742	2012-04-27 22:54:43 +00:00
Chad Rosier	d627fcbf2a	Add x86-specific DAG combine to simplify: x == -y --> x+y == 0 x != -y --> x+y != 0 On x86, the generated code goes from negl %esi cmpl %esi, %edi je .LBB0_2 to addl %esi, %edi je .L4 This case is correctly handled for ARM with "cmn". Patch by Manman Ren. rdar://11245199 PR12545 llvm-svn: 155739	2012-04-27 22:33:25 +00:00
Michael J. Spencer	ff5aec2d9d	[Support/YAMLParser] Fix ASan found bugs. llvm-svn: 155735	2012-04-27 21:12:20 +00:00
Craig Topper	b06100424e	Tidy up spacing. llvm-svn: 155733	2012-04-27 21:05:09 +00:00
Hal Finkel	a565d03d78	Don't vectorize target-specific types (ppc_fp128, x86_fp80, etc.). Target specific types should not be vectorized. As a practical matter, these types are already register matched (at least in the x86 case), and codegen does not always work correctly (at least in the ppc case, and this is not worth fixing because ppc_fp128 is currently broken and will probably go away soon). llvm-svn: 155729	2012-04-27 19:34:00 +00:00
David Blaikie	296c942e88	Change recurse depth limit to uint32 to fix warning. llvm-svn: 155727	2012-04-27 19:30:32 +00:00
Dan Gohman	1bc0d2e1bc	Miscellaneous accumulated cleanups. llvm-svn: 155725	2012-04-27 18:56:31 +00:00
Lang Hames	7d83af4ed0	Fix the order of the operands in the llvm.fma intrinsic patterns for ARM, <rdar://problem/11325085>. llvm-svn: 155724	2012-04-27 18:51:24 +00:00
Mon P Wang	85af068593	Add an early bailout to IsValueFullyAvailableInBlock from deeply nested blocks. The limit is set to an arbitrary 1000 recursion depth to avoid stack overflow issues. <rdar://problem/11286839>. llvm-svn: 155722	2012-04-27 18:09:28 +00:00
Dan Gohman	25a863dcf7	Reapply r155682, making constant folding more consistent, with a fix to work properly with how the code handles all-undef PHI nodes. llvm-svn: 155721	2012-04-27 17:50:22 +00:00
Richard Barton	f9237b25e6	Fix ARM assembly parsing for upper case condition codes on IT instructions. llvm-svn: 155720	2012-04-27 17:34:01 +00:00
Benjamin Kramer	1380494168	X86: Don't emit conditional floating point moves on when targeting pre-pentiumpro architectures. * Model FPSW (the FPU status word) as a register. * Add ISel patterns for the FUCOM, FNSTSW and SAHF instructions. During Legalize/Lowering, build a node sequence to transfer the comparison result from FPSW into EFLAGS. If you're wondering about the right-shift: That's an implicit sub-register extraction (%ax -> %ah) which is handled later on by the instruction selector. Fixes PR6679. Patch by Christoph Erhardt! llvm-svn: 155704	2012-04-27 12:07:43 +00:00
Kostya Serebryany	c855c442c8	[asan] small optimization: do not emit "x+0" instructions llvm-svn: 155701	2012-04-27 10:04:53 +00:00
Richard Barton	ca70156ab3	Refactor IT handling not to store the bottom bit of the condition code in the mask operand in the MCInst. llvm-svn: 155700	2012-04-27 08:42:59 +00:00
NAKAMURA Takumi	a28147f072	Revert r155682, "Use ConstantExpr::getExtractElement when constant-folding vectors" It broke stage2 build. stage1/clang sometimes crashed. llvm-svn: 155699	2012-04-27 07:59:20 +00:00
Kostya Serebryany	0cd695bd39	[tsan] Atomic support for ThreadSanitizer, patch by Dmitry Vyukov llvm-svn: 155698	2012-04-27 07:31:53 +00:00
Evan Cheng	f35523d08a	Implement a bastardized ABI. llvm-svn: 155686	2012-04-27 02:11:10 +00:00
Evan Cheng	594fb11f12	- thumbv6 shouldn't imply +thumb2. Cortex-M0 doesn't suppport 32-bit Thumb2 instructions. - However, it does support dmb, dsb, isb, mrs, and msr. rdar://11331541 llvm-svn: 155685	2012-04-27 01:27:19 +00:00
Dan Gohman	a72b2f97a6	Use ConstantExpr::getExtractElement when constant-folding vectors instead of getAggregateElement. This has the advantage of being more consistent and allowing higher-level constant folding to procede even if an inner extract element cannot be folded. Make ConstantFoldInstruction call ConstantFoldConstantExpression on the instruction's operands, making it more consistent with ConstantFoldConstantExpression itself. This makes sure that ConstantExprs get TargetData-aware folding before being handed off as operands for further folding. This causes more expressions to be folded, but due to a known shortcoming in constant folding, this currently has the side effect of stripping a few more nuw and inbounds flags in the non-targetdata side of constant-fold-gep.ll. This is mostly harmless. This fixes rdar://11324230. llvm-svn: 155682	2012-04-27 00:54:36 +00:00
Jakob Stoklund Olesen	185c3797be	Break up getProfitableChainIncrement(). The required checks are moved to ChainInstruction() itself and the policy decisions are moved to IVChain::isProfitableInc(). Also cache the ExprBase in IVChain to avoid frequent recomputations. No functional change intended. llvm-svn: 155676	2012-04-26 23:33:11 +00:00
Jakob Stoklund Olesen	613b89fecd	Turn IVChain into a struct. No functional change intended. llvm-svn: 155675	2012-04-26 23:33:09 +00:00
Chad Rosier	f3d4646377	Add instcombine patterns for the following transformations: (x & y) \| (x ^ y) -> x \| y (x & y) + (x ^ y) -> x \| y Patch by Manman Ren. rdar://10770603 llvm-svn: 155674	2012-04-26 23:29:14 +00:00
Andrew Trick	1aa00c0baa	Fix the SD scheduler to avoid gluing the same node twice. DAGCombine strangeness may result in multiple loads from the same offset. They both may try to glue themselves to another load. We could insist that the redundant loads glue themselves to each other, but the beter fix is to bail out from bad gluing at the time we detect it. Fixes rdar://11314175: BuildSchedUnits assert. llvm-svn: 155668	2012-04-26 21:48:25 +00:00
Jim Grosbach	bf9adf2ab5	ARM: Thumb ldr(literal) base address alignment is 32-bits. The base address for the PC-relative load is Align(PC,4), so it's the address of the word containing the 16-bit instruction, not the address of the instruction itself. Ugh. rdar://11314619 llvm-svn: 155659	2012-04-26 20:48:12 +00:00
Preston Gurd	fb1760744d	Trivial change to set UseLeaForSP flag in addition to toggling the FeatureLeaForSP feature bit when llvm auto detects Intel Atom. Patch by Andy Zhang llvm-svn: 155655	2012-04-26 19:52:27 +00:00
Michael J. Spencer	0dd0f2c8f0	[Support/YAML] Properly fix unitialized variable warning by inserting a 'REPLACEMENT CHARACTER' (U+FFFD) when getAsInteger fails. llvm-svn: 155653	2012-04-26 19:27:11 +00:00
Tim Northover	876c151146	Use VLD1 in NEON extenting-load patterns instead of VLDR. On some cores it's a bad idea for performance to mix VFP and NEON instructions and since these patterns are NEON anyway, the NEON load should be used. llvm-svn: 155630	2012-04-26 08:46:29 +00:00
Tim Northover	b83dc53c3a	Test commit. llvm-svn: 155626	2012-04-26 08:24:07 +00:00
Craig Topper	f883096ff7	Enable detection of AVX and AVX2 support through CPUID. Add AVX/AVX2 to corei7-avx, core-avx-i, and core-avx2 cpu names. llvm-svn: 155618	2012-04-26 06:40:15 +00:00
Chandler Carruth	587c136c31	Teach the reassociate pass to fold chains of multiplies with repeated elements to minimize the number of multiplies required to compute the final result. This uses a heuristic to attempt to form near-optimal binary exponentiation-style multiply chains. While there are some cases it misses, it seems to at least a decent job on a very diverse range of inputs. Initial benchmarks show no interesting regressions, and an 8% improvement on SPASS. Let me know if any other interesting results (in either direction) crop up! Credit to Richard Smith for the core algorithm, and helping code the patch itself. llvm-svn: 155616	2012-04-26 05:30:30 +00:00
Evan Cheng	4d570a3f0e	If triple is armv7 / thumbv7 and a CPU is specified, do not automatically assume the feature set of v7a. This comes about if the user specifies something like -arch armv7 -mcpu=cortex-m3. We shouldn't be generating instructions such as uxtab in this case. rdar://11318438 llvm-svn: 155601	2012-04-26 01:13:36 +00:00
Bill Wendling	d9d9230b83	Don't forget to reset 'first operand' flag when we're setting the MDNodeOperand value. llvm-svn: 155599	2012-04-26 00:38:42 +00:00
Jakob Stoklund Olesen	e2913e1ad5	Print IV chain numbers while collecting them. llvm-svn: 155567	2012-04-25 18:01:32 +00:00
Jakob Stoklund Olesen	24c99d2966	Remove more dead code. llvm-svn: 155566	2012-04-25 18:01:30 +00:00
Richard Barton	e9a972bbe3	Unify internal representation of ARM instructions with a register right-shifted by #32 . These are stored as shifts by #0 in the MCInst and correctly marshalled when transforming from or to assembly representation. llvm-svn: 155565	2012-04-25 18:00:18 +00:00
Jakob Stoklund Olesen	7f1be74a4a	Remove the -disable-cross-class-join option. Cross-class joins have been normal and fully supported for a while now. With TableGen generating the getMatchingSuperRegClass() hook, they are unlikely to cause problems again. llvm-svn: 155552	2012-04-25 16:17:50 +00:00
Jakob Stoklund Olesen	b8d98c5060	Cross-class joining is winning. Remove the heuristic for disabling cross-class joins. The greedy register allocator can handle the narrow register classes, and when it splits a live range, it can pick a larger register class. Benchmarks were unaffected by this change. <rdar://problem/11302212> llvm-svn: 155551	2012-04-25 16:17:47 +00:00
Craig Topper	5828c654b9	Add ifdef around getSubtargetFeatureName in tablegen output file so that only targets that want the function get it. This prevents other targets from getting an unused function warning. llvm-svn: 155538	2012-04-25 06:56:34 +00:00
Craig Topper	1a016fd95d	Use vector_shuffles instead of target specific unpack nodes for AVX ZERO_EXTEND/ANY_EXTEND combine. These will be converted to target specific nodes during lowering. This is more consistent with other code. llvm-svn: 155537	2012-04-25 06:39:39 +00:00
Lang Hames	7f69fbca29	Reverting r155468. Chris and Chandler have convinced me that it's dangerous and in poor taste. Talking through some alternate solutions with Chandler. llvm-svn: 155530	2012-04-25 02:16:54 +00:00
Akira Hatanaka	b3ecf903f1	Do not use $gp as a dedicated global register if the target ABI is not O32. llvm-svn: 155522	2012-04-25 01:24:52 +00:00
Dan Gohman	64171a0b3b	Simplify the known retain count tracking; use a boolean state instead of a precise count. Also, move RRInfo's Partial field into PtrState, now that it won't increase the size. llvm-svn: 155513	2012-04-25 00:50:46 +00:00
Dan Gohman	3a24d34041	Build custom predecessor and successor lists for each basic block. These lists exclude invoke unwind edges and loop backedges which are being ignored. This makes it easier to ignore them consistently. llvm-svn: 155500	2012-04-24 22:53:18 +00:00
Jim Grosbach	7ac2ac85a8	ARM: improved assembler diagnostics for missing CPU features. When an instruction match is found, but the subtarget features it requires are not available (missing floating point unit, or thumb vs arm mode, for example), issue a diagnostic that identifies what the feature mismatch is. rdar://11257547 llvm-svn: 155499	2012-04-24 22:40:08 +00:00
Andrew Trick	47f01c373e	Fix a naughty header include that breaks "installed" builds. llvm-svn: 155486	2012-04-24 20:36:19 +00:00
Nadav Rotem	3c817bb807	ConstantFoldSelectInstruction swapped the operands of the select. Fix 12592. Patch by Matt Pharr. llvm-svn: 155480	2012-04-24 20:18:49 +00:00

1 2 3 4 5 ...

54109 Commits