llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 04:52:54 +02:00

Author	SHA1	Message	Date
Daniel Dunbar	f2b4982344	MC/AsmParser: Push the burdon of emitting diagnostics about unmatched instructions onto the target specific parser, which can do a better job. llvm-svn: 110889	2010-08-12 00:55:38 +00:00
Daniel Dunbar	0a98bc5619	tblgen/AsmMatcher: Always emit the match function as 'MatchInstructionImpl', target specific parsers can adapt the TargetAsmParser to this. llvm-svn: 110888	2010-08-12 00:55:32 +00:00
Johnny Chen	9a37d16281	Changed the format of DMBsy, DSBsy, and friends from Pseudo to MiscFrm. Added two test cases to arm-tests.txt. llvm-svn: 110880	2010-08-11 23:35:12 +00:00
Bob Wilson	3582107cf8	Move the ARM SSAT and USAT optional shift amount operand out of the instruction opcode. This also fixes part of PR7792. llvm-svn: 110875	2010-08-11 23:10:46 +00:00
Jakob Stoklund Olesen	5a62f10abc	Fix <rdar://problem/8282498> even if it doesn't reproduce on trunk. When a register is defined by a partial load: %reg1234:sub_32 = MOV32mr <fi#-1>; GR64:%reg1234 That load cannot be folded into an instruction using the full 64-bit register. It would become a 64-bit load. This is related to the recent change to have isLoadFromStackSlot return false on a sub-register load. llvm-svn: 110874	2010-08-11 23:08:22 +00:00
Dan Gohman	54027cf446	Don't use unsigned char for alignments in TargetData. There aren't that many of these things, so the memory savings isn't significant, and there are now situations where there can be alignments greater than 128. llvm-svn: 110836	2010-08-11 18:15:01 +00:00
Dan Gohman	d91d51116b	Use ISD::ADD instead of ISD::SUB with a negated constant. This avoids trouble if the return type of TD->getPointerSize() is changed to something which doesn't promote to a signed type, and is simpler anyway. Also, use getCopyFromReg instead of getRegister to read a physical register's value. llvm-svn: 110835	2010-08-11 18:14:00 +00:00
Jim Grosbach	1128a47289	cortex m4 has floating point support, but only single precision. llvm-svn: 110810	2010-08-11 15:44:15 +00:00
Bill Wendling	f10d5c00fc	Consider this code snippet: float t1(int argc) { return (argc == 1123) ? 1.234f : 2.38213f; } We would generate truly awful code on ARM (those with a weak stomach should look away): _t1: movw r1, #1123 movs r2, #1 movs r3, #0 cmp r0, r1 mov.w r0, #0 it eq moveq r0, r2 movs r1, #4 cmp r0, #0 it ne movne r3, r1 adr r0, #LCPI1_0 ldr r0, [r0, r3] bx lr The problem was that legalization was creating a cascade of SELECT_CC nodes, for for the comparison of "argc == 1123" which was fed into a SELECT node for the ?: statement which was itself converted to a SELECT_CC node. This is because the ARM back-end doesn't have custom lowering for SELECT nodes, so it used the default "Expand". I added a fairly simple "LowerSELECT" to the ARM back-end. It takes care of this testcase, but can obviously be expanded to include more cases. Now we generate this, which looks optimal to me: _t1: movw r1, #1123 movs r2, #0 cmp r0, r1 adr r0, #LCPI0_0 it eq moveq r2, #4 ldr r0, [r0, r2] bx lr .align 2 LCPI0_0: .long 1075344593 @ float 2.382130e+00 .long 1067316150 @ float 1.234000e+00 llvm-svn: 110799	2010-08-11 08:43:16 +00:00
Evan Cheng	f8604b772e	Report error if codegen tries to instantiate a ARM target when the cpu does support it. e.g. cortex-m* processors. llvm-svn: 110798	2010-08-11 07:17:46 +00:00
Evan Cheng	4929ba9d20	ArchV7M implies HW division instructions. llvm-svn: 110797	2010-08-11 07:00:16 +00:00
Evan Cheng	31e15214c6	ArchV6T2, V7A, and V7M implies Thumb2; Archv7A implies NEON. llvm-svn: 110796	2010-08-11 06:57:53 +00:00
Evan Cheng	273160895e	Add ARM Archv6M and let it implies FeatureDB (having dmb, etc.) llvm-svn: 110795	2010-08-11 06:51:54 +00:00
Daniel Dunbar	bc7c0a60da	MC/ARM: Add basic support for handling predication by parsing it out of the mnemonic into a separate operand form. llvm-svn: 110794	2010-08-11 06:37:20 +00:00
Daniel Dunbar	63628f1443	MC/ARM: Split mnemonic on '.' characters. llvm-svn: 110793	2010-08-11 06:37:16 +00:00
Daniel Dunbar	bbaa88a848	MC/ARM: Fill in ARMOperand::dump a bit. llvm-svn: 110792	2010-08-11 06:37:12 +00:00
Daniel Dunbar	ee80a239ed	MCAsmParser: Add dump() hook to MCParsedAsmOperand. llvm-svn: 110790	2010-08-11 06:37:04 +00:00
Daniel Dunbar	74ed9321a3	MC/ARM: Add an ARMOperand class for condition codes. llvm-svn: 110788	2010-08-11 06:36:53 +00:00
Evan Cheng	e67c4c3723	Really control isel of barrier instructions with cpu feature. llvm-svn: 110787	2010-08-11 06:36:31 +00:00
Evan Cheng	e5bab36c75	Add Cortex-M0 support. It's a ARMv6m device (no ARM mode) with some 32-bit instructions: dmb, dsb, isb, msr, and mrs. llvm-svn: 110786	2010-08-11 06:30:38 +00:00
Evan Cheng	5fca4ca5f9	- Add subtarget feature -mattr=+db which determine whether an ARM cpu has the memory and synchronization barrier dmb and dsb instructions. - Change instruction names to something more sensible (matching name of actual instructions). - Added tests for memory barrier codegen. llvm-svn: 110785	2010-08-11 06:22:01 +00:00
Daniel Dunbar	89a64ee590	MC/ARM: Switch to using the generated match functions instead of stub implementations. llvm-svn: 110783	2010-08-11 05:24:50 +00:00
Daniel Dunbar	0d725e0080	MC/ARM: Enable generation of the ARM asm matcher, not that it can do much. llvm-svn: 110782	2010-08-11 05:09:20 +00:00
Daniel Dunbar	8311cf950b	ARM: Mark some disassembler only instructions as not available for matching -- for some reason they have a very odd MCInst form where the operands overlap, but I haven't dug in to find out why yet. llvm-svn: 110781	2010-08-11 04:46:13 +00:00
Daniel Dunbar	a77e3fc8d8	ARM: Quote $p in an asm string. llvm-svn: 110780	2010-08-11 04:46:10 +00:00
Bill Wendling	615aad17f7	Handle ARM compares as well as converting for ARM adds, subs, and thumb2's adds. llvm-svn: 110762	2010-08-11 00:23:00 +00:00
Bill Wendling	735305d4d8	Mark ARM compare instructions as isCompare. llvm-svn: 110761	2010-08-11 00:22:27 +00:00
Bob Wilson	0650cceb38	Add a separate ARM instruction format for Saturate instructions. (I discovered 2 more copies of the ARM instruction format list, bringing the total to 4!! Two of them were already out of sync. I haven't yet gotten into the disassembler enough to know the best way to fix this, but something needs to be done.) Add support for encoding these instructions. llvm-svn: 110754	2010-08-11 00:01:18 +00:00
Evan Cheng	966ed540a6	CBZ and CBNZ are implemented. llvm-svn: 110745	2010-08-10 23:27:11 +00:00
Bruno Cardoso Lopes	6eb24fd744	Add AVX matching patterns to Packed Bit Test intrinsics. Apply the same approach of SSE4.1 ptest intrinsics but create a new x86 node "testp" since AVX introduces vtest{ps}{pd} instructions which set ZF and CF depending on sign bit AND and ANDN of packed floating-point sources. This is slightly different from what the "ptest" does. Tests comming with the other 256 intrinsics tests. llvm-svn: 110744	2010-08-10 23:25:42 +00:00
Bill Wendling	c8117e507d	Turn optimize compares back on with fix. We needed to test that a machine op was a register before checking if it was defined. llvm-svn: 110733	2010-08-10 21:38:11 +00:00
Evan Cheng	784a286b92	Delete some unused instructions. llvm-svn: 110710	2010-08-10 19:36:22 +00:00
Evan Cheng	d9a1b0d046	Re-apply r110655 with fixes. Epilogue must restore sp from fp if the function stack frame has a var-sized object. Also added a test case to check for the added benefit of this patch: it's optimizing away the unnecessary restore of sp from fp for some non-leaf functions. llvm-svn: 110707	2010-08-10 19:30:19 +00:00
Daniel Dunbar	872e84afb5	Revert r110655, "Fix ARM hasFP() semantics. It should return true whenever FP register is", it breaks a couple test-suite tests. llvm-svn: 110701	2010-08-10 18:32:02 +00:00
Evan Cheng	3d47dbe761	Fix ARM hasFP() semantics. It should return true whenever FP register is reserved, not available for general allocation. This eliminates all the extra checks for Darwin. This change also fixes the use of FP to access frame indices in leaf functions and cleaned up some confusing code in epilogue emission. llvm-svn: 110655	2010-08-10 06:26:49 +00:00
Bruno Cardoso Lopes	f1928b60c0	Add AVX movnt{pd,ps,dq} 256-bit intrinsics llvm-svn: 110650	2010-08-10 02:49:24 +00:00
Bruno Cardoso Lopes	f5884c6791	Add AVX movmsk 256-bit intrinsics llvm-svn: 110648	2010-08-10 02:34:56 +00:00
Bruno Cardoso Lopes	2a7ed4b5c9	Support AVX 256-bit load and store intrinsics llvm-svn: 110645	2010-08-10 01:43:16 +00:00
Bruno Cardoso Lopes	1ea37cfa7b	Patterns to match AVX cmp instructions llvm-svn: 110633	2010-08-10 00:13:20 +00:00
Bruno Cardoso Lopes	4e8d77892c	Add matching patterns for vblend AVX intrinsics llvm-svn: 110630	2010-08-10 00:02:05 +00:00
Eric Christopher	a79ff725ab	Wording. llvm-svn: 110618	2010-08-09 22:52:47 +00:00
Evan Cheng	fa0406ae10	ARMBaseRegisterInfo::hasFP() has been broken for a while now. :-( This will always be false before PEI: (DisableFramePointerElim(MF) && MFI->adjustsStack()) Which means it's going to make r11 available as a general purpose register even if -disable-fp-elim is specified. It's working on Darwin only because r7 is always reserved. But it's obviously broken for other targets. llvm-svn: 110614	2010-08-09 22:32:45 +00:00
Bruno Cardoso Lopes	e58d077846	Add VCVTPD2PS, VCVTPS2DQ, VCVTPS2PDY, VCVTTPD2DQY, VCVTTPS2DQ and VCVTPD2DQ 256-bit conversion intrinsics llvm-svn: 110608	2010-08-09 21:51:56 +00:00
Bruno Cardoso Lopes	e7ceec4edf	Add patterns to AVX conversions instructions. Do that instead of declaring more intructions whenever is possible, more coming llvm-svn: 110605	2010-08-09 21:24:59 +00:00
Oscar Fuentes	633432d46c	CMake: eliminated unnecessary target_link_libraries. Next time the build is broken due to wrong library dependencies, just try building again (if you are on some Unix and are building all LLVM targets) or ask someone to commit the regenerated LLVMLibDeps.cmake. llvm-svn: 110593	2010-08-09 20:33:08 +00:00
Evan Cheng	b6b08dfca1	Explicitly initialize SlowFPBrcc and Pref32BitThumb to false. llvm-svn: 110587	2010-08-09 19:19:36 +00:00
Evan Cheng	15d23d4966	Change -prefer-32bit-thumb to attribute -mattr=+32bit instead to disable more 32-bit to 16-bit optimizations. llvm-svn: 110584	2010-08-09 18:35:19 +00:00
Bruno Cardoso Lopes	6a92e01d05	Memory version of vcvtdq2pd intrinsic llvm-svn: 110582	2010-08-09 18:20:14 +00:00
Bruno Cardoso Lopes	0794b8ab3f	Patterns to match vinsert, vbroadcast, vmovmask and vcvtdq2pd AVX intrinsics llvm-svn: 110580	2010-08-09 18:03:43 +00:00
Evan Cheng	a04ba7588a	Add an option to disable 32 -> 16-bit Thumb2 size reduction pass for experimentation. llvm-svn: 110579	2010-08-09 17:16:10 +00:00
Kalle Raiskila	e2c0e66ff1	Have SPU handle halfvec stores aligned by 8 bytes. llvm-svn: 110576	2010-08-09 16:33:00 +00:00
Nick Lewycky	3a15ba4d5e	Add optimization to Target/README.txt. llvm-svn: 110543	2010-08-08 07:04:25 +00:00
Bill Wendling	39c49e3e17	Use the "isCompare" machine instruction attribute instead of calling the relatively expensive comparison analyzer on each instruction. Also rename the comparison analyzer method to something more in line with what it actually does. This pass is will eventually be folded into the Machine CSE pass. llvm-svn: 110539	2010-08-08 05:04:59 +00:00
Dale Johannesen	23f9086dd3	Use sdmem and sse_load_f64 (etc.) for the vector form of CMPSD (etc.) Matching a 128-bit memory operand is wrong, the instruction uses only 64 bits (same as ADDSD etc.) 8193553. llvm-svn: 110491	2010-08-07 00:33:42 +00:00
Bruno Cardoso Lopes	5b602f8822	Patterns to match AVX 256-bit vzero intrinsics llvm-svn: 110480	2010-08-06 22:10:01 +00:00
Bruno Cardoso Lopes	821eebf946	Patterns to match AVX 256-bit permutation intrinsics llvm-svn: 110468	2010-08-06 20:03:27 +00:00
Jim Grosbach	9ef6362af1	Remove empty processFunctionBeforeFrameFinalized(). The default implementation of the function is equivalent, so no need to provide the target-specific version until/unless it needs to do something. llvm-svn: 110465	2010-08-06 18:57:24 +00:00
Owen Anderson	f2fea95f2f	Reapply r110396, with fixes to appease the Linux buildbot gods. llvm-svn: 110460	2010-08-06 18:33:48 +00:00
Rafael Espindola	6d53fded19	Fix eabi calling convention when a 64 bit value shadows r3. Without this what was happening was: * R3 is not marked as "used" * ARM backend thinks it has to save it to the stack because of vaarg * Offset computation correctly ignores it * Offsets are wrong llvm-svn: 110446	2010-08-06 15:35:32 +00:00
Bruno Cardoso Lopes	d186fba555	Patterns to match AVX 256-bit horizontal arithmetic intrinsics llvm-svn: 110427	2010-08-06 02:10:30 +00:00
Bruno Cardoso Lopes	5e9f9c921e	Patterns to match AVX 256-bit arithmetic intrinsics llvm-svn: 110425	2010-08-06 01:52:29 +00:00
Bill Wendling	0cd2ae5158	Add the Optimize Compares pass (disabled by default). This pass tries to remove comparison instructions when possible. For instance, if you have this code: sub r1, 1 cmp r1, 0 bz L1 and "sub" either sets the same flag as the "cmp" instruction or could be converted to set the same flag, then we can eliminate the "cmp" instruction all together. This is a important for ARM where the ALU instructions could set the CPSR flag, but need a special suffix ('s') to do so. llvm-svn: 110423	2010-08-06 01:32:48 +00:00
Owen Anderson	aadd8a89ca	Revert r110396 to fix buildbots. llvm-svn: 110410	2010-08-06 00:23:35 +00:00
Eric Christopher	cf17d8dfa7	Add an option to always emit realignment code for a particular module. llvm-svn: 110404	2010-08-05 23:57:43 +00:00
Owen Anderson	b9762c07cb	Don't use PassInfo* as a type identifier for passes. Instead, use the address of the static ID member as the sole unique type identifier. Clean up APIs related to this change. llvm-svn: 110396	2010-08-05 23:42:04 +00:00
Dan Gohman	8a813c4ded	Remove IntrWriteMem, as it's the default. Rename IntrWriteArgMem to IntrReadWriteArgMem, as it's for reading as well as writing. llvm-svn: 110395	2010-08-05 23:36:21 +00:00
Bruno Cardoso Lopes	a26a97510a	Support very basic (doesn't include ABI support in the front-end, varags, ...) 256-bit argument passing and return for AVX llvm-svn: 110394	2010-08-05 23:35:51 +00:00
Eric Christopher	b10ca25085	Handle the memory barrier pseudo that goes to nothing for the JIT. llvm-svn: 110371	2010-08-05 20:04:36 +00:00
Eric Christopher	bc14450d15	Set hasSideEffects on the 64-bit no-sse memory barrier. llvm-svn: 110369	2010-08-05 19:54:59 +00:00
Jim Grosbach	fb6af5329d	For local variables in functions with a frame pointer, use FP as a base register for local access when it's closer to the stack slot being refererenced than the stack pointer. Make sure to take into account any argument frame SP adjustments that are in affect at the time. rdar://8256090 llvm-svn: 110366	2010-08-05 19:27:37 +00:00
Bob Wilson	fbce203f20	Fix indentation. llvm-svn: 110363	2010-08-05 19:00:21 +00:00
Bob Wilson	4ba3c0a5e1	Add an ARM RSCrr instruction for disassembly only. Partial fix for PR7792. llvm-svn: 110361	2010-08-05 18:59:36 +00:00
Eric Christopher	61f3059ee1	Be a little bit more specific about target for the memory barrier instructions. llvm-svn: 110360	2010-08-05 18:36:20 +00:00
Eric Christopher	904ec3a392	Handle the pseudo in MCInstLower. llvm-svn: 110359	2010-08-05 18:34:30 +00:00
Bob Wilson	9fbaea3765	Add an ARM RSBrr instruction for disassembly only. Partial fix for PR7792. llvm-svn: 110358	2010-08-05 18:23:43 +00:00
Chandler Carruth	01c83a8512	Silence a GCC warning about && and \|\| without explicit parentheses. This preserves the existing behavior, as it seems a concious choice to allow RS to be null and BigStack marked true. llvm-svn: 110307	2010-08-05 03:04:21 +00:00
Bob Wilson	214b004717	ARM "rrx" shift operands do not have an immediate. PR7790. llvm-svn: 110292	2010-08-05 00:34:42 +00:00
Eric Christopher	0e09eb9f77	Make x86-64 membarriers work without sse and clean up some of the uses. llvm-svn: 110274	2010-08-04 23:03:04 +00:00
Jim Grosbach	511dbe9c8e	and back in. false alarm on the tests from another unrelated local change. llvm-svn: 110269	2010-08-04 22:46:09 +00:00
Eli Friedman	401dbe036d	PR7814: Truncates cannot be ignored for signed comparisons. llvm-svn: 110268	2010-08-04 22:40:58 +00:00
Devang Patel	53e2e4feae	Implement target specific getDebugValueLocation(). llvm-svn: 110267	2010-08-04 22:39:39 +00:00
Jim Grosbach	497c60502c	oops. revert for a moment to clean up tests first. llvm-svn: 110259	2010-08-04 22:12:43 +00:00
Jim Grosbach	ece51f94db	Reserve a stack slot if the function adjusts the stack but doesn't simplify the call frame pseudo instructions. In that situation, the calculations for estimating the stack size will be way off, leading to not having an emergency spill slot when we need one. It should be possible to be more precise about tracking the adjustment values, but not really necessary for correctness. Upcoming cleanups for PEI in general will render that moot. llvm-svn: 110258	2010-08-04 22:10:15 +00:00
Devang Patel	97a93285f5	Implement target specific getDebugValueLocation(). llvm-svn: 110256	2010-08-04 22:07:50 +00:00
Torok Edwin	319c3f56c8	Use indirect calls in PowerPC JIT. See PR5201. There is no way to know if direct calls will be within the allowed range for BL. Hence emit all calls as indirect when in JIT mode. Without this long-running applications will fail to JIT on PowerPC with a relocation failure. llvm-svn: 110246	2010-08-04 20:47:44 +00:00
Dale Johannesen	53bc276b33	Remove switch for disabling ARM tail calls. They seem to be working correctly. No functional change. llvm-svn: 110226	2010-08-04 18:07:17 +00:00
Devang Patel	e48431509c	Add DEBUG message. llvm-svn: 110224	2010-08-04 18:06:05 +00:00
Benjamin Kramer	8bce8e326c	Enable COFF writer on mingw32 and cygwin. llvm-svn: 110200	2010-08-04 15:32:40 +00:00
Kalle Raiskila	ce1e4d80cb	Make SPU backend handle insertelement and store for "half vectors" llvm-svn: 110198	2010-08-04 13:59:48 +00:00
Benjamin Kramer	968cc0119f	Print an error message when someone tries -integrated-as on an unsupported target. - The COFF backend doesn't support MingW/Cygwin at the moment, it'll report an error, but it's still much better than random assertions from the MachO backend. - We want to make ELF the default eventually, it's what the majority of targets use. llvm-svn: 110197	2010-08-04 13:16:30 +00:00
Gabor Greif	50fb0419ea	by Alexander Herz: "The CWriter::GetValueName() method does not check if a value as an alias and emits the alias name which will never be defined in the output .c file (so the output file fails to compile). This can happen if you have multiple inheritance with several destructors defined by clang (...D0Ev, ...D1Ev, ...D2Ev)." -- applied with minor tweaks. Thanks! llvm-svn: 110194	2010-08-04 10:00:52 +00:00
Bob Wilson	6a2437480a	Combine NEON VABD (absolute difference) intrinsics with ADDs to make VABA (absolute difference with accumulate) intrinsics. Radar 8228576. llvm-svn: 110170	2010-08-04 00:12:08 +00:00
Chris Lattner	a2e36c6b18	fix a win64 encoding problem, patch by Cameron Esfahani! llvm-svn: 110164	2010-08-03 22:49:22 +00:00
Nate Begeman	b506e13a32	Add support for getting & setting the FPSCR application register on ARM when VFP is enabled. Add support for using the FPSCR in conjunction with the vcvtr instruction, for controlling fp to int rounding. Add support for the FLT_ROUNDS_ node now that the FPSCR is exposed. llvm-svn: 110152	2010-08-03 21:31:55 +00:00
Oscar Fuentes	7186be986b	CMake: Change somme target library names: XCore->XCoreGen PIC16->PIC16CodeGen After updating your working copy, the first build will fail because it is using the old library dependencies. Start the build again and it will work fine. llvm-svn: 110127	2010-08-03 17:40:31 +00:00
Kalle Raiskila	014c93befb	More SPU v2f32 stuff added: insertelement and shuffle. llvm-svn: 110038	2010-08-02 11:22:10 +00:00
Kalle Raiskila	766fd434df	Add preliminary v2f32 support for SPU. Like with v2i32, we just duplicate the instructions and operate on half vectors. Also reorder code in SPUInstrInfo.td for better coherency. llvm-svn: 110037	2010-08-02 10:25:47 +00:00
Kalle Raiskila	21615cb06e	Add preliminary v2i32 support for SPU backend. As there are no such registers in SPU, this support boils down to "emulating" them by duplicating instructions on the general purpose registers. This adds the most basic operations on v2i32: passing parameters, addition, subtraction, multiplication and a few others. llvm-svn: 110035	2010-08-02 08:54:39 +00:00
Eli Friedman	40cb7d9994	PR7781: Fix incorrect shifting in PPCTargetLowering::LowerBUILD_VECTOR. llvm-svn: 109998	2010-08-02 00:18:19 +00:00
Eli Friedman	3c5289c381	PR7774: Fix undefined shifts in Alpha backend. As a bonus, this actually improves the generated code in some cases. llvm-svn: 109985	2010-08-01 21:13:28 +00:00
Daniel Dunbar	e0737ebae3	Silence some -Asserts uninitialized variable warnings. llvm-svn: 109956	2010-07-31 21:08:54 +00:00
Michael J. Spencer	1f89cc1fe5	MC: Remove HasAbsolutizedSet from WindowsX86AsmBackend. llvm-svn: 109949	2010-07-31 07:21:44 +00:00
Bob Wilson	58c8a5da9e	Move newlines before inline jumptables from the asm strings in .td files to the jtblock_operand print methods. This avoids extra newlines in the disassembler's output. PR7757. llvm-svn: 109948	2010-07-31 06:28:10 +00:00
Michael J. Spencer	b52ff1ba41	Add relax all support to the COFF object streamer. llvm-svn: 109947	2010-07-31 06:22:29 +00:00
Bob Wilson	439e7b1d73	Add support for disassembling VMVN (immediate) instructions. PR7747. llvm-svn: 109946	2010-07-31 05:57:44 +00:00
Evan Cheng	ee59acf6dd	Add -disable-shifter-op to disable isel of shifter ops. On Cortex-a9 the shifts cost extra instructions so it might be better to emit them separately to take advantage of dual-issues. llvm-svn: 109934	2010-07-30 23:33:54 +00:00
Bob Wilson	6ce71251cc	Add a check in the ARM disassembler for NEON instructions that would reference registers past the end of the NEON register file, and report them as invalid instead of asserting when trying to print them. PR7746. llvm-svn: 109933	2010-07-30 23:27:59 +00:00
Dale Johannesen	eb251be031	PPC doesn't supported VLA with large alignment. This was formerly rejected by the FE, so asserted in the BE; now the FE only warns, so we treat it as a legitimate fatal error in PPC BE. This means the test for the feature won't pass, so it's xfail'd. llvm-svn: 109892	2010-07-30 21:09:48 +00:00
Bob Wilson	bd1dc153a5	Add the __TEXT,__StaticInit section to the list of sections emitted at the beginning on ARM Darwin assembly files so that it won't be placed after debug sections. Radar 8252813. llvm-svn: 109879	2010-07-30 19:55:47 +00:00
Bruno Cardoso Lopes	0c0dd2173c	Support all 128-bit AVX vector intrinsics. Most part of them I already declared during the addition of the assembler support, the additional changes are: - Add missing intrinsics - Move all SSE conversion instructions in X86InstInfo64.td to the SSE.td file. - Duplicate some patterns to AVX mode. - Step into PCMPEST/PCMPIST custom inserter and add AVX versions. llvm-svn: 109878	2010-07-30 19:54:33 +00:00
Bruno Cardoso Lopes	5d4afd0cb9	Fix typo! llvm-svn: 109877	2010-07-30 19:41:24 +00:00
Jim Grosbach	1718345a30	Many Thumb2 instructions can reference the full ARM register set (i.e., have 4 bits per register in the operand encoding), but have undefined behavior when the operand value is 13 or 15 (SP and PC, respectively). The trivial coalescer in linear scan sometimes will merge a copy from SP into a subsequent instruction which uses the copy, and if that instruction cannot legally reference SP, we get bad code such as: mls r0,r9,r0,sp instead of: mov r2, sp mls r0, r9, r0, r2 This patch adds a new register class for use by Thumb2 that excludes the problematic registers (SP and PC) and is used instead of GPR for those operands which cannot legally reference PC or SP. The trivial coalescer explicitly requires that the register class of the destination for the COPY instruction contain the source register for the COPY to be considered for coalescing. This prevents errant instructions like that above. PR7499 llvm-svn: 109842	2010-07-30 02:41:01 +00:00
Nate Begeman	0b0f838c32	Add builtins for ssat/usat, similar to RealView's __ssat and __usat intrinsics. llvm-svn: 109813	2010-07-29 22:48:09 +00:00
Bob Wilson	d70ec880ea	Refactor ARM-specific DAG combining in preparation for adding some more transformations. llvm-svn: 109800	2010-07-29 20:34:14 +00:00
Dale Johannesen	717fbb2b32	Implement vector constants which are splat of integers with mov + vdup. 8003375. This is currently disabled by default because LICM will not hoist a VDUP, so it pessimizes the code if the construct occurs inside a loop (8248029). llvm-svn: 109799	2010-07-29 20:10:08 +00:00
Bob Wilson	823182c3e5	Don't assert on an unrecognized BrMiscFrm instruction. PR7745. llvm-svn: 109788	2010-07-29 18:29:28 +00:00
Nate Begeman	b24fa8b8ae	Add intrinsics __builtin_arm_qadd & __builtin_arm_qsub to allow access to the QADD & QSUB instructions. Behave identically to __qadd & __qsub RealView instruction intrinsics. llvm-svn: 109770	2010-07-29 17:56:55 +00:00
Jakob Stoklund Olesen	1dee3913d6	Revert r109652, and remove the offending assert in loadRegFromStackSlot instead. We do sometimes load from a too small stack slot when dealing with x86 arguments (varargs and smaller-than-32-bit args). It looks like we know what we are doing in those cases, so I am going to remove the assert instead of artifically enlarging stack slot sizes. The assert in storeRegToStackSlot stays in. We don't want to write beyond the bounds of a stack slot. llvm-svn: 109764	2010-07-29 17:42:27 +00:00
Jim Grosbach	8764c0127d	ARM mode version of r109693. Remove incorrect substitution pattern for UXTB16. It wrongly assumed the input shift was actually a rotate. rdar://8240138 llvm-svn: 109696	2010-07-28 23:25:44 +00:00
Jim Grosbach	17bec0f609	Remove incorrect substitution pattern for UXTB16. It wrongly assumed the input shift was actually a rotate. rdar://8240138 llvm-svn: 109693	2010-07-28 23:17:45 +00:00
Jim Grosbach	03b130774b	Remove dead prototype llvm-svn: 109691	2010-07-28 23:16:12 +00:00
Jakob Stoklund Olesen	d4c60eed5e	Create a fixed stack object for varargs that is as large as any register. The size of this object isn't used for anything - technically it is of variable size. This avoids a false positive from the assert in X86InstrInfo::loadRegFromStackSlot, and fixes PR7735. llvm-svn: 109652	2010-07-28 20:55:38 +00:00
Dan Gohman	a4186ab5f0	Fix this code to avoid decrementing an iterator past the beginning of a std::vector. llvm-svn: 109597	2010-07-28 17:15:36 +00:00
Dan Gohman	041bd99662	Do GEP offset calculations with unsigned math rather than signed math to avoid undefined behavior on overflow, noticed by John Regehr. llvm-svn: 109594	2010-07-28 17:11:36 +00:00
Nate Begeman	133820e806	Implement a vectorized algorithm for <16 x i8> << <16 x i8> This is about 4x faster and smaller than the existing scalarization. llvm-svn: 109566	2010-07-28 00:21:48 +00:00
Nate Begeman	068e932975	~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches. For: define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp { entry: %shl = shl <4 x i32> %r, %a ; <<4 x i32>> [#uses=1] %tmp2 = bitcast <4 x i32> %shl to <2 x i64> ; <<2 x i64>> [#uses=1] ret <2 x i64> %tmp2 } We get: _shl: ## @shl pslld $23, %xmm1 paddd LCPI0_0, %xmm1 cvttps2dq %xmm1, %xmm1 pmulld %xmm1, %xmm0 ret Instead of: _shl: ## @shl pshufd $3, %xmm0, %xmm2 movd %xmm2, %eax pshufd $3, %xmm1, %xmm2 movd %xmm2, %ecx shll %cl, %eax movd %eax, %xmm2 pshufd $1, %xmm0, %xmm3 movd %xmm3, %eax pshufd $1, %xmm1, %xmm3 movd %xmm3, %ecx shll %cl, %eax movd %eax, %xmm3 punpckldq %xmm2, %xmm3 movd %xmm0, %eax movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm2 movhlps %xmm0, %xmm0 movd %xmm0, %eax movhlps %xmm1, %xmm1 movd %xmm1, %ecx shll %cl, %eax movd %eax, %xmm0 punpckldq %xmm0, %xmm2 movdqa %xmm2, %xmm0 punpckldq %xmm3, %xmm0 ret llvm-svn: 109549	2010-07-27 22:37:06 +00:00
Michael J. Spencer	33ac353ce4	Make MC use Windows COFF on Windows and add tests. llvm-svn: 109494	2010-07-27 06:46:15 +00:00
Jakob Stoklund Olesen	b056152ccf	The isLoadFromStackSlot and isStoreToStackSlot have no way of reporting subregister operands like this: %reg1040:sub_32bit<def> = MOV32rm <fi#-2>, 1, %reg0, 0, %reg0, %reg1040<imp-def>; mem:LD4[FixedStack-2](align=8) Make them return false when subreg operands are present. VirtRegRewriter is making bad assumptions otherwise. This fixes PR7713. llvm-svn: 109489	2010-07-27 04:17:01 +00:00
Jakob Stoklund Olesen	fa4bcde9d9	Add assertions that expose the PR7713 miscompilation: Accessing a stack slot with a too-big register class. llvm-svn: 109488	2010-07-27 04:16:58 +00:00
Eli Friedman	94c9f00dd6	And a bit more non-ASCII stuff. llvm-svn: 109458	2010-07-26 22:28:18 +00:00
Anton Korobeynikov	b71ee4ebbb	Drop some non-ascii stuff llvm-svn: 109456	2010-07-26 22:23:07 +00:00
Evan Cheng	4cee8136a7	On x86, f32 / f64 nodes share the same registers as 128-bit vector values. llvm-svn: 109450	2010-07-26 21:50:05 +00:00
Anton Korobeynikov	5e1e95aed9	Add a note llvm-svn: 109448	2010-07-26 21:48:35 +00:00
Bruno Cardoso Lopes	cb0f921ca4	Temporary hack to let codegen assert or generate poor code in case we are using AVX and no AVX version of the desired intruction is present, this is better for incremental dev (without fallbacks it's easier to spot what's missing). Not sure this is the best hack thought (we can also disable all HasSSE* predicates by dinamically marking them 'false' if AVX is present) llvm-svn: 109434	2010-07-26 21:01:18 +00:00
Anton Korobeynikov	5e3d50ec58	Currently EH lowering code expects typeinfo to be global only. This assumption is not satisfied due to global mergeing. Workaround the issue by temporary disablinge mergeing of const globals. Also, ignore LLVM "special" globals. This fixes PR7716 llvm-svn: 109423	2010-07-26 18:45:39 +00:00
Evan Cheng	1566171daa	ARM fastisel isn't ready. llvm-svn: 109421	2010-07-26 18:32:55 +00:00
Douglas Gregor	caa8768635	Remove extraneous semicolon llvm-svn: 109373	2010-07-25 17:34:42 +00:00
Douglas Gregor	dc72ace097	Unbreak CMake build llvm-svn: 109372	2010-07-25 17:10:14 +00:00
Anton Korobeynikov	7ae895e007	Hook in GlobalMerge pass llvm-svn: 109359	2010-07-24 21:52:08 +00:00
Evan Cheng	a0b74d8804	Add an ILP scheduler. This is a register pressure aware scheduler that's appropriate for targets without detailed instruction iterineries. The scheduler schedules for increased instruction level parallelism in low register pressure situation; it schedules to reduce register pressure when the register pressure becomes high. On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2 by 16%. llvm-svn: 109300	2010-07-24 00:39:05 +00:00
Bruno Cardoso Lopes	632295a03c	Support x86 "eiz" and "riz" pseudo index registers in the assembler. llvm-svn: 109295	2010-07-24 00:06:39 +00:00
Jim Grosbach	4b7545413d	Use the appropriate register class for an i32 when adding ARM::LR to the function live in set. This will give us tGPR for Thumb1 and GPR otherwise, so the copy will be spillable. rdar://8224931 llvm-svn: 109293	2010-07-23 23:50:35 +00:00
Dale Johannesen	50d2bc2942	Revert 109076. It is wrong and was causing regressions. Add some comments explaining why it was wrong. 8225024. Fix the real problem in 8213383: the code that splits very large blocks when no other place to put constants can be found was not considering the case that the block contained a Thumb tablejump. llvm-svn: 109282	2010-07-23 22:50:23 +00:00
Evan Cheng	f215e55d5f	- Allow target to specify when is register pressure "too high". In most cases, it's too late to start backing off aggressive latency scheduling when most of the registers are in use so the threshold should be a bit tighter. - Correctly handle live out's and extract_subreg etc. - Enable register pressure aware scheduling by default for hybrid scheduler. For ARM, this is almost always a win on # of instructions. It's runtime neutral for most of the tests. But for some kernels with high register pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by 54 and sped up by 20%. llvm-svn: 109279	2010-07-23 22:39:59 +00:00
Bruno Cardoso Lopes	06fcdd6563	Remove trailing whitespace llvm-svn: 109276	2010-07-23 22:15:26 +00:00
Bruno Cardoso Lopes	a80b57e5fc	Add AVX version of CLMUL instructions llvm-svn: 109248	2010-07-23 18:41:12 +00:00
Gabor Greif	30f8e2112c	fix constness warnings llvm-svn: 109224	2010-07-23 13:28:47 +00:00
Gabor Greif	a04ffe0391	do not (implicitly) dereference iterator many times, cache it instead llvm-svn: 109222	2010-07-23 10:23:01 +00:00
Bruno Cardoso Lopes	b5374c4b69	Declare CLMUL as a subtarget feature llvm-svn: 109207	2010-07-23 01:22:45 +00:00
Bruno Cardoso Lopes	b034ffa291	Add x86 CLMUL (Carry-less multiplication) cpu feature llvm-svn: 109206	2010-07-23 01:17:51 +00:00

1 2 3 4 5 ...

15007 Commits