llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00

Author	SHA1	Message	Date
Preston Gurd	66b9c4fcf9	Bypass Slow Divides * Only apply divide bypass optimization when not optimizing for size. * Fixed bug caused by constant for 0 value of type Int32, used dividend type to generate the constant instead. * For atom x86-64 apply the divide bypass to use 16-bit divides instead of 64-bit divides when operand values are small enough. * Added lit tests for 64-bit divide bypass. Patch by Tyler Nowicki! llvm-svn: 176442	2013-03-04 18:13:57 +00:00
Arnold Schwaighofer	e60e6fc70f	X86 cost model: Adjust cost for custom lowered vector multiplies This matters for example in following matrix multiply: int mmult(int rows, int cols, int m1, int m2, int m3) { int i, j, k, val; for (i=0; i<rows; i++) { for (j=0; j<cols; j++) { val = 0; for (k=0; k<cols; k++) { val += m1[i][k] * m2[k][j]; } m3[i][j] = val; } } return(m3); } Taken from the test-suite benchmark Shootout. We estimate the cost of the multiply to be 2 while we generate 9 instructions for it and end up being quite a bit slower than the scalar version (48% on my machine). Also, properly differentiate between avx1 and avx2. On avx-1 we still split the vector into 2 128bits and handle the subvector muls like above with 9 instructions. Only on avx-2 will we have a cost of 9 for v4i64. I changed the test case in test/Transforms/LoopVectorize/X86/avx1.ll to use an add instead of a mul because with a mul we now no longer vectorize. I did verify that the mul would be indeed more expensive when vectorized with 3 kernels: for (i ...) r += a[i] * 3; for (i ...) m1[i] = m1[i] * 3; // This matches the test case in avx1.ll and a matrix multiply. In each case the vectorized version was considerably slower. radar://13304919 llvm-svn: 176403	2013-03-02 04:02:52 +00:00
Michael Liao	1e621fbd2f	Fix PR10475 - ISD::SHL/SRL/SRA must have either both scalar or both vector operands but TLI.getShiftAmountTy() so far only return scalar type. As a result, backend logic assuming that breaks. - Rename the original TLI.getShiftAmountTy() to TLI.getScalarShiftAmountTy() and re-define TLI.getShiftAmountTy() to return target-specificed scalar type or the same vector type as the 1st operand. - Fix most TICG logic assuming TLI.getShiftAmountTy() a simple scalar type. llvm-svn: 176364	2013-03-01 18:40:30 +00:00
Duncan Sands	268f52a52f	GCC thinks that this variable might be used uninitialized (it isn't). llvm-svn: 176341	2013-03-01 09:46:03 +00:00
Yiannis Tsiouris	b2a123a008	Re-format comments (and check commit access) llvm-svn: 176270	2013-02-28 16:59:10 +00:00
Nadav Rotem	6489b3c2b7	Revert r176166 because it broke one of the lit tests. llvm-svn: 176171	2013-02-27 05:56:20 +00:00
Nadav Rotem	c64a5a0435	std::string to StringRef. llvm-svn: 176166	2013-02-27 05:23:56 +00:00
Chad Rosier	5d41dc7229	[fast-isel] Make sure the FastLowerArguments function checks to make sure the arguments type is a simple type. rdar://13290455 llvm-svn: 176066	2013-02-26 01:05:31 +00:00
Michael Liao	ad0b9ecc47	Refine fix to PR10499, no functionality change - Put expensive checking after simple one llvm-svn: 176060	2013-02-25 23:16:36 +00:00
Michael Liao	ff7d7ec88b	Fix PR10499 - Check whether SSE is available before lowering all 1s vector building with PCMPEQD, which is only available from SSE2 llvm-svn: 176058	2013-02-25 23:01:03 +00:00
Chad Rosier	37142b6930	[fast-isel] Add X86FastIsel::FastLowerArguments to handle functions with 6 or fewer scalar integer (i32 or i64) arguments. It completely eliminates the need for SDISel for trivial functions. Also, add the new llc -fast-isel-abort-args option, which is similar to -fast-isel-abort option, but for formal argument lowering. llvm-svn: 176052	2013-02-25 21:59:35 +00:00
Chad Rosier	77e46d6eb6	[ms-inline asm] Add support for the pushad/popad mnemonics. rdar://13254235 llvm-svn: 176036	2013-02-25 19:06:27 +00:00
Nadav Rotem	0740239f87	Revert r169638 because it broke Mesa llvmpipe tests. Fix PR15239. llvm-svn: 175985	2013-02-24 07:09:35 +00:00
Benjamin Kramer	bdb1d9aad3	X86: Disable cmov-memory patterns on subtargets without cmov. Fixes PR15115. llvm-svn: 175962	2013-02-23 10:40:58 +00:00
Peter Collingbourne	7dc1ee08f5	x86_64: designate most general purpose and SSE registers as callee save under coldcc llvm-svn: 175911	2013-02-22 19:19:44 +00:00
Eli Bendersky	37f247b8d8	Move the eliminateCallFramePseudoInstr method from TargetRegisterInfo to TargetFrameLowering, where it belongs. Incidentally, this allows us to delete some duplicated (and slightly different!) code in TRI. There are potentially other layering problems that can be cleaned up as a result, or in a similar manner. The refactoring was OK'd by Anton Korobeynikov on llvmdev. Note: this touches the target interfaces, so out-of-tree targets may be affected. llvm-svn: 175788	2013-02-21 20:05:00 +00:00
Eli Bendersky	79457c8f90	getX86SubSuperRegister has a special mode with High=true for i64 which exists solely to enable it to call itself for i8 with some registers. The proposed patch simplifies the function somewhat to make the High bit only meaningful for the i8 mode, which makes sense. No functional difference (getX86SubSuperRegister is not getting called from anywhere outside with i64 and High=true). llvm-svn: 175762	2013-02-21 16:40:18 +00:00
Jim Grosbach	89c0252c2a	MCParser: Update method names per coding guidelines. s/AddDirectiveHandler/addDirectiveHandler/ s/ParseMSInlineAsm/parseMSInlineAsm/ s/ParseIdentifier/parseIdentifier/ s/ParseStringToEndOfStatement/parseStringToEndOfStatement/ s/ParseEscapedString/parseEscapedString/ s/EatToEndOfStatement/eatToEndOfStatement/ s/ParseExpression/parseExpression/ s/ParseParenExpression/parseParenExpression/ s/ParseAbsoluteExpression/parseAbsoluteExpression/ s/CheckForValidSection/checkForValidSection/ http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly No functional change intended. llvm-svn: 175675	2013-02-20 22:21:35 +00:00
Jim Grosbach	233487d8a2	Update TargetLowering ivars for name policy. http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly ivars should be camel-case and start with an upper-case letter. A few in TargetLowering were starting with a lower-case letter. No functional change intended. llvm-svn: 175667	2013-02-20 21:13:59 +00:00
Chad Rosier	f46d46cb82	[ms-inline asm] Make the comment a bit more verbose. llvm-svn: 175641	2013-02-20 18:03:44 +00:00
Elena Demikhovsky	0886fb4d55	I optimized the following patterns: sext <4 x i1> to <4 x i64> sext <4 x i8> to <4 x i64> sext <4 x i16> to <4 x i64> I'm running Combine on SIGN_EXTEND_IN_REG and revert SEXT patterns: (sext_in_reg (v4i64 anyext (v4i32 x )), ExtraVT) -> (v4i64 sext (v4i32 sext_in_reg (v4i32 x , ExtraVT))) The sext_in_reg (v4i32 x) may be lowered to shl+sar operations. The "sar" does not exist on 64-bit operation, so lowering sext_in_reg (v4i64 x) has no vector solution. I also added a cost of this operations to the AVX costs table. llvm-svn: 175619	2013-02-20 12:42:54 +00:00
Chad Rosier	2be41be7b9	[ms-inline asm] Force the use of a base pointer if the MachineFunction includes MS-style inline assembly. This is a follow-on to r175334. Forcing a FP to be emitted doesn't ensure it will be used. Therefore, force the base pointer as well. We now treat MS inline assembly in the same way we treat functions with dynamic stack realignment and VLAs. This guarantees the BP will be used to reference parameters and locals. rdar://13218191 llvm-svn: 175576	2013-02-19 23:50:45 +00:00
Jakub Staszak	8aa8fcbbf4	Add obvious constantness. llvm-svn: 175560	2013-02-19 21:54:59 +00:00
Benjamin Kramer	e94a9e6d14	Clean up HiPE prologue emission a bit and avoid signed arithmetic tricks. No intended functionality change. llvm-svn: 175536	2013-02-19 17:32:57 +00:00
Rafael Espindola	2e18dbb394	Move LLVM_LIBRARY_VISIBILITY for consistency with what was done to PPCJITInfo.cpp in r175394. llvm-svn: 175531	2013-02-19 17:14:33 +00:00
Eli Bendersky	cf980445ac	Make pass name more precise and fix comment. llvm-svn: 175525	2013-02-19 16:38:32 +00:00
Craig Topper	6b083f50d1	Fix capitalization in comment to match function name. llvm-svn: 175497	2013-02-19 07:43:59 +00:00
Jakub Staszak	201dd59b34	Use array_pod_sort instead of std::sort. llvm-svn: 175472	2013-02-18 23:18:22 +00:00
NAKAMURA Takumi	fd77684daa	X86FrameLowering.cpp: Fixup. Sorry for the breakage. llvm-svn: 175467	2013-02-18 23:15:21 +00:00
NAKAMURA Takumi	ea78a0dbe1	X86FrameLowering.cpp: Fix a warning in -Asserts. [-Wunused-variable] llvm-svn: 175464	2013-02-18 23:08:49 +00:00
Chad Rosier	0914ae8db3	Remove a useless assert. llvm-svn: 175463	2013-02-18 22:20:16 +00:00
Benjamin Kramer	f776243f47	Fix a 32/64 bit incompatibility in the HiPE prologue generation. llvm-svn: 175458	2013-02-18 21:45:01 +00:00
Benjamin Kramer	462b555ebe	Support for HiPE-compatible code emission, patch by Yiannis Tsiouris. llvm-svn: 175457	2013-02-18 20:55:12 +00:00
Benjamin Kramer	4ee689feed	X86: Add a note. llvm-svn: 175408	2013-02-17 23:34:14 +00:00
Jakub Staszak	933b3fda5f	Return false instead of 0. llvm-svn: 175402	2013-02-17 18:35:25 +00:00
NAKAMURA Takumi	355a81a32e	[msvc x64] Update X86CompilationCallback_Win64.asm corresponding to r175267. llvm-svn: 175363	2013-02-16 16:04:29 +00:00
Jakub Staszak	0b70c7ab00	Minor cleanups. No functionality change. llvm-svn: 175359	2013-02-16 13:34:26 +00:00
Bill Wendling	fb87157cc8	Reinitialize the ivars in the subtarget so that they can be reset with the new features. llvm-svn: 175336	2013-02-16 01:36:26 +00:00
Chad Rosier	1062ec80b5	[ms-inline asm] Do not omit the frame pointer if we have ms-inline assembly. If the frame pointer is omitted, and any stack changes occur in the inline assembly, e.g.: "pusha", then any C local variable or C argument references will be incorrect. I pass no judgement on anyone who would do such a thing. ;) rdar://13218191 llvm-svn: 175334	2013-02-16 01:25:28 +00:00
Bill Wendling	3e17fc6664	Temporary revert of 175320. llvm-svn: 175322	2013-02-15 23:22:32 +00:00
Bill Wendling	ecc7822c1e	Reinitialize the ivars in the subtarget. When we're recalculating the feature set of the subtarget, we need to have the ivars in their initial state. llvm-svn: 175320	2013-02-15 23:18:01 +00:00
Bill Wendling	cc20bdf27b	Use the 'target-features' and 'target-cpu' attributes to reset the subtarget features. If two functions require different features (e.g., `-mno-sse' vs. `-msse') then we want to honor that, especially during LTO. We can do that by resetting the subtarget's features depending upon the 'target-feature' attribute. llvm-svn: 175314	2013-02-15 22:31:27 +00:00
Chad Rosier	f2e3e4b3af	[ms-inline asm] Adjust the EndLoc to account for the ']'. llvm-svn: 175312	2013-02-15 21:58:13 +00:00
Rafael Espindola	64454b7056	Give these callbacks hidden visibility. It is better to not export them more than we need to and some ELF linkers complain about directly accessing symbols with default visibility. llvm-svn: 175268	2013-02-15 14:15:59 +00:00
Rafael Espindola	50b6c2a0e2	Don't make assumptions about the mangling of static functions in extern "C" blocks. We still don't have consensus if we should try to change clang or the standard, but llvm should work with compilers that implement the current standard and mangle those functions. llvm-svn: 175267	2013-02-15 14:08:43 +00:00
Benjamin Kramer	e11f88e804	Make helpers static. Add missing include so LLVMInitializeObjCARCOpts gets C linkage. llvm-svn: 175264	2013-02-15 12:30:38 +00:00
Eli Bendersky	3c50981d2a	The operand listing is very much outdated. llvm-svn: 175220	2013-02-14 23:17:03 +00:00
Jakub Staszak	9ef0442cde	Simplify code. Remove "else after return". llvm-svn: 175212	2013-02-14 21:50:09 +00:00
Kay Tiong Khoo	45b3d90921	added basic support for Intel ADX instructions -feature flag, instructions definitions, test cases llvm-svn: 175196	2013-02-14 19:08:21 +00:00
Nadav Rotem	402093b121	80-col llvm-svn: 175189	2013-02-14 18:20:48 +00:00
Elena Demikhovsky	3a155506e7	Fixed a bug in X86TargetLowering::LowerVectorIntExtend() (assertion failure). Added a test. llvm-svn: 175144	2013-02-14 08:20:26 +00:00
Rafael Espindola	11ec830693	Revert r175120 and r175121. Clang is producing the expected asm names again. llvm-svn: 175133	2013-02-14 03:33:34 +00:00
Rafael Espindola	d214fef222	Don't assume the mangling of static functions. llvm-svn: 175121	2013-02-14 02:49:18 +00:00
Nick Lewycky	9a61e050d5	Don't build tail calls to functions with three inreg arguments on x86-32 PIC. Fixes PR15250! llvm-svn: 175092	2013-02-13 21:59:15 +00:00
Chad Rosier	ed40f84fdc	[ms-inline-asm] Add support for memory references that have non-immediate displacements. rdar://12974533 llvm-svn: 175083	2013-02-13 21:33:44 +00:00
Benjamin Kramer	34ab81b7fa	X86: Disable generation of rep;movsl when %esi is used as a base pointer. This happens when there is both stack realignment and a dynamic alloca in the function. If we overwrite %esi (rep;movsl uses fixed registers) we'll lose the base pointer and the next register spill will write into oblivion. Fixes PR15249 and unbreaks firefox on i386/freebsd. Mozilla uses dynamic allocas and freebsd a 4 byte stack alignment. llvm-svn: 175057	2013-02-13 13:40:35 +00:00
Elena Demikhovsky	a4a4bded4d	Prevent insertion of "vzeroupper" before call that preserves YMM registers, since a caller uses preserved registers across the call. llvm-svn: 175043	2013-02-13 08:02:04 +00:00
Eric Christopher	a2c85e433f	Check i1 as well as i8 variables for 8 bit registers for x86 inline assembly. llvm-svn: 175036	2013-02-13 06:01:05 +00:00
Kay Tiong Khoo	d299c572f3	Added 0x0D to 2-byte opcode extension table for prefetch* variants Fixed decode of existing 3dNow prefetchw instruction Intel is scheduled to add a compatible prefetchw (same encoding) to future CPUs llvm-svn: 174920	2013-02-12 00:19:12 +00:00
Kay Tiong Khoo	09400e6c4a	fixed disassembly of some i386 system insts with intel syntax added file for test cases for i386 intel syntax llvm-svn: 174900	2013-02-11 19:46:36 +00:00
Eli Bendersky	1854305220	This is a follow-up on r174446, now taking Atom processors into account. Atoms use LEA for updating SP in prologs/epilogs, and the exact LEA opcode depends on the data model. Also reapplying the test case which was added and then reverted (because of Atom failures), this time specifying explicitly the CPU in addition to the triple. The test case now checks all variations (data mode, cpu Atom vs. Core). llvm-svn: 174542	2013-02-06 20:43:57 +00:00
Eli Bendersky	63b704faa4	Make sure the correct opcodes are used to SUB and ADD the stack pointer in function prologs/epilogs. The opcodes should depend on the data model (LP64 vs. ILP32) rather than the architecture bit-ness. llvm-svn: 174446	2013-02-05 21:53:29 +00:00
Jakob Stoklund Olesen	83ad73208a	Move MRI liveouts to X86 return instructions. llvm-svn: 174402	2013-02-05 17:59:48 +00:00
Eli Bendersky	89664e61b6	Fix comments llvm-svn: 174390	2013-02-05 16:53:11 +00:00
Benjamin Kramer	ae05ca2d32	X86: Open up some opportunities for constant folding by postponing shift lowering. Fixes PR15141. llvm-svn: 174327	2013-02-04 15:19:33 +00:00
Benjamin Kramer	ab649797e0	X86: Simplify code. No functionality change. llvm-svn: 174326	2013-02-04 15:19:25 +00:00
Evgeniy Stepanov	389e9dd213	More MSan/ASan annotations. This change lets us bootstrap LLVM/Clang under ASan and MSan. It contains fixes for 2 issues: - X86JIT reads return address from stack, which MSan does not know is initialized. - bugpoint tests run binaries with RLIMIT_AS. This does not work with certain Sanitizers. We are no longer including config.h in Compiler.h with this change. llvm-svn: 174306	2013-02-04 07:03:24 +00:00
David Sehr	59597001bc	Two changes relevant to LEA and x32: 1) allows the use of RIP-relative addressing in 32-bit LEA instructions under x86-64 (ILP32 and LP64) 2) separates the size of address registers in 64-bit LEA instructions from control by ILP32/LP64. llvm-svn: 174208	2013-02-01 19:28:09 +00:00
Chad Rosier	ebbd4433e6	[PEI] Pass the frame index operand number to the eliminateFrameIndex function. Each target implementation was needlessly recomputing the index. Part of rdar://13076458 llvm-svn: 174083	2013-01-31 20:02:54 +00:00
Eric Christopher	44ea43314a	Whitespace. llvm-svn: 174009	2013-01-31 00:50:48 +00:00
Eric Christopher	ae708feb79	Check and allow floating point registers to select the size of the register for inline asm. This conforms to how gcc allows for effective casting of inputs into gprs (fprs is already handled). llvm-svn: 174008	2013-01-31 00:50:46 +00:00
Evan Cheng	4d1a496923	Restrict sin/cos optimization to 64-bit only for now. 32-bit is a bit messy and less critical. llvm-svn: 173987	2013-01-30 22:56:35 +00:00
Evan Cheng	3d095b1549	Remove dead code. llvm-svn: 173812	2013-01-29 18:08:22 +00:00
Hans Wennborg	4df1f32131	Fix typo in X86BaseInfo.h that I introduced in r157818. llvm-svn: 173798	2013-01-29 14:05:57 +00:00
Craig Topper	a8c721bc88	Merge SSE and AVX shuffle instructions in the comment printer. llvm-svn: 173777	2013-01-29 07:54:31 +00:00
Evan Cheng	2e2cde560f	Teach SDISel to combine fsin / fcos into a fsincos node if the following conditions are met: 1. They share the same operand and are in the same BB. 2. Both outputs are used. 3. The target has a native instruction that maps to ISD::FSINCOS node or the target provides a sincos library call. Implemented the generic optimization in sdisel and enabled it for Mac OSX. Also added an additional optimization for x86_64 Mac OSX by using an alternative entry point __sincos_stret which returns the two results in xmm0 / xmm1. rdar://13087969 PR13204 llvm-svn: 173755	2013-01-29 02:32:37 +00:00
Craig Topper	8c275cea97	Fix 256-bit PALIGNR comment decoding to understand that it works on independent 256-bit lanes. llvm-svn: 173674	2013-01-28 07:41:18 +00:00
Craig Topper	e60d6ab3f8	Add missing break in 256-bit palignr comment printing. No test case yet because the comment itself is still wrong. llvm-svn: 173669	2013-01-28 07:19:11 +00:00
Craig Topper	97391f52d3	Fix inconsistent usage of PALIGN and PALIGNR when referring to the same instruction. llvm-svn: 173667	2013-01-28 06:48:25 +00:00
Benjamin Kramer	b7b4734d8b	X86: Decode PALIGN operands so I don't have to do it in my head. llvm-svn: 173572	2013-01-26 13:31:37 +00:00
Benjamin Kramer	f6126f19f4	X86: Do splat promotion later, so the optimizer can chew on it first. This catches many cases where we can emit a more efficient shuffle for a specific mask or when the mask contains undefs. Once the splat is lowered to unpacks we can't do that anymore. There is a possibility of moving the promotion after pshufb matching, but I'm not sure if pshufb with a mask loaded from memory is faster than 3 shuffles, so I avoided that for now. llvm-svn: 173569	2013-01-26 11:44:21 +00:00
Eli Bendersky	391ff99738	In this patch, we teach X86_64TargetMachine that it has a ILP32 (defined by the x32 ABI) mode, in which case its pointers are 32-bits in size. This knowledge is also added to X86RegisterInfo that now returns the appropriate registers in getPointerRegClass. There are many outcomes to this change. In order to keep the patches separate and manageable, we start by focusing on some simple testable cases. The patch adds a test with passing a pointer to a function - focusing on the difference between the two data models for x86-64. Another test is added for handling of 'sret' arguments (and functionality is added in X86ISelLowering to make it work). A note on naming: the "x32 ABI" document refers to the AMD64 architecture (in LLVM it's distinguished by being is64Bits() in the x86 subtarget) with two variations: the LP64 (default) data model, and the ILP32 data model. This patch adds predicates to the subtarget which are consistent with this naming scheme. llvm-svn: 173503	2013-01-25 22:07:43 +00:00
Renato Golin	efde585fc3	Moving Cost Tables up to share with other targets llvm-svn: 173382	2013-01-24 23:01:00 +00:00
Michael Liao	f1ce1e547c	Fix an issue of pseudo atomic instruction DAG schedule - Add list of physical registers clobbered in pseudo atomic insts Physical registers are clobbered when pseudo atomic instructions are expanded. Add them in clobber list to prevent DAG scheduler to mis-schedule them after these insns are declared side-effect free. - Add test case from Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 173200	2013-01-22 21:47:38 +00:00
Benjamin Kramer	f2fc27452c	X86: Make sure we account for the FMA4 register immediate value, otherwise rip-rel relocations will be off by one byte. PR15040. llvm-svn: 173176	2013-01-22 18:05:59 +00:00
Eli Bendersky	36076df691	Initial patch for x32 ABI support. Add the x32 environment kind to the triple, and separate the concept of pointer size and callee save stack slot size, since they're not equal on x32. llvm-svn: 173175	2013-01-22 18:02:49 +00:00
Tim Northover	52ba1e77cb	Make APFloat constructor require explicit semantics. Previously we tried to infer it from the bit width size, with an added IsIEEE argument for the PPC/IEEE 128-bit case, which had a default value. This default value allowed bugs to creep in, where it was inappropriate. llvm-svn: 173138	2013-01-22 09:46:31 +00:00
Craig Topper	c227faa439	Use <0 checks in place of ==-1 because it results in simpler code. llvm-svn: 173010	2013-01-21 07:25:16 +00:00
Craig Topper	8c9d15eee7	Use MVT instead of EVT in LowerVECTOR_SHUFFLEtoBlend. llvm-svn: 173009	2013-01-21 07:19:54 +00:00
Craig Topper	f0715ea5bb	Remove trailing whitespace. llvm-svn: 173008	2013-01-21 06:57:59 +00:00
Craig Topper	364a4f7a27	Fix some 80 column violations. llvm-svn: 173006	2013-01-21 06:21:54 +00:00
Craig Topper	636472593a	Make helper method static. llvm-svn: 173005	2013-01-21 06:13:28 +00:00
Craig Topper	27f55b0886	Convert more EVT's to MVT's in the lowering methods. llvm-svn: 172995	2013-01-20 21:50:27 +00:00
Craig Topper	31bc22abcd	Capitalize lowerTRUNCATE so that it matches the other lower functions in this file despite it not matching coding standards. llvm-svn: 172994	2013-01-20 21:34:37 +00:00
Renato Golin	5260b180e5	Revert CostTable algorithm, will re-write llvm-svn: 172992	2013-01-20 20:57:20 +00:00
Craig Topper	ca029d2150	Make LowerVSETCC a static function and use MVT instead of EVT. llvm-svn: 172969	2013-01-20 09:02:22 +00:00
Nadav Rotem	94213533f7	Revert 172708. The optimization handles esoteric cases but adds a lot of complexity both to the X86 backend and to other backends. This optimization disables an important canonicalization of chains of SEXT nodes and makes SEXT and ZEXT asymmetrical. Disabling the canonicalization of consecutive SEXT nodes into a single node disables other DAG optimizations that assume that there is only one SEXT node. The AVX mask optimizations is one example. Additionally this optimization does not update the cost model. llvm-svn: 172968	2013-01-20 08:35:56 +00:00
Craig Topper	33f4f75f64	Make some helper methods static. llvm-svn: 172936	2013-01-20 00:50:58 +00:00
Craig Topper	c36bc959d8	Remove DebugLoc argument from static function. It can easily be obtained from the SVOp passed in. llvm-svn: 172935	2013-01-20 00:43:42 +00:00
Craig Topper	cae5aa7eae	Use MVT instead of EVT in more instruction lowering code. llvm-svn: 172933	2013-01-20 00:38:18 +00:00
Craig Topper	33af4801d0	Use MVT instead of EVT in more of the shuffle lowering code. llvm-svn: 172930	2013-01-19 23:36:09 +00:00
Craig Topper	c7903010f6	Capitalize LowerVectorIntExtend to be consistent with all the other lower functions in this file. llvm-svn: 172927	2013-01-19 23:14:09 +00:00
Nadav Rotem	6d7bb8551d	On Sandybridge split unaligned 256bit stores into two xmm-sized stores. llvm-svn: 172894	2013-01-19 08:38:41 +00:00
Craig Topper	a19b081754	Use MVT instead of EVT when computing shuffle immediates since they can only be for legal types. Keeps compiler from generating unneeded checks and handling for extended types. llvm-svn: 172893	2013-01-19 08:27:45 +00:00
Nadav Rotem	de06852537	On Sandybridge loading unaligned 256bits using two XMM loads (vmovups and vinsertf128) is faster than using a single vmovups instruction. llvm-svn: 172868	2013-01-18 23:10:30 +00:00
Craig Topper	a7891c8da9	Calculate vector element size more directly for VINSERTF128/VEXTRACTF128 immediate handling. Also use MVT since this only called on legal types during pattern matching. llvm-svn: 172797	2013-01-18 08:41:28 +00:00
Craig Topper	4389c60096	Minor formatting fix. No functional change. llvm-svn: 172795	2013-01-18 07:27:20 +00:00
Craig Topper	c06d31bb06	Spelling fix: extened->extended. Trailing whitespace in same function. llvm-svn: 172793	2013-01-18 06:50:59 +00:00
Craig Topper	a819bc40e0	Make more use of is128BitVector/is256BitVector in place of getSizeInBits() == 128/256. llvm-svn: 172792	2013-01-18 06:44:29 +00:00
Chad Rosier	66c139114d	[ms-inline asm] Make the error message more generic now that we support the 'SIZE' and 'LENGTH' operators. llvm-svn: 172773	2013-01-18 00:50:59 +00:00
Chad Rosier	bb513e22fa	[ms-inline asm] Add support for the 'SIZE' and 'LENGTH' operators. Part of rdar://12576868 llvm-svn: 172743	2013-01-17 19:21:48 +00:00
Elena Demikhovsky	461c2bd18c	Optimization for the following SIGN_EXTEND pairs: v8i8 -> v8i64, v8i8 -> v8i32, v4i8 -> v4i64, v4i16 -> v4i64 for AVX and AVX2. Bug 14865. llvm-svn: 172708	2013-01-17 09:59:53 +00:00
Craig Topper	c5444baf77	Combine AVX and SSE forms of MOVSS and MOVSD into the same multiclasses so they get instantiated together. llvm-svn: 172704	2013-01-17 06:59:42 +00:00
Jakob Stoklund Olesen	4cc85cb304	Provide a place for targets to insert ILP optimization passes. Move the early if-conversion pass into this group. ILP optimizations usually need to find the right balance between register pressure and ILP using the MachineTraceMetrics analysis to identify critical paths and estimate other costs. Such passes should run together so they can share dominator tree and loop info analyses. Besides if-conversion, future passes to run here here could include expression height reduction and ARM's MLxExpansion pass. llvm-svn: 172687	2013-01-17 00:58:38 +00:00
Renato Golin	1487c2a7ac	Change CostTable model to be global to all targets Moving the X86CostTable to a common place, so that other back-ends can share the code. Also simplifying it a bit and commoning up tables with one and two types on operations. llvm-svn: 172658	2013-01-16 21:29:55 +00:00
Chad Rosier	1f23c079a7	[ms-inline asm] Extend support for parsing Intel bracketed memory operands that have an arbitrary ordering of the base register, index register and displacement. rdar://12527141 llvm-svn: 172484	2013-01-14 22:31:35 +00:00
Craig Topper	58b9662000	Simplify nested strconcats in X86 td files since strconcat can take more than 2 arguments. llvm-svn: 172379	2013-01-14 07:46:34 +00:00
Craig Topper	7dac5e7e3d	Create a single multiclass for SSE and AVX version of MOVL/MOVH. Prevents needing to specify everything twice. No functional change intended llvm-svn: 172378	2013-01-14 07:26:58 +00:00
Nick Lewycky	07a4cc5052	Fix typo in comment. llvm-svn: 172364	2013-01-13 19:03:55 +00:00
Benjamin Kramer	26eae94ea6	X86: Add patterns for X86ISD::VSEXT in registers. Those can occur when something between the sextload and the store is on the same chain and blocks isel. Fixes PR14887. llvm-svn: 172353	2013-01-13 11:37:04 +00:00
Preston Gurd	7affdf3bdd	Update patch for the pad short functions pass for Intel Atom (only). Adds a check for -Oz, changes the code to not re-visit BBs, and skips over DBG_VALUE instrs. Patch by Andy Zhang. llvm-svn: 172258	2013-01-11 22:06:56 +00:00
NAKAMURA Takumi	da4d0cbcc1	X86AsmParser.cpp: Fix up r172148, to add initializer in another CreateMem(). llvm-svn: 172157	2013-01-11 01:13:54 +00:00
Jakub Staszak	4beed9fd38	Remove heavy and unused #inclues from X86TargetObjectFile.cpp. llvm-svn: 172151	2013-01-10 23:43:56 +00:00
Chad Rosier	217f7fad13	[ms-inline asm] Make sure we set a default value for AddressOf. Follow on to r172121. llvm-svn: 172148	2013-01-10 23:39:07 +00:00
Chad Rosier	f66d08be5c	[ms-inline asm] Add support for calling functions from inline assembly. Part of rdar://12991541 llvm-svn: 172121	2013-01-10 22:10:27 +00:00
Nadav Rotem	436dc952aa	ARM Cost model: Use the size of vector registers and widest vectorizable instruction to determine the max vectorization factor. llvm-svn: 172010	2013-01-09 22:29:00 +00:00
Nadav Rotem	18e176ccaa	Efficient lowering of vector sdiv when the divisor is a splatted power of two constant. PR 14848. The lowered sequence is based on the existing sequence the target-independent DAG Combiner creates for the scalar case. Patch by Zvi Rackover. llvm-svn: 171953	2013-01-09 05:14:33 +00:00
Eric Christopher	44e3142d09	Last in the series of removing unnecessary '0' arguments for address space. Reordered the EmitULEB128IntValue arguments to make this easier. llvm-svn: 171949	2013-01-09 03:52:05 +00:00
Andrew Trick	c15e94c204	MIsched: add an ILP window property to machine model. This was an experimental option, but needs to be defined per-target. e.g. PPC A2 needs to aggressively hide latency. I converted some in-order scheduling tests to A2. Hal is working on more test cases. llvm-svn: 171946	2013-01-09 03:36:49 +00:00
Eric Christopher	38c8e00aa9	These functions have default arguments of 0 for the last arg. Use them. llvm-svn: 171933	2013-01-09 01:57:54 +00:00
Nadav Rotem	9c27f36e59	Cost Model: Move the 'max unroll factor' variable to the TTI and add initial Cost Model support on ARM. llvm-svn: 171928	2013-01-09 01:15:42 +00:00
Preston Gurd	4b0d66f924	Pad Short Functions for Intel Atom The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. This patch has been updated to address Nadav's review comments - Optimize only at >= O1 and don't do optimization if -Os is set - Stores MachineBasicBlock* instead of BBNum - Uses DenseMap instead of std::map - Fixes placement of braces Patch by Andy Zhang. llvm-svn: 171879	2013-01-08 18:27:24 +00:00
Eli Bendersky	4699968d0b	Renamed MCInstFragment to MCRelaxableFragment and added some comments. No change in functionality. llvm-svn: 171822	2013-01-08 00:22:56 +00:00
Jordan Rose	c95190a559	Change SMRange to be half-open (exclusive end) instead of closed (inclusive) This is necessary not only for representing empty ranges, but for handling multibyte characters in the input. (If the end pointer in a range refers to a multibyte character, should it point to the beginning or the end of the character in a char array?) Some of the code in the asm parsers was already assuming this anyway. llvm-svn: 171765	2013-01-07 19:00:49 +00:00
Craig Topper	8884832622	Remove # from the beginning and end of def names. llvm-svn: 171696	2013-01-07 05:26:58 +00:00
Craig Topper	b80024c8e6	Remove unnecessary # tokens at the beginning and end of defm names. llvm-svn: 171694	2013-01-07 05:04:39 +00:00
Chandler Carruth	7723d75e9e	Fix the enumerator names for ShuffleKind to match tho coding standards, and make its comments doxygen comments. llvm-svn: 171688	2013-01-07 03:20:02 +00:00
Chandler Carruth	601fa4e996	Make the popcnt support enums and methods have more clear names and follow the conding conventions regarding enumerating a set of "kinds" of things. llvm-svn: 171687	2013-01-07 03:16:03 +00:00
Chandler Carruth	3c0f5d4efb	Move TargetTransformInfo to live under the Analysis library. This no longer would violate any dependency layering and it is in fact an analysis. =] llvm-svn: 171686	2013-01-07 03:08:10 +00:00
Chandler Carruth	30bd563e01	Switch TargetTransformInfo from an immutable analysis pass that requires a TargetMachine to construct (and thus isn't always available), to an analysis group that supports layered implementations much like AliasAnalysis does. This is a pretty massive change, with a few parts that I was unable to easily separate (sorry), so I'll walk through it. The first step of this conversion was to make TargetTransformInfo an analysis group, and to sink the nonce implementations in ScalarTargetTransformInfo and VectorTargetTranformInfo into a NoTargetTransformInfo pass. This allows other passes to add a hard requirement on TTI, and assume they will always get at least on implementation. The TargetTransformInfo analysis group leverages the delegation chaining trick that AliasAnalysis uses, where the base class for the analysis group delegates to the previous analysis pass, allowing all but tho NoFoo analysis passes to only implement the parts of the interfaces they support. It also introduces a new trick where each pass in the group retains a pointer to the top-most pass that has been initialized. This allows passes to implement one API in terms of another API and benefit when some other pass above them in the stack has more precise results for the second API. The second step of this conversion is to create a pass that implements the TargetTransformInfo analysis using the target-independent abstractions in the code generator. This replaces the ScalarTargetTransformImpl and VectorTargetTransformImpl classes in lib/Target with a single pass in lib/CodeGen called BasicTargetTransformInfo. This class actually provides most of the TTI functionality, basing it upon the TargetLowering abstraction and other information in the target independent code generator. The third step of the conversion adds support to all TargetMachines to register custom analysis passes. This allows building those passes with access to TargetLowering or other target-specific classes, and it also allows each target to customize the set of analysis passes desired in the pass manager. The baseline LLVMTargetMachine implements this interface to add the BasicTTI pass to the pass manager, and all of the tools that want to support target-aware TTI passes call this routine on whatever target machine they end up with to add the appropriate passes. The fourth step of the conversion created target-specific TTI analysis passes for the X86 and ARM backends. These passes contain the custom logic that was previously in their extensions of the ScalarTargetTransformInfo and VectorTargetTransformInfo interfaces. I separated them into their own file, as now all of the interface bits are private and they just expose a function to create the pass itself. Then I extended these target machines to set up a custom set of analysis passes, first adding BasicTTI as a fallback, and then adding their customized TTI implementations. The fourth step required logic that was shared between the target independent layer and the specific targets to move to a different interface, as they no longer derive from each other. As a consequence, a helper functions were added to TargetLowering representing the common logic needed both in the target implementation and the codegen implementation of the TTI pass. While technically this is the only change that could have been committed separately, it would have been a nightmare to extract. The final step of the conversion was just to delete all the old boilerplate. This got rid of the ScalarTargetTransformInfo and VectorTargetTransformInfo classes, all of the support in all of the targets for producing instances of them, and all of the support in the tools for manually constructing a pass based around them. Now that TTI is a relatively normal analysis group, two things become straightforward. First, we can sink it into lib/Analysis which is a more natural layer for it to live. Second, clients of this interface can depend on it always being available which will simplify their code and behavior. These (and other) simplifications will follow in subsequent commits, this one is clearly big enough. Finally, I'm very aware that much of the comments and documentation needs to be updated. As soon as I had this working, and plausibly well commented, I wanted to get it committed and in front of the build bots. I'll be doing a few passes over documentation later if it sticks. Commits to update DragonEgg and Clang will be made presently. llvm-svn: 171681	2013-01-07 01:37:14 +00:00
Craig Topper	7af95b6c84	Fix suffix handling for parsing and printing of cvtsi2ss, cvtsi2sd, cvtss2si, cvttss2si, cvtsd2si, and cvttsd2si to match gas behavior. cvtsi2* should parse with an 'l' or 'q' suffix or no suffix at all. No suffix should be treated the same as 'l' suffix. Printing should always print a suffix. Previously we didn't parse or print an 'l' suffix. cvtt2si/cvt2si should parse with an 'l' or 'q' suffix or not suffix at all. No suffix should use the destination register size to choose encoding. Printing should not print a suffix. Original 'l' suffix issue with cvtsi2* pointed out by Michael Kuperstein. llvm-svn: 171668	2013-01-06 20:39:29 +00:00
Evan Cheng	80b19ffea6	Fix for PR14739. It's not safe to fold a load into a call across a store. Thanks to Nick Lewycky for the initial patch. llvm-svn: 171665	2013-01-06 19:00:15 +00:00
Craig Topper	942e03f627	Recommit r171461 which was incorrectly reverted. Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks. llvm-svn: 171608	2013-01-05 07:39:25 +00:00
Nadav Rotem	900cb45dec	Revert revision 171524. Original message: URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev Log: The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. llvm-svn: 171603	2013-01-05 05:42:48 +00:00
Jakub Staszak	369da81e4b	Move 'break' to the right place to prevent fallthru. There is no test-case because conditions in the next case prevented from doing anything nasty. llvm-svn: 171549	2013-01-04 23:01:26 +00:00
Preston Gurd	b1c34fa73f	The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. llvm-svn: 171524	2013-01-04 20:54:54 +00:00
Nadav Rotem	cb3562a88e	LoopVectorizer: 1. Add code to estimate register pressure. 2. Add code to select the unroll factor based on register pressure. 3. Add bits to TargetTransformInfo to provide the number of registers. llvm-svn: 171469	2013-01-04 17:48:25 +00:00
Nadav Rotem	08d6ff1eaf	Revert revision: 171467. This transformation is incorrect and makes some tests fail. Original message: Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1. Added a test. llvm-svn: 171468	2013-01-04 17:35:21 +00:00
Elena Demikhovsky	d675e085b0	Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1. Added a test. llvm-svn: 171467	2013-01-03 08:48:33 +00:00
Michael Gottesman	5f81e1c2d0	Revert "Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks." This reverts commit r171461 since it breaks the following tests: Clang :: Analysis/outofbound-notwork.c Clang :: Analysis/string-fail.c Clang :: CXX/basic/basic.lookup/basic.lookup.qual/p6-0x.cpp Clang :: CXX/basic/basic.lookup/basic.lookup.unqual/p15.cpp Clang :: CXX/dcl.dcl/dcl.spec/dcl.fct.spec/p4.cpp Clang :: CXX/dcl.dcl/dcl.spec/dcl.stc/p10.cpp Clang :: CXX/temp/temp.param/p14.cpp Clang :: CXX/temp/temp.res/temp.dep.res/temp.point/p1.cpp Clang :: CodeGen/2009-02-13-zerosize-union-field-ppc.c Clang :: CodeGen/blocks-2.c Clang :: CodeGen/libcalls-d.c Clang :: CodeGen/libcalls-ld.c Clang :: CodeGenCXX/conversion-function.cpp Clang :: CodeGenCXX/debug-info-limit-type.cpp Clang :: CodeGenCXX/inheriting-constructor.cpp Clang :: FixIt/fixit-errors.c Clang :: FixIt/fixit-pmem.cpp Clang :: Modules/namespaces.cpp Clang :: PCH/changed-files.c Clang :: PCH/pr4489.c Clang :: PCH/source-manager-stack.c Clang :: Parser/cxx-ambig-decl-expr-xfail.cpp Clang :: SemaCXX/switch-implicit-fallthrough-cxx98.cpp Clang :: SemaTemplate/instantiate-function-1.mm llvm-svn: 171466	2013-01-03 08:18:30 +00:00

1 2 3 4 5 ...

9169 Commits