llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00

Author	SHA1	Message	Date
Nuno Lopes	ea7b37e3ae	teach DSE and isInstructionTriviallyDead() about calloc llvm-svn: 156553	2012-05-10 17:14:00 +00:00
Joel Jones	cc8aa34ea8	formatting change: strip debug info from test llvm-svn: 156551	2012-05-10 16:55:31 +00:00
Manman Ren	727b7d5e4c	ARM: peephole optimization to remove cmp instruction This patch will optimize the following cases: sub r1, r3 \| sub r1, imm cmp r3, r1 or cmp r1, r3 \| cmp r1, imm bge L1 TO subs r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can replace "sub" with "subs" and eliminate the "cmp" instruction. rdar: 10734411 llvm-svn: 156550	2012-05-10 16:48:21 +00:00
Joel Jones	305ecb3495	Fix a problem with incomplete equality testing of PHINodes in Instruction::IsIdenticalToWhenDefined. This manifested itself when inlining two calls to the same function. The inlined function had a switch statement that returned one of a set of global variables. Without this modification, the two phi instructions that chose values from the branches of the switch instruction inlined from the callee were considered equivalent and jump-threading replaced a load for the first switch value with a phi selecting from the second switch, thereby producing incorrect code. This patch has been tested with "make check-all", "lnt runteste nt", and llvm self-hosted, and on the original program that had this problem, wireshark. <rdar://problem/11025519> llvm-svn: 156548	2012-05-10 15:59:41 +00:00
Nadav Rotem	157be301c5	AVX2: Add an additional broadcast idiom. llvm-svn: 156540	2012-05-10 12:39:13 +00:00
Nadav Rotem	64319ce27c	Generate AVX/AVX2 shuffles even when there is a memory op somewhere else in the program. Starting r155461 we are able to select patterns for vbroadcast even when the load op is used by other users. Fix PR11900. llvm-svn: 156539	2012-05-10 12:22:05 +00:00
Dan Gohman	9e72870dd1	Fix the objc_storeStrong recognizer to stop before walking off the end of a basic block if there's no store. llvm-svn: 156520	2012-05-09 23:08:33 +00:00
Nuno Lopes	3d7a8137ee	objectsize: refactor code a bit to enable future changes to support run-time information add support to compute allocation sizes at run-time if penalty > 1 (e.g., malloc(x), calloc(x, y), and VLAs) llvm-svn: 156515	2012-05-09 21:30:57 +00:00
Danil Malyshev	0e378f7bcb	Added a regress test for the bug #9964 before close it. This bug was fixed by Jim Grosbach in #138879, thanks Jim! llvm-svn: 156505	2012-05-09 19:07:04 +00:00
Nuno Lopes	e8880a9916	change the objectsize intrinsic signature: add a 3rd parameter to denote the maximum runtime performance penalty that the user is willing to accept. This commit only adds the parameter. Code taking advantage of it will follow. llvm-svn: 156473	2012-05-09 15:52:43 +00:00
Filipe Cabecinhas	fe00fb1f06	Fixed a typo llvm-svn: 156471	2012-05-09 14:43:50 +00:00
Akira Hatanaka	574e68feec	Add another peephole pattern for conditional moves. llvm-svn: 156460	2012-05-09 02:29:29 +00:00
Akira Hatanaka	a53bdc878f	Make register FP allocatable if the compiled function does not have dynamic allocas. llvm-svn: 156458	2012-05-09 01:38:13 +00:00
Akira Hatanaka	bd2f3d1c46	Expand 64-bit shifts if target ABI is O32. llvm-svn: 156457	2012-05-09 00:55:21 +00:00
Dan Gohman	b47d02f929	Fix objc_storeStrong pattern matching to catch a potential use of the old value after the store but before it is released. This fixes rdar:/11116986. llvm-svn: 156442	2012-05-08 23:34:08 +00:00
Eric Christopher	63b73ede75	Handle OpDeref in case it comes in as a register operand. Part of rdar://11352000 llvm-svn: 156405	2012-05-08 18:56:00 +00:00
Daniel Dunbar	8e13944f35	Revert r156393, "[tests] Remove some remaining DejaGNU related cruft.", this patch wasn't ready yet. llvm-svn: 156395	2012-05-08 18:26:07 +00:00
Daniel Dunbar	882b236879	[tests] Remove some remaining DejaGNU related cruft. llvm-svn: 156393	2012-05-08 18:11:49 +00:00
Duncan Sands	c9f011a85b	Calling ReassociateExpression recursively is extremely dangerous since it will replace the operands of expressions with only one use with undef and generate a new expression for the original without using RAUW to update the original. Thus any copies of the original expression held in a vector may end up referring to some bogus value - and using a ValueHandle won't help since there is no RAUW. There is already a mechanism for getting the effect of recursion non-recursively: adding the value to be recursed on to RedoInsts. But it wasn't being used systematically. Have various places where recursion had snuck in at some point use the RedoInsts mechanism instead. Fixes PR12169. llvm-svn: 156379	2012-05-08 12:16:05 +00:00
Stepan Dyatkovskiy	b150cd5ced	Rejected r156374: Ordinary PR1255 patch. Due to clang-x86_64-debian-fnt buildbot failure. llvm-svn: 156377	2012-05-08 08:33:21 +00:00
Craig Topper	77b1a4cee5	Remove 256-bit AVX non-temporal store intrinsics. Similar was previously done for 128-bit. llvm-svn: 156375	2012-05-08 06:58:15 +00:00
Stepan Dyatkovskiy	33fd2a5bf4	Ordinary patch for PR1255. Added new case-ranges orientated methods for adding/removing cases in SwitchInst. After this patch cases will internally representated as ConstantArray-s instead of ConstantInt, externally cases wrapped within the ConstantRangesSet object. Old methods of SwitchInst are also works well, but marked as deprecated. So on this stage we have no side effects except that I added support for case ranges in BitcodeReader/Writer, of course test for Bitcode is also added. Old "switch" format is also supported. llvm-svn: 156374	2012-05-08 06:36:08 +00:00
Owen Anderson	8adb0322ce	Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled. llvm-svn: 156324	2012-05-07 20:51:25 +00:00
Owen Anderson	1e7a4f0f91	Teach reassociate to commute FMul's and FAdd's in order to canonicalize the order of their operands across instructions. This allows for greater CSE opportunities. llvm-svn: 156323	2012-05-07 20:47:23 +00:00
Chad Rosier	3e284d8bd6	Fix a regression from r147481. This combine should only happen if there is a single use. rdar://11360370 llvm-svn: 156316	2012-05-07 18:47:44 +00:00
Manman Ren	6fde9f74b4	X86: optimization for -(x != 0) This patch will optimize -(x != 0) on X86 FROM cmpl $0x01,%edi sbbl %eax,%eax notl %eax TO negl %edi sbbl %eax %eax In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td: def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>; rdar: 10961709 llvm-svn: 156312	2012-05-07 18:06:23 +00:00
Eric Christopher	87e8163c57	Add support for the 'l' constraint. Patch by Jack Carter. llvm-svn: 156294	2012-05-07 06:25:15 +00:00
Eric Christopher	af8eabbbd8	Add support for the 'c' constraint. Patch by Jack Carter. llvm-svn: 156293	2012-05-07 06:25:10 +00:00
Eric Christopher	0f1a0afa75	Add support for the 'P' constraint. Patch by Jack Carter. llvm-svn: 156292	2012-05-07 06:25:02 +00:00
Eric Christopher	a6552ba637	Add support for the 'O' constraint. Patch by Jack Carter. llvm-svn: 156285	2012-05-07 05:46:48 +00:00
Eric Christopher	5e1efebf09	Add support for the 'N' inline asm constraint. Patch by Jack Carter. llvm-svn: 156284	2012-05-07 05:46:43 +00:00
Eric Christopher	e5a46b70b3	Add support for the 'L' inline asm constraint. Patch by Jack Carter. llvm-svn: 156283	2012-05-07 05:46:37 +00:00
Eric Christopher	267aa256cb	Add support for the inline asm constraint 'K'. llvm-svn: 156282	2012-05-07 05:46:29 +00:00
Craig Topper	c6d0bc2afc	Add SSE4A MOVNTSS/MOVNTSD instructions. llvm-svn: 156281	2012-05-07 05:36:19 +00:00
Eric Christopher	bf784be9ae	Support the 'J' constraint. Patch by Jack Carter. llvm-svn: 156280	2012-05-07 03:13:42 +00:00
Eric Christopher	929ba63dcf	Add support for the 'I' inline asm constraint. Also add tests from the previous 2 patches. Patch by Jack Carter. llvm-svn: 156279	2012-05-07 03:13:32 +00:00
Benjamin Kramer	786f7671ab	Switch the select to branch transformation on by default. The primitive conservative heuristic seems to give a slight overall improvement while not regressing stuff. Make it available to wider testing. If you notice any speed regressions (or significant code size regressions) let me know! llvm-svn: 156258	2012-05-06 14:25:16 +00:00
Benjamin Kramer	0463564612	CodeGenPrepare: Add a transform to turn selects into branches in some cases. This came up when a change in block placement formed a cmov and slowed down a hot loop by 50%: ucomisd (%rdi), %xmm0 cmovbel %edx, %esi cmov is a really bad choice in this context because it doesn't get branch prediction. If we emit it as a branch, an out-of-order CPU can do a better job (if the branch is predicted right) and avoid waiting for the slow load+compare instruction to finish. Of course it won't help if the branch is unpredictable, but those are really rare in practice. This patch uses a dumb conservative heuristic, it turns all cmovs that have one use and a direct memory operand into branches. cmovs usually save some code size, so we disable the transform in -Os mode. In-Order architectures are unlikely to benefit as well, those are included in the "predictableSelectIsExpensive" flag. It would be better to reuse branch probability info here, but BPI doesn't support select instructions currently. It would make sense to use the same heuristics as the if-converter pass, which does the opposite direction of this transform. Test suite shows a small improvement here and there on corei7-level machines, but the actual results depend a lot on the used microarchitecture. The transformation is currently disabled by default and available by passing the -enable-cgp-select2branch flag to the code generator. Thanks to Chandler for the initial test case to him and Evan Cheng for providing me with comments and test-suite numbers that were more stable than mine :) llvm-svn: 156234	2012-05-05 12:49:22 +00:00
Stepan Dyatkovskiy	469935e0ae	Small fix in InstCombineCasts.cpp. Restored "alloca + bitcast" reducing for case when alloca's size is calculated within the "add/sub/... nsw". Also added fix to 2011-06-13-nsw-alloca.ll test. llvm-svn: 156231	2012-05-05 07:09:40 +00:00
Justin Holewinski	4ca961430f	This patch adds a new NVPTX back-end to LLVM which supports code generation for NVIDIA PTX 3.0. This back-end will (eventually) replace the current PTX back-end, while maintaining compatibility with it. The new target machines are: nvptx (old ptx32) => 32-bit PTX nvptx64 (old ptx64) => 64-bit PTX The sources are based on the internal NVIDIA NVPTX back-end, and contain more functionality than the current PTX back-end currently provides. NV_CONTRIB llvm-svn: 156196	2012-05-04 20:18:50 +00:00
Sebastian Pop	2b868d474e	Added missing CMN case in Thumb2SizeReduction pass so that LLVM emits 16-bits encoding of CMN instructions. llvm-svn: 156195	2012-05-04 19:53:56 +00:00
Craig Topper	6881f1067c	Allow v16i16 and v32i8 shuffles to be rewritten as narrower shuffles. llvm-svn: 156156	2012-05-04 04:44:49 +00:00
Kevin Enderby	1ab00df6a7	Fix issues with the ARM bl and blx thumb instructions and the J1 and J2 bits for the assembler and disassembler. Which were not being set/read correctly for offsets greater than 22 bits in some cases. Changes to lib/Target/ARM/ARMAsmBackend.cpp from Gideon Myles! llvm-svn: 156118	2012-05-03 22:41:56 +00:00
Nuno Lopes	2762496a1a	remove calls to calloc if the allocated memory is not used (it was already being done for malloc) fix a few typos found by Chad in my previous commit llvm-svn: 156110	2012-05-03 22:08:19 +00:00
Sirish Pande	253d16b1b0	Support for target dependent Hexagon VLIW packetizer. This patch creates and optimizes packets as per Hexagon ISA rules. llvm-svn: 156109	2012-05-03 21:52:53 +00:00
Nuno Lopes	26239aeb99	add support for calloc to objectsize lowering llvm-svn: 156102	2012-05-03 21:19:58 +00:00
Silviu Baranga	b5e46c12d6	Fixed disassembler for vstm/vldm ARM VFP instructions. llvm-svn: 156077	2012-05-03 16:38:40 +00:00
Craig Topper	52869bf5bf	Fix 256-bit vpshuflw and vpshufhw immediate encoding to handle undefs in the lower half correctly. Missed in r155982. llvm-svn: 156059	2012-05-03 07:12:59 +00:00
Evan Cheng	59c8d1af93	Fix two-address pass's aggressive instruction commuting heuristics. It's meant to catch cases like: %reg1024<def> = MOV r1 %reg1025<def> = MOV r0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 By commuting ADD, it let coalescer eliminate all of the copies. However, there was a bug in the heuristics where it ended up commuting the ADD in: %reg1024<def> = MOV r0 %reg1025<def> = MOV 0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 That did no benefit but rather ensure the last MOV would not be coalesced. rdar://11355268 llvm-svn: 156048	2012-05-03 01:45:13 +00:00
Owen Anderson	e3d41b44cc	Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, just like it now knows for FMULs. llvm-svn: 156029	2012-05-02 22:17:40 +00:00

1 2 3 4 5 ...

16180 Commits