llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-30 23:42:52 +01:00

Author	SHA1	Message	Date
Jakob Stoklund Olesen	207108d4a4	Fix 12892. Dead code elimination during coalescing could cause a virtual register to be split into connected components. The following rewriting would be confused about the already joined copies present in the code, but without a corresponding value number in the live range. Erase all joined copies instantly when joining intervals such that the MI and LiveInterval representations are always in sync. llvm-svn: 157135	2012-05-19 23:34:59 +00:00
Peter Collingbourne	0baed83df2	Do not eliminate allocas whose alignment exceeds that of the copied-in constant, as a subsequent user may rely on over alignment. Fixes PR12885. llvm-svn: 157134	2012-05-19 22:52:10 +00:00
Jakob Stoklund Olesen	81d77434d9	Erase joined copies immediately. The late dead code elimination is no longer necessary. The test changes are cause by a register hint that can be either %rdi or %rax. The choice depends on the use list order, which this patch changes. llvm-svn: 157131	2012-05-19 20:54:07 +00:00
Nadav Rotem	a46c75d9a7	On Haswell, perfer storing YMM registers using a single instruction. llvm-svn: 157129	2012-05-19 20:30:08 +00:00
Nadav Rotem	441318ee29	Add support for additional in-reg vbroadcast patterns llvm-svn: 157127	2012-05-19 19:57:37 +00:00
Eric Christopher	74be9f0f18	Actually support DW_TAG_rvalue_reference_type that we were trying to generate out of the front end. rdar://11479676 llvm-svn: 157094	2012-05-19 01:36:37 +00:00
Eric Christopher	2d50bc73f8	Add support for the 'd' mips inline asm output modifier. Patch by Jack Carter. llvm-svn: 157093	2012-05-19 00:51:56 +00:00
Andrew Trick	c3765e6af0	SCEV: Add MarkPendingLoopPredicates to avoid recursive isImpliedCond. getUDivExpr attempts to simplify by checking for overflow. isLoopEntryGuardedByCond then evaluates the loop predicate which may lead to the same getUDivExpr causing endless recursion. Fixes PR12868: clang 3.2 segmentation fault. llvm-svn: 157092	2012-05-19 00:48:25 +00:00
Dan Gohman	a487e2b57e	Fix replacing all the users of objc weak runtime routines when deleting them. rdar://11434915. llvm-svn: 157080	2012-05-18 22:17:29 +00:00
Nuno Lopes	4b9a5ae769	allow LazyValueInfo::getEdgeValue() to reason about multiple edges from the same switch instruction by doing union of ranges (which may still be conservative, but it's more aggressive than before) llvm-svn: 157071	2012-05-18 21:02:10 +00:00
Jim Grosbach	343a996ca5	Refactor data-in-code annotations. Use a dedicated MachO load command to annotate data-in-code regions. This is the same format the linker produces for final executable images, allowing consistency of representation and use of introspection tools for both object and executable files. Data-in-code regions are annotated via ".data_region"/".end_data_region" directive pairs, with an optional region type. data_region_directive := ".data_region" { region_type } region_type := "jt8" \| "jt16" \| "jt32" \| "jta32" end_data_region_directive := ".end_data_region" The previous handling of ARM-style "$d.*" labels was broken and has been removed. Specifically, it didn't handle ARM vs. Thumb mode when marking the end of the section. rdar://11459456 llvm-svn: 157062	2012-05-18 19:12:01 +00:00
Nuno Lopes	4af12ec75d	add test case for bugfix in r157032 llvm-svn: 157058	2012-05-18 17:44:58 +00:00
Eric Christopher	9308072a49	Add support for the mips 'x' inline asm modifier. Patch by Jack Carter. llvm-svn: 157057	2012-05-18 17:39:35 +00:00
Joel Jones	f99b58276c	FileCheck-ify, apropos of nothing llvm-svn: 157051	2012-05-18 16:24:01 +00:00
Craig Topper	d86e4c0088	Simplify handling of v16i8 shuffles and fix a missed optimization. llvm-svn: 157043	2012-05-18 06:42:06 +00:00
Evan Cheng	6ffd037105	Teach two-address pass to update the "source" map so it doesn't perform a non-profitable commute using outdated info. The test case would still fail because of poor pre-RA schedule. That will be fixed by MI scheduler. rdar://11472010 llvm-svn: 157038	2012-05-18 01:33:51 +00:00
Danil Malyshev	0d982140e7	Temporarily disabled the MCJIT tests for Darwin, because the RuntimeDyldMachO has a problems with relocations for 32bit x86. llvm-svn: 157035	2012-05-18 00:30:58 +00:00
Kevin Enderby	ef39671f98	Fixed a bug in llvm-objdump when disassembling using -macho option for a binary containing no symbols. Fixed the crash and fixed it not disassembling anything. llvm-svn: 157031	2012-05-18 00:13:56 +00:00
Jakob Stoklund Olesen	f6cadaaf06	Remove a test that was only testing for physreg joining. This is the same as the other tests: Clever tricks are required to make the arguments and return value line up in a single-instruction function. It rarely happens in real life. We have plenty other examples of this behavior. llvm-svn: 157030	2012-05-18 00:07:14 +00:00
Jakob Stoklund Olesen	b3487aa334	Remove -join-physregs from the test suite. This option has been disabled for a while, and it is going away so I can clean up the coalescer code. The tests that required physreg joining to be enabled were almost all of the form "tiny function with interference between arguments and return value". Such functions are usually inlined in the real world. The problem exposed by phys_subreg_coalesce-3.ll is real, but fairly rare. llvm-svn: 157027	2012-05-17 23:44:19 +00:00
Kevin Enderby	9747fd81e3	Fix the encoding of the armv7m (MClass) for MSR APSR writes which was missing the 0b10 mask encoding bits. Make MSR APSR writes without a _<bits> qualifier an alias for MSR APSR_nzcvq even though ARM as deprecated it use. Also add support for suffixes (_nzcvq, _g, _nzcvqg) for APSR versions. Some FIXMEs in the code for better error checking when versions shouldn't be used. rdar://11457025 llvm-svn: 157019	2012-05-17 22:18:01 +00:00
Danil Malyshev	dade6759c4	- Added ExecutionEngine/MCJIT tests - Added HOST_ARCH to Makefile.config.in The HOST_ARCH will be used by MCJIT tests filter, because MCJIT supported only x86 and ARM architectures now. llvm-svn: 157015	2012-05-17 21:07:47 +00:00
Tim Northover	fc7737e7dc	Remove incorrect pattern for ARM SMML instruction. Patch by Meador Inge. llvm-svn: 156989	2012-05-17 13:12:13 +00:00
Chandler Carruth	6bc19df503	Teach the 'opt' tool about '-Os' and '-Oz', corresponding to the Clang options, to enable easier testing of the innards of LLVM that are enabled by such optimization strategies. Note that this doesn't provide the (much needed) function attribute support for -Oz (as opposed to -Os), but still seems like a positive step to better test the logic that Clang currently relies on. Patch by Patrik Hägglund. llvm-svn: 156913	2012-05-16 08:32:49 +00:00
Evan Cheng	9a33fc17be	Avoid creating a cycle when folding load / op with flag / store. PR11451474. rdar://11451474 llvm-svn: 156896	2012-05-16 01:54:27 +00:00
Jakob Stoklund Olesen	1da786c936	Enable sub-sub-register copy coalescing. It is now possible to coalesce weird skewed sub-register copies by picking a super-register class larger than both original registers. The included test case produces code like this: vld2.32 {d16, d17, d18, d19}, [r0]! vst2.32 {d18, d19, d20, d21}, [r0] We still perform interference checking as if it were a normal full copy join, so this is still quite conservative. In particular, the f1 and f2 functions in the included test case still have remaining copies because of false interference. llvm-svn: 156878	2012-05-15 23:31:35 +00:00
Kevin Enderby	8feefef6ff	Add a test case for r156840, a fix to llvm-objdump when disassembling using -macho to disassemble the last symbol to the end of the section. llvm-svn: 156850	2012-05-15 20:20:50 +00:00
Sirish Pande	7081e43d5c	Enable all Hexagon tests. llvm-svn: 156824	2012-05-15 16:13:12 +00:00
David Majnemer	ea3e1ea334	Teach SimplifyLibCalls about stpcpy. llvm-svn: 156815	2012-05-15 11:46:21 +00:00
Jakob Stoklund Olesen	a3e3afc746	Fix PR12821. RAFast must add an <imp-def> operand when it is rewriting a sub-register def that isn't a read-modify-write. llvm-svn: 156777	2012-05-14 21:10:25 +00:00
Chad Rosier	c3a90c47b9	Move the capture analysis from MemoryDependencyAnalysis to a more general place so that it can be reused in MemCpyOptimizer. This analysis is needed to remove an unnecessary memcpy when returning a struct into a local variable. rdar://11341081 PR12686 llvm-svn: 156776	2012-05-14 20:35:04 +00:00
Brendon Cahoon	8b14ad918f	Revert 156634 upon request until code improvement changes are made. llvm-svn: 156775	2012-05-14 19:35:42 +00:00
Dan Gohman	cc1f60a86c	Rename @llvm.debugger to @llvm.debugtrap. llvm-svn: 156774	2012-05-14 18:58:10 +00:00
Rafael Espindola	b6ca820fbb	Add support for the .rept directive. Patch by Vladmir Sorokin. I added support for nesting. llvm-svn: 156714	2012-05-12 16:31:10 +00:00
Benjamin Kramer	b778bbd91b	ELF: Add support for the asm .version directive. llvm-svn: 156712	2012-05-12 14:30:47 +00:00
Benjamin Kramer	549c257415	AsmParser: Add support for the .purgem directive. Based on a patch by Team PaX. llvm-svn: 156709	2012-05-12 11:21:46 +00:00
Benjamin Kramer	3cf84357e0	AsmParser: ignore the .extern directive. llvm-svn: 156707	2012-05-12 11:18:59 +00:00
Benjamin Kramer	09b38e9f61	AsmParser: Add support for .ifc and .ifnc directives. Based on a patch from PaX Team. llvm-svn: 156706	2012-05-12 11:18:51 +00:00
Benjamin Kramer	dc54b252bb	AsmParser: Add support for .ifb and .ifnb directives. Based on a patch from PaX Team. llvm-svn: 156705	2012-05-12 11:18:42 +00:00
Stepan Dyatkovskiy	fa0cf8dc2e	Recommited r156374 with critical fixes in BitcodeReader/Writer: Ordinary patch for PR1255. Added new case-ranges orientated methods for adding/removing cases in SwitchInst. After this patch cases will internally representated as ConstantArray-s instead of ConstantInt, externally cases wrapped within the ConstantRangesSet object. Old methods of SwitchInst are also works well, but marked as deprecated. So on this stage we have no side effects except that I added support for case ranges in BitcodeReader/Writer, of course test for Bitcode is also added. Old "switch" format is also supported. llvm-svn: 156704	2012-05-12 10:48:17 +00:00
Jay Foad	65d25fa204	Teach Function::hasAddressTaken that BlockAddress doesn't really take the address of a function. llvm-svn: 156703	2012-05-12 08:30:16 +00:00
Sirish Pande	2eadb696a5	Support for Hexagon feature, New Value Jump. llvm-svn: 156698	2012-05-12 05:10:30 +00:00
Akira Hatanaka	a80ec224bf	Fix test cases. llvm-svn: 156697	2012-05-12 03:25:16 +00:00
Akira Hatanaka	431ee824c6	Make the following changes in MipsAsmPrinter.cpp: - Remove code which lowers pseudo SETGP01. - Fix LowerSETGP01. The first two of the three instructions that are emitted to initialize the global pointer register now use register $2. - Stop emitting .cpload directive. llvm-svn: 156689	2012-05-12 00:48:43 +00:00
Akira Hatanaka	bc52a1662b	Insert instructions to the entry basic block which initializes the global pointer register. This is the first of the series of patches which clean up the way global pointer register is used. The patches will make the following improvements: - Make $gp an allocatable temporary register rather than reserving it. - Use a virtual register as the global pointer register and let the register allocator decide which register to assign to it or whether spill/reloads are needed. - Make sure $gp is valid at the entry of a called function, which is necessary for functions using lazy binding. - Remove the need for emitting .cprestore and .cpload directives. llvm-svn: 156671	2012-05-12 00:17:17 +00:00
Akira Hatanaka	3e39081c4a	Do not replace operands of pseudo instructions with register $zero. llvm-svn: 156663	2012-05-11 23:22:18 +00:00
Akira Hatanaka	d81273be58	Use regular expression to match register names. llvm-svn: 156656	2012-05-11 23:00:40 +00:00
Chad Rosier	4a65a2a197	[fast-isel] Add support for selecting @llvm.trap(). llvm-svn: 156646	2012-05-11 21:33:49 +00:00
Brendon Cahoon	90dddafa44	Hexagon constant extender support. Patch by Jyotsna Verma. llvm-svn: 156634	2012-05-11 19:56:59 +00:00
Chad Rosier	72bd34ca71	[fast-isel] Remove -disable-arm-fast-isel option. -fast-isel=0 suffices. Minor cleanup. llvm-svn: 156632	2012-05-11 19:40:25 +00:00
Chad Rosier	c20de37076	[fast-isel] Cleaner fix for when we're unable to handle a non-double multi-reg retval. Hoists check before emitting the call to avoid unnecessary work. rdar://11430407 PR12796 llvm-svn: 156628	2012-05-11 18:51:55 +00:00
Nuno Lopes	11d6ecb6db	objectsize: add a few more tests and fix a bug llvm-svn: 156625	2012-05-11 18:25:29 +00:00
Hans Wennborg	ea694231ad	Fix test/CodeGen/X86/tls-pie.ll. llvm-svn: 156612	2012-05-11 10:19:54 +00:00
Hans Wennborg	a5a417fcd3	Implement initial-exec TLS model for 32-bit PIC x86 This fixes a TODO from 2007 :) Previously, LLVM would emit the wrong code here (see the update to test/CodeGen/X86/tls-pie.ll). llvm-svn: 156611	2012-05-11 10:11:01 +00:00
Silviu Baranga	5138c169b1	Added the missing bit definition for the 4th bit of the STR (post reg) instruction. It is now set to 0. The patch also sets the unpredictable mask for SEL and SXTB-type instructions. llvm-svn: 156609	2012-05-11 09:28:27 +00:00
Silviu Baranga	dad5ffc779	Fixed the LLVM ARM v7 assembler and instruction printer for 8-bit immediate offset addressing. The assembler and instruction printer were not properly handeling the #-0 immediate. llvm-svn: 156608	2012-05-11 09:10:54 +00:00
Eli Friedman	1746bfc50e	Fix a minor logic mistake transforming compares in instcombine. PR12514. llvm-svn: 156600	2012-05-11 01:32:59 +00:00
Manman Ren	c82d0e71b9	ARM: peephole optimization to remove cmp instruction This patch will optimize the following cases: sub r1, r3 \| sub r1, imm cmp r3, r1 or cmp r1, r3 \| cmp r1, imm bge L1 TO subs r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can replace "sub" with "subs" and eliminate the "cmp" instruction. rdar: 10734411 llvm-svn: 156599	2012-05-11 01:30:47 +00:00
Dan Gohman	ed475ad173	Define a new intrinsic, @llvm.debugger. It will be similar to __builtin_trap(), but it generates int3 on x86 instead of ud2. llvm-svn: 156593	2012-05-11 00:19:32 +00:00
Nuno Lopes	415911a5c7	objectsize: add support for GEPs with non-constant indexes add an additional parameter to InstCombiner::EmitGEPOffset() to force it to not emit operations with NUW flag llvm-svn: 156585	2012-05-10 23:17:35 +00:00
Eric Christopher	dbb26083e5	Add support for the 'X' inline asm operand modifier. Patch by Jack Carter. llvm-svn: 156577	2012-05-10 21:48:22 +00:00
Sirish Pande	7fbfe4a1d6	Hexagon V5 FP Support. llvm-svn: 156568	2012-05-10 20:20:25 +00:00
Dan Gohman	8b1a3cec89	Teach DeadStoreElimination to eliminate exit-block stores with phi addresses. llvm-svn: 156558	2012-05-10 18:57:38 +00:00
Manman Ren	5abcae1320	Revert: 156550 "ARM: peephole optimization to remove cmp instruction" This commit broke an external linux bot and gave a compile-time warning. llvm-svn: 156556	2012-05-10 18:49:43 +00:00
Nuno Lopes	ea7b37e3ae	teach DSE and isInstructionTriviallyDead() about calloc llvm-svn: 156553	2012-05-10 17:14:00 +00:00
Joel Jones	cc8aa34ea8	formatting change: strip debug info from test llvm-svn: 156551	2012-05-10 16:55:31 +00:00
Manman Ren	727b7d5e4c	ARM: peephole optimization to remove cmp instruction This patch will optimize the following cases: sub r1, r3 \| sub r1, imm cmp r3, r1 or cmp r1, r3 \| cmp r1, imm bge L1 TO subs r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can replace "sub" with "subs" and eliminate the "cmp" instruction. rdar: 10734411 llvm-svn: 156550	2012-05-10 16:48:21 +00:00
Joel Jones	305ecb3495	Fix a problem with incomplete equality testing of PHINodes in Instruction::IsIdenticalToWhenDefined. This manifested itself when inlining two calls to the same function. The inlined function had a switch statement that returned one of a set of global variables. Without this modification, the two phi instructions that chose values from the branches of the switch instruction inlined from the callee were considered equivalent and jump-threading replaced a load for the first switch value with a phi selecting from the second switch, thereby producing incorrect code. This patch has been tested with "make check-all", "lnt runteste nt", and llvm self-hosted, and on the original program that had this problem, wireshark. <rdar://problem/11025519> llvm-svn: 156548	2012-05-10 15:59:41 +00:00
Nadav Rotem	157be301c5	AVX2: Add an additional broadcast idiom. llvm-svn: 156540	2012-05-10 12:39:13 +00:00
Nadav Rotem	64319ce27c	Generate AVX/AVX2 shuffles even when there is a memory op somewhere else in the program. Starting r155461 we are able to select patterns for vbroadcast even when the load op is used by other users. Fix PR11900. llvm-svn: 156539	2012-05-10 12:22:05 +00:00
Dan Gohman	9e72870dd1	Fix the objc_storeStrong recognizer to stop before walking off the end of a basic block if there's no store. llvm-svn: 156520	2012-05-09 23:08:33 +00:00
Nuno Lopes	3d7a8137ee	objectsize: refactor code a bit to enable future changes to support run-time information add support to compute allocation sizes at run-time if penalty > 1 (e.g., malloc(x), calloc(x, y), and VLAs) llvm-svn: 156515	2012-05-09 21:30:57 +00:00
Danil Malyshev	0e378f7bcb	Added a regress test for the bug #9964 before close it. This bug was fixed by Jim Grosbach in #138879, thanks Jim! llvm-svn: 156505	2012-05-09 19:07:04 +00:00
Nuno Lopes	e8880a9916	change the objectsize intrinsic signature: add a 3rd parameter to denote the maximum runtime performance penalty that the user is willing to accept. This commit only adds the parameter. Code taking advantage of it will follow. llvm-svn: 156473	2012-05-09 15:52:43 +00:00
Filipe Cabecinhas	fe00fb1f06	Fixed a typo llvm-svn: 156471	2012-05-09 14:43:50 +00:00
Akira Hatanaka	574e68feec	Add another peephole pattern for conditional moves. llvm-svn: 156460	2012-05-09 02:29:29 +00:00
Akira Hatanaka	a53bdc878f	Make register FP allocatable if the compiled function does not have dynamic allocas. llvm-svn: 156458	2012-05-09 01:38:13 +00:00
Akira Hatanaka	bd2f3d1c46	Expand 64-bit shifts if target ABI is O32. llvm-svn: 156457	2012-05-09 00:55:21 +00:00
Dan Gohman	b47d02f929	Fix objc_storeStrong pattern matching to catch a potential use of the old value after the store but before it is released. This fixes rdar:/11116986. llvm-svn: 156442	2012-05-08 23:34:08 +00:00
Eric Christopher	63b73ede75	Handle OpDeref in case it comes in as a register operand. Part of rdar://11352000 llvm-svn: 156405	2012-05-08 18:56:00 +00:00
Daniel Dunbar	8e13944f35	Revert r156393, "[tests] Remove some remaining DejaGNU related cruft.", this patch wasn't ready yet. llvm-svn: 156395	2012-05-08 18:26:07 +00:00
Daniel Dunbar	882b236879	[tests] Remove some remaining DejaGNU related cruft. llvm-svn: 156393	2012-05-08 18:11:49 +00:00
Duncan Sands	c9f011a85b	Calling ReassociateExpression recursively is extremely dangerous since it will replace the operands of expressions with only one use with undef and generate a new expression for the original without using RAUW to update the original. Thus any copies of the original expression held in a vector may end up referring to some bogus value - and using a ValueHandle won't help since there is no RAUW. There is already a mechanism for getting the effect of recursion non-recursively: adding the value to be recursed on to RedoInsts. But it wasn't being used systematically. Have various places where recursion had snuck in at some point use the RedoInsts mechanism instead. Fixes PR12169. llvm-svn: 156379	2012-05-08 12:16:05 +00:00
Stepan Dyatkovskiy	b150cd5ced	Rejected r156374: Ordinary PR1255 patch. Due to clang-x86_64-debian-fnt buildbot failure. llvm-svn: 156377	2012-05-08 08:33:21 +00:00
Craig Topper	77b1a4cee5	Remove 256-bit AVX non-temporal store intrinsics. Similar was previously done for 128-bit. llvm-svn: 156375	2012-05-08 06:58:15 +00:00
Stepan Dyatkovskiy	33fd2a5bf4	Ordinary patch for PR1255. Added new case-ranges orientated methods for adding/removing cases in SwitchInst. After this patch cases will internally representated as ConstantArray-s instead of ConstantInt, externally cases wrapped within the ConstantRangesSet object. Old methods of SwitchInst are also works well, but marked as deprecated. So on this stage we have no side effects except that I added support for case ranges in BitcodeReader/Writer, of course test for Bitcode is also added. Old "switch" format is also supported. llvm-svn: 156374	2012-05-08 06:36:08 +00:00
Owen Anderson	8adb0322ce	Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled. llvm-svn: 156324	2012-05-07 20:51:25 +00:00
Owen Anderson	1e7a4f0f91	Teach reassociate to commute FMul's and FAdd's in order to canonicalize the order of their operands across instructions. This allows for greater CSE opportunities. llvm-svn: 156323	2012-05-07 20:47:23 +00:00
Chad Rosier	3e284d8bd6	Fix a regression from r147481. This combine should only happen if there is a single use. rdar://11360370 llvm-svn: 156316	2012-05-07 18:47:44 +00:00
Manman Ren	6fde9f74b4	X86: optimization for -(x != 0) This patch will optimize -(x != 0) on X86 FROM cmpl $0x01,%edi sbbl %eax,%eax notl %eax TO negl %edi sbbl %eax %eax In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td: def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>; rdar: 10961709 llvm-svn: 156312	2012-05-07 18:06:23 +00:00
Eric Christopher	87e8163c57	Add support for the 'l' constraint. Patch by Jack Carter. llvm-svn: 156294	2012-05-07 06:25:15 +00:00
Eric Christopher	af8eabbbd8	Add support for the 'c' constraint. Patch by Jack Carter. llvm-svn: 156293	2012-05-07 06:25:10 +00:00
Eric Christopher	0f1a0afa75	Add support for the 'P' constraint. Patch by Jack Carter. llvm-svn: 156292	2012-05-07 06:25:02 +00:00
Eric Christopher	a6552ba637	Add support for the 'O' constraint. Patch by Jack Carter. llvm-svn: 156285	2012-05-07 05:46:48 +00:00
Eric Christopher	5e1efebf09	Add support for the 'N' inline asm constraint. Patch by Jack Carter. llvm-svn: 156284	2012-05-07 05:46:43 +00:00
Eric Christopher	e5a46b70b3	Add support for the 'L' inline asm constraint. Patch by Jack Carter. llvm-svn: 156283	2012-05-07 05:46:37 +00:00
Eric Christopher	267aa256cb	Add support for the inline asm constraint 'K'. llvm-svn: 156282	2012-05-07 05:46:29 +00:00
Craig Topper	c6d0bc2afc	Add SSE4A MOVNTSS/MOVNTSD instructions. llvm-svn: 156281	2012-05-07 05:36:19 +00:00
Eric Christopher	bf784be9ae	Support the 'J' constraint. Patch by Jack Carter. llvm-svn: 156280	2012-05-07 03:13:42 +00:00
Eric Christopher	929ba63dcf	Add support for the 'I' inline asm constraint. Also add tests from the previous 2 patches. Patch by Jack Carter. llvm-svn: 156279	2012-05-07 03:13:32 +00:00
Benjamin Kramer	786f7671ab	Switch the select to branch transformation on by default. The primitive conservative heuristic seems to give a slight overall improvement while not regressing stuff. Make it available to wider testing. If you notice any speed regressions (or significant code size regressions) let me know! llvm-svn: 156258	2012-05-06 14:25:16 +00:00
Benjamin Kramer	0463564612	CodeGenPrepare: Add a transform to turn selects into branches in some cases. This came up when a change in block placement formed a cmov and slowed down a hot loop by 50%: ucomisd (%rdi), %xmm0 cmovbel %edx, %esi cmov is a really bad choice in this context because it doesn't get branch prediction. If we emit it as a branch, an out-of-order CPU can do a better job (if the branch is predicted right) and avoid waiting for the slow load+compare instruction to finish. Of course it won't help if the branch is unpredictable, but those are really rare in practice. This patch uses a dumb conservative heuristic, it turns all cmovs that have one use and a direct memory operand into branches. cmovs usually save some code size, so we disable the transform in -Os mode. In-Order architectures are unlikely to benefit as well, those are included in the "predictableSelectIsExpensive" flag. It would be better to reuse branch probability info here, but BPI doesn't support select instructions currently. It would make sense to use the same heuristics as the if-converter pass, which does the opposite direction of this transform. Test suite shows a small improvement here and there on corei7-level machines, but the actual results depend a lot on the used microarchitecture. The transformation is currently disabled by default and available by passing the -enable-cgp-select2branch flag to the code generator. Thanks to Chandler for the initial test case to him and Evan Cheng for providing me with comments and test-suite numbers that were more stable than mine :) llvm-svn: 156234	2012-05-05 12:49:22 +00:00
Stepan Dyatkovskiy	469935e0ae	Small fix in InstCombineCasts.cpp. Restored "alloca + bitcast" reducing for case when alloca's size is calculated within the "add/sub/... nsw". Also added fix to 2011-06-13-nsw-alloca.ll test. llvm-svn: 156231	2012-05-05 07:09:40 +00:00
Justin Holewinski	4ca961430f	This patch adds a new NVPTX back-end to LLVM which supports code generation for NVIDIA PTX 3.0. This back-end will (eventually) replace the current PTX back-end, while maintaining compatibility with it. The new target machines are: nvptx (old ptx32) => 32-bit PTX nvptx64 (old ptx64) => 64-bit PTX The sources are based on the internal NVIDIA NVPTX back-end, and contain more functionality than the current PTX back-end currently provides. NV_CONTRIB llvm-svn: 156196	2012-05-04 20:18:50 +00:00
Sebastian Pop	2b868d474e	Added missing CMN case in Thumb2SizeReduction pass so that LLVM emits 16-bits encoding of CMN instructions. llvm-svn: 156195	2012-05-04 19:53:56 +00:00
Craig Topper	6881f1067c	Allow v16i16 and v32i8 shuffles to be rewritten as narrower shuffles. llvm-svn: 156156	2012-05-04 04:44:49 +00:00
Kevin Enderby	1ab00df6a7	Fix issues with the ARM bl and blx thumb instructions and the J1 and J2 bits for the assembler and disassembler. Which were not being set/read correctly for offsets greater than 22 bits in some cases. Changes to lib/Target/ARM/ARMAsmBackend.cpp from Gideon Myles! llvm-svn: 156118	2012-05-03 22:41:56 +00:00
Nuno Lopes	2762496a1a	remove calls to calloc if the allocated memory is not used (it was already being done for malloc) fix a few typos found by Chad in my previous commit llvm-svn: 156110	2012-05-03 22:08:19 +00:00
Sirish Pande	253d16b1b0	Support for target dependent Hexagon VLIW packetizer. This patch creates and optimizes packets as per Hexagon ISA rules. llvm-svn: 156109	2012-05-03 21:52:53 +00:00
Nuno Lopes	26239aeb99	add support for calloc to objectsize lowering llvm-svn: 156102	2012-05-03 21:19:58 +00:00
Silviu Baranga	b5e46c12d6	Fixed disassembler for vstm/vldm ARM VFP instructions. llvm-svn: 156077	2012-05-03 16:38:40 +00:00
Craig Topper	52869bf5bf	Fix 256-bit vpshuflw and vpshufhw immediate encoding to handle undefs in the lower half correctly. Missed in r155982. llvm-svn: 156059	2012-05-03 07:12:59 +00:00
Evan Cheng	59c8d1af93	Fix two-address pass's aggressive instruction commuting heuristics. It's meant to catch cases like: %reg1024<def> = MOV r1 %reg1025<def> = MOV r0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 By commuting ADD, it let coalescer eliminate all of the copies. However, there was a bug in the heuristics where it ended up commuting the ADD in: %reg1024<def> = MOV r0 %reg1025<def> = MOV 0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 That did no benefit but rather ensure the last MOV would not be coalesced. rdar://11355268 llvm-svn: 156048	2012-05-03 01:45:13 +00:00
Owen Anderson	e3d41b44cc	Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, just like it now knows for FMULs. llvm-svn: 156029	2012-05-02 22:17:40 +00:00
Owen Anderson	3cfe269707	Teach DAG combine that multiplication by 1.0 can always be constant folded. llvm-svn: 156023	2012-05-02 21:32:35 +00:00
Jim Grosbach	658b3efc30	ARM: Add missing two-operand VBIC aliases. llvm-svn: 156019	2012-05-02 21:11:56 +00:00
Manman Ren	0bdd46e32e	Revert r155853 The commit is intended to fix rdar://10961709. But it is the root cause of PR12720. Revert it for now. llvm-svn: 155992	2012-05-02 15:24:32 +00:00
Bill Wendling	4cb38868b9	The value held in the vector may be RAUW'ed by some of the canonicalization methods. Use a weak value handle to keep up with this. PR12245 llvm-svn: 155984	2012-05-02 09:59:45 +00:00
Richard Barton	249de35aaa	Disallow YIELD and other allocated nop hints in pre-ARMv6 architectures. llvm-svn: 155983	2012-05-02 09:43:18 +00:00
Craig Topper	00ccecdc84	Add support for selecting AVX2 vpshuflw and vpshufhw. Add decoding support for AsmPrinter. llvm-svn: 155982	2012-05-02 08:03:44 +00:00
Bill Wendling	c520775de3	Strip the pointer casts off of allocas so that the selection DAG can find them. PR10799 llvm-svn: 155954	2012-05-01 22:50:45 +00:00
Jim Grosbach	c77cae905e	ARM: Add a few missing add->sub aliases w/ 'w' suffix. Aliases for adding a negative immediate when using an explicit 'w' suffix. E.g., adds.w r2, #-16 adds.w r2, r2, #-16 addw r2, #-16 addw r2, #-16 addw r2, r2, #-16 rdar://11330769 llvm-svn: 155946	2012-05-01 21:17:34 +00:00
Jim Grosbach	0e16cdc5e2	ARM: allow vanilla expressions for movw/movt. Expressions for movw/movt don't always have an :upper16: or :lower16: on them and that's ok. When they don't, it's just a plain [0-65536] immediate result, effectively the same as a :lower16: variant kind. rdar://10550147 llvm-svn: 155941	2012-05-01 20:43:21 +00:00
Jim Grosbach	a538b5240e	MC: Unknown assembler directives are now hard errors. Previously, an unsupported/unknown assembler directive issued a warning. That's generally unsafe, and inconsistent with the behaviour of pretty much every system assembler. Now that the MC assemblers are mature enough to be the default on multiple targets, it's reasonable to issue errors for these. For target or platform directives that need to stay warnings, we should add explicit handlers for them in, e.g., ELFAsmParser.cpp, DarwinAsmParser.cpp, et. al., and issue the warning there. rdar://9246275 llvm-svn: 155926	2012-05-01 18:38:27 +00:00
Manman Ren	2a032bd8f9	X86: optimization for max-like struct This patch will optimize the following cases on X86 (a > b) ? (a-b) : 0 (a >= b) ? (a-b) : 0 (b < a) ? (a-b) : 0 (b <= a) ? (a-b) : 0 FROM movl %edi, %ecx subl %esi, %ecx cmpl %edi, %esi movl $0, %eax cmovll %ecx, %eax TO xorl %eax, %eax subl %esi, %edi cmovll %eax, %edi movl %edi, %eax rdar: 10734411 llvm-svn: 155919	2012-05-01 17:16:15 +00:00
Alexey Samsonov	246af5318a	X86: Use StackRegister instead of FrameRegister in getFrameIndexReference (to generate debug info for local variables) if stack needs realignment llvm-svn: 155917	2012-05-01 15:16:06 +00:00
Jay Foad	5af5a8548b	Regression test for PR2960. llvm-svn: 155912	2012-05-01 11:11:34 +00:00
Nick Lewycky	fd4342c2f1	An instruction in a loop is not guaranteed to be executed just because the loop has no exit blocks. Fixes PR12706! llvm-svn: 155884	2012-05-01 04:03:01 +00:00
Lang Hames	26d71c9d0a	Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. Fixes <rdar://problem/11291436>. This is a second attempt at a fix for this, the first was r155468. Thanks to Chandler, Bob and others for the feedback that helped me improve this. llvm-svn: 155866	2012-05-01 00:20:38 +00:00
Manman Ren	0a8b8b491f	X86: optimization for -(x != 0) This patch will optimize -(x != 0) on X86 FROM cmpl $0x01,%edi sbbl %eax,%eax notl %eax TO negl %edi sbbl %eax %eax llvm-svn: 155853	2012-04-30 22:51:25 +00:00
Manman Ren	71aa37cdf5	test/CodeGen/X86/select.ll: remove spaces llvm-svn: 155840	2012-04-30 18:54:27 +00:00
Derek Schuff	85abcc8498	Fix fastcc structure return with fast-isel on x86-32 On x86-32, structure return via sret lets the callee pop the hidden pointer argument off the stack, which the caller then re-pushes. However if the calling convention is fastcc, then a register is used instead, and the caller should not adjust the stack. This is implemented with a check of IsTailCallConvention X86TargetLowering::LowerCall but is now checked properly in X86FastISel::DoSelectCall. (this time, actually commit what was reviewed!) llvm-svn: 155825	2012-04-30 16:57:15 +00:00
Bob Wilson	d2f6ff588b	Don't introduce illegal types when creating vmull operations. <rdar://11324364> ARM BUILD_VECTORs created after type legalization cannot use i8 or i16 operands, since those types are not legal. Instead use i32 operands, which will be implicitly truncated by the BUILD_VECTOR to match the element type. llvm-svn: 155824	2012-04-30 16:53:34 +00:00
Duncan Sands	8fa4166239	Just mark the sign bit as known zero, rather than any other irrelevant bits known zero in the LHS. Fixes PR12541. llvm-svn: 155818	2012-04-30 11:56:58 +00:00
Bill Wendling	5a1a6421ca	Second attempt at PR12573: Allow the "SplitCriticalEdge" function to split the edge to a landing pad. If the pass is sure that it thinks it knows what it's doing, then it may go ahead and specify that the landing pad can have its critical edge split. The loop unswitch pass is one of these passes. It will split the critical edges of all edges coming from a loop to a landing pad not within the loop. Doing so will retain important loop analysis information, such as loop simplify. llvm-svn: 155817	2012-04-30 10:44:54 +00:00
Rafael Espindola	314a1a477a	Make sure HoistInsertPosition finds a position that is dominated by all inputs. llvm-svn: 155809	2012-04-30 03:53:06 +00:00
Andrew Trick	55623eaf5a	Reapply 155668: Fix the SD scheduler to avoid gluing the same node twice. This time, also fix the caller of AddGlue to properly handle incomplete chains. AddGlue had failure modes, but shamefully hid them from its caller. It's luck ran out. Fixes rdar://11314175: BuildSchedUnits assert. llvm-svn: 155749	2012-04-28 01:03:23 +00:00
Jim Grosbach	92e628a9c2	ARM: Thumb add(sp plus register) asm constraints. Make sure when parsing the Thumb1 sp+register ADD instruction that the source and destination operands match. In thumb2, just use the wide encoding if they don't. In Thumb1, issue a diagnostic. rdar://11219154 llvm-svn: 155748	2012-04-27 23:51:36 +00:00
Derek Schuff	7fe1fbbe81	Revert r155745 llvm-svn: 155746	2012-04-27 23:37:41 +00:00
Derek Schuff	80bd01f406	Fix fastcc structure return with fast-isel on x86-32 On x86-32, structure return via sret lets the callee pop the hidden pointer argument off the stack, which the caller then re-pushes. However if the calling convention is fastcc, then a register is used instead, and the caller should not adjust the stack. This is implemented with a check of IsTailCallConvention X86TargetLowering::LowerCall but is now checked properly in X86FastISel::DoSelectCall. llvm-svn: 155745	2012-04-27 23:27:17 +00:00
Andrew Trick	cbe7b03dbe	Temporarily revert r155668: Fix the SD scheduler to avoid gluing. This definitely caused regression with ARM -mno-thumb. llvm-svn: 155743	2012-04-27 22:55:59 +00:00
Chad Rosier	d627fcbf2a	Add x86-specific DAG combine to simplify: x == -y --> x+y == 0 x != -y --> x+y != 0 On x86, the generated code goes from negl %esi cmpl %esi, %edi je .LBB0_2 to addl %esi, %edi je .L4 This case is correctly handled for ARM with "cmn". Patch by Manman Ren. rdar://11245199 PR12545 llvm-svn: 155739	2012-04-27 22:33:25 +00:00
Evan Cheng	d337b7cc8a	Make test less fragile. llvm-svn: 155732	2012-04-27 20:48:18 +00:00
Hal Finkel	a565d03d78	Don't vectorize target-specific types (ppc_fp128, x86_fp80, etc.). Target specific types should not be vectorized. As a practical matter, these types are already register matched (at least in the x86 case), and codegen does not always work correctly (at least in the ppc case, and this is not worth fixing because ppc_fp128 is currently broken and will probably go away soon). llvm-svn: 155729	2012-04-27 19:34:00 +00:00
Lang Hames	7d83af4ed0	Fix the order of the operands in the llvm.fma intrinsic patterns for ARM, <rdar://problem/11325085>. llvm-svn: 155724	2012-04-27 18:51:24 +00:00
Dan Gohman	25a863dcf7	Reapply r155682, making constant folding more consistent, with a fix to work properly with how the code handles all-undef PHI nodes. llvm-svn: 155721	2012-04-27 17:50:22 +00:00
Richard Barton	f9237b25e6	Fix ARM assembly parsing for upper case condition codes on IT instructions. llvm-svn: 155720	2012-04-27 17:34:01 +00:00
Benjamin Kramer	15690164a3	Missed some register numbers. llvm-svn: 155706	2012-04-27 12:21:46 +00:00
Benjamin Kramer	17378d3a7f	Update edis test for r155704. llvm-svn: 155705	2012-04-27 12:14:03 +00:00
Benjamin Kramer	1380494168	X86: Don't emit conditional floating point moves on when targeting pre-pentiumpro architectures. * Model FPSW (the FPU status word) as a register. * Add ISel patterns for the FUCOM, FNSTSW and SAHF instructions. During Legalize/Lowering, build a node sequence to transfer the comparison result from FPSW into EFLAGS. If you're wondering about the right-shift: That's an implicit sub-register extraction (%ax -> %ah) which is handled later on by the instruction selector. Fixes PR6679. Patch by Christoph Erhardt! llvm-svn: 155704	2012-04-27 12:07:43 +00:00
NAKAMURA Takumi	a28147f072	Revert r155682, "Use ConstantExpr::getExtractElement when constant-folding vectors" It broke stage2 build. stage1/clang sometimes crashed. llvm-svn: 155699	2012-04-27 07:59:20 +00:00
Kostya Serebryany	0cd695bd39	[tsan] Atomic support for ThreadSanitizer, patch by Dmitry Vyukov llvm-svn: 155698	2012-04-27 07:31:53 +00:00
Craig Topper	b53325d70e	Add mcpu to tests to prevent them from using AVX instructions on Sandy Bridge after r155618. llvm-svn: 155696	2012-04-27 07:11:58 +00:00
Evan Cheng	f35523d08a	Implement a bastardized ABI. llvm-svn: 155686	2012-04-27 02:11:10 +00:00
Evan Cheng	594fb11f12	- thumbv6 shouldn't imply +thumb2. Cortex-M0 doesn't suppport 32-bit Thumb2 instructions. - However, it does support dmb, dsb, isb, mrs, and msr. rdar://11331541 llvm-svn: 155685	2012-04-27 01:27:19 +00:00
Dan Gohman	a72b2f97a6	Use ConstantExpr::getExtractElement when constant-folding vectors instead of getAggregateElement. This has the advantage of being more consistent and allowing higher-level constant folding to procede even if an inner extract element cannot be folded. Make ConstantFoldInstruction call ConstantFoldConstantExpression on the instruction's operands, making it more consistent with ConstantFoldConstantExpression itself. This makes sure that ConstantExprs get TargetData-aware folding before being handed off as operands for further folding. This causes more expressions to be folded, but due to a known shortcoming in constant folding, this currently has the side effect of stripping a few more nuw and inbounds flags in the non-targetdata side of constant-fold-gep.ll. This is mostly harmless. This fixes rdar://11324230. llvm-svn: 155682	2012-04-27 00:54:36 +00:00
Chad Rosier	f3d4646377	Add instcombine patterns for the following transformations: (x & y) \| (x ^ y) -> x \| y (x & y) + (x ^ y) -> x \| y Patch by Manman Ren. rdar://10770603 llvm-svn: 155674	2012-04-26 23:29:14 +00:00
Andrew Trick	1aa00c0baa	Fix the SD scheduler to avoid gluing the same node twice. DAGCombine strangeness may result in multiple loads from the same offset. They both may try to glue themselves to another load. We could insist that the redundant loads glue themselves to each other, but the beter fix is to bail out from bad gluing at the time we detect it. Fixes rdar://11314175: BuildSchedUnits assert. llvm-svn: 155668	2012-04-26 21:48:25 +00:00
Tim Northover	876c151146	Use VLD1 in NEON extenting-load patterns instead of VLDR. On some cores it's a bad idea for performance to mix VFP and NEON instructions and since these patterns are NEON anyway, the NEON load should be used. llvm-svn: 155630	2012-04-26 08:46:29 +00:00
Chandler Carruth	587c136c31	Teach the reassociate pass to fold chains of multiplies with repeated elements to minimize the number of multiplies required to compute the final result. This uses a heuristic to attempt to form near-optimal binary exponentiation-style multiply chains. While there are some cases it misses, it seems to at least a decent job on a very diverse range of inputs. Initial benchmarks show no interesting regressions, and an 8% improvement on SPASS. Let me know if any other interesting results (in either direction) crop up! Credit to Richard Smith for the core algorithm, and helping code the patch itself. llvm-svn: 155616	2012-04-26 05:30:30 +00:00
Evan Cheng	505263cb36	Specify cpu to unbreak tests. llvm-svn: 155604	2012-04-26 01:38:10 +00:00
Evan Cheng	4d570a3f0e	If triple is armv7 / thumbv7 and a CPU is specified, do not automatically assume the feature set of v7a. This comes about if the user specifies something like -arch armv7 -mcpu=cortex-m3. We shouldn't be generating instructions such as uxtab in this case. rdar://11318438 llvm-svn: 155601	2012-04-26 01:13:36 +00:00
Jakob Stoklund Olesen	b35c6a6369	Try to fix llvm-arm-linux builder with -mcpu. llvm-svn: 155589	2012-04-25 21:22:33 +00:00
Preston Gurd	237e411bb4	Trivial change to make the test use -mcpu=generic so as to avoid a failure if run on an Intel Atom with post RA instruction scheduling. llvm-svn: 155587	2012-04-25 21:04:54 +00:00
Chandler Carruth	8821a79947	Actually delete now-empty file. llvm-svn: 155532	2012-04-25 02:30:00 +00:00
Lang Hames	7f69fbca29	Reverting r155468. Chris and Chandler have convinced me that it's dangerous and in poor taste. Talking through some alternate solutions with Chandler. llvm-svn: 155530	2012-04-25 02:16:54 +00:00
Akira Hatanaka	b3ecf903f1	Do not use $gp as a dedicated global register if the target ABI is not O32. llvm-svn: 155522	2012-04-25 01:24:52 +00:00
Jim Grosbach	7ac2ac85a8	ARM: improved assembler diagnostics for missing CPU features. When an instruction match is found, but the subtarget features it requires are not available (missing floating point unit, or thumb vs arm mode, for example), issue a diagnostic that identifies what the feature mismatch is. rdar://11257547 llvm-svn: 155499	2012-04-24 22:40:08 +00:00
Nadav Rotem	3c817bb807	ConstantFoldSelectInstruction swapped the operands of the select. Fix 12592. Patch by Matt Pharr. llvm-svn: 155480	2012-04-24 20:18:49 +00:00
Nadav Rotem	62a0eeb276	Fix the testcase. We do expect two vblendw on XMMs. llvm-svn: 155477	2012-04-24 19:57:38 +00:00
Nadav Rotem	d35ec2e4ad	Add a testcase for 155440 llvm-svn: 155475	2012-04-24 19:45:28 +00:00
Evan Cheng	7f9bf43bcf	MachineBasicBlock::SplitCriticalEdge() should follow LLVM IR variant and refuse to break edge to EH landing pad. rdar://11300144 llvm-svn: 155470	2012-04-24 19:06:55 +00:00
Lang Hames	08eb5f2340	Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. This fixes <rdar://problem/11291436>. llvm-svn: 155468	2012-04-24 18:58:36 +00:00
Chandler Carruth	eb9c5df516	Fix a crash on valid (if UB) bitcode that is produced for some global constants in C++11 mode. I have no idea why it required such particular circumstances to get here, the code seems clearly to rely upon unchecked assumptions. Specifically, when we decide to form an index into a struct type, we may have gone through (at least one) zero-length array indexing round, which would have left the offset un-adjusted, and thus not necessarily valid for use when indexing the struct type. This is just an canonicalization step, so the correct thing is to refuse to canonicalize nonsensical GEPs of this form. Implemented, and test case added. Fixes PR12642. Pair debugged and coded with Richard Smith. =] I credit him with most of the debugging, and preventing me from writing the wrong code. llvm-svn: 155466	2012-04-24 18:42:47 +00:00
Kevin Enderby	f954efdbea	Add missing test cases for ARM VLD3 (single 3-element structure to all lanes) instructions. llvm-svn: 155453	2012-04-24 17:45:56 +00:00
Kevin Enderby	e7378cb42d	Add missing test cases for ARM VLD4 (single 4-element structure to all lanes) instructions. llvm-svn: 155444	2012-04-24 15:55:00 +00:00
Nadav Rotem	d060c25823	AVX: We lower VECTOR_SHUFFLE and BUILD_VECTOR nodes into vbroadcast instructions using the pattern (vbroadcast (i32load src)). In some cases, after we generate this pattern new users are added to the load node, which prevent the selection of the blend pattern. This commit provides fallback patterns which perform in-vector broadcast (using in-vector vbroadcast in AVX2 and pshufd on AVX1). llvm-svn: 155437	2012-04-24 11:07:03 +00:00
Bill Wendling	9f736e7c65	FileCheck-ize tests. llvm-svn: 155434	2012-04-24 10:45:44 +00:00
Bill Wendling	6824095c62	FileCheck-ize these tests. llvm-svn: 155433	2012-04-24 10:36:42 +00:00
Bill Wendling	b394f59f17	FileCheck-ize these tests. Harden some of them. llvm-svn: 155432	2012-04-24 09:15:38 +00:00
Nadav Rotem	c60ef21760	Optimize the vector UINT_TO_FP, SINT_TO_FP and FP_TO_SINT operations where the integer type is i8 (commonly used in graphics). llvm-svn: 155397	2012-04-23 21:53:37 +00:00
Preston Gurd	0a730de3c3	This patch fixes a problem which arose when using the Post-RA scheduler on X86 Atom. Some of our tests failed because the tail merging part of the BranchFolding pass was creating new basic blocks which did not contain live-in information. When the anti-dependency code in the Post-RA scheduler ran, it would sometimes rename the register containing the function return value because the fact that the return value was live-in to the subsequent block had been lost. To fix this, it is necessary to run the RegisterScavenging code in the BranchFolding pass. This patch makes sure that the register scavenging code is invoked in the X86 subtarget only when post-RA scheduling is being done. Post RA scheduling in the X86 subtarget is only done for Atom. This patch adds a new function to the TargetRegisterClass to control whether or not live-ins should be preserved during branch folding. This is necessary in order for the anti-dependency optimizations done during the PostRASchedulerList pass to work properly when doing Post-RA scheduling for the X86 in general and for the Intel Atom in particular. The patch adds and invokes the new function trackLivenessAfterRegAlloc() instead of using the existing requiresRegisterScavenging(). It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of requiresRegisterScavenging(). It changes the all the targets that implemented requiresRegisterScavenging() to also implement trackLivenessAfterRegAlloc(). It adds an assertion in the Post RA scheduler to make sure that post RA liveness information is available when it is needed. It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order to avoid running into the added assertion. Finally, this patch restores the use of anti-dependency checking (which was turned off temporarily for the 3.1 release) for Intel Atom in the Post RA scheduler. Patch by Andy Zhang! Thanks to Jakob and Anton for their reviews. llvm-svn: 155395	2012-04-23 21:39:35 +00:00
Jim Grosbach	649ba20f1a	ARM: Add testcases for two-operand variants of VSRA/VRSRA/VSRI. llvm-svn: 155391	2012-04-23 21:00:47 +00:00
Jim Grosbach	2d1db8e4a5	Add ARM mode tests for the NEON vector shift-accumulate tests. llvm-svn: 155390	2012-04-23 21:00:44 +00:00
Jim Grosbach	41406f1b7a	Tidy up. Reformat for ease of reading. llvm-svn: 155389	2012-04-23 21:00:42 +00:00
Chandler Carruth	9460759e4f	Revert r155365, r155366, and r155367. All three of these have regression test suite failures. The failures occur at each stage, and only get worse, so I'm reverting all of them. Please resubmit these patches, one at a time, after verifying that the regression test suite passes. Never submit a patch without running the regression test suite. llvm-svn: 155372	2012-04-23 18:25:57 +00:00
Sirish Pande	9f4844f7da	Hexagon V5 (floating point) support. llvm-svn: 155367	2012-04-23 17:49:40 +00:00
Sirish Pande	4bcbe40295	Support for Hexagon architectural feature, new value jump. llvm-svn: 155366	2012-04-23 17:49:28 +00:00
Sirish Pande	2230f1957e	Support for Hexagon VLIW Packetizer. llvm-svn: 155365	2012-04-23 17:49:20 +00:00
Jakob Stoklund Olesen	6c1440cf27	Reapply r155136 after fixing PR12599. Original commit message: Defer some shl transforms to DAGCombine. The shl instruction is used to represent multiplication by a constant power of two as well as bitwise left shifts. Some InstCombine transformations would turn an shl instruction into a bit mask operation, making it difficult for later analysis passes to recognize the constsnt multiplication. Disable those shl transformations, deferring them to DAGCombine time. An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'. These transformations are deferred: (X >>? C) << C --> X & (-1 << C) (When X >> C has multiple uses) (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2) (When C2 > C1) (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2) (When C1 > C2) The corresponding exact transformations are preserved, just like div-exact + mul: (X >>?,exact C) << C --> X (X >>?,exact C1) << C2 --> X << (C2-C1) (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2) The disabled transformations could also prevent the instruction selector from recognizing rotate patterns in hash functions and cryptographic primitives. I have a test case for that, but it is too fragile. llvm-svn: 155362	2012-04-23 17:39:52 +00:00
Elena Demikhovsky	587ea8d3fc	cleaned line endings in the newly added test file llvm-svn: 155315	2012-04-22 13:22:48 +00:00
Chandler Carruth	a4f8aa5231	Tidy up this test more: 1) Make the checked assertions a bit more precise. We really want the canonical forms coming out of reassociate to be exactly what is expected. 2) Remove other passes, and switch the test to actually directly check that reassociate makes the important transforms and canonicalizations. 3) Fold in a related test case now that we're using FileCheck. Make the same tidying changes to it. llvm-svn: 155311	2012-04-22 10:11:26 +00:00
Chandler Carruth	038c36b06c	FileCheck-ize a test, and tidy it up a touch. llvm-svn: 155310	2012-04-22 10:11:23 +00:00
Elena Demikhovsky	35721fc4f8	ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2 llvm-svn: 155309	2012-04-22 09:39:03 +00:00
Nadav Rotem	97bbbe3368	Teach getVectorTypeBreakdown about promotion of vectors in addition to widening of vectors. llvm-svn: 155296	2012-04-21 20:08:32 +00:00
Jakob Stoklund Olesen	adfc8212cf	Fix PR12599. The X86 target is editing the selection DAG while isel is selecting nodes following a topological ordering. When the DAG hacking triggers CSE, nodes can be deleted and bad things happen. llvm-svn: 155257	2012-04-20 23:36:09 +00:00
Jim Grosbach	e33d0c7063	ARM: Update NEON assembly two-operand aliases. Use the new TwoOperandAliasConstraint to handle lots of the two-operand aliases for NEON instructions. There's still more to go, but this is a good chunk of them. llvm-svn: 155210	2012-04-20 18:12:54 +00:00
Manuel Klimek	dc83827995	Removes json-bench from the test dependencies. llvm-svn: 155197	2012-04-20 13:45:49 +00:00
Jakob Stoklund Olesen	3d22f26e88	Revert r155136 "Defer some shl transforms to DAGCombine." While the patch was perfect and defect free, it exposed a really nasty bug in X86 SelectionDAG that caused an llc crash when compiling lencod. I'll put the patch back in after fixing the SelectionDAG problem. llvm-svn: 155181	2012-04-20 00:38:45 +00:00
Jim Grosbach	c935649d5c	ARM some VFP tblgen'erated two-operand aliases. llvm-svn: 155178	2012-04-20 00:15:00 +00:00

... 2 3 4 5 6 ...

16394 Commits