llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00

Author	SHA1	Message	Date
Devang Patel	99147805e4	Remove little used statistical counter. llvm-svn: 130955	2011-05-05 22:00:08 +00:00
Rafael Espindola	a84abb2226	Implement a really simple DwarfSjLjException. llvm-svn: 130947	2011-05-05 20:48:31 +00:00
Rafael Espindola	a99111bae9	List all exception types in a switch. llvm-svn: 130944	2011-05-05 19:48:34 +00:00
Andrew Trick	cd459f78ef	ARM post RA scheduler compile time fix. BuildSchedGraph was quadratic in the number of calls in the basic block. After this fix, it keeps only a single call at the top of the DefList so compile time doesn't blow up on large blocks. This reduces postRA sched time on an external test case from 81s to 0.3s. Although r130800 (reduced ARM register alias defs) also partially fixes the issue by reducing the constant overhead of checking call interference by an order of magnitude. Fixes <rdar://problem/7662664> very poor compile time with post RA scheduling. llvm-svn: 130943	2011-05-05 19:32:21 +00:00
Andrew Trick	5946376390	whitespace llvm-svn: 130942	2011-05-05 19:24:06 +00:00
Owen Anderson	35f6bae989	Allow FastISel of three-register-operand instructions. llvm-svn: 130934	2011-05-05 17:59:04 +00:00
Devang Patel	3d0c5dc9fd	If debug info for inlined function is missing then handle it gracefully. llvm-svn: 130933	2011-05-05 17:54:26 +00:00
Jakob Stoklund Olesen	fa1e44c83e	Add some statistics to the splitting and spilling frameworks. llvm-svn: 130931	2011-05-05 17:22:53 +00:00
Eli Friedman	09ec41fcde	Avoid extra vreg copies for arguments passed in registers. Specifically, this can make MachineCSE more effective in some cases (especially in small functions). PR8361 / part of rdar://problem/8259436 . llvm-svn: 130928	2011-05-05 16:53:34 +00:00
Eli Friedman	367747bccd	Small syntax cleanup; we don't need to #define constants in C++. No functionality change intended. llvm-svn: 130926	2011-05-05 16:25:23 +00:00
Eli Friedman	9d2cd1cac4	Minor correction to r130877; fixes PR9846 and hopefully the buildbot failures. llvm-svn: 130925	2011-05-05 16:18:11 +00:00
Bill Wendling	7a13f3bde3	Remove a flag that would set the ".eh" symbol as .globl. MachO was the only one who used this flag, and it now emits CFI and doesn't emit this anymore. All other targets left this flag "false". <rdar://problem/8486371> llvm-svn: 130918	2011-05-05 06:49:15 +00:00
Jakob Stoklund Olesen	357867ae35	Disable physical register coalescing by default. Joining physregs is inherently dangerous because it uses a heuristic to avoid creating invalid code. Linear scan had an emergency spilling mechanism to deal with those rare cases. The new greedy allocator does not. The greedy register allocator is much better at taking hints, so this has almost no impact on code size and quality. The few cases where it matters show up as unit tests that now have -join-physregs enabled explicitly. llvm-svn: 130896	2011-05-04 23:59:00 +00:00
Bill Wendling	279e17e523	SjLj EH could produce a machine basic block that legitimately has more than one landing pad as its successor. SjLj exception handling jumps to the correct landing pad via a switch statement that's generated right before code-gen. Loosen the constraint in the machine instruction verifier to allow for this. Note, this isn't the most rigorous check since we cannot determine where that switch statement came from. But it's marginally better than turning this check off when SjLj exceptions are used. <rdar://problem/9187612> llvm-svn: 130881	2011-05-04 22:54:05 +00:00
Eli Friedman	5b78092546	Re-commit r130862 with a minor change to avoid an iterator running off the edge in some cases. Original message: Teach MachineCSE how to do simple cross-block CSE involving physregs. This allows, for example, eliminating duplicate cmpl's on x86. Part of rdar://problem/8259436 . llvm-svn: 130877	2011-05-04 22:10:36 +00:00
Eli Friedman	cc74616be6	Back out r130862; it appears to be breaking bootstrap. llvm-svn: 130867	2011-05-04 20:48:42 +00:00
Eli Friedman	e086e00208	Teach MachineCSE how to do simple cross-block CSE involving physregs. This allows, for example, eliminating duplicate cmpl's on x86. Part of rdar://problem/8259436 . llvm-svn: 130862	2011-05-04 19:54:24 +00:00
Rafael Espindola	8346c59e6a	Producing a DW_FORM_addr for DW_AT_stmt_list is probably correct, but it is both inefficient and unexpected by dwarfdump. Change to a DW_FORM_data4. While in here, change the predicate name to reflect that the position is not really absolute (it is an offset), just that the linker needs a relocation. llvm-svn: 130846	2011-05-04 17:44:06 +00:00
Jakob Stoklund Olesen	372f33d560	Rename -disable-physical-join to -join-physregs and invert it. Physreg joining is still on by default, but I will turn it off shortly. llvm-svn: 130844	2011-05-04 16:45:05 +00:00
Devang Patel	a6e69a5541	Tighten up check for empty (i.e. no meaningful debug info) module. This fixes dwarf-die2.c test case from gcc test suite. llvm-svn: 130842	2011-05-04 16:34:02 +00:00
Devang Patel	3506ad5c7f	Even if the subprogram is going to use AT_specification, emit DW_AT_MIPS_linkage_name. This helps gdb and fixes var-path-expr.exp regression reported by gdb testsuite. llvm-svn: 130794	2011-05-03 21:50:34 +00:00
Jakob Stoklund Olesen	9884b02bef	Gracefully handle invalid live ranges. Fix PR9831. Register coalescing can sometimes create live ranges that end in the middle of a basic block without any killing instruction. When SplitKit detects this, it will repair the live range by shrinking it to its uses. Live range splitting also needs to know about this. When the range shrinks so much that it becomes allocatable, live range splitting fails because it can't find a good split point. It is paranoid about making progress, so an allocatable range is considered an error. The coalescer should really not be creating these bad live ranges. They appear when coalescing dead copies. llvm-svn: 130787	2011-05-03 20:42:13 +00:00
Devang Patel	2d8861a690	If the front end has emitted llvm.dbg.cu and other debug info anchors (clang does it now) then use them directly. This saves one scan of entire module, to collect debug info, which in turns saves few machine cycles at compile time. llvm-svn: 130759	2011-05-03 16:45:22 +00:00
Owen Anderson	285b2ec92d	Other parts of the SelectionDAG framework assume that targets use their pointer type for vector indices. Make the vector unrolling code respect that. llvm-svn: 130733	2011-05-02 22:25:45 +00:00
Jakob Stoklund Olesen	86f7ef57c2	Handle <def,undef> in the second loop as well. llvm-svn: 130718	2011-05-02 20:36:53 +00:00
Jakob Stoklund Olesen	4ad266b02c	Use the PrintReg adaptor to correctly print live-in registers in debug output. llvm-svn: 130715	2011-05-02 20:06:30 +00:00
Jakob Stoklund Olesen	3c707fcb60	Only ignore <undef> use operands, keep the <def,undef> ops. Def operands may also have an <undef> flag, but that just means that a sub-register redef doesn't actually read the super-register. For physical registers, it has no meaning. llvm-svn: 130714	2011-05-02 20:06:28 +00:00
Devang Patel	7d7ac5512a	Emit debug info for global variables first. This works around a limitation in gdb which is reported by following inherit.exp test failures from gdb testsuite. gdb.cp/inherit.exp: print g_vB.vB::vb gdb.cp/inherit.exp: print g_vB.vB::vx gdb.cp/inherit.exp: print g_vC.vC::vc gdb.cp/inherit.exp: print g_vC.vC::vx gdb.cp/inherit.exp: print g_vD.vB::vb ... llvm-svn: 130702	2011-05-02 18:19:17 +00:00
Rafael Espindola	8ef7012675	Only produce the eh_frame section if we have at least one personality function. llvm-svn: 130692	2011-05-02 15:49:52 +00:00
Jakob Stoklund Olesen	281067b1b8	Minimize the slot indexes spanned by register ranges created when splitting. When an interfering live range ends at a dead slot index between two instructions, make sure that the inserted copy instruction gets a slot index after the dead ones. This makes it possible to avoid the interference. Ideally, there shouldn't be interference ending at a deleted instruction, but physical register coalescing can sometimes do that to sub-registers. This fixes PR9823. llvm-svn: 130687	2011-05-02 05:29:58 +00:00
Rafael Espindola	eb5d0cb4f4	GCC uses a different encoding of pointers in the FDE when using -fno-dwarf2-cfi-asm. Implement the same behavior. llvm-svn: 130637	2011-05-01 04:49:54 +00:00
Jakob Stoklund Olesen	ffe1dbc840	When a physreg is live-in and live through a basic block, make sure its live range covers the entire block. The live range can't be terminated at a random instruction. llvm-svn: 130619	2011-04-30 19:12:33 +00:00
Jakob Stoklund Olesen	4ec9e1c33a	Avoid using stale entries form the sibling value map. This could happen when trying to use a value that had been eliminated after dead code elimination and folding loads. llvm-svn: 130597	2011-04-30 06:42:21 +00:00
Jakob Stoklund Olesen	8c578bf1cf	Use hysteresis for local live range splitting as well. llvm-svn: 130596	2011-04-30 05:07:46 +00:00
Rafael Espindola	7901d3790e	Add all the plumbing needed for MC to expand cfi to the old tables in the final assembly. It is the same technique used when targeting assemblers that don't support .loc. llvm-svn: 130587	2011-04-30 03:44:37 +00:00
Jakob Stoklund Olesen	720ec44d13	Update comment. llvm-svn: 130582	2011-04-30 03:13:08 +00:00
Jakob Stoklund Olesen	a397ddfd59	Use a greedy algorithm for allocating registers. llvm-svn: 130568	2011-04-30 01:37:54 +00:00
Bill Wendling	0a7857b4fa	Print out the 'nontemporal' info on a store. llvm-svn: 130562	2011-04-29 23:45:22 +00:00
Eli Friedman	919bf1ca71	Make FastEmit_ri_ try a bit harder to succeed for supported operations; FastEmit_i can fail for non-Thumb2 ARM. Makes ARMSimplifyAddress work correctly, and reduces the number of fast-isel bailouts on non-Thumb ARM. llvm-svn: 130560	2011-04-29 23:34:52 +00:00
Devang Patel	acce3eb6b2	Hoist MCLineEntry construction AsmPrinter so that anyone who derives from AsmPrinter can have line number entries. PR 9810 llvm-svn: 130518	2011-04-29 18:00:54 +00:00
Rafael Espindola	7c632e0001	The last hack for producing bit identical output with cfi on OS X. llvm-svn: 130504	2011-04-29 15:09:53 +00:00
Rafael Espindola	16455286cb	Change DwarfCFIException's member variables to track what it actually emmits: .cfi_personality, .cfi_lsda and the moves. llvm-svn: 130503	2011-04-29 14:48:51 +00:00
Rafael Espindola	16b23d9ff7	Factor some code to needsCFIMoves. Avoid printing moves when we don't have to. llvm-svn: 130501	2011-04-29 14:14:06 +00:00
Devang Patel	900ceb725b	Teach dwarf writer to handle complex address expression for .debug_loc entries. This fixes clang generated blocks' variables' debug info. Radar 9279956. llvm-svn: 130373	2011-04-28 02:22:40 +00:00
Eli Friedman	86181251f3	Fix a silly mistake in r130338. llvm-svn: 130360	2011-04-28 00:42:03 +00:00
Rafael Espindola	36e419b524	Remove unnecessary argument. llvm-svn: 130343	2011-04-27 23:17:57 +00:00
Rafael Espindola	0525497a16	Rename getPersonalityPICSymbol to getCFIPersonalitySymbol, document it, and give it a bit more responsibility. Also implement it for MachO. If hacked to use cfi, 32 bit MachO will produce .cfi_personality 155, L___gxx_personality_v0$non_lazy_ptr and 64 bit will produce .cfi_presonality ___gxx_personality_v0 The general idea is that .cfi_personality gets passed the final symbol. It is up to codegen to produce it if using indirect representation (like 32 bit MachO), but it is up to MC to decide which relocations to create. llvm-svn: 130341	2011-04-27 23:08:15 +00:00
Devang Patel	42736cb058	Simplify handling of variables with complex address (i.e. blocks variables) llvm-svn: 130339	2011-04-27 22:45:24 +00:00
Eli Friedman	c5406cdb50	Make the fast-isel code for literal 0.0 a bit shorter/faster, since 0.0 is common. rdar://problem/9303592 . llvm-svn: 130338	2011-04-27 22:41:55 +00:00
Eli Friedman	fc1152d772	Remove unused function. llvm-svn: 130337	2011-04-27 22:21:02 +00:00
Rafael Espindola	6bf5acb38d	Fix indentation. llvm-svn: 130331	2011-04-27 21:29:52 +00:00
Devang Patel	42f4a7ff92	Revert r130178. It turned out to be not the optimal path to emit complex location expressions. llvm-svn: 130326	2011-04-27 20:29:27 +00:00
Evan Cheng	fa34d31aa4	If converter was being too cute. It look for root BBs (which don't have successors) and use inverse depth first search to traverse the BBs. However that doesn't work when the CFG has infinite loops. Simply do a linear traversal of all BBs work just fine. rdar://9344645 llvm-svn: 130324	2011-04-27 19:32:43 +00:00
Jakob Stoklund Olesen	adb564f3cd	Also add <imp-def> operands for defined and dead super-registers when rewriting. We cannot rely on the <imp-def> operands added by LiveIntervals in all cases as demonstrated by the test case. llvm-svn: 130313	2011-04-27 17:42:31 +00:00
Jakob Stoklund Olesen	2fa051f068	Add a safe-guard against repeated splitting for some rare cases. The number of blocks covered by a live range must be strictly decreasing when splitting, otherwise we can't allow repeated splitting. llvm-svn: 130249	2011-04-26 22:33:12 +00:00
Evan Cheng	dea3347167	Be careful about scheduling nodes above previous calls. It increase usages of more callee-saved registers and introduce copies. Only allows it if scheduling a node above calls would end up lessen register pressure. Call operands also has added ABI restrictions for register allocation, so be extra careful with hoisting them above calls. rdar://9329627 llvm-svn: 130245	2011-04-26 21:31:35 +00:00
Rafael Espindola	e238ffe4ba	Print the label if we will use it in debug_frame. llvm-svn: 130232	2011-04-26 19:26:53 +00:00
Devang Patel	09b1585aac	Refactor code. Keep dwarf register operation selection logic at one place. llvm-svn: 130231	2011-04-26 19:06:18 +00:00
Jakob Stoklund Olesen	c9cf507d93	Use the new TRI->getLargestLegalSuperClass hook to constrain register class inflation. This has two effects: 1. We never inflate to a larger register class than what the sub-target can handle. 2. Completely unconstrained virtual registers get the largest possible register class. llvm-svn: 130229	2011-04-26 18:52:36 +00:00
Dan Gohman	fbb7ade7ae	Fast-isel support for simple inline asms. llvm-svn: 130205	2011-04-26 17:18:34 +00:00
Chris Lattner	37fec9f729	don't emit the symbol name twice for local bss and common symbols. For example, don't emit: .comm _i,4,2 ## @i ## @i instead emit: .comm _i,4,2 ## @i llvm-svn: 130192	2011-04-26 06:14:13 +00:00
Evan Cheng	73a9ae3388	Fix typo llvm-svn: 130190	2011-04-26 04:57:37 +00:00
Rafael Espindola	59c3a084c6	Print all the moves at a given label instead of just the first one. Remove previous DwarfCFI hack. llvm-svn: 130187	2011-04-26 03:58:56 +00:00
Devang Patel	4969322bc4	Let dwarf writer allocate extra space in the debug location expression. This space, if requested, will be used for complex addresses of the Blocks' variables. llvm-svn: 130178	2011-04-26 00:12:46 +00:00
Devang Patel	3da97b7d34	Rename a local variable. llvm-svn: 130171	2011-04-25 23:05:21 +00:00
Devang Patel	e28211b031	Rename a method to match what it really does. s/addVariableAddress/addFrameVariableAddress/g llvm-svn: 130170	2011-04-25 23:02:17 +00:00
Devang Patel	b1b33d6569	Do not drop a variable's complex address if it is not based on frame base. Observed this while reading code, so I do not have a test case handy here. llvm-svn: 130167	2011-04-25 22:52:55 +00:00
Devang Patel	83eac5e134	A dbg.declare may not be in entry block, even if it is referring to an incoming argument. However, It is appropriate to emit DBG_VALUE referring to this incoming argument in entry block in MachineFunction. llvm-svn: 130129	2011-04-25 16:33:52 +00:00
Rafael Espindola	a14f5303dd	Simplify the logic. Noticed by aKor. llvm-svn: 130116	2011-04-24 19:55:34 +00:00
Rafael Espindola	8c824c73b6	Synchronize the conditions for producing a .cfi_startproc and a .cfi_endproc. Fixes PR9787. llvm-svn: 130115	2011-04-24 19:00:34 +00:00
Sebastian Redl	5fea40be23	Give SplitKit.h a header guard. llvm-svn: 130095	2011-04-24 15:46:51 +00:00
Jay Foad	c146569beb	Remove unused STL header includes. llvm-svn: 130068	2011-04-23 19:53:52 +00:00
Owen Anderson	e1b33b92a3	Teach FastISel to deal with instructions that have two immediate operands. llvm-svn: 130033	2011-04-22 23:38:06 +00:00
Devang Patel	929bbb6bf9	Let front-end tie subprogram declaration with subprogram definition directly. llvm-svn: 130028	2011-04-22 23:10:17 +00:00
Jakob Stoklund Olesen	0dcf650f0a	Always compare the cost of region splitting with the cost of per-block splitting. Sometimes it is better to split per block, and we missed those cases. llvm-svn: 130025	2011-04-22 22:47:40 +00:00
Chris Lattner	d9c0db9bd7	Recommit the fix for rdar://9289512 with a couple tweaks to fix bugs exposed by the gcc dejagnu testsuite: 1. The load may actually be used by a dead instruction, which would cause an assert. 2. The load may not be used by the current chain of instructions, and we could move it past a side-effecting instruction. Change how we process uses to define the problem away. llvm-svn: 130018	2011-04-22 21:59:37 +00:00
Benjamin Kramer	f6eab5f86e	DAGCombine: fold "(zext x) == C" into "x == (trunc C)" if the trunc is lossless. On x86 this allows to fold a load into the cmp, greatly reducing register pressure. movzbl (%rdi), %eax cmpl $47, %eax -> cmpb $47, (%rdi) This shaves 8k off gcc.o on i386. I'll leave applying the patch in README.txt to Chris :) llvm-svn: 130005	2011-04-22 18:47:44 +00:00
Devang Patel	6aac901f77	Do not leak argument's DbgVariables. llvm-svn: 130004	2011-04-22 18:09:57 +00:00
Evan Cheng	1f353e2438	Typo llvm-svn: 129970	2011-04-22 01:40:20 +00:00
Bill Wendling	b0df282414	Branch folding is folding a landing pad into a regular BB. An exception is thrown via a call to _cxa_throw, which we don't expect to return. Therefore, the "true" part of the invoke goes to a BB that has 'unreachable' as its only instruction. This is lowered into an empty MachineBB. The landing pad for this invoke, however, is directly after the "true" MBB. When the empty MBB is removed, the landing pad is directly below the BB with the invoke call. The unconditional branch is removed and then the two blocks are merged together. The testcase is too big for a regression test. <rdar://problem/9305728> llvm-svn: 129965	2011-04-22 01:07:09 +00:00
Devang Patel	4f25432e4e	Refactor. llvm-svn: 129938	2011-04-21 21:07:35 +00:00
Matt Beaumont-Gay	eb07568f1b	Don't recycle loop variables. llvm-svn: 129928	2011-04-21 19:46:23 +00:00
Jakob Stoklund Olesen	5053b8795b	Allow allocatable ranges from global live range splitting to be split again. These intervals are allocatable immediately after splitting, but they may be evicted because of later splitting. This is rare, but when it happens they should be split again. The remainder intervals that cannot be allocated after splitting still move directly to spilling. SplitEditor::finish can optionally provide a mapping from new live intervals back to the original interval indexes returned by openIntv(). Each original interval index can map to multiple new intervals after connected components have been separated. Dead code elimination may also add existing intervals to the list. The reverse mapping allows the SplitEditor client to treat the new intervals differently depending on the split region they came from. llvm-svn: 129925	2011-04-21 18:38:15 +00:00
Devang Patel	a31b73427e	Add comment in output stream. llvm-svn: 129921	2011-04-21 17:50:24 +00:00
Daniel Dunbar	3a96439b36	Revert r1296656, "Fix rdar://9289512 - not folding load into compare at -O0...", which broke a couple GCC test suite tests at -O0. llvm-svn: 129914	2011-04-21 16:14:46 +00:00
Jakob Stoklund Olesen	22089a66fb	Add debug output for rematerializable instructions. llvm-svn: 129883	2011-04-20 22:14:20 +00:00
Jakob Stoklund Olesen	266de7e1e1	Permit remat when a virtual register has multiple defs. TII::isTriviallyReMaterializable() shouldn't depend on any properties of the register being defined by the instruction. Rematerialization is going to create a new virtual register anyway. llvm-svn: 129882	2011-04-20 22:14:17 +00:00
Jakob Stoklund Olesen	6501ea2557	Prefer cheap registers for busy live ranges. On the x86-64 and thumb2 targets, some registers are more expensive to encode than others in the same register class. Add a CostPerUse field to the TableGen register description, and make it available from TRI->getCostPerUse. This represents the cost of a REX prefix or a 32-bit instruction encoding required by choosing a high register. Teach the greedy register allocator to prefer cheap registers for busy live ranges (as indicated by spill weight). llvm-svn: 129864	2011-04-20 18:19:48 +00:00
Stuart Hastings	a552942e02	ARM byval support. Will be enabled by another patch to the FE. <rdar://problem/7662569> llvm-svn: 129858	2011-04-20 16:47:52 +00:00
Rafael Espindola	09e9797728	Remove unused arguments. llvm-svn: 129844	2011-04-20 03:08:09 +00:00
Eric Christopher	4c3c7c8211	Rewrite the expander for umulo/smulo to remember to sign extend the input manually and pass all (now) 4 arguments to the mul libcall. Add a new ExpandLibCall for just this (copied gratuitously from type legalization). Fixes rdar://9292577 llvm-svn: 129842	2011-04-20 01:19:45 +00:00
Daniel Dunbar	82a4062a4e	ADT/Triple: Renambe isOSX... methods to isMacOSX for consistency with the OS triple component. llvm-svn: 129838	2011-04-20 00:14:25 +00:00
Daniel Dunbar	140e365c49	CodeGen: Eliminate a use of getDarwinMajorNumber(). - There is a minor semantic change here (evidenced by the test change) for Darwin triples that have no version component. I debated changing the default behavior of isOSVersionLT, but decided it made more sense for triples to be explicit. llvm-svn: 129802	2011-04-19 20:32:39 +00:00
Stuart Hastings	89cb281cf8	Delete unnecessary variable. <rdar://problem/7662569> llvm-svn: 129796	2011-04-19 20:09:38 +00:00
Bob Wilson	886994b683	Avoid write-after-write issue hazards for Cortex-A9. Add a avoidWriteAfterWrite() target hook to identify register classes that suffer from write-after-write hazards. For those register classes, try to avoid writing the same register in two consecutive instructions. This is currently disabled by default. We should not spill to avoid hazards! The command line flag -avoid-waw-hazard can be used to enable waw avoidance. llvm-svn: 129772	2011-04-19 18:11:45 +00:00
Jakob Stoklund Olesen	dceb96c62d	Force the greedy register allocator to be linked alongside linear scan. This means that the new register allocator can be used with 'clang -mllvm -regalloc=greedy'. llvm-svn: 129764	2011-04-19 17:17:58 +00:00
Eli Friedman	bbf7d2ac38	SelectBasicBlock is rather slow even when it doesn't do anything; skip the unnecessary work where possible. llvm-svn: 129763	2011-04-19 17:01:08 +00:00
Stuart Hastings	f838ea4959	Support nested CALLSEQ_BEGIN/END; necessary for ARM byval support. <rdar://problem/7662569> llvm-svn: 129761	2011-04-19 16:16:58 +00:00
Chris Lattner	f15db6c86f	Implement support for x86 fastisel of small fixed-sized memcpys, which are generated en-mass for C++ PODs. On my c++ test file, this cuts the fast isel rejects by 10x and shrinks the generated .s file by 5% llvm-svn: 129755	2011-04-19 05:52:03 +00:00
Eli Friedman	b306371396	Simplify declarations slightly by using typedefs. llvm-svn: 129720	2011-04-18 21:21:37 +00:00
Devang Patel	7220c1a021	Reduce clutter in asm output. Do not emit source location as comment for each instruction. llvm-svn: 129715	2011-04-18 20:26:49 +00:00
Jakob Stoklund Olesen	c2f25578a4	Handle spilling around an instruction that has an early-clobber re-definition of the spilled register. This is quite common on ARM now that some stores have early-clobber defines. llvm-svn: 129714	2011-04-18 20:23:27 +00:00
Eric Christopher	e1103d0a86	Fix a bug where we were counting the alias sets as completely used registers for fast allocation a different way. This has us updating used registers only when we're using that exact register. Fixes rdar://9207598 llvm-svn: 129711	2011-04-18 19:26:25 +00:00
Chris Lattner	f8f4d3c30a	while we're at it, handle 'sdiv exact' of a power of 2 also, this fixes a few rejects on c++ iterator loops. llvm-svn: 129694	2011-04-18 07:00:40 +00:00
Chris Lattner	dd2f1ec77c	fix rdar://9297011 - udiv by power of two causing fast-isel rejects llvm-svn: 129693	2011-04-18 06:55:51 +00:00
Chris Lattner	28eaf6be7f	1. merge fast-isel-shift-imm.ll into fast-isel-x86-64.ll 2. implement rdar://9289501 - fast isel should fold trivial multiplies to shifts 3. teach tblgen to handle shift immediates that are different sizes than the shifted operands, eliminating some code from the X86 fast isel backend. 4. Have FastISel::SelectBinaryOp use (the poorly named) FastEmit_ri_ function instead of FastEmit_ri to simplify code. llvm-svn: 129666	2011-04-17 20:23:29 +00:00
Chris Lattner	f9d9976374	fix an oversight which caused us to compile the testcase (and other less trivial things) into a dummy lea. Before we generated: _test: ## @test movq _G@GOTPCREL(%rip), %rax leaq (%rax), %rax ret now we produce: _test: ## @test movq _G@GOTPCREL(%rip), %rax ret This is part of rdar://9289558 llvm-svn: 129662	2011-04-17 17:12:08 +00:00
Chris Lattner	5e00f501ff	Fix rdar://9289512 - not folding load into compare at -O0 The basic issue here is that bottom-up isel is matching the branch and compare, and was failing to fold the load into the branch/compare combo. Fixing this (by allowing folding into any instruction of a sequence that is selected) allows us to produce things like: cmpb $0, 52(%rax) je LBB4_2 instead of: movb 52(%rax), %cl cmpb $0, %cl je LBB4_2 This makes the generated -O0 code run a bit faster, but also speeds up compile time by putting less pressure on the register allocator and generating less code. This was one of the biggest classes of missing load folding. Implementing this shrinks 176.gcc's c-decl.s (as a random example) by about 4% in (verbose-asm) line count. llvm-svn: 129656	2011-04-17 06:35:44 +00:00
Chris Lattner	1fe5f78b7e	split a complex predicate out to a helper function. Simplify two for loops, which don't need to check for falling off the end of a block and end of phi nodes, since terminators are never phis. llvm-svn: 129655	2011-04-17 06:03:19 +00:00
Chris Lattner	cb194276e0	fix rdar://9289583 - fast isel should handle non-canonical commutative binops allowing us to fold the immediate into the 'and' in this case: int test1(int i) { return 8&i; } llvm-svn: 129653	2011-04-17 01:16:47 +00:00
Eli Friedman	2798137293	PR9055: extend the fix to PR4050 (r70179) to apply to zext and anyext. Returning a new node makes the code try to replace the old node, which in the included testcase is killed by CSE. llvm-svn: 129650	2011-04-16 23:25:34 +00:00
Francois Pichet	1cc1375d03	Unbreak the MSVC 2010 build. For further information on this particular issue see: http://connect.microsoft.com/VisualStudio/feedback/details/520043/error-converting-from-null-to-a-pointer-type-in-std-pair llvm-svn: 129642	2011-04-16 14:20:39 +00:00
Benjamin Kramer	0b3416e2f5	Remove unused variable. llvm-svn: 129639	2011-04-16 10:30:47 +00:00
Rafael Espindola	9e5aaa3b78	Put each personality function in a section. This fixes the gnu ld warning: error in foo.o; no .eh_frame_hdr table will be created. llvm-svn: 129635	2011-04-16 03:51:21 +00:00
Evan Cheng	b720f37282	Fix divmod libcall lowering. Convert to {S\|U}DIVREM first and then expand the node to a libcall. rdar://9280991 llvm-svn: 129633	2011-04-16 03:08:26 +00:00
Devang Patel	eddab1d186	Introduce support to encode Objective-C property information in debugging information generated for an interface. llvm-svn: 129624	2011-04-16 00:11:51 +00:00
Rafael Espindola	694ad2f25c	Some refactoring suggested by Anton Korobeynikov. llvm-svn: 129600	2011-04-15 20:32:03 +00:00
Jakob Stoklund Olesen	bdd6204582	Teach the SplitKit blitter to handle multiply defined values as well. The transferValues() function can now handle both singly and multiply defined values, as long as the resulting live range is known. Only rematerialized values have their live range recomputed by extendRange(). The updateSSA() function can now insert PHI values in bulk across multiple values in multiple target registers in one pass. The list of blocks received from transferValues() is in layout order which seems to work well for the iterative algorithm. Blocks from extendRange() are still in reverse BFS order, but this function is used so rarely now that it doesn't matter. llvm-svn: 129580	2011-04-15 17:24:49 +00:00
Jakob Stoklund Olesen	ea8581b792	Remember to set flag. llvm-svn: 129579	2011-04-15 17:24:46 +00:00
Rafael Espindola	99831068c8	Add 129518 back with a fix for when we are producing eh just because of debug info. Change ELF systems to use CFI for producing the EH tables. This reduces the size of the clang binary in Debug builds from 690MB to 679MB. llvm-svn: 129571	2011-04-15 15:11:06 +00:00
Chris Lattner	0304b82f80	Fix a ton of comment typos found by codespell. Patch by Luis Felipe Strano Moraes! llvm-svn: 129558	2011-04-15 05:18:47 +00:00
NAKAMURA Takumi	7aed456653	Revert r129518, "Change ELF systems to use CFI for producing the EH tables. This reduces the" It broke several builds. llvm-svn: 129557	2011-04-15 03:35:57 +00:00
Owen Anderson	0ce6c0f86e	Fix another instance of the DAG combiner not using the correct type for the RHS of a shift. llvm-svn: 129522	2011-04-14 17:30:49 +00:00
Rafael Espindola	d5eed657e2	Change ELF systems to use CFI for producing the EH tables. This reduces the size of the clang binary in Debug builds from 690MB to 679MB. llvm-svn: 129518	2011-04-14 15:18:53 +00:00
Andrew Trick	e89c19ab7b	In the pre-RA scheduler, maintain cmp+br proximity. This is done by pushing physical register definitions close to their use, which happens to handle flag definitions if they're not glued to the branch. This seems to be generally a good thing though, so I didn't need to add a target hook yet. The primary motivation is to generate code closer to what people expect and rule out missed opportunity from enabling macro-op fusion. As a side benefit, we get several 2-5% gains on x86 benchmarks. There is one regression: SingleSource/Benchmarks/Shootout/lists slows down be -10%. But this is an independent scheduler bug that will be tracked separately. See rdar://problem/9283108. Incidentally, pre-RA scheduling is only half the solution. Fixing the later passes is tracked by: <rdar://problem/8932804> [pre-RA-sched] on x86, attempt to schedule CMP/TEST adjacent with condition jump Fixes: <rdar://problem/9262453> Scheduler unnecessary break of cmp/jump fusion llvm-svn: 129508	2011-04-14 05:15:06 +00:00
Chris Lattner	d4ba43dc76	sink a call into its only use. llvm-svn: 129503	2011-04-14 04:12:47 +00:00
Owen Anderson	d98929ed6c	During post-legalization DAG combining, be careful to only create shifts where the RHS is of the legal type for the new operation. llvm-svn: 129484	2011-04-13 23:22:23 +00:00
Devang Patel	43cbfe2ba7	Remove extra bytes that were added for gdb. We do not have good poiner to understand actual reason behind this fixme. Spot checking suggest that newer gdb does not need this. llvm-svn: 129461	2011-04-13 19:41:17 +00:00
Jakob Stoklund Olesen	d7db076abc	Stop using dead function. llvm-svn: 129442	2011-04-13 15:00:11 +00:00
Andrew Trick	916e01c917	Recommit r129383. PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency. Additional fixes: Do something reasonable for subtargets with generic itineraries by handle node latency the same as for an empty itinerary. Now nodes default to unit latency unless an itinerary explicitly specifies a zero cycle stage or it is a TokenFactor chain. Original fixes: UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make the ndoe latency adjustments work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129421	2011-04-13 00:38:32 +00:00
Eric Christopher	147cad907a	Temporarily revert r129408 to see if it brings the bots back. llvm-svn: 129417	2011-04-13 00:20:59 +00:00
Eric Christopher	c72bd6024f	Fix a bug where we were counting the alias sets as completely used registers for fast allocation. Fixes rdar://9207598 llvm-svn: 129408	2011-04-12 23:23:14 +00:00
Devang Patel	9cceebfde4	I missed this new file in previous commit. llvm-svn: 129407	2011-04-12 23:21:44 +00:00
Devang Patel	5f8111e1ca	Simplify. There is no need to use static variable. llvm-svn: 129406	2011-04-12 23:10:47 +00:00
Devang Patel	f078958e43	Do not reuse parameter name. llvm-svn: 129405	2011-04-12 23:09:06 +00:00
Devang Patel	f288e23b3f	This mechanical patch moves type handling into CompileUnit from DwarfDebug. In case of multiple compile unit in one object file, each compile unit is responsible for its own set of type entries anyway. This refactoring makes this obvious. llvm-svn: 129402	2011-04-12 22:53:02 +00:00
Eric Christopher	553418ccd4	Add more comments... err debug statements to the fast allocator. llvm-svn: 129400	2011-04-12 22:17:44 +00:00
Jakob Stoklund Olesen	7f28263ab0	SparseBitVector is SLOW. Use a Bitvector instead, we didn't need the smaller memory footprint anyway. This makes the greedy register allocator 10% faster. llvm-svn: 129390	2011-04-12 21:30:53 +00:00
Andrew Trick	d83e7b6a5d	Revert 129383. It causes some targets to hit a scheduler assert. llvm-svn: 129385	2011-04-12 20:14:07 +00:00
Andrew Trick	1e0821075d	PreRA scheduler heuristic fixes: VRegCycle, TokenFactor latency. UnitsSharePred was a source of randomness in the scheduler: node priority depended on the queue data structure. I rewrote the recent VRegCycle heuristics to completely replace the old heuristic without any randomness. To make these heuristic adjustments to node latency work, I also needed to do something a little more reasonable with TokenFactor. I gave it zero latency to its consumers and always schedule it as low as possible. llvm-svn: 129383	2011-04-12 19:54:36 +00:00
Jakob Stoklund Olesen	1db776a52e	Create new intervals for isolated blocks during region splitting. This merges the behavior of splitSingleBlocks into splitAroundRegion, so the RS_Region and RS_Block register stages can be coalesced. That means the leftover intervals after region splitting go directly to spilling instead of a second pass of per-block splitting. llvm-svn: 129379	2011-04-12 19:32:53 +00:00
Jakob Stoklund Olesen	33a5706748	Add SplitKit API to query and select the current interval being worked on. This makes it possible to target multiple registers in one pass. llvm-svn: 129374	2011-04-12 18:11:31 +00:00
Jakob Stoklund Olesen	1b4a5fa3e4	Fix a bug in RegAllocBase::addMBBLiveIns() where a basic block could accidentally be skipped. llvm-svn: 129373	2011-04-12 18:11:28 +00:00
Devang Patel	c115961589	Remove dead typedef. llvm-svn: 129368	2011-04-12 17:43:12 +00:00
Devang Patel	6c1785d527	Refactor CompileUnit into a separate header. llvm-svn: 129367	2011-04-12 17:40:32 +00:00
Eric Christopher	72a09952de	Fix typo. llvm-svn: 129334	2011-04-12 00:48:08 +00:00
Jakob Stoklund Olesen	ea0a2b637b	Reuse live interval union between functions. This saves a bit of compile time when compiling many small functions. llvm-svn: 129321	2011-04-11 23:57:14 +00:00
Nick Lewycky	75e67d4dc2	Just because a GlobalVariable's initializer is [N x { i32, void ()* }] doesn't mean that it has to be ConstantArray of ConstantStruct. We might have ConstantAggregateZero, at either level, so don't crash on that. Also, semi-deprecate the sentinal value. The linker isn't aware of sentinals so we end up with the two lists appended, each with their "sentinals" on them. Different parts of LLVM treated sentinals differently, so make them all just ignore the single entry and continue on with the rest of the list. llvm-svn: 129307	2011-04-11 22:11:20 +00:00
Jakob Stoklund Olesen	7796876061	Speed up eviction by stopping collectInterferingVRegs as soon as the spill weight limit has been exceeded. llvm-svn: 129305	2011-04-11 21:47:01 +00:00
Bill Wendling	966775ce8a	The default of the dispatch switch statement was to branch to a BB that executed the 'unwind' instruction. However, later on that instruction was converted into a jump to the basic block it was located in, causing an infinite loop when we get there. It turns out, we get there if the _Unwind_Resume_or_Rethrow call returns (which it's not supposed to do). It returns if it cannot find a place to unwind to. Thus we would get what appears to be a "hang" when in reality it's just that the EH couldn't be propagated further along. Instead of infinitely looping (or calling `unwind', which none of our back-ends support (it's lowered into nothing...)), call the @llvm.trap() intrinsic instead. This may not conform to specific rules of a particular language, but it's rather better than infinitely looping. <rdar://problem/9175843&9233582> llvm-svn: 129302	2011-04-11 21:32:34 +00:00

1 2 3 4 5 ...

11831 Commits