llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00

Author	SHA1	Message	Date
Benjamin Kramer	f4041e8936	SDAGBuilder: Don't create an invalid iterator when there is only one switch case. Found by libstdc++'s debug mode. llvm-svn: 157522	2012-05-26 21:19:12 +00:00
Benjamin Kramer	07f18e78ba	SelectionDAGBuilder: When emitting small compare chains for switches order them by using edge weights. SimplifyCFG tends to form a lot of 2-3 case switches when merging branches. Move the most likely condition to the front so it is checked first and the others can be skipped. This is currently not as effective as it could be because SimplifyCFG destroys profiling metadata when merging branches and switches. Merging branch weight metadata is tricky though. This code touches at most 3 cases so I didn't use a proper sorting algorithm. llvm-svn: 157521	2012-05-26 20:01:32 +00:00
Benjamin Kramer	35427d9cab	ScoreboardHazardRecognizer: Remove dead conditional in debug code. Negative cycles are filtered out earlier. llvm-svn: 157514	2012-05-26 11:37:37 +00:00
Justin Holewinski	77c4679dae	Change interface for TargetLowering::LowerCallTo and TargetLowering::LowerCall to pass around a struct instead of a large set of individual values. This cleans up the interface and allows more information to be added to the struct for future targets without requiring changes to each and every target. NV_CONTRIB llvm-svn: 157479	2012-05-25 16:35:28 +00:00
Andrew Trick	240a1dd6e0	misched: trace formatting llvm-svn: 157455	2012-05-25 02:02:39 +00:00
Eli Friedman	d89582030a	Simplify code for calling a function where CanLowerReturn fails, fixing a small bug in the process. llvm-svn: 157446	2012-05-25 00:09:29 +00:00
Kaelyn Uhrain	74138c341a	Silence unused variable warnings from when assertions are disabled. llvm-svn: 157438	2012-05-24 23:37:49 +00:00
Andrew Trick	72d8f7c1df	misched: Use the same scheduling heuristics with -misched-topdown/bottomup. (except the part about choosing direction) llvm-svn: 157437	2012-05-24 23:11:17 +00:00
Andrew Trick	699439a90d	misched: Trace regpressure. llvm-svn: 157429	2012-05-24 22:11:14 +00:00
Andrew Trick	95b9cef3da	misched: Give each ReadyQ a unique ID llvm-svn: 157428	2012-05-24 22:11:12 +00:00
Andrew Trick	3152745a8f	misched: Added ScoreboardHazardRecognizer. The Hazard checker implements in-order contraints, or interlocked resources. Ready instructions with hazards do not enter the available queue and are not visible to other heuristics. The major code change is the addition of SchedBoundary to encapsulate the state at the top or bottom of the schedule, including both a pending and available queue. The scheduler now counts cycles in sync with the hazard checker. These are minimum cycle counts based on known hazards. Targets with no itinerary (x86_64) currently remain at cycle 0. To fix this, we need to provide some maximum issue width for all targets. We also need to add the concept of expected latency vs. minimum latency. llvm-svn: 157427	2012-05-24 22:11:09 +00:00
Andrew Trick	5ca7d67b39	misched: Release bottom roots in reverse order. llvm-svn: 157426	2012-05-24 22:11:05 +00:00
Andrew Trick	0e174b3a0e	misched: rename ReadyQ class llvm-svn: 157425	2012-05-24 22:11:03 +00:00
Andrew Trick	155f812f54	misched: copy comments so compareRPDelta is readable by itself. llvm-svn: 157424	2012-05-24 22:11:01 +00:00
Andrew Trick	523d3614e0	regpressure: Added RegisterPressure::dump llvm-svn: 157423	2012-05-24 22:10:59 +00:00
Andrew Trick	d0de06a312	regpressure: physreg livein/out fix llvm-svn: 157422	2012-05-24 22:10:57 +00:00
Craig Topper	936702e142	Mark some static arrays as const. llvm-svn: 157377	2012-05-24 06:35:32 +00:00
Jakob Stoklund Olesen	2313afcd00	Add a last resort tryInstructionSplit() to RAGreedy. Live ranges with a constrained register class may benefit from splitting around individual uses. It allows the remaining live range to use a larger register class where it may allocate. This is like spilling to a different register class. This is only attempted on constrained register classes. <rdar://problem/11438902> llvm-svn: 157354	2012-05-23 22:37:27 +00:00
Bill Wendling	ad5fae97e3	Forgot to reverse conditional. llvm-svn: 157349	2012-05-23 22:12:50 +00:00
Bill Wendling	9235b4e8b1	Reduce indentation by early detection of 'continue'. No functionality change. llvm-svn: 157348	2012-05-23 22:09:50 +00:00
Jakob Stoklund Olesen	ce44a7a9ae	Correctly deal with identity copies in RegisterCoalescer. Now that the coalescer keeps live intervals and machine code in sync at all times, it needs to deal with identity copies differently. When merging two virtual registers, all identity copies are removed right away. This means that other identity copies must come from somewhere else, and they are going to have a value number. Deal with such copies by merging the value numbers before erasing the copy instruction. Otherwise, we leave dangling value numbers in the live interval. This fixes PR12927. llvm-svn: 157340	2012-05-23 20:21:06 +00:00
Patrik Hägglund	822fe63c97	Small fix for the debug output from PBQP (PR12822). llvm-svn: 157319	2012-05-23 12:12:58 +00:00
Eric Christopher	809a39bd54	Add support for C++11 enum classes in llvm. Part of rdar://11496790 llvm-svn: 157303	2012-05-23 00:09:20 +00:00
Eric Christopher	7b3899918f	Untabify and 80-col. llvm-svn: 157274	2012-05-22 18:45:24 +00:00
Eric Christopher	a087fadf96	Formatting consistency. llvm-svn: 157273	2012-05-22 18:45:18 +00:00
Jakob Stoklund Olesen	151b044b75	Only erase virtregs with no uses left. Also make sure registers aren't erased twice if the dead def mentions the register twice. This fixes PR12911. llvm-svn: 157254	2012-05-22 14:52:12 +00:00
Owen Anderson	da2ddf18dc	Fix use of an unitialized value in the LegalizeOps expansion for ISD::SUB. No in-tree targets exercise this path. Patch by Micah Villmow. llvm-svn: 157215	2012-05-21 22:39:20 +00:00
Chad Rosier	8d8d0f7479	Typo. llvm-svn: 157195	2012-05-21 17:13:41 +00:00
Jakob Stoklund Olesen	c058c41261	Give a small negative bias to giant edge bundles. This helps compile time when the greedy register allocator splits live ranges in giant functions. Without the bias, we would try to grow regions through the giant edge bundles, usually to find out that the region became too big and expensive. If a live range has many uses in blocks near the giant bundle, the small negative bias doesn't make a big difference, and we still consider regions including the giant edge bundle. Giant edge bundles are usually connected to landing pads or indirect branches. llvm-svn: 157174	2012-05-21 03:11:23 +00:00
Jakob Stoklund Olesen	dadb6e60b9	Clear kill flags on the fly when joining intervals. With physreg joining out of the way, it is easy to recognize the instructions that need their kill flags cleared while testing for interference. This allows us to skip the final scan of all instructions for an 11% speedup of the coalescer pass. llvm-svn: 157169	2012-05-20 21:41:05 +00:00
Jakob Stoklund Olesen	904bc4185f	Constrain regclasses in PeepholeOptimizer. It can be necessary to restrict to a sub-class before accessing sub-registers. llvm-svn: 157164	2012-05-20 18:42:55 +00:00
Jakob Stoklund Olesen	12cd5a3007	Constrain register classes in TailDup. When rewriting operands, make sure the new registers have a compatible register class. llvm-svn: 157163	2012-05-20 18:42:51 +00:00
Peter Collingbourne	c9e3c75c4d	When legalising shifts, do not pre-build a list of operands which may be RAUW'd by the recursive call to LegalizeOps; instead, retrieve the other operands when calling UpdateNodeOperands. Fixes PR12889. llvm-svn: 157162	2012-05-20 18:36:15 +00:00
Benjamin Kramer	e42f976213	Plug a leak when using MCJIT. Found by valgrind. llvm-svn: 157160	2012-05-20 17:24:08 +00:00
Benjamin Kramer	fda11b2327	Use TargetMachine's register info instead of creating a new one and leaking it. llvm-svn: 157155	2012-05-20 11:24:27 +00:00
Jakob Stoklund Olesen	ed19f92618	Properly constrain register classes for sub-registers. Not all GR64 registers have sub_8bit sub-registers. llvm-svn: 157150	2012-05-20 06:38:37 +00:00
Jakob Stoklund Olesen	3979d32abd	Properly constrain register classes in 2-addr. X86 has 2-addr instructions with different constraints on the tied def and use operands. One is GR32, one is GR32_NOSP. llvm-svn: 157149	2012-05-20 06:38:32 +00:00
Jakob Stoklund Olesen	e2ec343323	Missed a push_back in r157147. llvm-svn: 157148	2012-05-20 05:28:53 +00:00
Jakob Stoklund Olesen	72efb98f61	Avoid deleting extra copies when RegistersDefinedFromSameValue is true. This function adds copies to be erased to DupCopies, avoid also adding them to DeadCopies. llvm-svn: 157147	2012-05-20 04:52:48 +00:00
Jakob Stoklund Olesen	04202f992f	Fix build bots. Avoid looking at the operands of a potentially erased instruction. llvm-svn: 157146	2012-05-20 03:57:12 +00:00
Jakob Stoklund Olesen	9d172f4b64	LiveRangeQuery simplifies shrinkToUses(). llvm-svn: 157145	2012-05-20 02:54:52 +00:00
Jakob Stoklund Olesen	d43dc753ee	Use LiveRangeQuery in ScheduleDAGInstrs. llvm-svn: 157144	2012-05-20 02:44:38 +00:00
Jakob Stoklund Olesen	5d95a00a3b	Eliminate some uses of struct LiveRange. That struct ought to be a LiveInterval implementation detail. llvm-svn: 157143	2012-05-20 02:44:36 +00:00
Jakob Stoklund Olesen	ef9f8dc31c	Use LiveRangeQuery instead of getLiveRangeContaining(). llvm-svn: 157142	2012-05-20 02:44:33 +00:00
Jakob Stoklund Olesen	3a4b342af1	Simplify overlap check. llvm-svn: 157137	2012-05-19 23:59:27 +00:00
Jakob Stoklund Olesen	207108d4a4	Fix 12892. Dead code elimination during coalescing could cause a virtual register to be split into connected components. The following rewriting would be confused about the already joined copies present in the code, but without a corresponding value number in the live range. Erase all joined copies instantly when joining intervals such that the MI and LiveInterval representations are always in sync. llvm-svn: 157135	2012-05-19 23:34:59 +00:00
Jakob Stoklund Olesen	01ab5e172d	Remove the late DCE in RegisterCoalescer. Dead code and joined copies are now eliminated on the fly, and there is no need for a post pass. This makes the coalescer work like other modern register allocator passes: Code is changed on the fly, there is no pending list of changes to be committed. llvm-svn: 157132	2012-05-19 21:02:31 +00:00
Jakob Stoklund Olesen	81d77434d9	Erase joined copies immediately. The late dead code elimination is no longer necessary. The test changes are cause by a register hint that can be either %rdi or %rax. The choice depends on the use list order, which this patch changes. llvm-svn: 157131	2012-05-19 20:54:07 +00:00
Jakob Stoklund Olesen	b2eb997a2c	Fix an ancient bug in removeCopyByCommutingDef(). Before rewriting uses of one value in A to register B, check that there are no tied uses. That would require multiple A values to be rewritten. This bug can't bite in the current version of the code for a fairly subtle reason: A tied use would have caused 2-addr to insert a copy before the use. If the copy has been coalesced, it will be found by the same loop changed by this patch, and the optimization is aborted. This was exposed by 400.perlbench and lua after applying a patch that deletes joined copies aggressively. llvm-svn: 157130	2012-05-19 20:54:03 +00:00
Jakob Stoklund Olesen	7b0808e107	Collect inflatable virtual registers on the fly. There is no reason to defer the collection of virtual registers whose register class may be replaced with a larger class. llvm-svn: 157125	2012-05-19 19:25:00 +00:00
Jakob Stoklund Olesen	7b47611be6	Eliminate dead code after remat. This will remove the original def once it has no more uses. llvm-svn: 157104	2012-05-19 05:25:59 +00:00
Jakob Stoklund Olesen	719bee51d1	Don't remat during updateRegDefsUses(). Remaining virtreg->physreg copies were rematerialized during updateRegDefsUses(), but we already do the same thing in joinCopy() when visiting the physreg copy instruction. Eliminate the preserveSrcInt argument to reMaterializeTrivialDef(). It is now always true. llvm-svn: 157103	2012-05-19 05:25:56 +00:00
Jakob Stoklund Olesen	ab9a075f26	Immediately erase trivially useless copies. There is no need for these instructions to stick around since they are known to be not dead. llvm-svn: 157102	2012-05-19 05:25:53 +00:00
Jakob Stoklund Olesen	1cb1222e6c	Run proper recursive dead code elimination during coalescing. Dead copies cause problems because they are trivial to coalesce, but removing them gived the live range a dangling end point. This patch enables full dead code elimination which trims live ranges to their uses so end points don't dangle. DCE may erase multiple instructions. Put the pointers in an ErasedInstrs set so we never risk visiting erased instructions in the work list. There isn't supposed to be any dead copies entering RegisterCoalescer, but they do slip by as evidenced by test/CodeGen/X86/coalescer-dce.ll. llvm-svn: 157101	2012-05-19 05:25:50 +00:00
Jakob Stoklund Olesen	6a5bbcc25c	Allow LiveRangeEdit to be created with a NULL parent. The dead code elimination with callbacks is still useful. llvm-svn: 157100	2012-05-19 05:25:46 +00:00
Jakob Stoklund Olesen	ffe87e7e00	Modernize naming convention for class members. No functional change. llvm-svn: 157079	2012-05-18 22:10:15 +00:00
Jakob Stoklund Olesen	46e05f2e79	Move all work list processing to copyCoalesceWorkList(). This will make it possible to filter out erased instructions later. llvm-svn: 157073	2012-05-18 21:09:40 +00:00
Jim Grosbach	343a996ca5	Refactor data-in-code annotations. Use a dedicated MachO load command to annotate data-in-code regions. This is the same format the linker produces for final executable images, allowing consistency of representation and use of introspection tools for both object and executable files. Data-in-code regions are annotated via ".data_region"/".end_data_region" directive pairs, with an optional region type. data_region_directive := ".data_region" { region_type } region_type := "jt8" \| "jt16" \| "jt32" \| "jta32" end_data_region_directive := ".end_data_region" The previous handling of ARM-style "$d.*" labels was broken and has been removed. Specifically, it didn't handle ARM vs. Thumb mode when marking the end of the section. rdar://11459456 llvm-svn: 157062	2012-05-18 19:12:01 +00:00
Eric Christopher	01248cbb6f	Remove duplicate code that we could just fallthrough to. llvm-svn: 157060	2012-05-18 18:24:15 +00:00
Jakob Stoklund Olesen	1536ca7ca6	Simplify RegisterCoalescer::copyCoalesceInMBB(). It is no longer necessary to separate VirtCopies, PhysCopies, and ImpDefCopies. Implicitly defined copies are extremely rare after we added the ProcessImplicitDefs pass, and physical register copies are not joined any longer. llvm-svn: 157059	2012-05-18 18:21:48 +00:00
Jakob Stoklund Olesen	e581443aad	Remove support for PhysReg joining. This has been disabled for a while, and it is not a feature we want to support. Copies between physical and virtual registers are eliminated by good hinting support in the register allocator. Joining virtual and physical registers is really a form of register allocation, and the coalescer is not properly equipped to do that. In particular, it cannot backtrack coalescing decisions, and sometimes that would cause it to create programs that were impossible to register allocate, by exhausting a small register class. It was also very difficult to keep track of the live ranges of aliasing registers when extending the live range of a physreg. By disabling physreg joining, we can let fixed physreg live ranges remain constant throughout the register allocator super-pass. One type of physreg joining remains: A virtual register that has a single value which is a copy of a reserved register can be merged into the reserved physreg. This always lowers register pressure, and since we don't compute live ranges for reserved registers, there are no problems with aliases. llvm-svn: 157055	2012-05-18 17:18:58 +00:00
Stepan Dyatkovskiy	eb65d844be	Recommited reworked r156804: SelectionDAGBuilder::Clusterify : main functinality was replaced with CRSBuilder::optimize, so big part of Clusterify's code was reduced. llvm-svn: 157046	2012-05-18 08:32:28 +00:00
Evan Cheng	6ffd037105	Teach two-address pass to update the "source" map so it doesn't perform a non-profitable commute using outdated info. The test case would still fail because of poor pre-RA schedule. That will be fixed by MI scheduler. rdar://11472010 llvm-svn: 157038	2012-05-18 01:33:51 +00:00
Andrew Trick	b57d7deff8	comments llvm-svn: 157020	2012-05-17 22:37:09 +00:00
Andrew Trick	de303d5be0	misched: trace ReadyQ. llvm-svn: 157007	2012-05-17 18:35:13 +00:00
Andrew Trick	865d5c0a6d	misched: Added 3-level regpressure back-off. Introduce the basic strategy for register pressure scheduling. 1) Respect target limits at all times. 2) Indentify critical register classes (pressure sets). Track pressure within the scheduled region. Avoid increasing scheduled pressure for critical registers. 3) Avoid exceeding the max pressure of the region prior to scheduling. Added logic for picking between the top and bottom ready Q's based on regpressure heuristics. Status: functional but needs to be asjusted to achieve good results. llvm-svn: 157006	2012-05-17 18:35:10 +00:00
Andrew Trick	8494212e03	comment llvm-svn: 157005	2012-05-17 18:35:07 +00:00
Andrew Trick	7c81798f81	regpressure: Fix getMaxUpwardPressureDelta. llvm-svn: 157004	2012-05-17 18:35:05 +00:00
Andrew Trick	9f0a58d92b	misched: fix liveness iterators llvm-svn: 157003	2012-05-17 18:35:03 +00:00
Andrew Trick	adad394842	whitespace llvm-svn: 157002	2012-05-17 18:35:00 +00:00
Jakob Stoklund Olesen	8a0b0df10a	Never clear <undef> flags on already joined copies. RegisterCoalescer set <undef> flags on all operands of copy instructions that are scheduled to be removed. This is so they won't affect shrinkToUses() by introducing false register reads. Make sure those <undef> flags are never cleared, or shrinkToUses() could cause live intervals to end at instructions about to be deleted. This would be a lot simpler if RegisterCoalescer could just erase joined copies immediately instead of keeping all the to-be-deleted instructions around. This fixes PR12862. Unfortunately, bugpoint can't create a sane test case for this. Like many other coalescer problems, this failure depends of a very fragile series of events. <rdar://problem/11474428> llvm-svn: 157001	2012-05-17 18:32:42 +00:00
Jakob Stoklund Olesen	0069dd0862	Fix a verifier bug. Make sure useless (def-only) intervals also get verified. llvm-svn: 157000	2012-05-17 18:32:40 +00:00
Bill Wendling	ab2fc29c78	Relax the requirement that the exception object must be an instruction. During bugpoint-ing, it may turn into something else. llvm-svn: 156998	2012-05-17 17:59:51 +00:00
Stepan Dyatkovskiy	5fd076f3e6	SelectionDAGBuilder: CaseBlock, CaseRanges and CaseCmp changed representation of Low and High from signed to unsigned. Since unsigned ints usually simpler, faster and allows to reduce some extra signed bit checks needed before <,>,<=,>= comparisons. llvm-svn: 156985	2012-05-17 08:56:30 +00:00
Jakob Stoklund Olesen	92778c5641	Set sub-register <undef> flags more accurately. When widening an existing <def,reads-undef> operand to a super-register, it may be necessary to clear the <undef> flag because the wider register is now read-modify-write through the instruction. Conversely, it may be necessary to add an <undef> flag when the coalescer turns a full-register def into a sub-register def, but the larger register wasn't live before the instruction. This happens in test/CodeGen/ARM/coalesce-subregs.ll, but the test is too small for the <undef> flags to affect the generated code. llvm-svn: 156951	2012-05-16 21:22:35 +00:00
Duncan Sands	00ff48d3e3	Fix a thinko in DisintegrateMERGE_VALUES. Patch by Xiaoyi Guo. llvm-svn: 156909	2012-05-16 07:57:18 +00:00
Jakob Stoklund Olesen	1da786c936	Enable sub-sub-register copy coalescing. It is now possible to coalesce weird skewed sub-register copies by picking a super-register class larger than both original registers. The included test case produces code like this: vld2.32 {d16, d17, d18, d19}, [r0]! vst2.32 {d18, d19, d20, d21}, [r0] We still perform interference checking as if it were a normal full copy join, so this is still quite conservative. In particular, the f1 and f2 functions in the included test case still have remaining copies because of false interference. llvm-svn: 156878	2012-05-15 23:31:35 +00:00
Jakob Stoklund Olesen	d9e5addfc0	Teach RegisterCoalescer to handle symmetric sub-register copies. It is possible to coalesce two overlapping registers to a common super-register that it larger than both of the original registers. The important difference is that it may be necessary to rewrite DstReg operands as well as SrcReg operands because the sub-register index has changed. This behavior is still disabled by CoalescerPair. llvm-svn: 156869	2012-05-15 22:26:28 +00:00
Jakob Stoklund Olesen	0e3b11196f	Handle NewReg==OldReg in renameRegister(). This can happen when widening a virtual register to a super-register class. llvm-svn: 156867	2012-05-15 22:20:27 +00:00
Jakob Stoklund Olesen	109ee15f5d	We never call adjustCopiesBackFrom() for partial copies. There is no need to look at an always null SrcIdx. llvm-svn: 156866	2012-05-15 22:18:49 +00:00
Jakob Stoklund Olesen	032842545f	Extend the CoalescerPair interface to handle symmetric sub-register copies. Now both SrcReg and DstReg can be sub-registers of the final coalesced register. CoalescerPair::setRegisters still rejects such copies because RegisterCoalescer doesn't yet handle them. llvm-svn: 156848	2012-05-15 20:09:43 +00:00
Andrew Trick	2bc63e3ee1	Add -enable-aa-sched-mi, off by default, for AliasAnalysis inside MachineScheduler. This feature avoids creating edges in the scheduler's dependence graph for non-aliasing memory operations according to whichever alias analysis is available. It has been fully tested in Hexagon. Before making this default, it needs to be extended to handle multiple MachineMemOperands, compile time needs more evaluation, and benchmarking on X86 and ARM is needed. Patch by Sergei Larin! llvm-svn: 156842	2012-05-15 18:59:41 +00:00
Jim Grosbach	2e62e2f664	Allow MCCodeEmitter access to the target MCRegisterInfo. Add the MCRegisterInfo to the factories and constructors. Patch by Tom Stellard <Tom.Stellard@amd.com>. llvm-svn: 156828	2012-05-15 17:35:52 +00:00
Stepan Dyatkovskiy	f0f42687c6	Rejected r156804 due to buildbots failures. llvm-svn: 156808	2012-05-15 06:50:18 +00:00
Stepan Dyatkovskiy	8f3e310361	SelectionDAGBuilder::Clusterify : main functinality was replaced with CRSBuilder::optimize, so big part of Clusterify's code was reduced. llvm-svn: 156804	2012-05-15 05:09:41 +00:00
Jakob Stoklund Olesen	a78f184a30	Don't access MO reference after invalidating operand list. This should unbreak llvm-x86_64-linux. llvm-svn: 156778	2012-05-14 21:30:58 +00:00
Jakob Stoklund Olesen	a3e3afc746	Fix PR12821. RAFast must add an <imp-def> operand when it is rewriting a sub-register def that isn't a read-modify-write. llvm-svn: 156777	2012-05-14 21:10:25 +00:00
Dan Gohman	cc1f60a86c	Rename @llvm.debugger to @llvm.debugtrap. llvm-svn: 156774	2012-05-14 18:58:10 +00:00
Jakob Stoklund Olesen	aff911c34c	Don't look for empty live ranges in the unions. Empty live ranges represent undef and still get allocated, but they won't appear in LiveIntervalUnions. Patch by Patrik Hägglund! llvm-svn: 156685	2012-05-12 00:33:28 +00:00
Chad Rosier	dba9908c4b	Revert 156658. llvm-svn: 156662	2012-05-11 23:21:01 +00:00
Chad Rosier	20f6e62e43	[fast-isel] Fast-isel doesn't use the expect intrinsic. llvm-svn: 156658	2012-05-11 23:10:58 +00:00
Manman Ren	c82d0e71b9	ARM: peephole optimization to remove cmp instruction This patch will optimize the following cases: sub r1, r3 \| sub r1, imm cmp r3, r1 or cmp r1, r3 \| cmp r1, imm bge L1 TO subs r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can replace "sub" with "subs" and eliminate the "cmp" instruction. rdar: 10734411 llvm-svn: 156599	2012-05-11 01:30:47 +00:00
Dan Gohman	ed475ad173	Define a new intrinsic, @llvm.debugger. It will be similar to __builtin_trap(), but it generates int3 on x86 instead of ud2. llvm-svn: 156593	2012-05-11 00:19:32 +00:00
Andrew Trick	e907a8bc0e	misched: Print machineinstrs with -debug-only=misched llvm-svn: 156576	2012-05-10 21:06:21 +00:00
Andrew Trick	19b39882d9	misched: tracing register pressure heuristics. llvm-svn: 156575	2012-05-10 21:06:19 +00:00
Andrew Trick	ba6b818855	misched: Add register pressure backoff to ConvergingScheduler. Prioritize the instruction that comes closest to keeping pressure under the target's limit. Then prioritize instructions that avoid increasing the max pressure in the scheduled region. The max pressure heuristic is a tad aggressive. Later I'll fix it to consider the unscheduled pressure as well. WIP: This is mostly functional but untested and not likely to do much good yet. llvm-svn: 156574	2012-05-10 21:06:16 +00:00
Andrew Trick	949bbbecd6	misched: Release only unscheduled nodes into ReadyQ. llvm-svn: 156573	2012-05-10 21:06:14 +00:00
Andrew Trick	50028ab54e	misched: Added ReadyQ container wrapper for Top and Bottom Queues. llvm-svn: 156572	2012-05-10 21:06:12 +00:00
Andrew Trick	828845f0e8	misched: Introducing Top and Bottom register pressure trackers during scheduling. llvm-svn: 156571	2012-05-10 21:06:10 +00:00
Andrew Trick	bfd4d328b1	RegPressure: API for speculatively checking instruction pressure. Added getMaxExcessUpward/DownwardPressure. They somewhat abuse the tracker by speculatively handling an instruction out of order. But it is convenient for now. In the future, we will cache each instruction's pressure contribution to make this efficient. llvm-svn: 156561	2012-05-10 19:11:52 +00:00
Andrew Trick	980318a1e5	RegPressure: fix array index iteration style. llvm-svn: 156560	2012-05-10 19:11:49 +00:00
Manman Ren	5abcae1320	Revert: 156550 "ARM: peephole optimization to remove cmp instruction" This commit broke an external linux bot and gave a compile-time warning. llvm-svn: 156556	2012-05-10 18:49:43 +00:00
Manman Ren	727b7d5e4c	ARM: peephole optimization to remove cmp instruction This patch will optimize the following cases: sub r1, r3 \| sub r1, imm cmp r3, r1 or cmp r1, r3 \| cmp r1, imm bge L1 TO subs r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can replace "sub" with "subs" and eliminate the "cmp" instruction. rdar: 10734411 llvm-svn: 156550	2012-05-10 16:48:21 +00:00
Eric Christopher	218078152b	Fix thinko in conditional. Part of rdar://11352000 and should bring the buildbots back. llvm-svn: 156421	2012-05-08 21:24:39 +00:00
Jim Grosbach	15662155b6	DAGCombiner should not change the type of an extract_vector index. When a combine twiddles an extract_vector, care should be take to preserve the type of the index operand. No luck extracting a reasonable testcase, unfortunately. rdar://11391009 llvm-svn: 156419	2012-05-08 20:56:07 +00:00
Akira Hatanaka	39b0b85f7f	Formatting fixes. Patch by Jack Carter. llvm-svn: 156409	2012-05-08 19:14:42 +00:00
Eric Christopher	63b73ede75	Handle OpDeref in case it comes in as a register operand. Part of rdar://11352000 llvm-svn: 156405	2012-05-08 18:56:00 +00:00
Jakob Stoklund Olesen	3eaa5c5eff	Extract methods for joining physregs. No functional change. llvm-svn: 156345	2012-05-08 00:08:35 +00:00
Jakob Stoklund Olesen	3b71666770	Naming convention and whitespace. No functional change. llvm-svn: 156342	2012-05-07 23:46:16 +00:00
Jakob Stoklund Olesen	cb4b315ee8	Coalesce subreg-subreg copies. At least some of them: %vreg1:sub_16bit = COPY %vreg2:sub_16bit; GR64:%vreg1, GR32: %vreg2 Previously, we couldn't figure out that the above copy could be eliminated by coalescing %vreg2 with %vreg1:sub_32bit. The new getCommonSuperRegClass() hook makes it possible. This is not very useful yet since the unmodified part of the destination register usually interferes with the source register. The coalescer needs to understand sub-register interference checking first. llvm-svn: 156334	2012-05-07 22:57:55 +00:00
Jakob Stoklund Olesen	cc0cf22b98	Add an MF argument to TRI::getPointerRegClass() and TII::getRegClass(). The getPointerRegClass() hook can return register classes that depend on the calling convention of the current function (ptr_rc_tailcall). So far, we have been able to infer the calling convention from the subtarget alone, but as we add support for multiple calling conventions per target, that no longer works. Patch by Yiannis Tsiouris! llvm-svn: 156328	2012-05-07 22:10:26 +00:00
Owen Anderson	8adb0322ce	Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled. llvm-svn: 156324	2012-05-07 20:51:25 +00:00
Benjamin Kramer	7a9528b540	Add a new target hook "predictableSelectIsExpensive". This will be used to determine whether it's profitable to turn a select into a branch when the branch is likely to be predicted. Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM. I'm not entirely happy with the name of this flag, suggestions welcome ;) llvm-svn: 156233	2012-05-05 12:49:14 +00:00
Jakob Stoklund Olesen	90ad9e9f13	Make sure findRepresentativeClass picks the widest super-register. We want the representative register class to contain the largest super-registers available. This makes the function less sensitive to the register class numbering. llvm-svn: 156220	2012-05-04 22:53:28 +00:00
Jakob Stoklund Olesen	c169683227	Remove extra comma in debug output. llvm-svn: 156219	2012-05-04 22:53:26 +00:00
Jakob Stoklund Olesen	8fbea83a95	Use SuperRegClassIterator for findRepresentativeClass(). The masks returned by SuperRegClassIterator are computed automatically by TableGen. This is better than depending on the manually specified SuperRegClasses. llvm-svn: 156147	2012-05-04 02:19:22 +00:00
Evan Cheng	59c8d1af93	Fix two-address pass's aggressive instruction commuting heuristics. It's meant to catch cases like: %reg1024<def> = MOV r1 %reg1025<def> = MOV r0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 By commuting ADD, it let coalescer eliminate all of the copies. However, there was a bug in the heuristics where it ended up commuting the ADD in: %reg1024<def> = MOV r0 %reg1025<def> = MOV 0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 That did no benefit but rather ensure the last MOV would not be coalesced. rdar://11355268 llvm-svn: 156048	2012-05-03 01:45:13 +00:00
Andrew Trick	4d16c1f958	Added TargetRegisterInfo::getAllocatableClass. The ensures that virtual registers always belong to an allocatable class. If your target attempts to create a vreg for an operand that has no allocatable register subclass, you will crash quickly. This ensures that targets define register classes as intended. llvm-svn: 156046	2012-05-03 01:14:37 +00:00
Owen Anderson	e3d41b44cc	Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, just like it now knows for FMULs. llvm-svn: 156029	2012-05-02 22:17:40 +00:00
Owen Anderson	3cfe269707	Teach DAG combine that multiplication by 1.0 can always be constant folded. llvm-svn: 156023	2012-05-02 21:32:35 +00:00
Jim Grosbach	1e9562afb1	Tidy up. Naming conventions. llvm-svn: 155960	2012-05-01 23:21:41 +00:00
Jakub Staszak	c45d47462b	Use dyn_cast instead of checking opcode and cast. llvm-svn: 155957	2012-05-01 23:06:00 +00:00
Bill Wendling	c520775de3	Strip the pointer casts off of allocas so that the selection DAG can find them. PR10799 llvm-svn: 155954	2012-05-01 22:50:45 +00:00
Sirish Pande	42a5ef931c	Target independent Hexagon Packetizer fix. llvm-svn: 155947	2012-05-01 21:28:30 +00:00
Bill Wendling	003b1bf46c	Change the PassManager from a reference to a pointer. The TargetPassManager's default constructor wants to initialize the PassManager to 'null'. But it's illegal to bind a null reference to a null l-value. Make the ivar a pointer instead. PR12468 llvm-svn: 155902	2012-05-01 08:27:43 +00:00
Jakub Staszak	73c62748da	Add some constantness. No functionality change. llvm-svn: 155859	2012-04-30 23:41:30 +00:00
Benjamin Kramer	ad49f4d6b5	RegisterPressure: ArrayRefize some functions for better readability. No functionality change. llvm-svn: 155795	2012-04-29 18:52:56 +00:00
Jakob Stoklund Olesen	b1322a9056	Don't update spill weights when joining intervals. We don't compute spill weights until after coalescing anyway. llvm-svn: 155766	2012-04-28 19:19:11 +00:00
Jakob Stoklund Olesen	9182fb5fce	Spring cleaning - Delete dead code. llvm-svn: 155765	2012-04-28 19:19:07 +00:00
Andrew Trick	55623eaf5a	Reapply 155668: Fix the SD scheduler to avoid gluing the same node twice. This time, also fix the caller of AddGlue to properly handle incomplete chains. AddGlue had failure modes, but shamefully hid them from its caller. It's luck ran out. Fixes rdar://11314175: BuildSchedUnits assert. llvm-svn: 155749	2012-04-28 01:03:23 +00:00
Andrew Trick	cbe7b03dbe	Temporarily revert r155668: Fix the SD scheduler to avoid gluing. This definitely caused regression with ARM -mno-thumb. llvm-svn: 155743	2012-04-27 22:55:59 +00:00
Andrew Trick	1aa00c0baa	Fix the SD scheduler to avoid gluing the same node twice. DAGCombine strangeness may result in multiple loads from the same offset. They both may try to glue themselves to another load. We could insist that the redundant loads glue themselves to each other, but the beter fix is to bail out from bad gluing at the time we detect it. Fixes rdar://11314175: BuildSchedUnits assert. llvm-svn: 155668	2012-04-26 21:48:25 +00:00
Jakob Stoklund Olesen	24c99d2966	Remove more dead code. llvm-svn: 155566	2012-04-25 18:01:30 +00:00
Jakob Stoklund Olesen	7f1be74a4a	Remove the -disable-cross-class-join option. Cross-class joins have been normal and fully supported for a while now. With TableGen generating the getMatchingSuperRegClass() hook, they are unlikely to cause problems again. llvm-svn: 155552	2012-04-25 16:17:50 +00:00
Jakob Stoklund Olesen	b8d98c5060	Cross-class joining is winning. Remove the heuristic for disabling cross-class joins. The greedy register allocator can handle the narrow register classes, and when it splits a live range, it can pick a larger register class. Benchmarks were unaffected by this change. <rdar://problem/11302212> llvm-svn: 155551	2012-04-25 16:17:47 +00:00
Andrew Trick	47f01c373e	Fix a naughty header include that breaks "installed" builds. llvm-svn: 155486	2012-04-24 20:36:19 +00:00
Evan Cheng	7f9bf43bcf	MachineBasicBlock::SplitCriticalEdge() should follow LLVM IR variant and refuse to break edge to EH landing pad. rdar://11300144 llvm-svn: 155470	2012-04-24 19:06:55 +00:00
Andrew Trick	b18b7c6e55	cmake: new file llvm-svn: 155460	2012-04-24 18:06:49 +00:00
Andrew Trick	4c57a9bcf1	misched: DAG builder must special case earlyclobber llvm-svn: 155459	2012-04-24 18:04:41 +00:00
Andrew Trick	0b0b66833e	misched: try (not too hard) to place debug values where they belong llvm-svn: 155458	2012-04-24 18:04:37 +00:00
Andrew Trick	02daf554b7	misched: ignore debug values during scheduling llvm-svn: 155457	2012-04-24 18:04:34 +00:00
Andrew Trick	cc1e9fe38e	misched: DAG builder support for tracking register pressure within the current scheduling region. The DAG builder is a convenient place to do it. Hopefully this is more efficient than a separate traversal over the same region. llvm-svn: 155456	2012-04-24 17:56:43 +00:00
Andrew Trick	61df7b36e9	RegisterPressure: A utility for computing register pressure within a MachineInstr sequence. This uses the new target interface for tracking register pressure using pressure sets to model overlapping register classes and subregisters. RegisterPressure results can be tracked incrementally or stored at region boundaries. Global register pressure can be deduced from local RegisterPressure results if desired. This is an early, somewhat untested implementation. I'm working on testing it within the context of a register pressure reducing MachineScheduler. llvm-svn: 155454	2012-04-24 17:53:35 +00:00
Bill Wendling	a9402cc275	Look for the 'Is Simulated' module flag. This indicates that the program is compiled to run on a simulator. llvm-svn: 155435	2012-04-24 11:03:50 +00:00
Preston Gurd	0a730de3c3	This patch fixes a problem which arose when using the Post-RA scheduler on X86 Atom. Some of our tests failed because the tail merging part of the BranchFolding pass was creating new basic blocks which did not contain live-in information. When the anti-dependency code in the Post-RA scheduler ran, it would sometimes rename the register containing the function return value because the fact that the return value was live-in to the subsequent block had been lost. To fix this, it is necessary to run the RegisterScavenging code in the BranchFolding pass. This patch makes sure that the register scavenging code is invoked in the X86 subtarget only when post-RA scheduling is being done. Post RA scheduling in the X86 subtarget is only done for Atom. This patch adds a new function to the TargetRegisterClass to control whether or not live-ins should be preserved during branch folding. This is necessary in order for the anti-dependency optimizations done during the PostRASchedulerList pass to work properly when doing Post-RA scheduling for the X86 in general and for the Intel Atom in particular. The patch adds and invokes the new function trackLivenessAfterRegAlloc() instead of using the existing requiresRegisterScavenging(). It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of requiresRegisterScavenging(). It changes the all the targets that implemented requiresRegisterScavenging() to also implement trackLivenessAfterRegAlloc(). It adds an assertion in the Post RA scheduler to make sure that post RA liveness information is available when it is needed. It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order to avoid running into the added assertion. Finally, this patch restores the use of anti-dependency checking (which was turned off temporarily for the 3.1 release) for Intel Atom in the Post RA scheduler. Patch by Andy Zhang! Thanks to Jakob and Anton for their reviews. llvm-svn: 155395	2012-04-23 21:39:35 +00:00
Chandler Carruth	c442bad8f8	Temporarily revert r155364 until the upstream review can complete, per the stated developer policy. llvm-svn: 155373	2012-04-23 18:28:57 +00:00
Sirish Pande	85e47a24fe	Hexagon Packetizer's target independent fix. llvm-svn: 155364	2012-04-23 17:49:09 +00:00
Elena Demikhovsky	35721fc4f8	ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2 llvm-svn: 155309	2012-04-22 09:39:03 +00:00
Nadav Rotem	97bbbe3368	Teach getVectorTypeBreakdown about promotion of vectors in addition to widening of vectors. llvm-svn: 155296	2012-04-21 20:08:32 +00:00
Jakob Stoklund Olesen	adfc8212cf	Fix PR12599. The X86 target is editing the selection DAG while isel is selecting nodes following a topological ordering. When the DAG hacking triggers CSE, nodes can be deleted and bad things happen. llvm-svn: 155257	2012-04-20 23:36:09 +00:00
Jakob Stoklund Olesen	21b2b2d965	Make ISelPosition a local variable. Now that multiple DAGUpdateListeners can be active at the same time, ISelPosition can become a local variable in DoInstructionSelection. We simply register an ISelUpdater with CurDAG while ISelPosition exists. llvm-svn: 155249	2012-04-20 22:08:50 +00:00
Jakob Stoklund Olesen	1947930692	Register DAGUpdateListeners with SelectionDAG. Instead of passing listener pointers to RAUW, let SelectionDAG itself keep a linked list of interested listeners. This makes it possible to have multiple listeners active at once, like RAUWUpdateListener was already doing. It also makes it possible to register listeners up the call stack without controlling all RAUW calls below. DAGUpdateListener uses an RAII pattern to add itself to the SelectionDAG list of active listeners. llvm-svn: 155248	2012-04-20 22:08:46 +00:00
Jakob Stoklund Olesen	e93e6ab7f6	Print <def,read-undef> to avoid confusion. The <undef> flag on a def operand only applies to partial register redefinitions. Only print the flag when relevant, and print it as <def,read-undef> to make it clearer what it means. llvm-svn: 155239	2012-04-20 21:45:33 +00:00
Andrew Trick	6e57806ea9	New and improved comment. llvm-svn: 155229	2012-04-20 20:24:33 +00:00
Andrew Trick	56264ae675	SparseSet: Add support for key-derived indexes and arbitrary key types. This nicely handles the most common case of virtual register sets, but also handles anticipated cases where we will map pointers to IDs. The goal is not to develop a completely generic SparseSet template. Instead we want to handle the expected uses within llvm without any template antics in the client code. I'm adding a bit of template nastiness here, and some assumption about expected usage in order to make the client code very clean. The expected common uses cases I'm designing for: - integer keys that need to be reindexed, and may map to additional data - densely numbered objects where we want pointer keys because no number->object map exists. llvm-svn: 155227	2012-04-20 20:05:28 +00:00
Andrew Trick	2e8365f6d2	misched: initialize BB llvm-svn: 155226	2012-04-20 20:05:21 +00:00
Andrew Trick	93005d8a61	Allow targets to select the default scheduler by name. llvm-svn: 155090	2012-04-19 01:34:10 +00:00
Chandler Carruth	090e90a242	This reverts a long string of commits to the Hexagon backend. These commits have had several major issues pointed out in review, and those issues are not being addressed in a timely fashion. Furthermore, this was all committed leading up to the v3.1 branch, and we don't need piles of code with outstanding issues in the branch. It is possible that not all of these commits were necessary to revert to get us back to a green state, but I'm going to let the Hexagon maintainer sort that out. They can recommit, in order, after addressing the feedback. Reverted commits, with some notes: Primary commit r154616: HexagonPacketizer - There are lots of review comments here. This is the primary reason for reverting. In particular, it introduced large amount of warnings due to a bad construct in tablegen. - Follow-up commits that should be folded back into this when reposting: - r154622: CMake fixes - r154660: Fix numerous build warnings in release builds. - Please don't resubmit this until the three commits above are included, and the issues in review addressed. Primary commit r154695: Pass to replace transfer/copy ... - Reverted to minimize merge conflicts. I'm not aware of specific issues with this patch. Primary commit r154703: New Value Jump. - Primarily reverted due to merge conflicts. - Follow-up commits that should be folded back into this when reposting: - r154703: Remove iostream usage - r154758: Fix CMake builds - r154759: Fix build warnings in release builds - Please incorporate these fixes and and review feedback before resubmitting. Primary commit r154829: Hexagon V5 (floating point) support. - Primarily reverted due to merge conflicts. - Follow-up commits that should be folded back into this when reposting: - r154841: Remove unused variable (fixing build warnings) There are also accompanying Clang commits that will be reverted for consistency. llvm-svn: 155047	2012-04-18 21:31:19 +00:00
Pete Cooper	d839376c4b	LiveIntervalUpdate validators weren't recorded after the calls to std::for_each. Turns out std::for_each doesn't update the variable passed in for the functor but instead copy constructs a new one. llvm-svn: 155041	2012-04-18 20:29:17 +00:00
Joel Jones	73aa4ce484	Fixes a problem in instruction selection with testing whether or not the transformation: (X op C1) ^ C2 --> (X op C1) & ~C2 iff (C1&C2) == C2 should be done. This change has been tested: Using a debug+asserts build: on the specific test case that brought this bug to light make check-all lnt nt using this clang to build a release version of clang Using the release+asserts clang-with-clang build: on the specific test case that brought this bug to light make check-all lnt nt Checking in because Evan wants it checked in. Test case forthcoming after scrubbing. llvm-svn: 154955	2012-04-17 22:23:10 +00:00
Lang Hames	c9489b786a	SlotIndexes used to store the index list in a crufty custom linked-list. I can't for the life of me remember why I wrote it this way, but I can't see any good reason for it now. This patch replaces the custom linked list with an ilist. This change should preserve the existing numberings exactly, so no generated code should change (if it does, file a bug!). llvm-svn: 154904	2012-04-17 04:15:51 +00:00
Eric Christopher	00c02f1556	Make comment here more clear. llvm-svn: 154878	2012-04-16 23:54:23 +00:00
Chandler Carruth	5780b826b0	Fix updateTerminator to be resiliant to degenerate terminators where both fallthrough and a conditional branch target the same successor. Gracefully delete the conditional branch and introduce any unconditional branch needed to reach the actual successor. This fixes memory corruption in 2009-06-15-RegScavengerAssert.ll and possibly other tests. Also, while I'm here fix a latent bug I spotted by inspection. I never applied the same fundamental fix to this fallthrough successor finding logic that I did to the logic used when there are no conditional branches. As a consequence it would have selected landing pads had they be aligned in just the right way here. I don't have a test case as I spotted this by inspection, and the previous time I found this required have of TableGen's source code to produce it. =/ I hate backend bugs. ;] Thanks to Jim Grosbach for helping me reason through this and reviewing the fix. llvm-svn: 154867	2012-04-16 22:03:00 +00:00
Chandler Carruth	728acc9bd9	Flip the new block-placement pass to be on by default. This is mostly to test the waters. I'd like to get results from FNT build bots and other bots running on non-x86 platforms. This feature has been pretty heavily tested over the last few months by me, and it fixes several of the execution time regressions caused by the inlining work by preventing inlining decisions from radically impacting block layout. I've seen very large improvements in yacr2 and ackermann benchmarks, along with the expected noise across all of the benchmark suite whenever code layout changes. I've analyzed all of the regressions and fixed them, or found them to be impossible to fix. See my email to llvmdev for more details. I'd like for this to be in 3.1 as it complements the inliner changes, but if any failures are showing up or anyone has concerns, it is just a flag flip and so can be easily turned off. I'm switching it on tonight to try and get at least one run through various folks' performance suites in case SPEC or something else has serious issues with it. I'll watch bots and revert if anything shows up. llvm-svn: 154816	2012-04-16 13:49:17 +00:00
Chandler Carruth	fbb6219d5b	Add a somewhat hacky heuristic to do something different from whole-loop rotation. When there is a loop backedge which is an unconditional branch, we will end up with a branch somewhere no matter what. Try placing this backedge in a fallthrough position above the loop header as that will definitely remove at least one branch from the loop iteration, where whole loop rotation may not. I haven't seen any benchmarks where this is important but loop-blocks.ll tests for it, and so this will be covered when I flip the default. llvm-svn: 154812	2012-04-16 13:33:36 +00:00
Chandler Carruth	33b200ad13	Tweak the loop rotation logic to check whether the loop is naturally laid out in a form with a fallthrough into the header and a fallthrough out of the bottom. In that case, leave the loop alone because any rotation will introduce unnecessary branches. If either side looks like it will require an explicit branch, then the rotation won't add any, do it to ensure the branch occurs outside of the loop (if possible) and maximize the benefit of the fallthrough in the bottom. llvm-svn: 154806	2012-04-16 09:31:23 +00:00
Hal Finkel	457fbe481c	Remove dead SD nodes after the combining pass. Fixes PR12201. llvm-svn: 154786	2012-04-16 03:33:22 +00:00
Chandler Carruth	fc5ab5d388	Rewrite how machine block placement handles loop rotation. This is a complex change that resulted from a great deal of experimentation with several different benchmarks. The one which proved the most useful is included as a test case, but I don't know that it captures all of the relevant changes, as I didn't have specific regression tests for each, they were more the result of reasoning about what the old algorithm would possibly do wrong. I'm also failing at the moment to craft more targeted regression tests for these changes, if anyone has ideas, it would be welcome. The first big thing broken with the old algorithm is the idea that we can take a basic block which has a loop-exiting successor and a looping successor and use the looping successor as the layout top in order to get that particular block to be the bottom of the loop after layout. This happens to work in many cases, but not in all. The second big thing broken was that we didn't try to select the exit which fell into the nearest enclosing loop (to which we exit at all). As a consequence, even if the rotation worked perfectly, it would result in one of two bad layouts. Either the bottom of the loop would get fallthrough, skipping across a nearer enclosing loop and thereby making it discontiguous, or it would be forced to take an explicit jump over the nearest enclosing loop to earch its successor. The point of the rotation is to get fallthrough, so we need it to fallthrough to the nearest loop it can. The fix to the first issue is to actually layout the loop from the loop header, and then rotate the loop such that the correct exiting edge can be a fallthrough edge. This is actually much easier than I anticipated because we can handle all the hard parts of finding a viable rotation before we do the layout. We just store that, and then rotate after layout is finished. No inner loops get split across the post-rotation backedge because we check for them when selecting the rotation. That fix exposed a latent problem with our exitting block selection -- we should allow the backedge to point into the middle of some inner-loop chain as there is no real penalty to it, the whole point is that it won't be a fallthrough edge. This may have blocked the rotation at all in some cases, I have no idea and no test case as I've never seen it in practice, it was just noticed by inspection. Finally, all of these fixes, and studying the loops they produce, highlighted another problem: in rotating loops like this, we sometimes fail to align the destination of these backwards jumping edges. Fix this by actually walking the backwards edges rather than relying on loopinfo. This fixes regressions on heapsort if block placement is enabled as well as lots of other cases where the previous logic would introduce an abundance of unnecessary branches into the execution. llvm-svn: 154783	2012-04-16 01:12:56 +00:00
Nadav Rotem	b8710ee43f	When emulating vselect using OR/AND/XOR make sure to bitcast the result back to the original type. llvm-svn: 154764	2012-04-15 15:08:09 +00:00
Andrew Trick	550cf63beb	misched: Added CanHandleTerminators. This is a special flag for targets that really want their block terminators in the DAG. The default scheduler cannot handle this correctly, so it becomes the specialized scheduler's responsibility to schedule terminators. llvm-svn: 154712	2012-04-13 23:29:54 +00:00
Benjamin Kramer	191fe619aa	Reduce malloc traffic in DwarfAccelTable - Don't copy offsets into HashData, the underlying vector won't change once the table is finalized. - Allocate HashData and HashDataContents in a BumpPtrAllocator. - Allocate string map entries in the same allocator. - Random cleanups. llvm-svn: 154694	2012-04-13 20:06:17 +00:00
Sirish Pande	ff74c0b4e8	HexagonPacketizer patch. llvm-svn: 154616	2012-04-12 21:06:38 +00:00
Nadav Rotem	b05ea8c9af	Reapply 154397. Original message: Fix a dagcombine optimization which assumes that the vsetcc result type is always of the same size as the compared values. This is ture for SSE/AVX/NEON but not for all targets. llvm-svn: 154490	2012-04-11 08:26:11 +00:00
Craig Topper	28df4bf296	Fix an overly indented line. Remove an 'else' after an 'if' that returns. llvm-svn: 154479	2012-04-11 04:55:51 +00:00
Craig Topper	82772b86d6	Inline implVisitAluOverflow by introducing a nested switch to convert the intrinsic to an nodetype. llvm-svn: 154478	2012-04-11 04:34:11 +00:00
Craig Topper	0590d2cdea	Optimize code a bit by calling push_back only once in some loops. Reduces compiled code size a bit. llvm-svn: 154473	2012-04-11 03:06:35 +00:00
Jakob Stoklund Olesen	4bfc07ceb5	Tweak MachineLICM heuristics for cheap instructions. Allow cheap instructions to be hoisted if they are register pressure neutral or better. This happens if the instruction is the last loop use of another virtual register. Only expensive instructions are allowed to increase loop register pressure. llvm-svn: 154455	2012-04-11 00:00:28 +00:00
Jakob Stoklund Olesen	b1ec8d8548	Only check for PHI uses inside the current loop. Hoisting a value that is used by a PHI in the loop will introduce a copy because the live range is extended to cross the PHI. The same applies to PHIs in exit blocks. Also use this opportunity to make HasLoopPHIUse() non-recursive. llvm-svn: 154454	2012-04-11 00:00:26 +00:00
Owen Anderson	a8319713a4	Move the constant-folding support for FP_ROUND in SelectionDAG from the one-operand version of getNode() to the two-operand version, since it became a two-operand node at sound point. Zap a testcase that this allows us to completely fold away. llvm-svn: 154447	2012-04-10 22:46:53 +00:00
Duncan Sands	6d360055c5	Add a comment noting that the fdiv -> fmul conversion won't generate multiplication by a denormal, and some tests checking that. llvm-svn: 154431	2012-04-10 20:35:27 +00:00
Eric Christopher	ec1405e930	To ensure that we have more accurate line information for a block don't elide the branch instruction if it's the only one in the block, otherwise it's ok. PR9796 and rdar://11215207 llvm-svn: 154417	2012-04-10 18:18:10 +00:00
Owen Anderson	540d48ddb5	Revert r154397, which was causing make check failures on the buildbots. llvm-svn: 154414	2012-04-10 18:02:12 +00:00
Nadav Rotem	e5008bb774	Fix a dagcombine optimization which assumes that the vsetcc result type is always of the same size as the compared values. This is ture for SSE/AVX/NEON but not for all targets. llvm-svn: 154397	2012-04-10 14:58:31 +00:00
Chandler Carruth	3c9796d9b0	Make a somewhat subtle change in the logic of block placement. Sometimes the loop header has a non-loop predecessor which has been pre-fused into its chain due to unanalyzable branches. In this case, rotating the header into the body of the loop in order to place a loop exit at the bottom of the loop is a Very Bad Idea as it makes the loop non-contiguous. I'm working on a good test case for this, but it's a bit annoynig to craft. I should get one shortly, but I'm submitting this now so I can begin the (lengthy) performance analysis process. An initial run of LNT looks really, really good, but there is too much noise there for me to trust it much. llvm-svn: 154395	2012-04-10 13:35:57 +00:00
Anton Korobeynikov	0fc5fe0430	Transform div to mul with reciprocal only when fp imm is legal. This fixes PR12516 and uncovers one weird problem in legalize (workarounded) llvm-svn: 154394	2012-04-10 13:22:49 +00:00
Evan Cheng	460634e917	Make the code slightly more palatable. llvm-svn: 154378	2012-04-10 03:15:18 +00:00
Evan Cheng	5825e9dbf5	Fix a long standing tail call optimization bug. When a libcall is emitted legalizer always use the DAG entry node. This is wrong when the libcall is emitted as a tail call since it effectively folds the return node. If the return node's input chain is not the entry (i.e. call, load, or store) use that as the tail call input chain. PR12419 rdar://9770785 rdar://11195178 llvm-svn: 154370	2012-04-10 01:51:00 +00:00
Rafael Espindola	9febd1fbf7	Don't try to zExt just to check if an integer constant is zero, it might not fit in a i64. llvm-svn: 154364	2012-04-10 00:16:22 +00:00
Akira Hatanaka	1b46e841a2	Have TargetLowering::getPICJumpTableRelocBase return a node that points to the GOT if jump table uses 64-bit gp-relative relocation. llvm-svn: 154341	2012-04-09 20:32:12 +00:00
Lang Hames	751eb83306	Patch r153892 for PR11861 apparently broke an external project (see PR12493). This patch restores TwoAddressInstructionPass's pre-r153892 behaviour when rescheduling instructions in TryInstructionTransform. Hopefully this will fix PR12493. To refix PR11861, lowering of INSERT_SUBREGS is deferred until after the copy that unties the operands is emitted (this seems to be a more appropriate fix for that issue anyway). llvm-svn: 154338	2012-04-09 20:17:30 +00:00
Rafael Espindola	6b7bf4d0aa	Pattern match a setcc of boolean value with 0 as a truncate. llvm-svn: 154322	2012-04-09 16:06:03 +00:00
Craig Topper	b06257c64d	Remove unnecessary type check when combining and/or/xor of swizzles. Move some checks to allow better early out. llvm-svn: 154309	2012-04-09 07:19:09 +00:00
Craig Topper	a248e92058	Remove unnecessary 'else' on an 'if' that always returns llvm-svn: 154308	2012-04-09 05:59:53 +00:00
Craig Topper	ee38217fe4	Optimize code slightly. No functionality change. llvm-svn: 154307	2012-04-09 05:55:33 +00:00
Craig Topper	24c4646a77	Replace some explicit checks with asserts for conditions that should never happen. llvm-svn: 154305	2012-04-09 05:16:56 +00:00
Craig Topper	1960db33c0	Optimize code a bit. No functional change intended. llvm-svn: 154299	2012-04-08 23:15:04 +00:00
Benjamin Kramer	e99c184047	Silence sign-compare warning. llvm-svn: 154297	2012-04-08 19:04:45 +00:00
Duncan Sands	28b9aa998e	Only have codegen turn fdiv by a constant into fmul by the reciprocal when -ffast-math, i.e. don't just always do it if the reciprocal can be formed exactly. There is already an IR level transform that does that, and it does it more carefully. llvm-svn: 154296	2012-04-08 18:08:12 +00:00
Craig Topper	e0c286243b	Simplify code that tries to do vector extracts for shuffles when the mask width and the input vector widths don't match. No need to check the min and max are in range before calculating the start index. The range check after having the start index is sufficient. Also no need to check for an extract from the beginning differently. llvm-svn: 154295	2012-04-08 17:53:33 +00:00
Chandler Carruth	233e7232ae	Move the TLSModel information into the TargetMachine rather than hiding in TargetLowering. There was already a FIXME about this location being odd. The interface is simplified as a consequence. This will also make it easier to change TLS models when compiling with PIE. llvm-svn: 154292	2012-04-08 17:20:55 +00:00
Chandler Carruth	5ec9b9fd94	Remove an over zealous assert. The assert was trying to catch places where a chain outside of the loop block-set ended up in the worklist for scheduling as part of the contiguous loop. However, asserting the first block in the chain is in the loop-set isn't a valid check -- we may be forced to drag a chain into the worklist due to one block in the chain being part of the loop even though the first block is not in the loop. This occurs when we have been forced to form a chain early due to un-analyzable branches. No test case here as I have no idea how to even begin reducing one, and it will be hopelessly fragile. We have to somehow end up with a loop header of an inner loop which is a successor of a basic block with an unanalyzable pair of branch instructions. Ow. Self-host triggers it so it is unlikely it will regress. This at least gets block placement back to passing selfhost and the test suite. There are still a lot of slowdown that I don't like coming out of block placement, although there are now also a lot of speedups. =[ I'm seeing swings in both directions up to 10%. I'm going to try to find time to dig into this and see if we can turn this on for 3.1 as it does a really good job of cleaning up after some loops that degraded with the inliner changes. llvm-svn: 154287	2012-04-08 14:37:02 +00:00
Chandler Carruth	2fe7a17703	Add a debug-only 'dump' method to the BlockChain structure to ease debugging. llvm-svn: 154286	2012-04-08 14:37:01 +00:00
Craig Topper	a6412fb8c0	Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove patterns for selecting the intrinsic. Similar was already done for avx1. llvm-svn: 154272	2012-04-07 22:32:29 +00:00
Craig Topper	d40e2513b2	Remove 'else' after 'if' that ends in return. llvm-svn: 154267	2012-04-07 21:23:41 +00:00
Nadav Rotem	37734277f0	1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new shuffle node because it could introduce new shuffle nodes that were not supported efficiently by the target. 2. Add a more restrictive shuffle-of-shuffle optimization for cases where the second shuffle reverses the transformation of the first shuffle. llvm-svn: 154266	2012-04-07 21:19:08 +00:00
Duncan Sands	cd52f3d447	Convert floating point division by a constant into multiplication by the reciprocal if converting to the reciprocal is exact. Do it even if inexact if -ffast-math. This substantially speeds up ac.f90 from the polyhedron benchmarks. llvm-svn: 154265	2012-04-07 20:04:00 +00:00
Eric Christopher	2e17b32e69	Patch to set is_stmt a little better for prologue lines in a function. This enables debuggers to see what are interesting lines for a breakpoint rather than any line that starts a function. rdar://9852092 llvm-svn: 154120	2012-04-05 20:39:05 +00:00
Jakob Stoklund Olesen	28edb011c4	Don't break the IV update in TLI::SimplifySetCC(). LSR always tries to make the ICmp in the loop latch use the incremented induction variable. This allows the induction variable to be kept in a single register. When the induction variable limit is equal to the stride, SimplifySetCC() would break LSR's hard work by transforming: (icmp (add iv, stride), stride) --> (cmp iv, 0) This forced us to use lea for the IC update, preventing the simpler incl+cmp. <rdar://problem/7643606> <rdar://problem/11184260> llvm-svn: 154119	2012-04-05 20:30:20 +00:00
Owen Anderson	b21312019c	Treat f16 the same as f80/f128 for the purposes of generating constants during instruction selection. llvm-svn: 154113	2012-04-05 18:50:32 +00:00
Pete Cooper	4f727ef169	REG_SEQUENCE expansion to COPY instructions wasn't taking account of sub register indices on the source registers. No simple test case llvm-svn: 154051	2012-04-04 21:03:25 +00:00
Pete Cooper	8d002ed0bb	f16 FREM can now be legalized by promoting to f32 llvm-svn: 154039	2012-04-04 19:36:31 +00:00
Jakob Stoklund Olesen	f0c39f0a1e	Remove spurious debug output. llvm-svn: 154032	2012-04-04 18:23:38 +00:00
Rafael Espindola	88a1aeb123	Always compute all the bits in ComputeMaskedBits. This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011	2012-04-04 12:51:34 +00:00
Craig Topper	98fc96208f	Remove default case from switch that was already covering all cases. llvm-svn: 153996	2012-04-04 04:42:42 +00:00
Pete Cooper	702973d27c	Removed useless switch for default case when switch was covering all the enum values llvm-svn: 153984	2012-04-04 00:53:04 +00:00
Pete Cooper	4164f86b8a	Add VSELECT to LegalizeVectorTypes::ScalariseVectorResult. Previously it would crash if it encountered a 1 element VSELECT. Solution is slightly more complicated than just creating a SELET as we have to mask or sign extend the vector condition if it had different boolean contents from the scalar condition. Fixes <rdar://problem/11178095> llvm-svn: 153976	2012-04-03 22:57:55 +00:00
Pete Cooper	983fc686b4	Removed one last bad continue statement meant to be removed in r153914. llvm-svn: 153975	2012-04-03 22:18:49 +00:00
Chad Rosier	e7870c71eb	Fix an issue in SimplifySetCC() specific to vector comparisons. When folding X == X we need to check getBooleanContents() to determine if the result is a vector of ones or a vector of negative ones. I tried creating a test case, but the problem seems to only be exposed on a much older version of clang (around r144500). rdar://10923049 llvm-svn: 153966	2012-04-03 20:11:24 +00:00
Eric Christopher	53ef0cf4a5	Fix thinko check for number of operands to be the one that actually might have more than 19 operands. Add a testcase to make sure I never screw that up again. Part of rdar://11026482 llvm-svn: 153961	2012-04-03 17:55:42 +00:00
Eric Christopher	ba40985484	Add a line number for the scope of the function (starting at the first brace) so that we get more accurate line number information about the declaration of a given function and the line where the function first starts. Part of rdar://11026482 llvm-svn: 153916	2012-04-03 00:43:49 +00:00
Pete Cooper	fb86d3b6bc	Fixes to r153903. Added missing explanation of behaviour when the VirtRegMap is NULL. Also changed it in this case to just avoid updating the map, but live ranges or intervals will still get updated and created llvm-svn: 153914	2012-04-03 00:28:46 +00:00
Pete Cooper	426b167bc5	Moved LiveRangeEdit.h so that it can be called from other parts of the backend, not just libCodeGen llvm-svn: 153906	2012-04-02 22:44:18 +00:00
Jakob Stoklund Olesen	97f47c37b6	Allocate virtual registers in ascending order. This is just the fallback tie-breaker ordering, the main allocation order is still descending size. Patch by Shamil Kurmangaleev! llvm-svn: 153904	2012-04-02 22:30:39 +00:00
Pete Cooper	a76a82ef6f	Refactored the LiveRangeEdit interface so that MachineFunction, TargetInstrInfo, MachineRegisterInfo, LiveIntervals, and VirtRegMap are all passed into the constructor and stored as members instead of passed in to each method. llvm-svn: 153903	2012-04-02 22:22:53 +00:00
Owen Anderson	157487e7c5	Add predicates for checking whether targets have free FNEG and FABS operations, and prevent the DAGCombiner from turning them into bitwise operations if they do. llvm-svn: 153901	2012-04-02 22:10:29 +00:00
Lang Hames	dbc3175c89	During two-address lowering, rescheduling an instruction does not untie operands. Make TryInstructionTransform return false to reflect this. Fixes PR11861. llvm-svn: 153892	2012-04-02 19:58:43 +00:00
Eric Christopher	6c4e6016b5	Turn on the accelerator tables for Darwin. llvm-svn: 153880	2012-04-02 17:58:52 +00:00
Nadav Rotem	a9ec0e024f	Optimizing swizzles of complex shuffles may generate additional complex shuffles. Do not try to optimize swizzles of shuffles if the source shuffle has more than a single user, except when the source shuffle is also a swizzle. llvm-svn: 153864	2012-04-02 07:11:12 +00:00
Craig Topper	dbc259a436	Make MCInstrInfo available to the MCInstPrinter. This will be used to remove getInstructionName and the static data it contains since the same tables are already in MCInstrInfo. llvm-svn: 153860	2012-04-02 06:09:36 +00:00
Nadav Rotem	2729f54295	This commit contains a few changes that had to go in together. 1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) (and also scalar_to_vector). 2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src). Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B)) 3. Optimize swizzles of shuffles: shuff(shuff(x, y), undef) -> shuff(x, y). 4. Fix an X86ISelLowering optimization which was very bitcast-sensitive. Code which was previously compiled to this: movd (%rsi), %xmm0 movdqa .LCPI0_0(%rip), %xmm2 pshufb %xmm2, %xmm0 movd (%rdi), %xmm1 pshufb %xmm2, %xmm1 pxor %xmm0, %xmm1 pshufb .LCPI0_1(%rip), %xmm1 movd %xmm1, (%rdi) ret Now compiles to this: movl (%rsi), %eax xorl %eax, (%rdi) ret llvm-svn: 153848	2012-04-01 19:31:22 +00:00
Lang Hames	44174d3b7a	Fix typo. llvm-svn: 153846	2012-04-01 19:27:25 +00:00
Andrew Trick	31337c9d64	misched: Add finalizeScheduler to complete the target interface. llvm-svn: 153827	2012-04-01 07:24:23 +00:00
Rafael Espindola	2da83dbcd4	Teach CodeGen's version of computeMaskedBits to understand the range metadata. This is the CodeGen equivalent of r153747. I tested that there is not noticeable performance difference with any combination of -O0/-O2 /-g when compiling gcc as a single compilation unit. llvm-svn: 153817	2012-03-31 18:14:00 +00:00
Bill Wendling	c6f065c054	If we have a VLA that has a "use" in a metadata node that's then used here but it has no other uses, then we have a problem. E.g., int foo (const int x) { char a[x]; return 0; } If we assign 'a' a vreg and fast isel later on has to use the selection DAG isel, it will want to copy the value to the vreg. However, there are no uses, which goes counter to what selection DAG isel expects. <rdar://problem/11134152> llvm-svn: 153705	2012-03-30 00:02:55 +00:00
Eric Christopher	469ec18341	Add support for objc property decls according to the page at: http://llvm.org/docs/SourceLevelDebugging.html#objcproperty including type and DECL. Expand the metadata needed accordingly. rdar://11144023 llvm-svn: 153639	2012-03-29 08:42:56 +00:00
Jakob Stoklund Olesen	753b1e33e0	Enable machine code verification in the entire code generator. Some targets still mess up the liveness information, but that isn't verified after MRI->invalidateLiveness(). The verifier can still check other useful things like register classes and CFG, so it should be enabled after all passes. llvm-svn: 153615	2012-03-28 23:54:28 +00:00
Jakob Stoklund Olesen	4b4ee58c4c	Enable machine code verification after PreSched2 passes. The late scheduler depends on accurate liveness information if it is breaking anti-dependencies, so we should be able to verify it. Relax the terminator checking in the machine code verifier so it can handle the basic blocks created by if conversion. llvm-svn: 153614	2012-03-28 23:31:15 +00:00
Jakob Stoklund Olesen	6ce4ff3ff7	Also verify after ExpandPostRAPseudos. llvm-svn: 153599	2012-03-28 20:49:30 +00:00
Jakob Stoklund Olesen	f5df00f0fb	Enable machine code verification after the late machine optimization passes. Branch folding invalidates liveness and disables liveness verification on some targets. llvm-svn: 153597	2012-03-28 20:47:37 +00:00
Jakob Stoklund Olesen	37927fe83c	Skip liveness verification when MRI->tracksLiveness() is false. Extract the liveness verification into its own method. This makes it possible to run the machine code verifier after liveness information is no longer required to be valid. llvm-svn: 153596	2012-03-28 20:47:35 +00:00
Jakob Stoklund Olesen	7c56ad07a6	Allow removeLiveIn to be called with a register that isn't live-in. This avoids the silly double search: if (isLiveIn(Reg)) removeLiveIn(Reg); llvm-svn: 153592	2012-03-28 20:11:42 +00:00
Pete Cooper	3bf62a9db3	Fixed commuteInstructions bug where if its called pre-regalloc the subreg indices weren't commuted llvm-svn: 153579	2012-03-28 17:02:22 +00:00
Eric Christopher	0f1d5b363c	More debug output. llvm-svn: 153571	2012-03-28 07:34:36 +00:00
Eric Christopher	57ec9a8587	Fix the output of the DW_TAG_friend tag to include DW_AT_friend and not the rest of the member tag. Fixes PR11695 llvm-svn: 153570	2012-03-28 07:34:31 +00:00
Lang Hames	6c935e3700	Use a SmallVector and linear lookup instead of a DenseSet - SourceMap values will always be tiny sets, so DenseSet is overkill (SmallSet won't work as we need iteration support). llvm-svn: 153529	2012-03-27 19:10:45 +00:00
Eric Christopher	5f34828440	Use DW_AT_low_pc for a single entry point into a routine. Fixes PR10105 llvm-svn: 153524	2012-03-27 18:35:54 +00:00
Jakob Stoklund Olesen	21fd015586	Print SSA and liveness tracking flags in MF::print(). llvm-svn: 153518	2012-03-27 17:17:16 +00:00
Jakob Stoklund Olesen	01ae333053	Branch folding may invalidate liveness. Branch folding can use a register scavenger to update liveness information when required. Don't do that if liveness information is already invalid. llvm-svn: 153517	2012-03-27 17:06:09 +00:00
Chris Lattner	bab5825de7	fix what looks like a real logic bug, found by PVS-Studio (part of PR12357) llvm-svn: 153513	2012-03-27 16:27:21 +00:00
Jakob Stoklund Olesen	7ba0f121e5	Add an MRI::tracksLiveness() flag. Late optimization passes like branch folding and tail duplication can transform the machine code in a way that makes it expensive to keep the register liveness information up to date. There is a fuzzy line between register allocation and late scheduling where the liveness information degrades. The MRI::tracksLiveness() flag makes the line clear: While true, liveness information is accurate, and can be used for register scavenging. Once the flag is false, liveness information is not accurate, and can only be used as a hint. Late passes generally don't need the liveness information, but they will sometimes use the register scavenger to help update it. The scavenger enforces strict correctness, and we have to spend a lot of code to update register liveness that may never be used. llvm-svn: 153511	2012-03-27 15:13:58 +00:00

... 3 4 5 6 7 ...

13774 Commits