llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00

Author	SHA1	Message	Date
Bill Wendling	06a580b5da	For non-Darwin platforms, we want to generate stack protectors only for character arrays. This is in line with what GCC does. <rdar://problem/10529227> llvm-svn: 161446	2012-08-07 20:59:05 +00:00
Jakob Stoklund Olesen	301af79343	Add a new kind of MachineOperand: MO_TargetIndex. A target index operand looks a lot like a constant pool reference, but it is completely target-defined. It contains the 8-bit TargetFlags, a 32-bit index, and a 64-bit offset. It is preserved by all code generator passes. TargetIndex operands can be used to carry target-specific information in cases where immediate operands won't suffice. llvm-svn: 161441	2012-08-07 18:56:39 +00:00
Jakob Stoklund Olesen	8836660866	Fix a couple of typos. llvm-svn: 161437	2012-08-07 18:32:57 +00:00
Jakob Stoklund Olesen	438bc30c3d	Add trace accessor methods, implement primitive if-conversion heuristic. Compare the critical paths of the two traces through an if-conversion candidate. If the difference is larger than the branch brediction penalty, reject the if-conversion. If would never pay. llvm-svn: 161433	2012-08-07 18:02:19 +00:00
Chandler Carruth	49d4e3f282	Add a much more conservative strategy for aligning branch targets. Previously, MBP essentially aligned every branch target it could. This bloats code quite a bit, especially non-looping code which has no real reason to prefer aligned branch targets so heavily. As Andy said in review, it's still a bit odd to do this without a real cost model, but this at least has much more plausible heuristics. Fixes PR13265. llvm-svn: 161409	2012-08-07 09:45:24 +00:00
Manman Ren	5d43c19d9e	MachineCSE: Update the heuristics for isProfitableToCSE. If the result of a common subexpression is used at all uses of the candidate expression, CSE should not increase the live range of the common subexpression. rdar://11393714 and rdar://11819721 llvm-svn: 161396	2012-08-07 06:16:46 +00:00
Jakob Stoklund Olesen	d5b3babd6f	Delete a dead variable. TwoAddressInstructionPass doesn't remat any more. llvm-svn: 161285	2012-08-04 00:04:03 +00:00
Jakob Stoklund Olesen	69611c470c	TwoAddressInstructionPass refactoring: Extract another method. llvm-svn: 161284	2012-08-03 23:57:58 +00:00
Bob Wilson	9f6e25017a	Refactor and check "onlyReadsMemory" before optimizing builtins. This patch is mostly just refactoring a bunch of copy-and-pasted code, but it also adds a check that the call instructions are readnone or readonly. That check was already present for sin, cos, sqrt, log2, and exp2 calls, but it was missing for the rest of the builtins being handled in this code. llvm-svn: 161282	2012-08-03 23:29:17 +00:00
Jakob Stoklund Olesen	7713568689	TwoAddressInstructionPass refactoring: Extract a method. No functional change intended, except replacing a DenseMap with a SmallDenseMap which should behave identically. llvm-svn: 161281	2012-08-03 23:25:45 +00:00
Jakob Stoklund Olesen	643fdb449e	Begin adding support for updating LiveIntervals in TwoAddressInstructionPass. This is far from complete, and only changes behavior when the -early-live-intervals flag is passed to llc. llvm-svn: 161273	2012-08-03 22:58:34 +00:00
Jakob Stoklund Olesen	14c88af5f2	Add an experimental -early-live-intervals option. This option runs LiveIntervals before TwoAddressInstructionPass which will eventually learn to exploit and update the analysis. Eventually, LiveIntervals will run before PHIElimination, and we can get rid of LiveVariables. llvm-svn: 161270	2012-08-03 22:12:54 +00:00
Jakob Stoklund Olesen	1e192f98dd	Delete merged physreg copies in joinReservedPhysReg(). Previously, the identity copy would survive through register allocation before it was removed by the rewriter. llvm-svn: 161269	2012-08-03 22:12:51 +00:00
Bob Wilson	7e2fb62620	Try to reduce the compile time impact of r161232. The previous change caused fast isel to not attempt handling any calls to builtin functions. That included things like "printf" and caused some noticable regressions in compile time. I wanted to avoid having fast isel keep a separate list of functions that had to be kept in sync with what the code in SelectionDAGBuilder.cpp was handling. I've resolved that here by moving the list into TargetLibraryInfo. This is somewhat redundant in SelectionDAGBuilder but it will ensure that we keep things consistent. llvm-svn: 161263	2012-08-03 21:26:24 +00:00
Bob Wilson	7d92f01d57	Fix memcmp code-gen to honor -fno-builtin. I noticed that SelectionDAGBuilder::visitCall was missing a check for memcmp in TargetLibraryInfo, so that it would use custom code for memcmp calls even with -fno-builtin. I also had to add a new -disable-simplify-libcalls option to llc so that I could write a test for this. llvm-svn: 161262	2012-08-03 21:26:18 +00:00
Jakob Stoklund Olesen	bf7a000191	Completely eliminate VNInfo flags. The 'unused' state of a value number can be represented as an invalid def SlotIndex. This also exposed code that shouldn't have been looking at unused value VNInfos. llvm-svn: 161258	2012-08-03 20:59:32 +00:00
Jakob Stoklund Olesen	9af03604de	Fix a couple of loops that were processing unused value numbers. Unused VNInfos should be left alone. Their def SlotIndex doesn't point to anything. llvm-svn: 161257	2012-08-03 20:59:29 +00:00
Matt Beaumont-Gay	acc7302f80	Silence unused variable warning in -asserts build llvm-svn: 161256	2012-08-03 20:54:11 +00:00
Jakob Stoklund Olesen	38aec46f18	Eliminate the VNInfo::hasPHIKill() flag. The only real user of the flag was removeCopyByCommutingDef(), and it has been switched to LiveIntervals::hasPHIKill(). All the code changed by this patch was only concerned with computing and propagating the flag. llvm-svn: 161255	2012-08-03 20:19:44 +00:00
Jakob Stoklund Olesen	74dfaec8d7	Make the hasPHIKills flag a computed property. The VNInfo::HAS_PHI_KILL is only half supported. We precompute it in LiveIntervalAnalysis, but it isn't properly updated by live range splitting and functions like shrinkToUses(). It is only used in one place: RegisterCoalescer::removeCopyByCommutingDef(). This patch changes that function to use a new LiveIntervals::hasPHIKill() function that computes the flag for a given value number. llvm-svn: 161254	2012-08-03 20:10:24 +00:00
Jakob Stoklund Olesen	df448a4b76	Delete dead function. llvm-svn: 161242	2012-08-03 15:21:21 +00:00
Jakob Stoklund Olesen	95dc0b7b93	Don't delete dead code in TwoAddressInstructionPass. This functionality was added before we started running DeadMachineInstructionElim on all targets. It serves no purpose now. llvm-svn: 161241	2012-08-03 15:11:57 +00:00
Bob Wilson	d1eefbeac2	Fall back to selection DAG isel for calls to builtin functions. Fast isel doesn't currently have support for translating builtin function calls to target instructions. For embedded environments where the library functions are not available, this is a matter of correctness and not just optimization. Most of this patch is just arranging to make the TargetLibraryInfo available in fast isel. <rdar://problem/12008746> llvm-svn: 161232	2012-08-03 04:06:28 +00:00
Manman Ren	2c236a30f6	X86 Peephole: fold loads to the source register operand if possible. Add more comments and use early returns to reduce nesting in isLoadFoldable. Also disable folding for V_SET0 to avoid introducing a const pool entry and a const pool load. rdar://10554090 and rdar://11873276 llvm-svn: 161207	2012-08-02 19:37:32 +00:00
Jakob Stoklund Olesen	a299894307	Compute the critical path length through a trace. Whenever both instruction depths and instruction heights are known in a block, it is possible to compute the length of the critical path as max(depth+height) over the instructions in the block. The stored live-in lists make it possible to accurately compute the length of a critical path that bypasses the current (small) block. llvm-svn: 161197	2012-08-02 18:45:54 +00:00
Jakob Stoklund Olesen	4be65d0d0c	Verify regunit intervals along with virtreg intervals. Don't cause regunit intervals to be computed just to verify them. Only check the already cached intervals. llvm-svn: 161183	2012-08-02 16:36:50 +00:00
Jakob Stoklund Olesen	ac79668cdf	Avoid creating dangling physreg live ranges during DCE. LiveRangeEdit::eliminateDeadDefs() can delete a dead instruction that reads unreserved physregs. This would leave the corresponding regunit live interval dangling because we don't have shrinkToUses() for physical registers. Fix this problem by turning the instruction into a KILL instead of deleting it. This happens in a landing pad in test/CodeGen/X86/2012-05-19-CoalescerCrash.ll: %vreg27<def,dead> = COPY %EDX<kill>; GR32:%vreg27 becomes: KILL %EDX<kill> An upcoming fix to the machine verifier will catch problems like this by verifying regunit live intervals. This fixes PR13498. I am not including the test case from the PR since we already have one exposing the problem once the verifier is fixed. llvm-svn: 161182	2012-08-02 16:36:47 +00:00
Jakob Stoklund Olesen	802cd7688f	Add report() functions that take a LiveInterval argument. llvm-svn: 161178	2012-08-02 14:31:49 +00:00
Manman Ren	78b8d454cc	X86 Peephole: fold loads to the source register operand if possible. Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. This patch is a rework of r160919 and was tested on clang self-host on my local machine. rdar://10554090 and rdar://11873276 llvm-svn: 161152	2012-08-02 00:56:42 +00:00
Jakob Stoklund Olesen	5848ce22b0	Extract some methods from verifyLiveIntervals. No functional change. llvm-svn: 161149	2012-08-02 00:20:20 +00:00
Jakob Stoklund Olesen	85840cd877	Also verify RegUnit intervals at uses. llvm-svn: 161147	2012-08-01 23:52:40 +00:00
Jakob Stoklund Olesen	b223807a3d	Compute instruction heights through a trace. The height on an instruction is the minimum number of cycles from the instruction is issued to the end of the trace. Heights are computed for all instructions in and below the trace center block. The method for computing heights is different from the depth computation. As we visit instructions in the trace bottom-up, heights of used instructions are pushed upwards. This way, we avoid scanning long use lists, looking for uses in the current trace. At each basic block boundary, a list of live-in registers and their minimum heights is saved in the trace block info. These live-in lists are used when restarting depth computations on a trace that converges with an already computed trace. They will also be used to accurately compute the critical path length. llvm-svn: 161138	2012-08-01 22:36:00 +00:00
Eric Christopher	dde7784606	Temporarily revert c23b933d5f8be9b51a1d22e717c0311f65f87dcd. It's causing failures in the debug testsuite and possibly PR13486. llvm-svn: 161121	2012-08-01 18:19:01 +00:00
Jakob Stoklund Olesen	beae25c46c	Add DataDep constructors. Explicitly check SSA form. llvm-svn: 161115	2012-08-01 16:02:59 +00:00
Elena Demikhovsky	0fec7026d9	Added FMA functionality to X86 target. llvm-svn: 161110	2012-08-01 12:06:00 +00:00
Manman Ren	b9e9c55911	MachineSink: Sort the successors before trying to find SuccToSinkTo. Use stable_sort instead of sort. Follow-up to r161062. rdar://11980766 llvm-svn: 161075	2012-07-31 20:45:38 +00:00
Jakob Stoklund Olesen	3dc189f67b	Compute instruction depths through the current trace. Assuming infinite issue width, compute the earliest each instruction in the trace can issue, when considering the latency of data dependencies. The issue cycle is record as a 'depth' from the beginning of the trace. This is half the computation required to find the length of the critical path through the trace. Heights are next. llvm-svn: 161074	2012-07-31 20:44:38 +00:00
Jakob Stoklund Olesen	b2c7febf35	Rename CT -> MTM. MachineTraceMetrics is abbreviated MTM. llvm-svn: 161072	2012-07-31 20:25:13 +00:00
Manman Ren	3d7e85d5b8	MachineSink: Sort the successors before trying to find SuccToSinkTo. One motivating example is to sink an instruction from a basic block which has two successors: one outside the loop, the other inside the loop. We should try to sink the instruction outside the loop. rdar://11980766 llvm-svn: 161062	2012-07-31 18:10:39 +00:00
Micah Villmow	122d115419	Conform to LLVM coding style. llvm-svn: 161061	2012-07-31 18:07:43 +00:00
Micah Villmow	8b7fb5e605	Don't generate ordered or unordered comparison operations if it is not legal to do so. llvm-svn: 161053	2012-07-31 16:48:03 +00:00
Jakob Stoklund Olesen	ed1a4d695a	Clear kill flags in removeCopyByCommutingDef(). We are extending live ranges, so kill flags are not accurate. They aren't needed until they are recomputed after RA anyway. <rdar://problem/11950722> llvm-svn: 161023	2012-07-31 02:47:24 +00:00
Manman Ren	3769ac64a6	Reverse order of the two branches at end of a basic block if it is profitable. We branch to the successor with higher edge weight first. Convert from je LBB4_8 --> to outer loop jmp LBB4_14 --> to inner loop to jne LBB4_14 jmp LBB4_8 PR12750 rdar: 11393714 llvm-svn: 161018	2012-07-31 01:11:07 +00:00
Andrew Trick	2773dbddf3	Use the latest MachineRegisterInfo APIs. No functionality. llvm-svn: 161010	2012-07-30 23:48:17 +00:00
Andrew Trick	dc8c12f62b	Inline MachineRegisterInfo::hasOneUse llvm-svn: 161007	2012-07-30 23:48:12 +00:00
Jakob Stoklund Olesen	12a22d4f0f	Avoid looking at stale data in verifyAnalysis(). llvm-svn: 161004	2012-07-30 23:15:12 +00:00
Jakob Stoklund Olesen	a9e1543215	Allow traces to enter nested loops. This lets traces include the final iteration of a nested loop above the center block, and the first iteration of a nested loop below the center block. We still don't allow traces to contain backedges, and traces are truncated where they would leave a loop, as seen from the center block. llvm-svn: 161003	2012-07-30 23:15:10 +00:00
Jakob Stoklund Olesen	e9523d88c3	Clarify invalidation strategy in comment. llvm-svn: 160997	2012-07-30 21:16:22 +00:00
Jakob Stoklund Olesen	2a129fa92c	Assert that all trace candidate blocks have been visited by the PO. When computing a trace, all the candidates for pred/succ must have been visited. Filter out back-edges first, though. The PO traversal ignores them. Thanks to Andy for spotting this in review. llvm-svn: 160995	2012-07-30 21:10:27 +00:00
Jakob Stoklund Olesen	d91215215c	Hook into PassManager's analysis verification. By overriding Pass::verifyAnalysis(), the pass contents will be verified by the pass manager. llvm-svn: 160994	2012-07-30 20:57:50 +00:00
Pete Cooper	e45da564cf	Consider address spaces for hashing and CSEing DAG nodes. Otherwise two loads from different x86 segments but the same address would get CSEd llvm-svn: 160987	2012-07-30 20:23:19 +00:00
Jakob Stoklund Olesen	7bbe0b0328	Add MachineInstr::isTransient(). This is a cleaned up version of the isFree() function in MachineTraceMetrics.cpp. Transient instructions are very unlikely to produce any code in the final output. Either because they get eliminated by RegisterCoalescing, or because they are pseudo-instructions like labels and debug values. llvm-svn: 160977	2012-07-30 18:34:14 +00:00
Jakob Stoklund Olesen	c3b8765d57	Add MachineTraceMetrics::verify(). This function verifies the consistency of cached data in the MachineTraceMetrics analysis. llvm-svn: 160976	2012-07-30 18:34:11 +00:00
Jakob Stoklund Olesen	6653a31973	Verify that the CFG hasn't changed during invalidate(). The MachineTraceMetrics analysis must be invalidated before modifying the CFG. This will catch some of the violations of that rule. llvm-svn: 160969	2012-07-30 17:36:49 +00:00
Jakob Stoklund Olesen	4f3254f73c	Add MachineBasicBlock::isPredecessor(). A->isPredecessor(B) is the same as B->isSuccessor(A), but it can tolerate a B that is null or dangling. This shouldn't happen normally, but it it useful for verification code. llvm-svn: 160968	2012-07-30 17:36:47 +00:00
Manman Ren	ceef7c4d9b	Revert r160920 and r160919 due to dragonegg and clang selfhost failure llvm-svn: 160927	2012-07-29 02:44:09 +00:00
Manman Ren	ea77f9076b	X86 Peephole: fold loads to the source register operand if possible. Machine CSE and other optimizations can remove instructions so folding is possible at peephole while not possible at ISel. rdar://10554090 and rdar://11873276 llvm-svn: 160919	2012-07-28 16:48:01 +00:00
Andrew Trick	0320969afa	Reenable a basic SSA DAG builder optimization. Jakob fixed ProcessImplicifDefs in r159149. llvm-svn: 160910	2012-07-28 01:48:15 +00:00
Jakob Stoklund Olesen	0eacb18967	Add more debug output to MachineTraceMetrics. llvm-svn: 160905	2012-07-27 23:58:38 +00:00
Jakob Stoklund Olesen	fefd43f7a9	Keep track of the head and tail of the trace through each block. This makes it possible to quickly detect blocks that are outside the trace. llvm-svn: 160904	2012-07-27 23:58:36 +00:00
Eric Christopher	acd91c534d	Add a DW_AT_high_pc for CUs that are a single address range. Update all tests accordingly. Fixes PR13351. Patch by shinichiro hamaji! llvm-svn: 160899	2012-07-27 22:00:05 +00:00
Jakob Stoklund Olesen	88319a3e66	Also compute register mask lists under -new-live-intervals. llvm-svn: 160898	2012-07-27 21:56:39 +00:00
Jakob Stoklund Olesen	8e957f3c0b	Eliminate the IS_PHI_DEF flag and VNInfo::setIsPHIDef(). A value number is a PHI def if and only if it begins at a block boundary. This can be derived from the def slot, a separate flag is not necessary. llvm-svn: 160893	2012-07-27 21:11:14 +00:00
Jakob Stoklund Olesen	d60f4942e6	Add a -new-live-intervals experimental option. This option replaces the existing live interval computation with one based on LiveRangeCalc.cpp. The new algorithm does not depend on LiveVariables, and it can be run at any time, before or after leaving SSA form. llvm-svn: 160892	2012-07-27 20:58:46 +00:00
Jakob Stoklund Olesen	03a59af504	Add <imp-def> of super-register when lowering SUBREG_TO_REG. Patch by Tyler Nowicki! llvm-svn: 160888	2012-07-27 20:19:49 +00:00
Jakob Stoklund Olesen	f953068467	Use an otherwise unused variable. llvm-svn: 160798	2012-07-26 19:42:56 +00:00
Jakob Stoklund Olesen	0d3c0a9aea	Start scaffolding for a MachineTraceMetrics analysis pass. This is still a work in progress. Out-of-order CPUs usually execute instructions from multiple basic blocks simultaneously, so it is necessary to look at longer traces when estimating the performance effects of code transformations. The MachineTraceMetrics analysis will pick a typical trace through a given basic block and provide performance metrics for the trace. Metrics will include: - Instruction count through the trace. - Issue count per functional unit. - Critical path length, and per-instruction 'slack'. These metrics can be used to determine the performance limiting factor when executing the trace, and how it will be affected by a code transformation. Initially, this will be used by the early if-conversion pass. llvm-svn: 160796	2012-07-26 18:38:11 +00:00
Dan Gohman	7ff5ef1757	Add a floor intrinsic. llvm-svn: 160791	2012-07-26 17:43:27 +00:00
Manman Ren	6b3550a998	Disable rematerialization in TwoAddressInstructionPass. It is redundant; RegisterCoalescer will do the remat if it can't eliminate the copy. Collected instruction counts before and after this. A few extra instructions are generated due to spilling but it is normal to see these kinds of changes with almost any small codegen change, according to Jakob. This also fixed rdar://11830760 where xor is expected instead of movi0. llvm-svn: 160749	2012-07-25 18:28:13 +00:00
Jakob Stoklund Olesen	e470961b23	Preserve 2-addr constraints in ConnectedVNInfoEqClasses. When a live range splits into multiple connected components, we would arbitrarily assign <undef> uses to component 0. This is wrong when the use is tied to a def that gets assigned to a different component: %vreg69<def> = ADD8ri %vreg68<undef>, 1 The use and def must get the same virtual register. Fix this by assigning <undef> uses to the same component as the value defined by the instruction, if any: %vreg69<def> = ADD8ri %vreg69<undef>, 1 This fixes PR13402. The PR has a test case which I am not including because it is unlikely to keep exposing this behavior in the future. llvm-svn: 160739	2012-07-25 17:15:15 +00:00
Jakob Stoklund Olesen	45755341ad	Verify two-address constraints more carefully. Include <undef> operands and virtual registers after leaving SSA form. llvm-svn: 160734	2012-07-25 16:49:11 +00:00
Craig Topper	227c1316f4	Change llvm_unreachable in SplitVectorOperand to report_fatal_error. Keeps release builds from crashing if code uses an intrinsic with an illegal type. llvm-svn: 160661	2012-07-24 04:11:21 +00:00
Sylvestre Ledru	bf8acb65ac	Fix a typo (the the => the) llvm-svn: 160621	2012-07-23 08:51:15 +00:00
Nadav Rotem	180a9e3758	Fixed DAGCombine optimizations which generate select_cc for targets that do not support it (X86 does not lower select_cc). PR: 13428 Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160619	2012-07-23 07:59:50 +00:00
Craig Topper	7bb456e013	Tidy up. Fix indentation and remove trailing whitespace. llvm-svn: 160617	2012-07-23 05:38:07 +00:00
Craig Topper	63f980aea7	Change llvm_unreachable in SplitVectorResult to report_fatal_error. Keeps release builds from crashing if code uses an intrinsic with an illegal type. For instance 256-bit AVX intrinsics without having AVX enabled. llvm-svn: 160616	2012-07-23 04:34:49 +00:00
Benjamin Kramer	87e459b047	Remove unused private member variables uncovered by the recent changes to clang's -Wunused-private-field. llvm-svn: 160583	2012-07-20 22:05:57 +00:00
Jakob Stoklund Olesen	e3c4840e77	Avoid folding loads that are unsafe to move. LiveRangeEdit::foldAsLoad() can eliminate a register by folding a load into its only use. Only do that when the load is safe to move, and it won't extend any live ranges. This fixes PR13414. llvm-svn: 160575	2012-07-20 21:29:31 +00:00
Jakob Stoklund Olesen	3eb650542a	Split loop exiting edges more aggressively. PHIElimination splits critical edges when it predicts it can resolve interference and eliminate copies. It doesn't split the edge if the interference wouldn't be resolved anyway because the phi-use register is live in the critical edge anyway. Teach PHIElimination to split loop exiting edges with interference, even if it wouldn't resolve the interference. This removes the necessary copies from the loop, which is still an improvement from injecting the copies into the loop. The test case demonstrates the improvement. Before: LBB0_1: cmpb $0, (%rdx) leaq 1(%rdx), %rdx movl %esi, %eax je LBB0_1 After: LBB0_1: cmpb $0, (%rdx) leaq 1(%rdx), %rdx je LBB0_1 movl %esi, %eax llvm-svn: 160571	2012-07-20 20:49:53 +00:00
Pete Cooper	4a544942c5	Fix crash in machine verifier when trying to print the def of a register which has no def llvm-svn: 160531	2012-07-19 23:40:38 +00:00
Benjamin Kramer	ec66856001	Replace some explicit compare loops with std::equal. No functionality change. llvm-svn: 160501	2012-07-19 10:46:05 +00:00
Galina Kistanova	c4a2a7cce5	Fixed few warnings. llvm-svn: 160493	2012-07-19 04:50:12 +00:00
Bill Wendling	343eebdbe4	Remove tabs. llvm-svn: 160475	2012-07-19 00:04:14 +00:00
Chandler Carruth	5d1c4f0605	Fix a somewhat nasty crasher in PR13378. This crashes inside of LiveIntervals due to the two-addr pass generating bogus MI code. The crux of the issue was a loop nesting problem. The intent of the code which attempts to transform instructions before converting them to two-addr form is to defer and reprocess any transformed instructions as the second processing is likely to have more opportunities to coalesce copies, etc. Unfortunately, there was one section of processing that was not deferred -- the INSERT_SUBREG rewriting. Due to quirks of how this rewriting proceeded, not only did it occur early, it removed the bits of information needed for the deferred processing to correctly generate the necessary two address form (specifically inserting a copy), but didn't trigger any immediate assertions and produced what appeared to be already valid two-address from code. Thus, the assertion only fired much later in the pipeline. The fix is to hoist the transformation logic up layer to where it can more firmly defer all further processing, and to teach the normal processing to handle an edge case previously handled as part of the transformation logic. This edge case (already matched tied register operands) needs to not defer any steps. As has been brought up repeatedly in the process: wow does this code need refactoring. I may squeeze in some time to at least bring sanity to this loop... but wow... =] Thanks to Jakob for helpful hints on the way here, and the review. llvm-svn: 160443	2012-07-18 18:58:22 +00:00
Nuno Lopes	99e140d517	ignore 'invoke @llvm.donothing', but still keep the edge to the continuation BB llvm-svn: 160411	2012-07-18 00:07:17 +00:00
Evan Cheng	5e82ad04d5	Back out r160101 and instead implement a dag combine to recover from instcombine transformation. llvm-svn: 160387	2012-07-17 18:54:11 +00:00
Jakob Stoklund Olesen	fb9eb735f6	Add some trace output to TwoAddressInstructionPass. llvm-svn: 160380	2012-07-17 17:57:23 +00:00
Benjamin Kramer	0d26646425	Remove unused variable. llvm-svn: 160372	2012-07-17 17:00:11 +00:00
Nadav Rotem	9df24d20a6	Fix a crash in the legalization of large vectors. When truncating a result of a vector that is split we need to use the result of the split vector, and not re-split the dead node. llvm-svn: 160357	2012-07-17 09:07:37 +00:00
Evan Cheng	f84dd0cf40	Implement r160312 as target indepedenet dag combine. llvm-svn: 160354	2012-07-17 08:31:11 +00:00
Evan Cheng	302a948c17	Make sure constant bitwidth is <= 64 bit before calling getSExtValue(). llvm-svn: 160350	2012-07-17 07:47:50 +00:00
Evan Cheng	0b6bcb6e06	This is another case where instcombine demanded bits optimization created large immediates. Add dag combine logic to recover in case the large immediates doesn't fit in cmp immediate operand field. int foo(unsigned long l) { return (l>> 47) == 1; } we produce %shr.mask = and i64 %l, -140737488355328 %cmp = icmp eq i64 %shr.mask, 140737488355328 %conv = zext i1 %cmp to i32 ret i32 %conv which codegens to movq $0xffff800000000000,%rax andq %rdi,%rax movq $0x0000800000000000,%rcx cmpq %rcx,%rax sete %al movzbl %al,%eax ret TargetLowering::SimplifySetCC would transform (X & -256) == 256 -> (X >> 8) == 1 if the immediate fails the isLegalICmpImmediate() test. For x86, that's immediates which are not a signed 32-bit immediate. Based on a patch by Eli Friedman. PR10328 rdar://9758774 llvm-svn: 160346	2012-07-17 06:53:39 +00:00
Nadav Rotem	0837b79904	Minor cleanup and docs. llvm-svn: 160311	2012-07-16 18:56:39 +00:00
Nadav Rotem	ae88f0486b	Make ComputeDemandedBits return a deterministic result when computing an AssertZext value. In the added testcase the constant 55 was behind an AssertZext of type i1, and ComputeDemandedBits reported that some of the bits were both known to be one and known to be zero. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160305	2012-07-16 18:34:53 +00:00
Nadav Rotem	338506d265	Fix a bug in the scalarization of BUILD_VECTOR. BUILD_VECTOR elements may be wider than the output element type. Make sure to trunc them if needed. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160235	2012-07-15 20:39:08 +00:00
Nadav Rotem	ff9fadbd0a	Refactor the code that checks that all operands of a node are UNDEFs. Add a micro-optimization to getNode of CONCAT_VECTORS when both operands are undefs. Can't find a testcase for this because VECTOR_SHUFFLE already handles undef operands, but Duncan suggested that we add this. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> llvm-svn: 160229	2012-07-15 08:38:23 +00:00
Chandler Carruth	a974e1667b	Reapply r160194, switching to use LV information for finding local kills. The notable fix is to look at any dependencies attached to the kill instruction (or other instructions between MI nad the kill) where the dependencies are specific to the register in question. The old code implicitly handled this by rejecting the transform if any other uses were found within the block, but after the start point. The new code directly finds the kill, and has to re-use the existing dependency scan to check for non-kill uses. This was caught by self-host, but I found the bug via inspection and use of absurd assert scaffolding to compute the kills in two ways and compare them. So I have no useful testcase for this other than "bootstrap". I'd work harder to reduce a test case if this particular code were likely to live for a long time. Thanks to Benjamin Kramer for reviewing the fix itself. llvm-svn: 160228	2012-07-15 03:29:46 +00:00
Nadav Rotem	a6dfd89cd0	Add a dagcombine optimization to convert concat_vectors of undefs into a single undef. The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node. llvm-svn: 160221	2012-07-14 21:30:27 +00:00
Jakob Stoklund Olesen	f5156bd4b2	Account for early-clobber reload instructions. No test case, there are no in-tree targets that require this. llvm-svn: 160219	2012-07-14 18:45:35 +00:00
Jakob Stoklund Olesen	70ab9a6638	Be more verbose when detecting dominance problems. Catch uses of undefined physregs that haven't been added to basic block live-in lists. Run the verifier to pinpoint the problem. Also run the verifier when a virtual register use is not jointly dominated by defs. llvm-svn: 160207	2012-07-13 23:39:05 +00:00
Chandler Carruth	0d0cb77247	Revert r160194, which switched to use LV information for finding local kills. This is causing miscompiles that I'm working on tracking down. llvm-svn: 160196	2012-07-13 22:23:32 +00:00
Chandler Carruth	7b376840a6	Use the LiveVariables information to efficiently get local kills. This removes the largest scaling problem in the test cases from PR13225 when ASan is switched to insert basic blocks in the natural CFG order. It may also solve some scaling problems for more normal code with large numbers of basic blocks and variables. llvm-svn: 160194	2012-07-13 21:18:38 +00:00
Jim Grosbach	91775b53f2	Provide function name in 'Cannot select' fatal error. When dumping the DAG for a fatal 'Cannot select' back-end error, also provide the name of the function the construct is in. Useful when dealing with large testcases, as the next step is to llvm-extract the function in question to get a small(er) testcase. llvm-svn: 160152	2012-07-13 00:29:09 +00:00
Eric Christopher	2db5457cb8	The end of the prologue should be marked with is_stmt. Fixes PR13303. Patch by Paul Robinson! llvm-svn: 160148	2012-07-12 23:30:25 +00:00
Duncan Sands	30441cb37c	The result type of EXTRACT_VECTOR_ELT doesn't have to match the element type of the input vector, it can be bigger (this is helpful for powerpc where <2 x i16> is a legal vector type but i16 isn't a legal type, IIRC). However this wasn't being taken into account by ExpandRes_EXTRACT_VECTOR_ELT, causing PR13220. Lightly tweaked version of a patch by Michael Liao. llvm-svn: 160116	2012-07-12 09:01:35 +00:00
Evan Cheng	6906c28f5b	InstrEmitter::EmitSubregNode() optimize extract_subreg in this case: r1025 = s/zext r1024, 4 r1026 = extract_subreg r1025, 4 to a copy: r1026 = copy r1024 This is correct. However it uses TII->isCoalescableExtInstr() which can return true for instructions which essentially does a sext_in_reg so this can end up with an illegal copy where the source and destination register classes do not match. Add a check to avoid it. Sorry, no test case possible at this time. rdar://11849816 llvm-svn: 160059	2012-07-11 18:55:07 +00:00
Nadav Rotem	f09721a97f	Rename many of the Tmp1, Tmp2, Tmp3 variables to names such as Chain, Value, Ptr, etc. No functionality change. llvm-svn: 160042	2012-07-11 11:02:16 +00:00
Benjamin Kramer	f848954df7	Remove unused variable. llvm-svn: 160040	2012-07-11 09:39:04 +00:00
Nadav Rotem	71ba361f40	Refactor the DAG Legalizer by extracting the legalization of Load and Store nodes into their own functions. No functional change. llvm-svn: 160037	2012-07-11 08:52:09 +00:00
Owen Anderson	ebcd7c43c9	Only apply the SETCC+SITOFP -> SELECTCC optimization when the SETCC returns an MVT::i1, i.e. before type legalization. This is a speculative fix for a problem on Mips reported by Akira Hatanaka. llvm-svn: 160036	2012-07-11 06:38:55 +00:00
Jakob Stoklund Olesen	55790859d7	Require and preserve LoopInfo for early if-conversion. It will surely be needed by heuristics. llvm-svn: 160027	2012-07-10 22:39:56 +00:00
Chandler Carruth	a72e56574e	Teach the LiveInterval::join function to use the fast merge algorithm, generalizing its implementation sufficiently to support this value number scenario as well. This cuts out another significant performance hit in large functions (over 10k basic blocks, etc), especially those with "natural" CFG structures. llvm-svn: 160026	2012-07-10 22:25:21 +00:00
Jakob Stoklund Olesen	2cdac0c7af	Run early if-conversion in domtree post-order. This ordering allows nested if-conversion without using a work list, and it makes it possible to update the dominator tree on the fly as well. Any erased basic blocks will always be dominated by the current post-order position, so the domtree can be pruned without invalidating the iterator. llvm-svn: 160025	2012-07-10 22:18:23 +00:00
Chandler Carruth	7837c8dbb9	Fix a bug where I didn't test for an empty range before inspecting the back of it. I don't have anything even remotely close to a test case for this. It only broke two build bots, both of them doing bootstrap builds, one of them a dragonegg bootstrap. It doesn't break for me when I bootstrap either. It doesn't reproduce every time or on many machines during the bootstrap. Many thanks to Duncan Sands who got the exact command (and stage of the bootstrap) which failed on the dragonegg bootstrap and managed to get it to trigger under valgrind with debug symbols. The fix was then found by inspection. llvm-svn: 159993	2012-07-10 15:41:33 +00:00
Nadav Rotem	5f6e9d5ffe	Improve the loading of load-anyext vectors by allowing the codegen to load multiple scalars and insert them into a vector. Next, we shuffle the elements into the correct places, as before. Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the migration of bitcasts happened too late in the SelectionDAG process. llvm-svn: 159991	2012-07-10 13:25:08 +00:00
Chandler Carruth	3ce0f2a6de	Add an efficient merge operation to LiveInterval and use it to avoid quadratic behavior when performing pathological merges. Fixes the core element of PR12652. There is only one user of addRangeFrom left: join. I'm hoping to refactor further in a future patch and have join use this merge operation as well. llvm-svn: 159982	2012-07-10 05:16:17 +00:00
Chandler Carruth	9c9af3cc67	Teach LiveIntervals how to verify themselves and start using it in some of the trick merge routines. This adds a layer of testing that was necessary when implementing more efficient (and complex) merge logic for this datastructure. No functionality changed here. llvm-svn: 159981	2012-07-10 05:06:03 +00:00
Andrew Trick	6b62cb7b17	indentation llvm-svn: 159958	2012-07-09 20:43:01 +00:00
Owen Anderson	e0c4f94d45	Teach the DAG combiner to turn sitofp/uitofp from i1 into a conditional move, since there are only two possible values. Previously, this would become an integer extension operation, followed by a real integer->float conversion. llvm-svn: 159957	2012-07-09 20:31:12 +00:00
Andrew Trick	b9c8074dcd	I'm introducing a new machine model to simultaneously allow simple subtarget CPU descriptions and support new features of MachineScheduler. MachineModel has three categories of data: 1) Basic properties for coarse grained instruction cost model. 2) Scheduler Read/Write resources for simple per-opcode and operand cost model (TBD). 3) Instruction itineraties for detailed per-cycle reservation tables. These will all live side-by-side. Any subtarget can use any combination of them. Instruction itineraries will not change in the near term. In the long run, I expect them to only be relevant for in-order VLIW machines that have complex contraints and require a precise scheduling/bundling model. Once itineraries are only actively used by VLIW-ish targets, they could be replaced by something more appropriate for those targets. This tablegen backend rewrite sets things up for introducing MachineModel type #2: per opcode/operand cost model. llvm-svn: 159891	2012-07-07 04:00:00 +00:00
Chad Rosier	9309733543	Whitespace. llvm-svn: 159839	2012-07-06 17:44:22 +00:00
Chad Rosier	2bffb657ca	[fast-isel] Tell fast-isel to do nothing with the new donothing intrinsic. llvm-svn: 159837	2012-07-06 17:33:39 +00:00
Alexey Samsonov	195a9e9c55	Fix PR13202 and a regtest. DwarfDebug class could generate the same (inlined) DIVariable twice: 1) when trying to find abstract debug variable for a concrete inlined instance. 2) when explicitly collecting info for variables that were optimized out. This change makes sure that this duplication won't happen and makes Clang pass "gdb.opt/inline-locals" test from gdb testsuite. Reviewed by Eric Christopher. llvm-svn: 159811	2012-07-06 08:45:08 +00:00
Jakob Stoklund Olesen	8f9719a55e	Add some comments suggested in code review. llvm-svn: 159800	2012-07-06 02:31:22 +00:00
Chandler Carruth	f23ce020cb	Optimize extendIntervalEndTo a tiny bit by saving one call through the vector erase. No functionality changed. llvm-svn: 159746	2012-07-05 12:40:45 +00:00
Chandler Carruth	fa7806504f	Finish fixing the MachineOperand hashing, providing a nice modern hash_value overload for MachineOperands. This addresses a FIXME sufficient for me to remove it, and cleans up the code nicely too. The important changes to the hashing logic: - TargetFlags are now included in all of the hashes. These were complete missed. - Register operands have their subregisters and whether they are a def included in the hash. - We now actually hash all of the operand types. Previously, many operand types were simply dropped on the floor. For example: - Floating point immediates - Large integer immediates (>64-bit) - External globals! - Register masks - Metadata operands - It removes the offset from the block-address hash; I'm a bit suspicious of this, but isIdenticalTo doesn't consider the offset for black addresses. Any patterns involving these entities could have triggered extreme slowdowns in MachineCSE or PHIElimination. Let me know if there are PRs you think might be closed now... I'm looking myself, but I may miss them. llvm-svn: 159743	2012-07-05 11:06:22 +00:00
Duncan Sands	763b96beb7	All cases are covered, no need for a default. This deals with the corresponding clang warning. llvm-svn: 159742	2012-07-05 10:14:33 +00:00
Chandler Carruth	7623c9ec6d	The hash function for MI expressions, used by MachineCSE, is really broken. This patch fixes the superficial problems which lead to the intractably slow compile times reported in PR13225. The specific issue is that we were failing to include the offset of a global variable in the hash code. Oops. This would in turn cause all MIs which were only distinguishable due to operating on different offsets of a global variable to produce identical hash functions. In some of the test cases attached to the PR I saw hash table activity where there were O(1000) probes-per-lookup on average. A very few entries were responsible for most of these probes. There is still quite a bit more to do here. The ad-hoc layering of data in MachineOperands makes them extremely brittle to hash correctly. We're missing quite a few other cases, the only ones I've fixed here are the specific MO types which were allowed through the assert() in getOffset(). llvm-svn: 159741	2012-07-05 10:03:57 +00:00
Duncan Sands	f9eec0d373	Use the right kind of booleans: we were emitting 0/1 booleans, instead of 0/-1 booleans. Patch by James Benton. llvm-svn: 159739	2012-07-05 09:32:46 +00:00
Nick Lewycky	7e3f11f34d	Remove ParentMap. You can just ask the domnode for its parent. No functionality change. Move the "Not profitable, avoid CSE!" debug message next to where we fail the check for profitability and use a different message for avoiding CSE due to being in different register classes. llvm-svn: 159729	2012-07-05 06:19:21 +00:00
Jakob Stoklund Olesen	e8399d2b3a	Allow trailing physreg RegisterSDNode operands on non-variadic instructions. Also allow trailing register mask operands on non-variadic both MachineSDNodes and MachineInstrs. The extra physreg RegisterSDNode operands are added to the MI as <imp-use> operands. This makes it possible to have non-variadic call instructions. Call and return instructions really are non-variadic, the argument registers should only be used implicitly - they are not part of the encoding. llvm-svn: 159727	2012-07-04 23:53:23 +00:00
Jakob Stoklund Olesen	2d29683988	Print SlotIndexes when available for -print-machineinstrs. llvm-svn: 159726	2012-07-04 23:53:19 +00:00
Jakob Stoklund Olesen	bd6116e78c	Allow multiple terminators to read virtual registers. Find the kill as the last terminator to read SrcReg. Patch by Philipp Brüschweiler! llvm-svn: 159722	2012-07-04 19:52:05 +00:00
Jakob Stoklund Olesen	df04640e90	Make sure -print-machineinstrs applies to the first pass as well. llvm-svn: 159720	2012-07-04 19:28:27 +00:00
Stepan Dyatkovskiy	bb30cc0d2f	Reverted r156659, due to probable performance regressions, DenseMap should be used here: IntegersSubsetMapping - Replaced type of Items field from std::list with std::map. In neares future I'll test it with DenseMap and do the correspond replacement if possible. llvm-svn: 159703	2012-07-04 05:53:05 +00:00
Eric Christopher	8266e56fdb	Reduce some code duplication. llvm-svn: 159701	2012-07-04 02:02:18 +00:00
Matt Beaumont-Gay	723207e88f	Fix some ascii art in a comment to not have trailing backslashes (inspiration from IfConversion.cc), and fix some spelling and grammar in the surrounding prose. llvm-svn: 159699	2012-07-04 01:09:45 +00:00
Jakob Stoklund Olesen	db187a51eb	Add an experimental early if-conversion pass, off by default. This pass performs if-conversion on SSA form machine code by speculatively executing both sides of the branch and using a cmov instruction to select the result. This can help lower the number of branch mispredictions on architectures like x86 that don't have predicable instructions. The current implementation is very aggressive, and causes regressions on mosts tests. It needs good heuristics that have yet to be implemented. llvm-svn: 159694	2012-07-04 00:09:54 +00:00
Stepan Dyatkovskiy	a646b05bfe	Part of r159527. Splitted into series of patches and gone with fixed PR13256: IntegersSubsetMapping - Replaced type of Items field from std::list with std::map. In neares future I'll test it with DenseMap and do the correspond replacement if possible. llvm-svn: 159659	2012-07-03 13:46:45 +00:00
Eric Christopher	185293d560	Revert "IntRange:" as it appears to be breaking self hosting. This reverts commit b2833d9dcba88c6f0520cad760619200adc0442c. llvm-svn: 159618	2012-07-02 23:22:21 +00:00
Chandler Carruth	33def20e63	All glory to address sanitizer. ;] It appears to have caught a use-after-free introduced as by r159567 and/or friends which call 'addPass' from many more places. The bug in 'addPass' doesn't appear to be new, and was spotted by inspection when ASan shown a bright light of a stacktrace at these functions. Hopefully this will fix the ASan failure -- I have no test case other than running an ASan-built clang over the test suite. llvm-svn: 159614	2012-07-02 22:56:41 +00:00
Evan Cheng	6196c5f5f3	Target option DisableJumpTables is a gross hack. Move it to TargetLowering instead. llvm-svn: 159611	2012-07-02 22:39:56 +00:00
Andrew Trick	8bd81116b5	misched: allow NULL InstrItineraries. llvm-svn: 159599	2012-07-02 21:55:12 +00:00
Eric Christopher	6be444ca57	Turn an assert into an error to make it a bit more friendly. Part of rdar://6880388 and rdar://11766377 llvm-svn: 159590	2012-07-02 21:16:43 +00:00
Bob Wilson	a848f156de	Extend TargetPassConfig to allow running only a subset of the normal passes. This is still a work in progress but I believe it is currently good enough to fix PR13122 "Need unit test driver for codegen IR passes". For example, you can run llc with -stop-after=loop-reduce to have it dump out the IR after running LSR. Serializing machine-level IR is not yet supported but we have some patches in progress for that. The plan is to serialize the IR to a YAML file, containing separate sections for the LLVM IR, machine-level IR, and whatever other info is needed. Chad suggested that we stash the stop-after pass in the YAML file and use that instead of the start-after option to figure out where to restart the compilation. I think that's a great idea, but since it's not implemented yet I put the -start-after option into this patch for testing purposes. llvm-svn: 159570	2012-07-02 19:48:45 +00:00
Bob Wilson	979d49fd15	Move assertion with TargetPassConfig's Initialized flag. llvm-svn: 159569	2012-07-02 19:48:39 +00:00
Bob Wilson	7d344104a7	Consistently use AnalysisID types in TargetPassConfig. This makes it possible to just use a zero value to represent "no pass", so the phony NoPassID global variable is no longer needed. llvm-svn: 159568	2012-07-02 19:48:37 +00:00
Bob Wilson	0a1ef38836	Add all codegen passes to the PassManager via TargetPassConfig. This is a preliminary step toward having TargetPassConfig be able to start and stop the compilation at specified passes for unit testing and debugging. No functionality change. llvm-svn: 159567	2012-07-02 19:48:31 +00:00
Manman Ren	23eb3ecf32	Added assertion in getVRegDef of MachineRegisterInfo to make sure the virtual register does not have multiple definitions. Modified TwoAddressInstructionPass to use getUniqueVRegDef instead of getVRegDef. llvm-svn: 159545	2012-07-02 18:55:36 +00:00
Andrew Trick	baf8a62800	Reapply "Make NumMicroOps a variable in the subtarget's instruction itinerary." Reapplies r159406 with minor cleanup. The regressions appear to have been spurious. llvm-svn: 159541	2012-07-02 18:10:42 +00:00
Stepan Dyatkovskiy	fff4579249	IntRange: - Changed isSingleNumber method behaviour. Now this flag is calculated on demand. IntegersSubsetMapping - Optimized diff operation. - Replaced type of Items field from std::list with std::map. - Added new methods: bool isOverlapped(self &RHS) void add(self& RHS, SuccessorClass S) void detachCase(self& NewMapping, SuccessorClass Succ) void removeCase(SuccessorClass Succ) SuccessorClass findSuccessor(const IntTy& Val) const IntTy* getCaseSingleNumber(SuccessorClass *Succ) IntegersSubsetTest - DiffTest: Added checks for successors. SimplifyCFG Updated SwitchInst usage (now it is case-ragnes compatible) for - SimplifyEqualityComparisonWithOnlyPredecessor - FoldValueComparisonIntoPredecessors llvm-svn: 159527	2012-07-02 13:02:18 +00:00
Rafael Espindola	dd05a97f8e	Now that RegistersDefinedFromSameValue handles one instruction being an implicit_def, the other instruction can be anything, including instructions that define multiple values. Be careful about that and don't assume what operand 0 is. Fixes pr13249. llvm-svn: 159509	2012-07-01 17:08:01 +00:00
Rafael Espindola	065c63c4ca	Handle implicit_defs in the register coalescer. I am still trying to produce a reduced testcase, but this fixes pr13209. llvm-svn: 159479	2012-06-30 01:45:55 +00:00
Manman Ren	125c1ee4e9	Add SrcReg2 to analyzeCompare and optimizeCompareInstr to handle Compare instructions with two register operands. llvm-svn: 159465	2012-06-29 21:33:59 +00:00
Jakob Stoklund Olesen	b57dc22af0	Clear kill flags in InstrEmitter::EmitSubregNode(). When a local virtual register is made global, make sure to clear any existing kill flags. llvm-svn: 159461	2012-06-29 21:00:03 +00:00
Jakob Stoklund Olesen	3199cf0fda	Check for extra kill flags on live-out virtual registers. This would previously get reported as the misleading "Virtual register def doesn't dominate all uses." llvm-svn: 159460	2012-06-29 21:00:00 +00:00
Manman Ren	7d22489d4e	Add getUniqueVRegDef to MachineRegisterInfo. This comes in handy during peephole optimization. llvm-svn: 159453	2012-06-29 19:16:05 +00:00
Alexey Samsonov	7d35420c19	Cleanup in DwarfDebug - fix a typo and remove two unused functions llvm-svn: 159433	2012-06-29 16:04:14 +00:00
Chandler Carruth	4b51f99c87	Move llvm/Support/IRBuilder.h -> llvm/IRBuilder.h This was always part of the VMCore library out of necessity -- it deals entirely in the IR. The .cpp file in fact was already part of the VMCore library. This is just a mechanical move. I've tried to go through and re-apply the coding standard's preferred header sort, but at 40-ish files, I may have gotten some wrong. Please let me know if so. I'll be committing the corresponding updates to Clang and Polly, and Duncan has DragonEgg. Thanks to Bill and Eric for giving the green light for this bit of cleanup. llvm-svn: 159421	2012-06-29 12:38:19 +00:00
Bill Wendling	74b96ac7b8	The DIBuilder class is just a wrapper around debug info creation (a.k.a. MDNodes). The module doesn't belong in Analysis. Move it to the VMCore instead. llvm-svn: 159414	2012-06-29 08:32:07 +00:00
Andrew Trick	251f64f946	Revert "Make NumMicroOps a variable in the subtarget's instruction itinerary." This reverts commit r159406. I noticed a performance regression so I'll back out for now. llvm-svn: 159411	2012-06-29 07:10:41 +00:00
Andrew Trick	f990c5c8ba	misched: avoid scheduling instructions that can't be dispatched. llvm-svn: 159408	2012-06-29 03:23:24 +00:00
Andrew Trick	bcf581b08c	misched: count micro-ops toward the issue limit. llvm-svn: 159407	2012-06-29 03:23:22 +00:00
Andrew Trick	52238a0ce5	Make NumMicroOps a variable in the subtarget's instruction itinerary. The TargetInstrInfo::getNumMicroOps API does not change, but soon it will be used by MachineScheduler. Now each subtarget can specify the number of micro-ops per itinerary class. For ARM, this is currently always dynamic (-1), because it is used for load/store multiple which depends on the number of register operands. Zero is now a valid number of micro-ops. This can be used for nop pseudo-instructions or instructions that the hardware can squash during dispatch. llvm-svn: 159406	2012-06-29 03:23:18 +00:00
Nuno Lopes	031ca196d0	add a new @llvm.donothing intrinsic that, well, does nothing, and teach CodeGen to ignore calls to it llvm-svn: 159383	2012-06-28 22:30:12 +00:00
Jim Grosbach	fa5486e817	'Promote' vector [su]int_to_fp should widen elements. Teach vector legalization how to honor Promote for int to float conversions. The code checking whether to promote the operation knew to look at the operand, but the actual promotion code didn't. This fixes that. The operand is promoted up via [zs]ext. rdar://11762659 llvm-svn: 159378	2012-06-28 21:03:44 +00:00
Bill Wendling	e8949ecfa6	Move lib/Analysis/DebugInfo.cpp to lib/VMCore/DebugInfo.cpp and include/llvm/Analysis/DebugInfo.h to include/llvm/DebugInfo.h. The reasoning is because the DebugInfo module is simply an interface to the debug info MDNodes and has nothing to do with analysis. llvm-svn: 159312	2012-06-28 00:05:13 +00:00
Jakob Stoklund Olesen	ee1b7565a6	Allow targets to inject passes before the virtual register rewriter. Such passes can be used to tweak the register assignments in a target-dependent way, for example to avoid write-after-write dependencies. llvm-svn: 159209	2012-06-26 17:09:29 +00:00
Chandler Carruth	872747b0f0	Update a bunch of stale comments that dated from when this folled the very first (and worst) placement algorithm. These should now more accurately reflect the reality of the pass. llvm-svn: 159185	2012-06-26 05:16:37 +00:00
Andrew Trick	c5e08120a4	Enable the new LoopInfo algorithm by default. The primary advantage is that loop optimizations will be applied in a stable order. This helps debugging and unit test creation. It is also a better overall implementation without pathologically bad performance on deep functions. On large functions (llvm-stress --size=200000 \| opt -loops) Before: 0.1263s After: 0.0225s On deep functions (after tweaking llvm-stress, thanks Nadav): Before: 0.2281s After: 0.0227s See r158790 for more comments. The loop tree is now consistently generated in forward order, but loop passes are applied in reverse order over the program. If we have a loop optimization that prefers forward order, that can easily be achieved by adding a different type of LoopPassManager. llvm-svn: 159183	2012-06-26 04:11:38 +00:00
Evan Cheng	652c7c94d4	Make sure type is not extended or untyped before create a constant of the type. No test case. Found by inspection. llvm-svn: 159179	2012-06-26 01:19:33 +00:00
Jakob Stoklund Olesen	cc79c28e91	Enforce stricter liveness rules for PHIs. Verify that all paths from the entry block to a virtual register read pass through a def. Enable this check even when MRI->isSSA() is false. Verify that the live range of a virtual register is live out of all predecessor blocks, even for PHI-values. This requires that PHIElimination sometimes inserts IMPLICIT_DEF instruction in predecessor blocks. llvm-svn: 159150	2012-06-25 18:18:27 +00:00
Jakob Stoklund Olesen	9333a7fb3b	Run ProcessImplicitDefs on SSA form where it can be much simpler. Implicitly defined virtual registers can simply have the <undef> bit set on all uses, and copies can be turned into implicit defs recursively. Physical registers are a bit trickier. We handle the common case where a physreg def is used by a nearby instruction in the same basic block. For more complicated cases, just leave the IMPLICIT_DEF instruction in. llvm-svn: 159149	2012-06-25 18:12:18 +00:00
Jakob Stoklund Olesen	dc90d3ffc2	Teach PHIElimination to handle <undef> operands. When a PHI use is <undef>, don't emit a copy in the predecessor block, but insert an IMPLICIT_DEF instruction instead. This ensures that virtual register uses are always jointly dominated by defs, even if some of them are IMPLICIT_DEF. llvm-svn: 159121	2012-06-25 03:36:12 +00:00
Jakob Stoklund Olesen	8fc784fec2	Handle <undef> operands in TwoAddressInstructionPass. When the source register to a 2-addr instruction is undefined, there is no need to attempt any transformations - simply replace the source register with the destination register. This also comes up when lowering IMPLICIT_DEF instructions - make sure the <undef> flag is moved to the new partial register def operand: %vreg8<def> = INSERT_SUBREG %vreg9<undef>, %vreg0<kill>, sub_16bit rewrite undef: %vreg8<def> = INSERT_SUBREG %vreg8<undef>, %vreg0<kill>, sub_16bit convert to: %vreg8:sub_16bit<def,read-undef> = COPY %vreg0<kill> llvm-svn: 159120	2012-06-25 03:27:12 +00:00
NAKAMURA Takumi	4599dee67a	llvm/lib: [CMake] Add explicit dependency to intrinsics_gen. llvm-svn: 159112	2012-06-24 13:32:01 +00:00
Pete Cooper	9f89f00988	DAG legalisation can now handle illegal fma vector types by scalarisation llvm-svn: 159092	2012-06-24 00:05:44 +00:00
Jakob Stoklund Olesen	c904a78f1e	Teach LiveVariables to handle <undef> operands. It's simple: Don't treat <undef> operands as uses, and don't assume a virtual register has a defining instruction unless a real use has been seen. llvm-svn: 159061	2012-06-23 02:23:00 +00:00
Jakob Stoklund Olesen	70a37b6a67	Remove ProcessImplicitDefs.h which was unused. The ProcessImplicitDefs class can be local to its implementation file. llvm-svn: 159041	2012-06-22 22:27:36 +00:00
Jakob Stoklund Olesen	f3226a960c	Also verify the def index for early clobbers. llvm-svn: 159039	2012-06-22 22:23:58 +00:00
Jakob Stoklund Olesen	3a972a4f8d	Delete a boring statistic. llvm-svn: 159030	2012-06-22 20:40:15 +00:00
Jakob Stoklund Olesen	5b5a4305f1	Store live intervals in an IndexedMap. It is both smaller and faster than DenseMap. llvm-svn: 159029	2012-06-22 20:37:52 +00:00
Hal Finkel	db4f1462bf	Revert r158679 - use case is unclear (and it increases the memory footprint). Original commit message: Allow up to 64 functional units per processor itinerary. This patch changes the type used to hold the FU bitset from unsigned to uint64_t. This will be needed for some upcoming PowerPC itineraries. llvm-svn: 159027	2012-06-22 20:27:13 +00:00
Jakob Stoklund Olesen	c50d6ad4cf	Fix a crash in --debug code. Don't try to print out the live range of a physreg. llvm-svn: 159021	2012-06-22 19:51:41 +00:00
Jakob Stoklund Olesen	a949faa533	Don't depend on live ranges being present. DBG_VALUE instructions could be referring to non-existing virtual registers. llvm-svn: 159020	2012-06-22 18:51:35 +00:00
Jakob Stoklund Olesen	8e2cfdc8f4	Simplify handleMove() a bit. There is no need to check for physreg live ranges. They don't exist any more. llvm-svn: 159019	2012-06-22 18:38:57 +00:00
Jakob Stoklund Olesen	a925ef2596	Stop computing physreg live ranges. Everyone is using on-demand regunit ranges now. llvm-svn: 159018	2012-06-22 18:20:50 +00:00
Jakob Stoklund Olesen	20b93fa363	Remove some redundant LIS->hasInterval() checks. These functions only operate on virtual registers now, and they all have live ranges. llvm-svn: 159015	2012-06-22 17:49:44 +00:00
Jakob Stoklund Olesen	9c98d0b233	Use MRI::isConstantPhysReg() to check remat feasibility. Don't depend on LiveIntervals::hasInterval() to determine if a physreg is reserved and constant. llvm-svn: 159013	2012-06-22 17:31:01 +00:00
Jakob Stoklund Olesen	4a4346d2da	Use regunit liveness to guide LiveDebugVariables. This should produce the same results as using physreg liveness directly. llvm-svn: 159009	2012-06-22 17:15:32 +00:00
Jakob Stoklund Olesen	0d48b013fb	Remove LiveIntervals::trackingRegUnits(). With regunit liveness permanently enabled, this function would always return true. Also remove now obsolete code for checking physreg interference. llvm-svn: 159006	2012-06-22 16:46:44 +00:00
Rafael Espindola	0280a5d85b	Remove another duplicated variable. We only need one to tell us if the linker knows dwarf or not. llvm-svn: 158993	2012-06-22 13:32:49 +00:00
Rafael Espindola	13084dd6a3	Fix a FIXME: DwarfRequiresRelocationForSectionOffset is the same as DwarfUsesRelocationsAcrossSections. llvm-svn: 158992	2012-06-22 13:24:07 +00:00
Nick Lewycky	da52706728	Emit relocations for DW_AT_location entries on systems which need it. This is a recommit of r127757. Fixes PR9493. Patch by Paul Robinson! llvm-svn: 158957	2012-06-22 01:25:12 +00:00
Lang Hames	68cf87e3ef	Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a boolean flag to an enum: { Fast, Standard, Strict } (default = Standard). This option controls the creation by optimizations of fused FP ops that store intermediate results in higher precision than IEEE allows (E.g. FMAs). The behavior of this option is intended to match the behaviour specified by a soon-to-be-introduced frontend flag: '-ffuse-fp-ops'. Fast mode - allows formation of fused FP ops whenever they're profitable. Standard mode - allow fusion only for 'blessed' FP ops. At present the only blessed op is the fmuladd intrinsic. In the future more blessed ops may be added. Strict mode - allow fusion only if/when it can be proven that the excess precision won't effect the result. Note: This option only controls formation of fused ops by the optimizers. Fused operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic) will always be honored, regardless of the value of this option. Internally TargetOptions::AllowExcessFPPrecision has been replaced by TargetOptions::AllowFPOpFusion. llvm-svn: 158956	2012-06-22 01:09:09 +00:00
Jack Carter	ecfcd0f81b	The inline asm operand modifier 'n' is suppose to be generic across architectures. It has the following description in the gnu sources: Negate the immediate constant Several Architectures such as x86 have local implementations of operand modifier 'n' which go beyond the above description slightly. This won't affect them. Affected files: lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp Added 'n' to the switch cases. test/CodeGen/Generic/asm-large-immediate.ll Generic compiled test (x86 for me) test/CodeGen/Mips/asm-large-immediate.ll Mips compiled version of the generic one Contributer: Jack Carter llvm-svn: 158939	2012-06-21 21:37:54 +00:00
Pete Cooper	24914c84a0	Fix potential crash if DAGCombine on stores sees a half type llvm-svn: 158927	2012-06-21 18:00:39 +00:00
Jack Carter	533bef32ae	The inline asm operand modifier 'c' is suppose to be generic across architectures. It has the following description in the gnu sources: Substitute immediate value without immediate syntax Several Architectures such as x86 have local implementations of operand modifier 'c' which go beyond the above description slightly. To make use of the generic modifiers without overriding local implementation one can make a call to the base class method for AsmPrinter::PrintAsmOperand() in the locally derived method's "default" case in the switch statement. That way if it is already defined locally the generic version will never get called. This change is needed when test/CodeGen/generic/asm-large-immediate.ll failed on a native Mips board. The test was assuming a generic implementation was in place. Affected files: lib/Target/Mips/MipsAsmPrinter.cpp: Changed the default case to call the base method. lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp Added 'c' to the switch cases. test/CodeGen/Mips/asm-large-immediate.ll Mips compiled version of the generic one Contributer: Jack Carter llvm-svn: 158925	2012-06-21 17:14:46 +00:00
Evan Cheng	5e3a175c65	Emit a single _udivmodsi4 libcall instead of two separate _udivsi3 and _umodsi3 libcalls if they have the same arguments. This optimization was apparently broken if one of the node was replaced in place. rdar://11714607 llvm-svn: 158900	2012-06-21 05:56:05 +00:00
Jakob Stoklund Olesen	d23194c4ed	Update regunits in RegisterCoalescer::reMaterializeTrivialDef. Old code would only update physreg live intervals. llvm-svn: 158881	2012-06-21 00:09:15 +00:00

... 2 3 4 5 6 ...

14033 Commits