llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00

Author	SHA1	Message	Date
Hans Wennborg	09865c30b6	SelectionDAG switch lowering: Replace unreachable default with most popular case. This can significantly reduce the size of the switch, allowing for more efficient lowering. I also worked with the idea of exploiting unreachable defaults by omitting the range check for jump tables, but always ended up with a non-neglible binary size increase. It might be worth looking into some more. llvm-svn: 223049	2014-12-01 17:08:32 +00:00
Akira Hatanaka	cad434d590	[stack protector] Set edge weights for newly created basic blocks. This commit fixes a bug in stack protector pass where edge weights were not set when new basic blocks were added to lists of successor basic blocks. Differential Revision: http://reviews.llvm.org/D5766 llvm-svn: 222987	2014-12-01 04:27:03 +00:00
Hans Wennborg	4856812966	Switch lowering: reformat some for loops etc. NFC llvm-svn: 222962	2014-11-29 21:24:12 +00:00
Hans Wennborg	22fda335d6	Switch lowering: Fix broken 'Figure out which block is next' code This doesn't seem to have worked in a long time, but other optimizations would clean it up. llvm-svn: 222961	2014-11-29 21:17:05 +00:00
Simon Pilgrim	3d5273ea80	Target triple OS detection tidyup. NFC Use Triple::isOS*() helpers where possible. llvm-svn: 222960	2014-11-29 19:18:21 +00:00
Duncan P. N. Exon Smith	73ce6dbb2b	Revert "Masked Vector Load and Store Intrinsics." This reverts commit r222632 (and follow-up r222636), which caused a host of LNT failures on an internal bot. I'll respond to the commit on the list with a reproduction of one of the failures. Conflicts: lib/Target/X86/X86TargetTransformInfo.cpp llvm-svn: 222936	2014-11-28 21:29:14 +00:00
Elena Demikhovsky	25f6c9047c	Converted back to Unix format (after my last commit 222632) llvm-svn: 222636	2014-11-23 15:21:53 +00:00
Elena Demikhovsky	36a2243ab7	Masked Vector Load and Store Intrinsics. Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align /, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8 %addr, <8 x double> %value, i32 4, <8 x i1> %mask) Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch. http://reviews.llvm.org/D6191 llvm-svn: 222632	2014-11-23 08:07:43 +00:00
Manman Ren	8be1069f3f	Debug Info: revert r222195, r222210 and r222239. This is no longer needed after David's fix at r222377 + r222485. rdar://18958417 llvm-svn: 222563	2014-11-21 19:55:23 +00:00
Manman Ren	ea36e798d4	[Objective-C] Support a new special module flag that will be put into the objc_imageinfo struct. rdar://17954668 llvm-svn: 222558	2014-11-21 19:24:55 +00:00
Sanjay Patel	5d493f5d01	Don't repeat class/function/variable names in comments. NFC. llvm-svn: 222555	2014-11-21 18:58:38 +00:00
Sanjay Patel	e65f60a9c9	Less space; NFC llvm-svn: 222546	2014-11-21 18:05:59 +00:00
Andrea Di Biagio	0a8cf1ad5a	[DAG] Teach how to turn a build_vector into a shuffle if some of the operands are zero. Before this patch, the DAGCombiner only tried to convert build_vector dag nodes into shuffles if all operands were either extract_vector_elt or undef. This patch improves that logic and teaches the DAGCombiner how to deal with build_vector dag nodes where one or more operands are zero. A build_vector dag node with some zero operands is turned into a shuffle only if the resulting shuffle mask is legal for the target. llvm-svn: 222536	2014-11-21 14:32:06 +00:00
Andrea Di Biagio	9c99df5e6c	[DAG] Refactor the shuffle combining logic in DAGCombiner. NFC. This patch simplifies the logic that combines a pair of shuffle nodes into a single shuffle if there is a legal mask. Also added comments to better describe the algorithm. No functional change intended. llvm-svn: 222522	2014-11-21 11:33:07 +00:00
Hao Liu	9cb82be410	DAGCombiner: Allow the DAGCombiner to combine multiple FDIVs with the same divisor info FMULs by the reciprocal. E.g., ( a / D; b / D ) -> ( recip = 1.0 / D; a * recip; b * recip) A hook is added to allow the target to control whether it needs to do such combine. Reviewed in http://reviews.llvm.org/D6334 llvm-svn: 222510	2014-11-21 06:39:58 +00:00
Matthias Braun	6ce466d916	RegisterCoalescer: Improve debug messages - Show "Considering..." message after flipping so you actually see the final destination vreg as destination. - Add a message on final join, so you can grep for "Success" messages to obtain a list of which register got merged with which. llvm-svn: 222382	2014-11-19 19:46:17 +00:00
Matthias Braun	50dcec92ed	Add a print and verify pass after the RegisterCoalescer llvm-svn: 222381	2014-11-19 19:46:15 +00:00
Matthias Braun	e700647af2	MachineVerifier: Report register for bad liveranges llvm-svn: 222380	2014-11-19 19:46:13 +00:00
Matthias Braun	314ef39016	Introduce register dump helper llvm-svn: 222379	2014-11-19 19:46:11 +00:00
Simon Pilgrim	e5f972f1c1	[X86][SSE] pslldq/psrldq byte shifts/rotation for SSE2 This patch builds on http://reviews.llvm.org/D5598 to perform byte rotation shuffles (lowerVectorShuffleAsByteRotate) on pre-SSSE3 (palignr) targets - pre-SSSE3 is only enabled on i8 and i16 vector targets where it is a more definite performance gain. I've also added a separate byte shift shuffle (lowerVectorShuffleAsByteShift) that makes use of the ability of the SLLDQ/SRLDQ instructions to implicitly shift in zero bytes to avoid the need to create a zero register if we had used palignr. Differential Revision: http://reviews.llvm.org/D5699 llvm-svn: 222340	2014-11-19 10:06:49 +00:00
David Blaikie	60e6c80905	Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... llvm-svn: 222334	2014-11-19 07:49:26 +00:00
David Blaikie	7499cbae4c	Remove StringMap::GetOrCreateValue in favor of StringMap::insert Having two ways to do this doesn't seem terribly helpful and consistently using the insert version (which we already has) seems like it'll make the code easier to understand to anyone working with standard data structures. (I also updated many references to the Entry's key and value to use first() and second instead of getKey{Data,Length,} and get/setValue - for similar consistency) Also removes the GetOrCreateValue functions so there's less surface area to StringMap to fix/improve/change/accommodate move semantics, etc. llvm-svn: 222319	2014-11-19 05:49:42 +00:00
Owen Anderson	4490a7cce1	Fix an incorrect chain operand when expanding INSERT_VECTOR operations through the stack. Patch by Daniil Troshkov! llvm-svn: 222254	2014-11-18 20:50:19 +00:00
Frederic Riss	f1bfe6e383	Allow DwarfCompileUnit::constructImportedEntityDIE to instanciate a GlobalVariable DIE. Usually global variables are in a retain list and instanciated before any call to constructImportedEntityDIE is made. This isn't true for forward declarations though. The testcase for this change is generated by a clang patched to emit such forward declarations (patch at http://reviews.llvm.org/D6173 which will land soon). The updated testcase tests more than just global variables, it now tests every type of 'using' clause we support. llvm-svn: 222217	2014-11-18 02:46:11 +00:00
Manman Ren	9b65b2864d	Debug Info: In DIBuilder, the context field of a global variable is updated to use DIScopeRef. A paired commit at clang will follow to show cases where we will use an identifer for the context of a global variable. rdar://18958417 llvm-svn: 222195	2014-11-18 00:29:08 +00:00
Oliver Stannard	2efad103c3	Fix optimisations of SELECT_CC which assumed result is boolean Some optimisations in DAGCombiner cause miscompilations for targets that use TargetLowering::UndefinedBooleanContent, because they assume that the results of a SELECT_CC node are boolean values, and can be safely ANDed, ORed and XORed. These optimisations are only valid for targets that use ZeroOrOneBooleanContent or ZeroOrNegativeOneBooleanContent. This is a follow-up to D6210/r221693. llvm-svn: 222123	2014-11-17 10:49:31 +00:00
Craig Topper	bc3d6e1d6d	Add missing semicolon from r222118. llvm-svn: 222119	2014-11-17 05:58:26 +00:00
Craig Topper	5b6e56da60	Move register class name strings to a single array in MCRegisterInfo to reduce static table size and number of relocation entries. Indices into the table are stored in each MCRegisterClass instead of a pointer. A new method, getRegClassName, is added to MCRegisterInfo and TargetRegisterInfo to lookup the string in the table. llvm-svn: 222118	2014-11-17 05:50:14 +00:00
Craig Topper	2aa3e27053	Replace a couple asserts with static_asserts. llvm-svn: 222114	2014-11-17 00:26:50 +00:00
Craig Topper	5121b0369b	Convert some EVTs to MVTs where only a SimpleValueType is needed. llvm-svn: 222109	2014-11-16 21:17:18 +00:00
Andrea Di Biagio	5475bc1d1b	[DAG] Improved target independent vector shuffle folding logic. This patch teaches the DAGCombiner how to combine shuffles according to rules: shuffle(shuffle(A, Undef, M0), B, M1) -> shuffle(B, A, M2) shuffle(shuffle(A, B, M0), B, M1) -> shuffle(B, A, M2) shuffle(shuffle(A, B, M0), A, M1) -> shuffle(B, A, M2) llvm-svn: 222090	2014-11-15 22:56:25 +00:00
Reid Kleckner	1bf13cbc83	Rename EH related stuff to be more precise Summary: The current "WinEH" exception handling type is more about Itanium-style LSDA tables layered on top of the Windows native unwind info format instead of .eh_frame tables or EHABI unwind info. Use the name "ItaniumWinEH" to better reflect the hybrid nature of the design. Also rename isExceptionHandlingDWARF to usesItaniumLSDAForExceptions, since the LSDA is part of the Itanium C++ ABI document, and not the DWARF standard. Reviewers: echristo Subscribers: llvm-commits, compnerd Differential Revision: http://reviews.llvm.org/D6279 llvm-svn: 222062	2014-11-14 23:31:07 +00:00
Reid Kleckner	c90bcabd2d	Allow the use of functions as typeinfo in landingpad clauses This is one step towards supporting SEH filter functions in LLVM. llvm-svn: 221954	2014-11-14 00:35:50 +00:00
Reid Kleckner	2fa78fff68	Use nullptr instead of NULL for variadic sentinels Windows defines NULL to 0, which when used as an argument to a variadic function, is not a null pointer constant. As a result, Clang's -Wsentinel fires on this code. Using '0' would be wrong on most 64-bit platforms, but both MSVC and Clang make it work on Windows. Sidestep the issue with nullptr. llvm-svn: 221940	2014-11-13 22:55:19 +00:00
Aditya Nandakumar	4d9c1ff994	We can get the TLOF from the TargetMachine - so constructor no longer requires TargetLoweringObjectFile to be passed. llvm-svn: 221926	2014-11-13 21:29:21 +00:00
Aditya Nandakumar	b93fb292df	This patch changes the ownership of TLOF from TargetLoweringBase to TargetMachine so that different subtargets could share the TLOF effectively llvm-svn: 221878	2014-11-13 09:26:31 +00:00
Frederic Riss	67f2855174	Add an assert and a test that verify r221709's fix. llvm-svn: 221854	2014-11-13 03:20:23 +00:00
Quentin Colombet	9239f4dcef	[CodeGenPrepare] Handle zero extensions in the TypePromotionHelper. Prior to this patch the TypePromotionHelper was promoting only sign extensions. Supporting zero extensions changes: - How constants are extended. - How sign extensions, zero extensions, and truncate are composed together. - How the type of the extended operation is recorded. Now we need to know the kind of the extension as well as its type. Each change is fairly small, unlike the diff. Most of the diff are comments/variable renaming to say "extension" instead of "sign extension". The performance improvements on the test suite are within the noise. Related to <rdar://problem/18310086>. llvm-svn: 221851	2014-11-13 01:44:51 +00:00
Frederic Riss	10a0fde419	Fix emission of Dwarf accelerator table when there are multiple CUs. The DIE offset in the accel tables is an offset relative to the start of the debug_info section, but we were encoding the offset to the start of the containing CU. llvm-svn: 221837	2014-11-12 23:48:14 +00:00
Ahmed Bougacha	d50e1d8236	[CodeGenPrepare] Replace other uses of EVT::getEVT with TL::getValueType. r221820 fixed a problem (PR21548) where an iPTR was used in TLI legality checks, which isn't valid and resulted in a failed assertion. The solution was to lower pointer types into the correct target's VT, by using TL::getValueType instead of EVT::getEVT. This commit changes 3 other uses of EVT::getEVT, but without any tests: - One of these non-lowered EVTs is passed to allowsMisalignedMemoryAccesses, which goes into target's TL implementation and doesn't cause any problem (yet.) - Two others are passed to TLI.isOperationLegalOrCustom: - one only looks at extensions, so doesn't concern pointers. - one only looks at binary operators, so also isn't a problem. The latter might some day be exposed to pointers and cause the same assert as the original PR, because there's a comment hinting at also supporting cast ops. For consistency, update all of them and be done with it. llvm-svn: 221827	2014-11-12 23:05:03 +00:00
Ahmed Bougacha	6f5323f5b8	[CodeGenPrepare][AArch64] Fix a TLI legality check on iPTR to use a lowered instead. Fixes PR21548. Related to PR20474. llvm-svn: 221820	2014-11-12 22:16:55 +00:00
Timur Iskhodzhanov	3a769074f4	Temporary fix for PR21528 - use mangled C++ function names in COFF debug info to un-break ASan on Windows llvm-svn: 221813	2014-11-12 20:21:20 +00:00
Timur Iskhodzhanov	30b699e1ad	[COFF] Make it clearer that the symbols subsection holds function display name rather than just name llvm-svn: 221812	2014-11-12 20:10:09 +00:00
Duncan P. N. Exon Smith	8770505e4e	Revert "IR: MDNode => Value" Instead, we're going to separate metadata from the Value hierarchy. See PR21532. This reverts commit r221375. This reverts commit r221373. This reverts commit r221359. This reverts commit r221167. This reverts commit r221027. This reverts commit r221024. This reverts commit r221023. This reverts commit r220995. This reverts commit r220994. llvm-svn: 221711	2014-11-11 21:30:22 +00:00
Tom Roeder	71097ab326	Fix build break: remove unused variable in FCFI. llvm-svn: 221710	2014-11-11 21:26:33 +00:00
Frederic Riss	b2c059475b	Totally forget deallocated SDNodes in SDDbgInfo. What would happen before that commit is that the SDDbgValues associated with a deallocated SDNode would be marked Invalidated, but SDDbgInfo would keep a map entry keyed by the SDNode pointer pointing to this list of invalidated SDDbgNodes. As the memory gets reused, the list might get wrongly associated with another new SDNode. As the SDDbgValues are cloned when they are transfered, this can lead to an exponential number of SDDbgValues being produced during DAGCombine like in http://llvm.org/bugs/show_bug.cgi?id=20893 Note that the previous behavior wasn't really buggy as the invalidation made sure that the SDDbgValues won't be used. This commit can be considered a memory optimization and as such is really hard to validate in a unit-test. llvm-svn: 221709	2014-11-11 21:21:08 +00:00
Tom Roeder	f8bc1a9968	Add Forward Control-Flow Integrity. This commit adds a new pass that can inject checks before indirect calls to make sure that these calls target known locations. It supports three types of checks and, at compile time, it can take the name of a custom function to call when an indirect call check fails. The default failure function ignores the error and continues. This pass incidentally moves the function JumpInstrTables::transformType from private to public and makes it static (with a new argument that specifies the table type to use); this is so that the CFI code can transform function types at call sites to determine which jump-instruction table to use for the check at that site. Also, this removes support for jumptables in ARM, pending further performance analysis and discussion. Review: http://reviews.llvm.org/D4167 llvm-svn: 221708	2014-11-11 21:08:02 +00:00
Oliver Stannard	553b791f8f	LLVM incorrectly folds xor into select LLVM replaces the SelectionDAG pattern (xor (set_cc cc x y) 1) with (set_cc !cc x y), which is only correct when the xor has type i1. Instead, we should check that the constant operand to the xor is all ones. llvm-svn: 221693	2014-11-11 17:36:01 +00:00
Saleem Abdulrasool	daa164bdfd	Transforms: address some late comments We already use the llvm namespace. Remove the unnecessary prefix. Use the StringRef::equals method to compare with C strings rather than instantiating std::strings. Addresses late review comments from David Majnemer. llvm-svn: 221564	2014-11-08 00:00:50 +00:00
Saleem Abdulrasool	95ca4876af	Transform: add SymbolRewriter pass This introduces the symbol rewriter. This is an IR->IR transformation that is implemented as a CodeGenPrepare pass. This allows for the transparent adjustment of the symbols during compilation. It provides a clean, simple, elegant solution for symbol inter-positioning. This technique is often used, such as in the various sanitizers and performance analysis. The control of this is via a custom YAML syntax map file that indicates source to destination mapping, so as to avoid having the compiler to know the exact details of the source to destination transformations. llvm-svn: 221548	2014-11-07 21:32:08 +00:00
Lang Hames	04f05ca81d	[RegAlloc] Kill off the trivial spiller - nobody is using it any more. llvm-svn: 221474	2014-11-06 19:12:38 +00:00
Rafael Espindola	876eed707a	Compute the correct jump table entries on 32 bit windows. On 32 bit windows we use label differences and .set does not suppress rolocations, a combination that was not used before r220256. This fixes PR21497. llvm-svn: 221456	2014-11-06 14:39:49 +00:00
Rafael Espindola	c77d1073d6	Add three other sections when L symbols are allowed. llvm-svn: 221436	2014-11-06 05:01:21 +00:00
Rafael Espindola	7a9d694412	Allow L symbols in no_dead_strip sections. If a section cannot be dead stripped, it is safe to use L symbols, since the linker will keep all of it in the end. llvm-svn: 221431	2014-11-06 02:42:03 +00:00
Duncan P. N. Exon Smith	fb1e9e11dd	IR: MDNode => Value: NamedMDNode::getOperator() Change `NamedMDNode::getOperator()` from returning `MDNode ` to returning `Value `. To reduce boilerplate at some call sites, add a `getOperatorAsMDNode()` for named metadata that's expected to only return `MDNode` -- for now, that's everything, but debug node named metadata (such as llvm.dbg.cu and llvm.dbg.sp) will soon change. This is part of PR21433. Note that there's a follow-up patch to clang for the API change. llvm-svn: 221375	2014-11-05 18:16:03 +00:00
Andrea Di Biagio	38de0c92c6	[X86] Teach method 'isVectorClearMaskLegal' how to check for legal blend masks. This patch improves the folding of vector AND nodes into blend operations for targets that feature SSE4.1. A vector AND node where one of the operands is a constant build_vector with elements that are either zero or all-ones can be converted into a blend. This allows for example to simplify the following code: define <4 x i32> @test(<4 x i32> %A, <4 x i32> %B) { %1 = and <4 x i32> %A, <i32 0, i32 0, i32 0, i32 -1> %2 = and <4 x i32> %B, <i32 -1, i32 -1, i32 -1, i32 0> %3 = or <4 x i32> %1, %2 ret <4 x i32> %3 } Before this patch llc (-mcpu=corei7) generated: andps LCPI1_0(%rip), %xmm0, %xmm0 andps LCPI1_1(%rip), %xmm1, %xmm1 orps %xmm1, %xmm0, %xmm0 retq With this patch we generate a single 'vpblendw'. llvm-svn: 221343	2014-11-05 13:04:14 +00:00
Craig Topper	f68768ec97	Improve logic that decides if its profitable to commute when some of the virtual registers involved have uses/defs chains connecting them to physical register. Fix up the tests that this change improves. llvm-svn: 221336	2014-11-05 06:43:02 +00:00
David Blaikie	51a4f6a794	Provide gmlt-like inline scope information in the skeleton CU to facilitate symbolication without needing the .dwo files Clang -gsplit-dwarf self-host -O0, binary increases by 0.0005%, -O2, binary increases by 25%. A large binary inside Google, split-dwarf, -O0, and other internal flags (GDB index, etc) increases by 1.8%, optimized build is 35%. The size impact may be somewhat greater in .o files (I haven't measured that much - since the linked executable -O0 numbers seemed low enough) due to relocations. These relocations could be removed if we taught the llvm-symbolizer to handle indexed addressing in the .o file (GDB can't cope with this just yet, but GDB won't be reading this info anyway). Also debug_ranges could be shared between .o and .dwo, though ideally debug_ranges would get a schema that could used index(+offset) addressing, and move to the .dwo file, then we'd be back to sharing addresses in the address pool again. But for now, these sizes seem small enough to go ahead with this. Verified that no other DW_TAGs are produced into the .o file other than subprograms and inlined_subroutines. llvm-svn: 221306	2014-11-04 22:12:25 +00:00
David Blaikie	a451dab55d	Move cross-unit DIE caching to the DwarfFile level, so it doesn't interfere with fission-gmlt data and produce skeleton<>full unit cross referencing. llvm-svn: 221305	2014-11-04 22:12:18 +00:00
Arnaud A. de Grandmaison	bd371f63c3	[PBQP] Callee saved regs should have a higher cost than scratch regs Registers are not all equal. Some are not allocatable (infinite cost), some have to be preserved but can be used, and some others are just free to use. Ensure there is a cost hierarchy reflecting this fact, so that the allocator will favor scratch registers over callee-saved registers. llvm-svn: 221293	2014-11-04 20:51:29 +00:00
Arnaud A. de Grandmaison	62bb366a9b	[PBQP] Tweak spill costs and coalescing benefits This patch improves how the different costs (register, interference, spill and coalescing) relates together. The assumption is now that: - coalescing (or any other "side effect" of reg alloc) is negative, and instead of being derived from a spill cost, they use the block frequency info. - spill costs are in the [MinSpillCost:+inf( range - register or interference costs are in [0.0:MinSpillCost( or +inf The current MinSpillCost is set to 10.0, which is a random value high enough that the current constraint builders do not need to worry about when settings costs. It would however be worth adding a normalization step for register and interference costs as the last step in the constraint builder chain to ensure they are not greater than SpillMinCost (unless this has some sense for some architectures). This would work well with the current builder pipeline, where all costs are tweaked relatively to each others, but could grow above MinSpillCost if the pipeline is deep enough. The current heuristic is tuned to depend rather on the number of uses of a live interval rather than a density of uses, as used by the greedy allocator. This heuristic provides a few percent improvement on a number of benchmarks (eembc, spec, ...) and will definitely need to change once spill placement is implemented: the current spill placement is really ineficient, so making the cost proportionnal to the number of use is a clear win. llvm-svn: 221292	2014-11-04 20:51:24 +00:00
David Majnemer	d052a12dd9	CodeGen: Enable DWARF emission for MS ABI targets This is experimental, just barely enough to get things to not immediately combust. A note for those who are curious: Only lld can successfully link the object files, other linkers truncate the section names making the debug sections illegible to debuggers. Even with this in mind, we believe we are having trouble with SECREL relocations. llvm-svn: 221245	2014-11-04 08:03:31 +00:00
Sanjoy Das	2351ae23a7	The patchpoint lowering logic would crash with live constants equal to the tombstone or empty keys of a DenseMap<int64_t, T>. This patch fixes the issue (and adds a tests case). llvm-svn: 221214	2014-11-04 00:59:21 +00:00
Sanjoy Das	f94294baf2	Change logic in StackMaps::recordStackMapOpers to use the isInt<32> predicate instead of bitwise operations. This is not a functional change. llvm-svn: 221209	2014-11-04 00:06:57 +00:00
David Blaikie	74faf3ecee	Use common range handling for the CU's ranges This generalizes the range handling for ranges in both the skeleton and full unit, laying the foundation for the addition of more ranges (rather than just the CU's special case) in the skeleton CU with fission+gmlt. llvm-svn: 221202	2014-11-03 23:10:59 +00:00
David Blaikie	d52ffe9078	Push the CURangeList down into the skeleton CU (where available) rather than the full CU So that it may be shared between skeleton/full compile unit, for CU ranges and other ranges to be added for fission+gmlt. (at some point we might want some kind of object shared between the skeleton and full compile units for all those things we only want one of in that scope, rather than having the full unit always look through to the skeleton... - alternatively, we might be able to have the skeleton pointer (or another, separate pointer) point to the skeleton or to the unit itself in non-fission, so we don't have to special case its absence) llvm-svn: 221186	2014-11-03 21:52:56 +00:00
David Blaikie	63db855d26	Add DwarfCompileUnit::BaseAddress to track the base address used by relative addressing in debug_ranges and debug_loc This is one of a few steps to generalize range handling to include the CU range (thus the CU's range list will be moved into the range list list, losing track of the base address in the process), which means generalizing ranges from both the skeleton and full unit under fission. And... then I can used that generalized support for ranges in fission+gmlt where there'll be a bunch more ranges in the skeleton. llvm-svn: 221182	2014-11-03 21:15:30 +00:00
Paul Robinson	2b27bd26b6	Normally an 'optnone' function goes through fast-isel, which does not call DAGCombiner. But we ran into a case (on Windows) where the calling convention causes argument lowering to bail out of fast-isel, and we end up in CodeGenAndEmitDAG() which does run DAGCombiner. So, we need to make DAGCombiner check for 'optnone' after all. Commit includes the test that found this, plus another one that got missed in the original optnone work. llvm-svn: 221168	2014-11-03 18:19:26 +00:00
David Blaikie	13ae175ea5	Cleanup some unused or trivial functions in DwarfCompileUnit llvm-svn: 221164	2014-11-03 17:10:38 +00:00
David Blaikie	5ec0b37419	Sink DwarfUnit::CURanges into DwarfCompileUnit llvm-svn: 221161	2014-11-03 16:40:43 +00:00
Oliver Stannard	394d298bcc	Revert r221150, as it broke sanitizer tests llvm-svn: 221151	2014-11-03 12:19:03 +00:00
Oliver Stannard	c14da7456a	Emit .eh_frame with relocations to functions, rather than sections When LLVM emits DWARF call frame information, it currently creates a local, section-relative symbol in the code section, which is pointed to by a relocation on the .eh_frame section. However, for C++ we emit some functions in section groups, and the SysV ABI has some rules to make it easier to remove these sections (http://www.sco.com/developers/gabi/latest/ch4.sheader.html#section_group_rules): A symbol table entry with STB_LOCAL binding that is defined relative to one of a group's sections, and that is contained in a symbol table section that is not part of the group, must be discarded if the group members are discarded. References to this symbol table entry from outside the group are not allowed. This means that we need to use the function symbol for the relocation, not a temporary symbol. There was a comment in the code claiming that the local symbol was used to avoid creating a relocation, but a relocation must be created anyway as the code and CFI are in different sections. llvm-svn: 221150	2014-11-03 12:02:51 +00:00
David Blaikie	904a6c8687	Sink range list handling down from DwarfUnit into its only use, in DwarfCompileUnit. llvm-svn: 221123	2014-11-03 02:41:49 +00:00
David Blaikie	36d2c1854d	Formatting llvm-svn: 221095	2014-11-02 08:52:37 +00:00
David Blaikie	cc391f34cf	Add DwarfUnit::isDwoUnit and use it to generalize string creation Currently we only need to emit skeleton strings into the CU header and we do this by explicitly calling "addLocalString". With gmlt-in-fission, we'll be emitting a bunch of other strings from other codepaths where it's not statically known that these strings will be local or not. Introduce a virtual function to indicate whether this unit is a DWO unit or not (I'm not sure if we have a good term for this, the opposite/alternative to 'skeleton' unit) and use that to generalize the string emission logic so that strings can be correctly emitted in both the skeleton and dwo unit when in split dwarf mode. And to demonstrate that this works, switch the existing special callers of addLocalString in the skeleton builder to addString - and they still work. Yay. llvm-svn: 221094	2014-11-02 08:51:37 +00:00
David Blaikie	7319643a43	Remove the last mention of LineTablesOnly from DwarfUnit, sinking it into DwarfCompileUnit This is a useful distinction/invariant/delination to make because LineTablesOnly mode is never relevant to type units, so it's clear that we're not doing weird line-tables-only-with-types by making this API choice. It also lays the foundations nicely for adding gmlt-like data to fission skeleton CUs while limiting the effects to CUs and not TUs. llvm-svn: 221093	2014-11-02 08:18:06 +00:00
David Blaikie	bb5b8ec235	Sink DwarfUnit::applySubprogramAttributesToDefinition into DwarfCompileUnit llvm-svn: 221092	2014-11-02 08:09:09 +00:00
David Blaikie	17a3519bc0	Sink DwarfUnit::addExpr into DwarfCompileUnit llvm-svn: 221090	2014-11-02 07:11:55 +00:00
David Blaikie	ec951ad31c	Fix the build from the last commit llvm-svn: 221089	2014-11-02 07:08:12 +00:00
David Blaikie	96c3f13e27	Sink DwarfUnit::applyVariableAttributes into DwarfCompileUnit llvm-svn: 221088	2014-11-02 07:06:51 +00:00
David Blaikie	abebe0162c	Sink DwarfUnit::addLocationList down into DwarfCompileUnit llvm-svn: 221087	2014-11-02 07:03:19 +00:00
David Blaikie	aadd810834	Sink DwarfUnit::addComplexAddress down into DwarfCompileUnit llvm-svn: 221086	2014-11-02 06:58:44 +00:00
David Blaikie	d6a9a46935	Push DwarfUnit::addAddress down into DwarfCompileUnit llvm-svn: 221085	2014-11-02 06:46:40 +00:00
David Blaikie	33c9b0d67b	Sink DwarfUnit::addVariableAddress into DwarfCompileUnit since type units don't have variables llvm-svn: 221084	2014-11-02 06:37:23 +00:00
David Blaikie	9a1d2d413e	DebugInfo: Sink accelerator table lists down (GlobalNames/Types) into DwarfCompileUnit llvm-svn: 221083	2014-11-02 06:16:39 +00:00
David Blaikie	5e147c4bc9	Add DwarfUnit::addGlobalType to match DwarfUnit::addGlobalName (these will shortly become virtual, with a null implementation in DwarfUnit (since type units don't have accelerator tables in the current schema) and the current implementation down in DwarfCompileUnit, moving the actual maps there too) llvm-svn: 221082	2014-11-02 06:06:14 +00:00
David Blaikie	7b02323482	DebugInfo: Refactor index type DIE initialization by rolling it into the accessor llvm-svn: 221080	2014-11-02 03:09:13 +00:00
David Blaikie	3761128594	Be sure to initialize DwarfCompileUnit::LabelBegin now that it may be skipped in initSection llvm-svn: 221079	2014-11-02 02:40:26 +00:00
David Blaikie	530359d196	Don't bother creating LabelBegin for .dwo units This would help catch cases where we might otherwise try to reference a dwo CU label, which would be weird - because without relocations in the dwo file it's not generally meaningful to talk about the CU offsets there (or, if it is, we can do so in absolute terms without using a relocation to compute it). llvm-svn: 221078	2014-11-02 02:26:24 +00:00
David Blaikie	e89a2104cf	Drop DwarfCompileUnit::getLocalLabel* in favor of just mapping through the skeleton explicitly. Confusing to do this two different ways - I'm not too wedded to either one, but here goes. llvm-svn: 221076	2014-11-02 01:21:43 +00:00
David Blaikie	795e1a2a21	Sink DwarfUnit::LabelBegin down into DwarfCompileUnit since that's the only place it's needed. llvm-svn: 221075	2014-11-02 01:21:40 +00:00
David Blaikie	44f17e492d	Sink dwarf unit length emission down into DwarfUnit::emitHeader This allows the CU label to be emitted only for compile units, as they're the only ones that need it (so they can be referenced from pubnames) llvm-svn: 221072	2014-11-01 23:59:23 +00:00
David Blaikie	f36a454c35	Remove DwarfUnit::LabelEnd in favor of computing the length of the section directly This was a compile-unit specific label (unused in type units) and seems unnecessary anyway when we can more easily directly compute the size of the compile unit. llvm-svn: 221067	2014-11-01 23:07:14 +00:00
David Blaikie	21e8ef20de	Sink DwarfUnit::SectionSym into DwarfCompileUnit as it's only needed/used there. llvm-svn: 221062	2014-11-01 20:06:28 +00:00
David Blaikie	e011711e85	Make DwarfCompileUnit::Skeleton more narrowly typed (DwarfCompileUnit* instead of DwarfUnit*) now that it's specific to DwarfCompileUnit anyway. llvm-svn: 221060	2014-11-01 19:26:05 +00:00
David Blaikie	1c7f7fe885	Sink DwarfUnit::Skeleton down into DwarfCompileUnit Type units no longer have skeletons and it's misleading to be able to query for a type unit's skeleton (it might incorrectly lead one to conclude that if a unit doesn't have a skeleton it's not in a .dwo file... ). llvm-svn: 221055	2014-11-01 18:18:07 +00:00
David Blaikie	eddf5043ac	Sink DwarfDebug::AbstractSPDies down into DwarfFile This is the first big step to allowing gmlt-like inline scope information in the skeleton CU. While this commit doesn't change the functionality, it's only a small step to call "constructAbstractSubprogramDIE" on both the InfoHolder and the SkeletonHolder (when in use) and that will at least create the abstract SP dies in that case, though still not creating the other subprograms. llvm-svn: 221051	2014-11-01 17:21:26 +00:00
David Blaikie	108983e345	Remove unused function llvm-svn: 221037	2014-11-01 01:15:26 +00:00
David Blaikie	9542ebc7a1	And... fix the build some more. llvm-svn: 221036	2014-11-01 01:15:24 +00:00
David Blaikie	ec0709e97f	Just iterate the DwarfCompileUnits rather than trying to filter them out of the list of all units. llvm-svn: 221034	2014-11-01 01:11:19 +00:00
David Blaikie	d6a0e067d7	Add '*' to auto variable that is a pointer, as per the coding conventions. llvm-svn: 221033	2014-11-01 01:03:39 +00:00
David Blaikie	bc2a3611cd	Add DwarfCompileUnit::getSkeleton that returns DwarfCompileUnit* to avoid having to cast from DwarfUnit* on every call. llvm-svn: 221031	2014-11-01 00:50:34 +00:00
Duncan P. N. Exon Smith	7004fd9aac	IR: MDNode => Value: Instruction::getMetadata() Change `Instruction::getMetadata()` to return `Value` as part of PR21433. Update most callers to use `Instruction::getMDNode()`, which wraps the result in a `cast_or_null<MDNode>`. llvm-svn: 221024	2014-11-01 00:10:31 +00:00
David Blaikie	1e29b40b41	Sink some of DwarfDebug::collectDeadVariables down into DwarfCompileUnit. llvm-svn: 221010	2014-10-31 22:30:30 +00:00
David Blaikie	da34a7bac5	Sink most of DwarfDebug::constructAbstractSubprogramScopeDIE into DwarfCompileUnit llvm-svn: 221005	2014-10-31 21:57:02 +00:00
Quentin Colombet	06167df4ad	[CodeGenPrepare] Move extractelement close to store if they can be combined. This patch adds an optimization in CodeGenPrepare to move an extractelement right before a store when the target can combine them. The optimization may promote any scalar operations to vector operations in the way to make that possible. Context Some targets use different register files for both vector and scalar operations. This means that transitioning from one domain to another may incur copy from one register file to another. These copies are not coalescable and may be expensive. For example, according to the scheduling model, on cortex-A8 a vector to GPR move is 20 cycles. Motivating Example Let us consider an example: define void @foo(<2 x i32>* %addr1, i32* %dest) { %in1 = load <2 x i32>* %addr1, align 8 %extract = extractelement <2 x i32> %in1, i32 1 %out = or i32 %extract, 1 store i32 %out, i32* %dest, align 4 ret void } As it is, this IR generates the following assembly on armv7: vldr d16, [r0] @vector load vmov.32 r0, d16[1] @ cross-register-file copy: 20 cycles orr r0, r0, #1 @ scalar bitwise or str r0, [r1] @ scalar store bx lr Whereas we could generate much faster code: vldr d16, [r0] @ vector load vorr.i32 d16, #0x1 @ vector bitwise or vst1.32 {d16[1]}, [r1:32] @ vector extract + store bx lr Half of the computation made in the vector is useless, but this allows to get rid of the expensive cross-register-file copy. Proposed Solution To avoid this cross-register-copy penalty, we promote the scalar operations to vector operations. The penalty will be removed if we manage to promote the whole chain of computation in the vector domain. Currently, we do that only when the chain of computation ends by a store and the target is able to combine an extract with a store. Stores are the most likely candidates, because other instructions produce values that would need to be promoted and so, extracted as some point[1]. Moreover, this is customary that targets feature stores that perform a vector extract (see AArch64 and X86 for instance). The proposed implementation relies on the TargetTransformInfo to decide whether or not it is beneficial to promote a chain of computation in the vector domain. Unfortunately, this interface is rather inaccurate for this level of details and although this optimization may be beneficial for X86 and AArch64, the inaccuracy will lead to the optimization being too aggressive. Basically in TargetTransformInfo, everything that is legal has a cost of 1, whereas, even if a vector type is legal, usually a vector operation is slightly more expensive than its scalar counterpart. That will lead to too many promotions that may not be counter balanced by the saving of the cross-register-file copy. For instance, on AArch64 this penalty is just 4 cycles. For now, the optimization is just enabled for ARM prior than v8, since those processors have a larger penalty on cross-register-file copies, and the scope is limited to basic blocks. Because of these two factors, we limit the effects of the inaccuracy. Indeed, I did not want to build up a fancy cost model with block frequency and everything on top of that. [1] We can imagine targets that can combine an extractelement with other instructions than just stores. If we want to go into that direction, the current interfaces must be augmented and, moreover, I think this becomes a global isel problem. Differential Revision: http://reviews.llvm.org/D5921 <rdar://problem/14170854> llvm-svn: 220978	2014-10-31 17:52:53 +00:00
David Blaikie	24cb75d1b9	Correct assert text from r220923 Noticed in post-commit review by Adrian Prantl. llvm-svn: 220967	2014-10-31 16:45:36 +00:00
Hao Liu	6cc87eb119	PR20557: Fix the bug that bogus cpu parameter crashes llc on AArch64 backend. Initial patch by Oleg Ranevskyy. llvm-svn: 220945	2014-10-31 02:35:34 +00:00
Ahmed Bougacha	38c0bf429c	[SelectionDAG] When scalarizing trunc, don't assert for legal operands. r212242 introduced a legalizer hook, originally to let AArch64 widen v1i{32,16,8} rather than scalarize, because the legalizer expected, when scalarizing the result of a conversion operation, to already have scalarized the operands. On AArch64, v1i64 is legal, so that commit ensured operations such as v1i32 = trunc v1i64 wouldn't assert. It did that by choosing to widen v1 types whenever possible. However, v1i1 types, for which there's no legal widened type, would still trigger the assert. This commit fixes that, by only scalarizing a trunc's result when the operand has already been scalarized, and introducing an extract_elt otherwise. This is similar to r205625. Fixes PR20777. llvm-svn: 220937	2014-10-30 23:46:50 +00:00
Louis Gerbarg	6f92b8978d	Fix incorrect invariant check in DAG Combine Earlier this summer I fixed an issue where we were incorrectly combining multiple loads that had different constraints such alignment, invariance, temporality, etc. Apparently in one case I made copt paste error and swapped alignment and invariance. Tests included. rdar://18816719 llvm-svn: 220933	2014-10-30 22:21:03 +00:00
David Blaikie	dc51b0f8dd	PR21408: Workaround the appearance of duplicate variables due to problems when inlining two calls to the same function from the same call site. llvm-svn: 220923	2014-10-30 20:20:11 +00:00
NAKAMURA Takumi	ee3b3d3d09	Whitespace. llvm-svn: 220857	2014-10-29 15:23:11 +00:00
David Blaikie	35385231e3	Minimize the scope of some variables, NFC. llvm-svn: 220759	2014-10-28 02:57:26 +00:00
Lang Hames	77d387a954	[PBQP] Unique allowed-sets for nodes in the PBQP graph and use pairs of these sets as keys into a cache of interference matrice values in the Interference constraint adder. Creating interference matrices was one of the large remaining time-sinks in PBQP. Caching them reduces the total compile time (when using PBQP) on the nightly test suite by ~10%. llvm-svn: 220688	2014-10-27 17:44:25 +00:00
David Blaikie	5a24603108	Remove some unnecessary casts. llvm-svn: 220658	2014-10-26 23:37:04 +00:00
Frederic Riss	1a2ce34071	Sink DwarfUnit::constructImportedEntityDIE into DwarfCompileUnit. So that it has access to getOrCreateGlobalVariableDIE. If we ever support decsribing using directive in C++ classes (thus requiring support in type units), it will certainly use another mechanism anyway. Differential Revision: http://reviews.llvm.org/D5975 llvm-svn: 220594	2014-10-24 21:31:09 +00:00
Matt Arsenault	cef46eb164	Fix copy paste comment llvm-svn: 220581	2014-10-24 18:13:10 +00:00
David Blaikie	21ab861fa1	DebugInfo: Sink DwarfDebug::ScopeVariables down into DwarfFile (part of refactoring to allow subprogram emission in both the skeleton and main units to enable -gmlt-like data to be included in the skeleton for live inlined backtracing purposes) llvm-svn: 220578	2014-10-24 17:57:34 +00:00
David Blaikie	1054961712	Remove DwarfDebug::FirstCU as it has no use It was only being used as a flag to identify the lack of debug info from within endModule - use the section labels for that instead. llvm-svn: 220575	2014-10-24 17:53:38 +00:00
Sanjay Patel	d9b7837012	Use rsqrt (X86) to speed up reciprocal square root calcs This is a first step for generating SSE rsqrt instructions for reciprocal square root calcs when fast-math is allowed. For now, be conservative and only enable this for AMD btver2 where performance improves significantly - for example, 29% on llvm/projects/test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c (if we convert the data type to single-precision float). This patch adds a two constant version of the Newton-Raphson refinement algorithm to DAGCombiner that can be selected by any target via a parameter returned by getRsqrtEstimate().. See PR20900 for more details: http://llvm.org/bugs/show_bug.cgi?id=20900 Differential Revision: http://reviews.llvm.org/D5658 llvm-svn: 220570	2014-10-24 17:02:16 +00:00
Marcello Maggioni	835ff8fe13	Added reset of LexicalScope in LiveDebugVariables reset function. llvm-svn: 220545	2014-10-24 02:46:50 +00:00
Timur Iskhodzhanov	04c11d578f	Fix PR21189 -- Emit symbol subsection required to debug LLVM-built binaries with VS2012+ Reviewed at http://reviews.llvm.org/D5772 llvm-svn: 220544	2014-10-24 01:27:45 +00:00
David Blaikie	11c3a3f6f2	DebugInfo: Remove DwarfDebug::addScopeVariable now that it's just a trivial wrapper llvm-svn: 220542	2014-10-24 00:43:47 +00:00
Ahmed Bougacha	a035d19b50	[SelectionDAG] Teach the vector scalarizer about FP conversions. This adds support for legalization of instructions of the form: [fp_conv] <1 x i1> %op to <1 x double> where fp_conv is one of fpto[us]i, [us]itofp. This used to assert because they were simply missing from the vector operand scalarizer. A similar problem arose in r190830, with trunc instead. Fixes PR20778. Differential Revision: http://reviews.llvm.org/D5810 llvm-svn: 220533	2014-10-23 22:49:25 +00:00
Ahmed Bougacha	4e0335a62d	Update comment and fix typos in assert message. (NFC) llvm-svn: 220531	2014-10-23 22:40:34 +00:00
Tim Northover	d882ba8bc5	ScheduleDAG: record PhysReg dependencies represented by CopyFromReg nodes x86's CMPXCHG -> EFLAGS consumer wasn't being recorded as a real EFLAGS dependency because it was represented by a pair of CopyFromReg(EFLAGS) -> CopyToReg(EFLAGS) nodes. ScheduleDAG was expecting the source to be an implicit-def on the instruction, where the result numbers in the DAG and the Uses list in TableGen matched up precisely. The Copy notation seems much more robust, so this patch extends ScheduleDAG rather than refactoring x86. Should fix PR20376. llvm-svn: 220529	2014-10-23 22:31:48 +00:00
David Blaikie	92e415962d	DebugInfo: Remove DwarfDebug::CurrentFnArguments since we have to handle argument ordering of other arguments (abstract arguments) in the same way and already have code for that too. While refactoring this code I was confused by both the name I had introduced (addNonArgumentVariable... but it has all this logic to handle argument numbering and keep things in order?) and by the redundancy. Seems when I fixed the misordered inlined argument handling, I didn't realize it was mostly redundant with the argument ordering code (which I may've also written, I'm not sure). So let's just rely on the more general case. The only oddity in output this produces is that it means when we emit all the variables for the current function, we don't track when we've finished the argument variables and are about to start the local variables and insert DW_AT_unspecified_parameters (for varargs functions) there. Instead it ends up after the local variables, scopes, etc. But this isn't invalid and doesn't cause DWARF consumers problems that I know of... so we'll just go with that because it makes the code nice & simple. (though, let's see what the buildbots have to say about this - crosses fingers) There will be some cleanup commits to follow to remove the now trivial wrappers, etc. llvm-svn: 220527	2014-10-23 22:27:50 +00:00
David Blaikie	c325cd7123	DebugInfo: Sink DwarfDebug::addNonArgumentScopeVariable into DwarfFile. llvm-svn: 220520	2014-10-23 22:04:30 +00:00
David Blaikie	dbed952309	DebugInfo: Remove DwarfDebug::addCurrentFnArgument declaration now that it's moved to DwarfFile. llvm-svn: 220515	2014-10-23 21:53:17 +00:00
David Blaikie	1e1e6fb31c	DebugInfo: Simplify/tidy/correct global variable decl/def emission handling. This fixes a bug (introduced by fixing the IR emitted from Clang where the definition of a static member would be scoped within the class, rather than within its lexical decl context) where the definition of a static variable would be placed inside a class. It also improves source fidelity by scoping static class member definitions inside the lexical decl context in which tehy are written (eg: namespace n { class foo { static int i; } int foo::i; } - the definition of 'i' will be within the namespace 'n' in the DWARF output now). Lastly, and the original goal, this reduces debug info size slightly (and makes debug info easier to read, etc) by placing the definitions of non-member global variables within their namespace, rather than using a separate namespace-scoped declaration along with a definition at global scope. Based on patches and discussion with Frédéric. llvm-svn: 220497	2014-10-23 19:12:43 +00:00
David Blaikie	8eacc975e4	Remove explicit (void) use of DwarfFile::DD that was accidentally left in r220452. Caught in post-commit review by Frédéric. llvm-svn: 220487	2014-10-23 16:12:58 +00:00
David Blaikie	e3b5e8b37a	[DebugInfo] Sink DwarfDebug::addCurrentFnArgument down into DwarfFile. Variable handling will be sunk into DwarfFile so that abstract variables and the like can be shared across multiple CUs (to handle cross-CU inlining, for example). llvm-svn: 220453	2014-10-23 00:16:05 +00:00
David Blaikie	f0eb7b0322	[DebugInfo] Add DwarfDebug& to DwarfFile. Use the DwarfDebug in one function that previously took it as a parameter, and lay the foundation for use this for other operations coming soon. llvm-svn: 220452	2014-10-23 00:16:03 +00:00
David Blaikie	e782ff6864	[DebugInfo] Remove LexicalScopes::isCurrentFunctionScope and CSE a use of LexicalScopes::getCurrentFunctionScope Now that we're sure the only root (non-abstract) scope is the current function scope, there's no need for isCurrentFunctionScope, the property can be tested directly instead. llvm-svn: 220451	2014-10-23 00:06:27 +00:00
Benjamin Kramer	a6c059251d	Strength reduce constant-sized vectors into arrays. No functionality change. llvm-svn: 220412	2014-10-22 19:55:26 +00:00
Matt Arsenault	c95fbccb3f	Fix typo llvm-svn: 220353	2014-10-22 00:28:59 +00:00
Matt Arsenault	2257f6b589	Add minnum / maxnum codegen llvm-svn: 220342	2014-10-21 23:01:01 +00:00
Arnaud A. de Grandmaison	3555931ec7	Pacify bots and simplify r220321 llvm-svn: 220335	2014-10-21 21:50:49 +00:00
Arnaud A. de Grandmaison	73624b6ac4	[PBQP] Teach PassConfig to tell if the default register allocator is used. This enables targets to adapt their pass pipeline to the register allocator in use. For example, with the AArch64 backend, using PBQP with the cortex-a57, the FPLoadBalancing pass is no longer necessary. llvm-svn: 220321	2014-10-21 20:47:22 +00:00
Arnaud A. de Grandmaison	f39e772ae9	[PBQP] Fix coalescing benefits As coalescing registers is a benefit, the cost should be improved (i.e. made smaller) when coalescing is possible. llvm-svn: 220302	2014-10-21 16:24:15 +00:00
Rafael Espindola	6ffbd5bf5d	Fix a bit of confusion about .set and produce more readable assembly. Every target we support has support for assembly that looks like a = b - c .long a What is special about MachO is that the above combination suppresses the production of a relocation. With this change we avoid producing the intermediary labels when they don't add any value. llvm-svn: 220256	2014-10-21 01:17:30 +00:00
Rafael Espindola	72d274d9e2	Make AsmPrinter::EmitLabelOffsetDifference a static helper and simplify. It had exactly one caller in a position where we know hasSetDirective is true. llvm-svn: 220250	2014-10-21 00:25:49 +00:00
Philip Reames	c3e4c79873	Introduce enum values for previously defined metadata types. (NFC) Our metadata scheme lazily assigns IDs to string metadata, but we have a mechanism to preassign them as well. Using a preassigned ID is helpful since we get compile time type checking, and avoid some (minimal) string construction and comparison. This change adds enum value for three existing metadata types: + MD_nontemporal = 9, // "nontemporal" + MD_mem_parallel_loop_access = 10, // "llvm.mem.parallel_loop_access" + MD_nonnull = 11 // "nonnull" I went through an updated various uses as well. I made no attempt to get all uses; I focused on the ones which were easily grepable and easily to translate. For example, there were several items in LoopInfo.cpp I chose not to update. llvm-svn: 220248	2014-10-21 00:13:20 +00:00
Lang Hames	86d69f67b6	[PBQP] Replace the interference-constraints algorithm with a faster version loosely based on linear scan. On x86-64 this is good for a ~2% drop in compile time on the nightly test suite. llvm-svn: 220143	2014-10-18 17:26:07 +00:00
Pete Cooper	bedb6f3c4b	Check for dynamic alloca's when selecting lifetime intrinsics. TL;DR: Indexing maps with [] creates missing entries. The long version: When selecting lifetime intrinsics, we index the static alloca map with the AllocaInst we find for that lifetime. Trouble is, we don't first check to see if this is a dynamic alloca. On the attached example, this causes a dynamic alloca to create an entry in the static map, and returns 0 (the default) as the frame index for that lifetime. 0 was used for the frame index of the stack protector, which given that it now has a lifetime, is coloured, and merged with other stack slots. PEI would later trigger an assert because it expects the stack protector to not be dead. This fix ensures that we only get frame indices for static allocas, ie, those in the map. Dynamic ones are effectively dropped, which is suboptimal, but at least isn't completely broken. rdar://problem/18672951 llvm-svn: 220099	2014-10-17 22:59:33 +00:00
Juergen Ributzka	99ffd17333	[Stackmaps] Enable invoking the patchpoint intrinsic. Patch by Kevin Modzelewski Reviewers: atrick, ributzka Reviewed By: ributzka Subscribers: llvm-commits, reames Differential Revision: http://reviews.llvm.org/D5634 llvm-svn: 220055	2014-10-17 17:39:00 +00:00
Jan Vesely	31f817808d	SelectionDAG: Add sext_inreg optimizations v2: use dyn_cast fixup comments v3: use cast Reviewed-by: Matt Arsenault <arsenm2@gmail.com> Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> llvm-svn: 220044	2014-10-17 14:45:25 +00:00
Juergen Ributzka	00a783c163	Reduce code duplication between patchpoint and non-patchpoint lowering. NFC. This is in preparation for another patch that makes patchpoints invokable. Reviewers: atrick, ributzka Reviewed By: ributzka Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5657 llvm-svn: 219967	2014-10-16 21:26:35 +00:00
Robin Morisset	8dc41d55aa	Erase fence insertion from SelectionDAGBuilder.cpp (NFC) Summary: Backends can use setInsertFencesForAtomic to signal to the middle-end that montonic is the only memory ordering they can accept for stores/loads/rmws/cmpxchg. The code lowering those accesses with a stronger ordering to fences + monotonic accesses is currently living in SelectionDAGBuilder.cpp. In this patch I propose moving this logic out of it for several reasons: - There is lots of redundancy to avoid: extremely similar logic already exists in AtomicExpand. - The current code in SelectionDAGBuilder does not use any target-hooks, it does the same transformation for every backend that requires it - As a result it is plain unsound, as it was apparently designed for ARM. It happens to mostly work for the other targets because they are extremely conservative, but Power for example had to switch to AtomicExpand to be able to use lwsync safely (see r218331). - Because it produces IR-level fences, it cannot be made sound ! This is noted in the C++11 standard (section 29.3, page 1140): ``` Fences cannot, in general, be used to restore sequential consistency for atomic operations with weaker ordering semantics. ``` It can also be seen by the following example (called IRIW in the litterature): ``` atomic<int> x = y = 0; int r1, r2, r3, r4; Thread 0: x.store(1); Thread 1: y.store(1); Thread 2: r1 = x.load(); r2 = y.load(); Thread 3: r3 = y.load(); r4 = x.load(); ``` r1 = r3 = 1 and r2 = r4 = 0 is impossible as long as the accesses are all seq_cst. But if they are lowered to monotonic accesses, no amount of fences can prevent it.. This patch does three things (I could cut it into parts, but then some of them would not be tested/testable, please tell me if you would prefer that): - it provides a default implementation for emitLeadingFence/emitTrailingFence in terms of IR-level fences, that mimic the original logic of SelectionDAGBuilder. As we saw above, this is unsound, but the best that can be done without knowing the targets well (and there is a comment warning about this risk). - it then switches Mips/Sparc/XCore to use AtomicExpand, relying on this default implementation (that exactly replicates the logic of SelectionDAGBuilder, so no functional change) - it finally erase this logic from SelectionDAGBuilder as it is dead-code. Ideally, each target would define its own override for emitLeading/TrailingFence using target-specific fences, but I do not know the Sparc/Mips/XCore memory model well enough to do this, and they appear to be dealing fine with the ARM-inspired default expansion for now (probably because they are overly conservative, as Power was). If anyone wants to compile fences more agressively on these platforms, the long comment should make it clear why he should first override emitLeading/TrailingFence. Test Plan: make check-all, no functional change Reviewers: jfb, t.p.northover Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D5474 llvm-svn: 219957	2014-10-16 20:34:57 +00:00
Eric Christopher	64e055d96c	Avoid caching the MachineFunction, we don't use it outside of runOnMachineFunction. llvm-svn: 219847	2014-10-15 21:06:25 +00:00

1 2 3 4 5 ...

17573 Commits