llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00

Author	SHA1	Message	Date
Dimitry Andric	42e4363449	Fix mixed line terminators. NFC. llvm-svn: 308052	2017-07-14 21:14:58 +00:00
Jakub Kuderski	aa78fc4f6e	[Dominators] Make IsPostDominator a template parameter Summary: DominatorTreeBase used to have IsPostDominators (bool) member to indicate if the tree is a dominator or a postdominator tree. This made it possible to switch between the two 'modes' at runtime, but it isn't used in practice anywhere. This patch makes IsPostDominator a template argument. This way, it is easier to switch between different algorithms at compile-time based on this argument and design external utilities around it. It also makes it impossible to incidentally assign a postdominator tree to a dominator tree (and vice versa), and to further simplify template code in GenericDominatorTreeConstruction. Reviewers: dberlin, sanjoy, davide, grosser Reviewed By: dberlin Subscribers: mzolotukhin, llvm-commits Differential Revision: https://reviews.llvm.org/D35315 llvm-svn: 308040	2017-07-14 18:26:09 +00:00
Nirav Dave	02fbce1f6d	Improve Aliasing of operations to static alloca Recommiting after adding check to avoid miscomputing alias information on addresses of the same base but different subindices. Memory accesses offset from frame indices may alias, e.g., we may merge write from function arguments passed on the stack when they are contiguous. As a result, when checking aliasing, we consider the underlying frame index's offset from the stack pointer. Static allocs are realized as stack objects in SelectionDAG, but its offset is not set until post-DAG causing DAGCombiner's alias check to consider access to static allocas to frequently alias. Modify isAlias to consider access between static allocas and access from other frame objects to be considered aliasing. Many test changes are included here. Most are fixes for tests which indirectly relied on our aliasing ability and needed to be modified to preserve their original intent. The remaining tests have minor improvements due to relaxed ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll which has a minor degradation dispite though the pre-legalized DAG is improved. Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand Reviewed By: rnk Subscribers: sdardis, nemanjai, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33345 llvm-svn: 308025	2017-07-14 13:56:21 +00:00
Jakub Kuderski	c4b1544a02	[NFC] Move DEBUG_TYPE macro below includes... in MachineCombiner.cpp. llvm-svn: 307940	2017-07-13 19:30:52 +00:00
Simon Dardis	c43ce3d9fa	Reland "[mips] Fix multiprecision arithmetic." For multiprecision arithmetic on MIPS, rather than using ISD::ADDE / ISD::ADDC, get SelectionDAG to break down the operation into ISD::ADDs and ISD::SETCCs. For MIPS, only the DSP ASE has a carry flag, so in the general case it is not useful to directly support ISD::{ADDE, ADDC, SUBE, SUBC} nodes. Also improve the generation code in such cases for targets with TargetLoweringBase::ZeroOrOneBooleanContent by directly using the result of the comparison node rather than using it in selects. Similarly for ISD::SUBE / ISD::SUBC. Address optimization breakage by moving the generation of MIPS specific integer multiply-accumulate nodes to before legalization. This revolves PR32713 and PR33424. Thanks to Simonas Kazlauskas and Pirama Arumuga Nainar for reporting the issue! Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D33494 The previous version of this patch was too aggressive in producing fused integer multiple-addition instructions. llvm-svn: 307906	2017-07-13 11:28:05 +00:00
Simon Pilgrim	8124dbf6ce	[DAGCombiner] Fix issue with rotate combines asserting if the constant value types differ from the result type. llvm-svn: 307900	2017-07-13 10:41:49 +00:00
Simon Pilgrim	5ae4ff7688	Use isNullConstantOrNullSplatConstant helper. NFCI. llvm-svn: 307895	2017-07-13 09:39:00 +00:00
Hiroshi Inoue	95d1010b48	fix typos in comments and error messges; NFC llvm-svn: 307885	2017-07-13 06:48:39 +00:00
Geoff Berry	b564df0f8b	[TargetLowering] Add hook for adding target MMO flags when doing ISel. Summary: Add TargetLowering hook getMMOFlags() to add target specific MMO flags to load/store instructions created by ISel. Reviewers: bogner, hfinkel, qcolombet, MatzeB Subscribers: mcrosier, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D34962 llvm-svn: 307879	2017-07-13 03:49:42 +00:00
Geoff Berry	9a32a3bacd	[MIR] Add support for printing and parsing target MMO flags Summary: Add target hooks for printing and parsing target MMO flags. Targets may override getSerializableMachineMemOperandTargetFlags() to return a mapping from string to flag value for target MMO values that should be serialized/parsed in MIR output. Add implementation of this hook for AArch64 SuppressPair MMO flag. Reviewers: bogner, hfinkel, qcolombet, MatzeB Subscribers: mcrosier, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D34962 llvm-svn: 307877	2017-07-13 02:28:54 +00:00
Eli Friedman	661b22e877	[CodeGenPrepare] Don't create dead instructions in addrmode sinking When we fail to sink an instruction, we must make sure not to modify the function; otherwise, we end up in an infinite loop because CodeGenPrepare iterates until it doesn't make any changes. Fixes https://bugs.llvm.org/show_bug.cgi?id=33608 . llvm-svn: 307866	2017-07-12 23:30:02 +00:00
Gerolf Hoflehner	7d489cae75	[SjLj] Replace recursive block marking algorithm with iterative algorithm Summary: Some programs run into a stack overflow issue. This change avoids this problem by replacing the recursive algorithm with the iterative version. Reviewers: MatzeB, t.p.northover, dblaikie Reviewed By: MatzeB Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D35105 llvm-svn: 307860	2017-07-12 23:05:15 +00:00
Daniel Neilson	84653da20b	Add element atomic memset intrinsic Summary: Continuing the work from https://reviews.llvm.org/D33240, this change introduces an element unordered-atomic memset intrinsic. This intrinsic is essentially memset with the implementation requirement that all stores used for the assignment are done with unordered-atomic stores of a given element size. Reviewers: eli.friedman, reames, mkazantsev, skatkov Reviewed By: reames Subscribers: jfb, dschuff, sbc100, jgravelle-google, aheejin, efriedma, llvm-commits Differential Revision: https://reviews.llvm.org/D34885 llvm-svn: 307854	2017-07-12 21:57:23 +00:00
Sam Clegg	470d37c78e	Remove unneeded use of #undef DEBUG_TYPE. NFC Where is is needed (at the end of headers that define it), be consistent about its use. Also fix a few header guards that I found in the process. Differential Revision: https://reviews.llvm.org/D34916 llvm-svn: 307840	2017-07-12 20:49:21 +00:00
Evandro Menezes	296d928945	[CodeGen] Add dependency printer Add SDep printer to make debugging sessions more productive. Differential revision: https://reviews.llvm.org/D35144 llvm-svn: 307799	2017-07-12 15:30:59 +00:00
Daniel Neilson	5294f8b585	Add element atomic memmove intrinsic Summary: Continuing the work from https://reviews.llvm.org/D33240, this change introduces an element unordered-atomic memmove intrinsic. This intrinsic is essentially memmove with the implementation requirement that all loads/stores used for the copy are done with unordered-atomic loads/stores of a given element size. Reviewers: eli.friedman, reames, mkazantsev, skatkov Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34884 llvm-svn: 307796	2017-07-12 15:25:26 +00:00
Konstantin Zhuravlyov	d382d6f3fc	Enhance synchscope representation OpenCL 2.0 introduces the notion of memory scopes in atomic operations to global and local memory. These scopes restrict how synchronization is achieved, which can result in improved performance. This change extends existing notion of synchronization scopes in LLVM to support arbitrary scopes expressed as target-specific strings, in addition to the already defined scopes (single thread, system). The LLVM IR and MIR syntax for expressing synchronization scopes has changed to use syncscope("<scope>"), where <scope> can be "singlethread" (this replaces singlethread keyword), or a target-specific name. As before, if the scope is not specified, it defaults to CrossThread/System scope. Implementation details: - Mapping from synchronization scope name/string to synchronization scope id is stored in LLVM context; - CrossThread/System and SingleThread scopes are pre-defined to efficiently check for known scopes without comparing strings; - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in the bitcode. Differential Revision: https://reviews.llvm.org/D21723 llvm-svn: 307722	2017-07-11 22:23:00 +00:00
Evandro Menezes	0e47762e3a	[CodeGen] Rename DEBUG_TYPE to match passnames Rename missing DEBUG_TYPE "machine-scheduler" from backend files, which were absent from https://reviews.llvm.org/rL303921. Differential revision: https://reviews.llvm.org/D35231 llvm-svn: 307719	2017-07-11 22:08:28 +00:00
Serguei Katkov	dca6ebe969	Revert Revert [MBP] do not rotate loop if it creates extra branch This is a second attempt to land this patch. The first one resulted in a crash of clang sanitizer buildbot. The fix is here and regression test is added. This is a last fix for the corner case of PR32214. Actually this is not really corner case in general. We should not do a loop rotation if we create an additional branch due to it. Consider the case where we have a loop chain H, M, B, C , where H is header with viable fallthrough from pre-header and exit from the loop M - some middle block B - backedge to Header but with exit from the loop also. C - some cold block of the loop. Let's H is determined as a best exit. If we do a loop rotation M, B, C, H we can introduce the extra branch. Let's compute the change in number of branches: +1 branch from pre-header to header -1 branch from header to exit +1 branch from header to middle block if there is such -1 branch from cold bock to header if there is one So if C is not a predecessor of H then we introduce extra branch. This change actually prohibits rotation of the loop if both true Best Exit has next element in chain as successor. Last element in chain is not a predecessor of first element of chain. Reviewers: iteratee, xur, sammccall, chandlerc Reviewed By: iteratee Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34745 llvm-svn: 307631	2017-07-11 08:34:58 +00:00
Serguei Katkov	e48484a1f9	[CGP] Relax a bit restriction for optimizeMemoryInst to extend scope CodeGenPrepare::optimizeMemoryInst contains a check that we do nothing if all instructions combining the address for memory instruction is in the same block as memory instruction itself. However if any of these instruction are placed after memory instruction then address calculation will not be folded to memory instruction. The added test case shows an example. Reviewers: loladiro, spatel, efriedma Reviewed By: efriedma Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34862 llvm-svn: 307628	2017-07-11 06:24:44 +00:00
Matthias Braun	5bb73c2a66	Revert "[DAG] Improve Aliasing of operations to static alloca" Reverting as it breaks tramp3d-v4 in the llvm test-suite. I added some comments to https://reviews.llvm.org/D33345 about it. This reverts commit r307546. llvm-svn: 307589	2017-07-10 20:51:30 +00:00
Nirav Dave	c55f8bfffc	Add DAG argument to canMergeStoresTo NFC. llvm-svn: 307583	2017-07-10 20:25:54 +00:00
Nirav Dave	8c365b2ee8	[DAG] Improve Aliasing of operations to static alloca Memory accesses offset from frame indices may alias, e.g., we may merge write from function arguments passed on the stack when they are contiguous. As a result, when checking aliasing, we consider the underlying frame index's offset from the stack pointer. Static allocs are realized as stack objects in SelectionDAG, but its offset is not set until post-DAG causing DAGCombiner's alias check to consider access to static allocas to frequently alias. Modify isAlias to consider access between static allocas and access from other frame objects to be considered aliasing. Many test changes are included here. Most are fixes for tests which indirectly relied on our aliasing ability and needed to be modified to preserve their original intent. The remaining tests have minor improvements due to relaxed ordering. The exception is CodeGen/X86/2011-10-19-widen_vselect.ll which has a minor degradation dispite though the pre-legalized DAG is improved. Reviewers: rnk, mkuper, jonpa, hfinkel, uweigand Reviewed By: rnk Subscribers: sdardis, nemanjai, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33345 llvm-svn: 307546	2017-07-10 15:39:41 +00:00
Hiroshi Inoue	f127dd99e3	fix typos in comments and error messages; NFC llvm-svn: 307533	2017-07-10 12:44:25 +00:00
Davide Italiano	ac5e953a20	[X86] Relax an assertion when legalizing vector types. WidenVSELECTAndMask can fold (and it folds in this case) so we get a BUILD_VECTOR of constants as mask. convertMask() seems to work fine when the input is a vector of constants, and we still need to call it to extend/add elements at the end. but the current code just asserts on anything but a SETCC or AND/OR/XOR of 2xSETCC. This change was discussed briefly with Simon Pilgrim, who also suggests we might consider dropping this assertion in the future. Fixes PR33715. llvm-svn: 307508	2017-07-09 19:22:48 +00:00
Simon Pilgrim	78bcdd73d3	Handle ConstantExpr correctly in SelectionDAGBuilder This change fixes a bug in SelectionDAGBuilder::visitInsertValue and SelectionDAGBuilder::visitExtractValue where constant expressions (InsertValueConstantExpr and ExtractValueConstantExpr) would be treated as non-constant instructions (InsertValueInst and ExtractValueInst). This bug resulted in an incorrect memory access, which manifested as an assertion failure in SDValue::SDValue. Fixes PR#33094. Submitted on behalf of @Praetonus (Benoit Vey) Differential Revision: https://reviews.llvm.org/D34538 llvm-svn: 307502	2017-07-09 16:01:04 +00:00
Igor Breger	7805125c03	[FastISel] fix a fallback diagnostic. Summary: FastISel was marked as failed in case instruction selection succeeded. Reviewers: qcolombet, zvi, rovka, ab Reviewed By: zvi Subscribers: javed.absar, ab, qcolombet, bogner, llvm-commits Differential Revision: https://reviews.llvm.org/D34438 llvm-svn: 307489	2017-07-09 05:55:20 +00:00
Hiroshi Inoue	1d578b7e75	fix trivial typos; NFC sucessor -> successor llvm-svn: 307488	2017-07-09 05:54:44 +00:00
Sanjay Patel	1d94b62277	[DAGCombiner] use local variable to shorten code; NFCI llvm-svn: 307429	2017-07-07 19:34:42 +00:00
Quentin Colombet	2d20120a2d	[RegAllocFast] Don't insert kill flags of super-register for partial kill When reusing a register for a new definition, the fast register allocator used to insert a kill flag at the previous last use of that register to inform later passes that this register is free between the redef and the last use. However, this may be wrong when subregisters are involved. Indeed, a partially redef would have trigger a kill of the full super register, potentially wrongly marking all the other subregisters as free. Given we don't track which lanes are still live, we cannot set the kill flag in such case. Note: This bug has been latent for about 7 years (r104056). llvmg.org/PR33677 llvm-svn: 307428	2017-07-07 19:25:45 +00:00
Quentin Colombet	d7ab6b5165	[RegAllocFast] Add the proper initialize method to use the .mir infrastructure NFC llvm-svn: 307427	2017-07-07 19:25:42 +00:00
Matthias Braun	100d5916de	RegisterScavenging: Fix PR33687 When scavenging for a use in instruction MI, we will reload after that instruction and hence cannot spill uses/defs of this instruction. This fixes http://llvm.org/PR33687 llvm-svn: 307352	2017-07-07 03:02:18 +00:00
Matthias Braun	3df3235c8d	LiveRegUnits: Rename accumulateBackward()->accumulate() Contrary to the stepForward()/stepBackward() method accumulate() doesn't have a direction as defs, uses and clobbers all have the same effect. Also improve the documentation comment. llvm-svn: 307351	2017-07-07 03:02:17 +00:00
Mikael Holmen	9b4c058bb3	[MachineVerifier] Add check that tied physregs aren't different. Summary: Added MachineVerifier code to check register ties more thoroughly, especially so that physical registers that are tied are the same. This may help e.g. when creating MIR files. Original patch by Jesper Antonsson Reviewers: stoklund, sanjoy, qcolombet Reviewed By: qcolombet Subscribers: qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D34394 llvm-svn: 307259	2017-07-06 13:18:21 +00:00
David Stuttard	720e14188d	[RegisterCoalescer] Fix for SubRange join unreachable Summary: During remat, some subranges might end up having invalid segments which caused problems for later coalescing. Added in a check to remove segments that are invalidated as part of the remat. See http://llvm.org/PR33524 Subscribers: MatzeB, qcolombet Differential Revision: https://reviews.llvm.org/D34391 llvm-svn: 307247	2017-07-06 10:07:57 +00:00
Diana Picus	1704a62e0b	[ARM] GlobalISel: Legalize G_FCMP for s32 This covers both hard and soft float. Hard float is easy, since it's just Legal. Soft float is more involved, because there are several different ways to handle it based on the predicate: one and ueq need not only one, but two libcalls to get a result. Furthermore, we have large differences between the values returned by the AEABI and GNU functions. AEABI functions return a nice 1 or 0 representing true and respectively false. GNU functions generally return a value that needs to be compared against 0 (e.g. for ogt, the value returned by the libcall is > 0 for true). We could introduce redundant comparisons for AEABI as well, but they don't seem easy to remove afterwards, so we do different processing based on whether or not the result really needs to be compared against something (and just truncate if it doesn't). llvm-svn: 307243	2017-07-06 09:09:33 +00:00
Vadim Chugunov	1a77669bc6	Fix libcall expansion creating DAG nodes with invalid type post type legalization. If we are lowering a libcall after legalization, we'll split the return type into a pair of legal values. Patch by Jatin Bhateja and Eli Friedman. Differential Revision: https://reviews.llvm.org/D34240 llvm-svn: 307207	2017-07-05 22:01:49 +00:00
Simon Pilgrim	a28964fc31	{DAGCombiner] Fold (rot x, 0) -> x llvm-svn: 307184	2017-07-05 18:27:11 +00:00
Andrew Zhogin	fd7bc34e21	[DAGCombiner] visitRotate patch to optimize pair of ROTR/ROTL instructions into one with combined shift operand. For two ROTR operations with shifts C1, C2; combined shift operand will be (C1 + C2) % bitsize. Differential revision: https://reviews.llvm.org/D12833 llvm-svn: 307179	2017-07-05 17:55:42 +00:00
Daniel Sanders	06935b98f4	[globalisel][tablegen] Finish fixing compile-time regressions by merging the matcher and emitter state machines. Summary: Also, made a few minor tweaks to shave off a little more cumulative memory consumption: * All rules share a single NewMIs instead of constructing their own. Only one will end up using it. * Use MIs.resize(1) instead of MIs.clear();MIs.push_back(I) and prevent GIM_RecordInsn from changing MIs[0]. Depends on D33764 Reviewers: rovka, vitalybuka, ab, t.p.northover, qcolombet, aditya_nandakumar Reviewed By: ab Subscribers: kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D33766 llvm-svn: 307159	2017-07-05 14:50:18 +00:00
Diana Picus	290114fb47	[GlobalISel] Refactor Legalizer helpers for libcalls We used to have a helper that replaced an instruction with a libcall. That turns out to be too aggressive, since sometimes we need to replace the instruction with at least two libcalls. Therefore, change our existing helper to only create the libcall and leave the instruction removal as a separate step. Also rename the helper accordingly. llvm-svn: 307149	2017-07-05 12:57:24 +00:00
Diana Picus	2a391c1fb3	[MachineIRBuilder] Fix formatting. NFC. llvm-svn: 307144	2017-07-05 11:47:23 +00:00
Diana Picus	ad68bc7a9f	[MachineIRBuilder] Add buildOr helper. NFC. This isn't used anywhere yet, but I need it for a future commit. llvm-svn: 307141	2017-07-05 11:32:12 +00:00
Igor Breger	98662f5c75	[GlobalIsel] allow x86_fp80 values to be dumped. Summary: Otherwise the fallback path fails with an assertion on x86_64 targets, when "x86_fp80" is encountered. Reviewers: t.p.northover, zvi, guyblank Reviewed By: zvi Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34975 llvm-svn: 307140	2017-07-05 11:11:10 +00:00
Diana Picus	6216b92663	[MachineIRBuilder] Add buildBinaryOp helper. NFC Add a helper for building simple binary ops like add, mul, sub, and. This can be used in the future for quickly adding support for or, xor. llvm-svn: 307139	2017-07-05 11:02:31 +00:00
Daniel Sanders	0251f5055d	[globalisel][tablegen] Fix an unused variable warning in release builds after r307133 llvm-svn: 307138	2017-07-05 10:16:48 +00:00
Daniel Sanders	067ddbd8c9	[globalisel][tablegen] Added instruction emission to the state-machine-based matcher. Summary: This further improves the compile-time regressions that will be caused by a re-commit of r303259. Also added included preliminary work in preparation for the multi-insn emitter since I needed to change the relevant part of the API for this patch anyway. Depends on D33758 Reviewers: rovka, vitalybuka, ab, t.p.northover, qcolombet, aditya_nandakumar Reviewed By: ab Subscribers: kristof.beyls, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D33764 llvm-svn: 307133	2017-07-05 09:39:33 +00:00
Nirav Dave	c3f1114c60	Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset Relanding after rewriting undef.ll test to avoid host-dependant endianness. As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using generic checks. Also, propagate missing local handling from there to BaseIndexOffset checks. Tests of note: * test/CodeGen/X86/build-vector* - Improved. * test/CodeGen/BPF/undef.ll - Improved store alignment allows an additional store merge * test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a case we already do not handle well. Here, the DAG is improved, but scheduling causes a code size degradation. Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D34472 llvm-svn: 307114	2017-07-05 01:21:23 +00:00
Hiroshi Inoue	8c9589ce3a	fix trivial typos in comments; NFC llvm-svn: 307094	2017-07-04 16:35:26 +00:00
Andrew Zhogin	0d027b7284	[DAGCombiner] Intermediate variables in visitRotate promoted to the function's begin. NFC precommit for D12833. llvm-svn: 307091	2017-07-04 15:57:39 +00:00
Anna Thomas	9d7382c30f	[FastISel][SelectionDAG]Teach fastISel about GC intrinsics Summary: We are crashing in LLC at O0 when gc intrinsics are present in the block. The reason being FastISel performs basic block ISel by modifying GC.relocates to be the first instruction in the block. This can cause us to visit the GC relocate before it's corresponding GC.statepoint is visited, which is incorrect. When we lower the statepoint, we record the base and derived pointers, along with the gc.relocates. After this we can visit the gc.relocate. This patch avoids fastISel from incorrectly creating the block with gc.relocate as the first instruction. Reviewers: qcolombet, skatkov, qikon, reames Reviewed by: skatkov Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34421 llvm-svn: 307084	2017-07-04 15:09:09 +00:00
Daniel Sanders	f4173f55ab	[globalisel][tablegen] Partially fix compile-time regressions by converting matcher to state-machine(s) Summary: Replace the matcher if-statements for each rule with a state-machine. This significantly reduces compile time, memory allocations, and cumulative memory allocation when compiling AArch64InstructionSelector.cpp.o after r303259 is recommitted. The following patches will expand on this further to fully fix the regressions. Reviewers: rovka, ab, t.p.northover, qcolombet, aditya_nandakumar Reviewed By: ab Subscribers: vitalybuka, aemerson, javed.absar, igorb, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D33758 llvm-svn: 307079	2017-07-04 14:35:06 +00:00
Nirav Dave	d98172f6a7	[DAG] Fixed predicate for determining when two frame indices addresses are comparable. NFCI. llvm-svn: 307055	2017-07-04 02:20:17 +00:00
Anton Yartsev	0649b527ea	[legalize-types] Clean up softening machinery. The patch makes SoftenFloatResult/Operand logic just the same as all other legalization routines have: SoftenFloatResult() now fills the SoftenFloats map and SoftenFloatOperand() perform all needed replacements. This prevents softening mashinery from leaving stale entries in SoftenFloats map (that resulted in errors during the legalize type checking) and clarifies softening. The patch replaces https://reviews.llvm.org/D29265. Differential Revision: https://reviews.llvm.org/D31946 llvm-svn: 307053	2017-07-04 01:08:55 +00:00
Zvi Rackover	e1f310fac6	DAGCombine: Combine BUILD_VECTOR to TRUNCATE Summary: Add a combine for creating a truncate to replace a build_vector composed of extracts with indices that form a stride-2^N series. Example: v8i32 V = ... v4i32 build_vector((extract_elt V, 0), (extract_elt V, 2), (extract_elt V, 4), (extract_elt V, 6)) --> v4i32 truncate (bitcast V to v4i64) Related discussion in llvm-dev about canonicalizing shuffles to truncates in LLVM IR: http://lists.llvm.org/pipermail/llvm-dev/2017-January/108936.html. Reviewers: spatel, RKSimon, efriedma, igorb, craig.topper, wolfgangp, delena Reviewed By: delena Subscribers: guyblank, delena, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D34077 llvm-svn: 307036	2017-07-03 15:47:40 +00:00
Hiroshi Inoue	8d919daa58	fix trivial typos in comments; NFC llvm-svn: 307004	2017-07-03 06:32:59 +00:00
Craig Topper	41999da9af	[SelectionDAGBuilder] Use EVT::getVectorVT instead of MVT::getVectorVT to prevent a crash if the type isn't a simple VT. llvm-svn: 306950	2017-07-01 06:46:09 +00:00
Sameer AbuAsal	b724aa3c1a	[RegisterCoalescer] Account for instructions deleted by removePartialredunduncy and in WorkList Summary: removePartialRedundency optimization introduces a state in the RegisterCoalescer where an instruction pointed to in the WorkList is deleted from the MBB and then removed from the ErasedList. This patch updates the ErasedList to be used globally by not erasing erased Instructions from it to solve the problem. The patch also accounts for the case where an Instruction was previously deleted and the same memory was reused by BuildMI to create a new instruction. Reviewers: kparzysz, qcolombet Reviewed By: qcolombet Subscribers: MatzeB, qcolombet, llvm-commits Differential Revision: https://reviews.llvm.org/D34902 llvm-svn: 306915	2017-06-30 23:49:07 +00:00
Brian Gesiak	f678ff58e2	[ORE] Add diagnostics hotness threshold Summary: Add an option to prevent diagnostics that do not meet a minimum hotness threshold from being output. When generating optimization remarks for large codebases with a ton of cold code paths, this option can be used to limit the optimization remark output at a reasonable size. Discussion of this change can be read here: http://lists.llvm.org/pipermail/llvm-dev/2017-June/114377.html Reviewers: anemet, davidxl, hfinkel Reviewed By: anemet Subscribers: qcolombet, javed.absar, fhahn, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D34867 llvm-svn: 306912	2017-06-30 23:14:53 +00:00
Reid Kleckner	0271f6ebc2	[codeview] Use the first valid source location at the top of every MBB If the instructions at the beginning of the block have no location, we're better off using the location of the first instruction in the current basic block. At the very least, that instruction post-dominates this one, whereas if we don't emit a .cv_loc directive, we end up using the potentially invalid location that falls through from the previous block. We could probably do better here by emitting some kind of ".cv_loc end" directive that stops the line table entry of the previous .cv_loc directive from bleeding out of its basic block. This would improve the line table when an entire MBB has no valid location info. llvm-svn: 306889	2017-06-30 21:33:44 +00:00
Tim Northover	f5949ef891	GlobalISel: add G_IMPLICIT_DEF instruction. It looks like there are two target-independent but not GISel instructions that need legalization, IMPLICIT_DEF and PHI. These are already anomalies since their operands have important LLTs attached, so to make things more uniform it seems like a good idea to add generic variants. Starting with G_IMPLICIT_DEF. llvm-svn: 306875	2017-06-30 20:27:36 +00:00
Reid Kleckner	7e9370cc34	Drop the LLVM mangler escape when printing the IR name in assembly comments I'm tired of seeing this: .globl "?Test@@YAXXZ" # -- Begin function ^A?Test@@YAXXZ llvm-svn: 306855	2017-06-30 18:22:51 +00:00
Brian Gesiak	0d22b63ef8	[ORE] Unify spelling as "diagnostics hotness" Summary: To enable profile hotness information in diagnostics output, Clang takes the option `-fdiagnostics-show-hotness` -- that's "diagnostics", with an "s" at the end. Clang also defines `CodeGenOptions::DiagnosticsWithHotness`. LLVM, on the other hand, defines `LLVMContext::getDiagnosticHotnessRequested` -- that's "diagnostic", not "diagnostics". It's a small difference, but it's confusing, typo-inducing, and frustrating. Add a new method with the spelling "diagnostics", and "deprecate" the old spelling. Reviewers: anemet, davidxl Reviewed By: anemet Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D34864 llvm-svn: 306848	2017-06-30 18:13:59 +00:00
Nirav Dave	987d4b24ce	Revert "[DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset" This reverts commit r306819 which appears be exposing underlying issues in a stage1 ppc64be build llvm-svn: 306820	2017-06-30 12:56:02 +00:00
Nirav Dave	28417300e7	[DAG] Rewrite areNonVolatileConsecutiveLoads to use BaseIndexOffset As discussed in D34087, rewrite areNonVolatileConsecutiveLoads using generic checks. Also, propagate missing local handling from there to BaseIndexOffset checks. Tests of note: * test/CodeGen/X86/build-vector* - Improved. * test/CodeGen/BPF/undef.ll - Improved store alignment allows an additional store merge * test/CodeGen/X86/clear_upper_vector_element_bits.ll - This is a case we already do not handle well. Here, the DAG is improved, but scheduling causes a code size degradation. Reviewers: RKSimon, craig.topper, spatel, andreadb, filcab Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D34472 llvm-svn: 306819	2017-06-30 12:23:41 +00:00
Kristof Beyls	fe183150a4	[GlobalISel] Make multi-step legalization work. In r301116, a custom lowering needed to be introduced to be able to legalize 8 and 16-bit divisions on ARM targets without a division instruction, since 2-step legalization (WidenScalar from 8 bit to 32 bit, then Libcall the 32-bit division) doesn't work. This fixes this and makes this kind of multi-step legalization, where first the size of the type needs to be changed and then some action is needed that doesn't require changing the size of the type, straighforward to specify. Differential Revision: https://reviews.llvm.org/D32529 llvm-svn: 306806	2017-06-30 08:26:20 +00:00
Wolfgang Pieb	0b90ef3d7b	[DWARF] Move a couple of member functions to the DWARFUnit baseclass. NFC. Reviewer: dblaikie Differential revision: https://reviews.llvm.org/D34765 llvm-svn: 306771	2017-06-30 00:27:45 +00:00
Aditya Nandakumar	7c00869517	[GISel]: New Opcode G_FLOG/G_FLOG2 https://reviews.llvm.org/D34837 llvm-svn: 306766	2017-06-29 23:43:44 +00:00
Taewook Oh	7e3c6fd16c	Remove redundant copy in recurrences Summary: If there is a chain of instructions formulating a recurrence, commuting operands can help removing a redundant copy. In the following example code, ``` BB#1: ; Loop Header %vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13 ... BB#6: ; Loop Latch %vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15 %vreg10<def,tied1> = ADD32rr %vreg1<kill,tied0>, %vreg0<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg1,%vreg0 %vreg3<def,tied1> = ADD32rr %vreg2<kill,tied0>, %vreg10<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2,%vreg10 CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3 %vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3 JL_1 <BB#1>, %EFLAGS<imp-use,kill> ``` Existing two-address generation pass generates following code: ``` BB#1: %vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13 ... BB#6: Predecessors according to CFG: BB#5 BB#4 %vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15 %vreg10<def> = COPY %vreg1<kill>; GR32:%vreg10,%vreg1 %vreg10<def,tied1> = ADD32rr %vreg10<tied0>, %vreg0<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg0 %vreg3<def> = COPY %vreg10<kill>; GR32:%vreg3,%vreg10 %vreg3<def,tied1> = ADD32rr %vreg3<tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2 CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3 %vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3 JL_1 <BB#1>, %EFLAGS<imp-use,kill> JMP_1 <BB#7> ``` This is suboptimal because the assembly code generated has a redundant copy at the end of #BB6 to feed %vreg13 to BB#1: ``` .LBB0_6: addl %esi, %edi addl %ebx, %edi cmpl $10, %edi movl %edi, %esi jl .LBB0_1 ``` This redundant copy can be elimiated by making instructions in the recurrence chain to compute the value "into" the register that actually holds the feedback value. In this example, this can be achieved by commuting %vreg0 and %vreg1 to compute %vreg10. With that change, code after two-address generation becomes ``` BB#1: %vreg0<def> = COPY %vreg13<kill>; GR32:%vreg0,%vreg13 ... BB#6: derived from LLVM BB %bb7 Predecessors according to CFG: BB#5 BB#4 %vreg2<def> = COPY %vreg15<kill>; GR32:%vreg2,%vreg15 %vreg10<def> = COPY %vreg0<kill>; GR32:%vreg10,%vreg0 %vreg10<def,tied1> = ADD32rr %vreg10<tied0>, %vreg1<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg10,%vreg1 %vreg3<def> = COPY %vreg10<kill>; GR32:%vreg3,%vreg10 %vreg3<def,tied1> = ADD32rr %vreg3<tied0>, %vreg2<kill>, %EFLAGS<imp-def,dead>; GR32:%vreg3,%vreg2 CMP32ri8 %vreg3, 10, %EFLAGS<imp-def>; GR32:%vreg3 %vreg13<def> = COPY %vreg3<kill>; GR32:%vreg13,%vreg3 JL_1 <BB#1>, %EFLAGS<imp-use,kill> JMP_1 <BB#7> ``` and the final assembly does not have redundant copy: ``` .LBB0_6: addl %edi, %eax addl %ebx, %eax cmpl $10, %eax jl .LBB0_1 ``` Reviewers: qcolombet, MatzeB, wmi Reviewed By: wmi Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31821 llvm-svn: 306758	2017-06-29 23:11:24 +00:00
Simon Dardis	ed026ad334	Revert "[mips] Fix multiprecision arithmetic." This reverts commit r305389. This broke chromium builds, so reverting while I investigate further. llvm-svn: 306741	2017-06-29 20:59:47 +00:00
Keno Fischer	8ef40f62c6	[CodeGenPrepare] Don't create inttoptr for ni ptrs Summary: Arguably non-integral pointers probably shouldn't show up here at all, but since the backend doesn't complain and this takes valid (according to the Verifier) IR and makes it invalid, make sure not to introduce any inttoptr instructions if we're dealing with non-integral pointers. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D33110 llvm-svn: 306737	2017-06-29 20:28:59 +00:00
Hiroshi Inoue	36ad5023db	fix trivial typo, NFC llvm-svn: 306716	2017-06-29 18:03:28 +00:00
Nirav Dave	c314f34707	[DAG] Fold FrameIndex offset into BaseIndexOffset analysis. NFCI. Relanding after restricting equalBaseIndex to not erroneuosly consider a FrameIndices stemming from alloca from being comparable as its offset is set post-selectionDAG. Pull FrameIndex comparision reasoning from DAGCombiner::isAlias to general BaseIndexOffset. llvm-svn: 306688	2017-06-29 15:48:11 +00:00
Daniel Jasper	8d76d09d77	Revert "r306529 - [X86] Correct dwarf unwind information in function epilogue" I am 99% sure that this breaks the PPC ASAN build bot: http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/3112/steps/64-bit%20check-asan/logs/stdio If it doesn't go back to green, we can recommit (and fix the original commit message at the same time :) ). llvm-svn: 306676	2017-06-29 13:58:24 +00:00
Stanislav Mekhanoshin	0b47718863	Fold fneg and fabs like multiplications Given no NaNs and no signed zeroes it folds: (fmul X, (select (fcmp X > 0.0), -1.0, 1.0)) -> (fneg (fabs X)) (fmul X, (select (fcmp X > 0.0), 1.0, -1.0)) -> (fabs X) Differential Revision: https://reviews.llvm.org/D34579 llvm-svn: 306592	2017-06-28 20:25:50 +00:00
Krzysztof Parzyszek	f0d7f8cfbb	Rangify loops, formatting changes, use bool instead of unsigned, NFC llvm-svn: 306557	2017-06-28 16:02:00 +00:00
Krzysztof Parzyszek	f7b4320d1e	Missed a check for UndefVI in r306466 llvm-svn: 306553	2017-06-28 15:46:16 +00:00
Petar Jovanovic	0199002e6e	[X86] Correct dwarf unwind information in function epilogue CFI instructions that set appropriate cfa offset and cfa register are now inserted in emitEpilogue() in X86FrameLowering. Majority of the changes in this patch: 1. Ensure that CFI instructions do not affect code generation. 2. Enable maintaining correct information about cfa offset and cfa register in a function when basic blocks are reordered, merged, split, duplicated. These changes are target independent and described below. Changed CFI instructions so that they: 1. are duplicable 2. are not counted as instructions when tail duplicating or tail merging 3. can be compared as equal Add information to each MachineBasicBlock about cfa offset and cfa register that are valid at its entry and exit (incoming and outgoing CFI info). Add support for updating this information when basic blocks are merged, split, duplicated, created. Add a verification pass (CFIInfoVerifier) that checks that outgoing cfa offset and register of predecessor blocks match incoming values of their successors. Incoming and outgoing CFI information is used by a late pass (CFIInstrInserter) that corrects CFA calculation rule for a basic block if needed. That means that additional CFI instructions get inserted at basic block beginning to correct the rule for calculating CFA. Having CFI instructions in function epilogue can cause incorrect CFA calculation rule for some basic blocks. This can happen if, due to basic block reordering, or the existence of multiple epilogue blocks, some of the blocks have wrong cfa offset and register values set by the epilogue block above them. Patch by Violeta Vukobrat. Differential Revision: https://reviews.llvm.org/D18046 llvm-svn: 306529	2017-06-28 10:21:17 +00:00
Nirav Dave	c2c9b865bc	Revert "[DAG] Fold FrameIndex offset into BaseIndexOffset analysis. NFCI." This reverts commit r306498 which appears to cause a compilrt-rt test failures llvm-svn: 306501	2017-06-28 03:20:04 +00:00
Stanislav Mekhanoshin	d6f4dc77a6	Allow to truncate left shift with non-constant shift amount That is pretty common for clang to produce code like (shl %x, (and %amt, 31)). In this situation we can still perform trunc (shl) into shl (trunc) conversion given the known value range of shift amount. Differential Revision: https://reviews.llvm.org/D34723 llvm-svn: 306499	2017-06-28 02:37:11 +00:00
Nirav Dave	48ea968c3a	[DAG] Fold FrameIndex offset into BaseIndexOffset analysis. NFCI. Pull FrameIndex comparision reasoning from DAGCombiner::isAlias to general BaseIndexOffset. llvm-svn: 306498	2017-06-28 02:09:50 +00:00
Sanjay Patel	2902e378eb	[CGP] add specialization for memcmp expansion with only one basic block llvm-svn: 306485	2017-06-27 23:15:01 +00:00
Tim Northover	1182731c7c	GlobalISel: add some more sanity-checking to MachineInstrBuilder. NFC. llvm-svn: 306481	2017-06-27 22:45:35 +00:00
Aditya Nandakumar	83ff413d56	[GISel]: Add G_FEXP, G_FEXP2 opcodes Also add IRTranslator support. https://reviews.llvm.org/D34710 llvm-svn: 306475	2017-06-27 22:19:32 +00:00
Sanjay Patel	3e73231abb	[CGP] eliminate a sub instruction in memcmp expansion As noted in D34071, there are some IR optimization opportunities that could be handled by normal IR passes if this expansion wasn't happening so late in CGP. Regardless of that, it seems wasteful to knowingly produce suboptimal IR here, so I'm proposing this change: %s = sub i32 %x, %y %r = icmp ne %s, 0 => %r = icmp ne %x, %y Changing the predicate to 'eq' mimics what InstCombine would do, so that's just an efficiency improvement if we decide this expansion should happen sooner. The fact that the PowerPC backend doesn't eliminate the 'subf.' might be something for PPC folks to investigate separately. Differential Revision: https://reviews.llvm.org/D34416 llvm-svn: 306471	2017-06-27 21:46:34 +00:00
Tim Northover	e8c4bdecf5	GlobalISel: verify that a COPY is trivial when created. Without this check, COPY instructions can actually be one of the generic casts in disguise. That's confusing and bad. At some point during ISel this restriction has to be relaxed since the fully selected instructions will usually use COPY for those purposes. Right now I think it's possible that relaxation occurs during RegBankSelect (hence the change there). I'm not convinced that's where it belongs long-term though. llvm-svn: 306470	2017-06-27 21:41:40 +00:00
Krzysztof Parzyszek	5f0aaebcf6	Create a PHI value when merging with a known undef live-in Differential Revision: https://reviews.llvm.org/D34640 llvm-svn: 306466	2017-06-27 21:30:46 +00:00
Sanjay Patel	3175626c72	[CGP] simplify code to get bswap in memcmp expansion; NFCI llvm-svn: 306452	2017-06-27 19:31:35 +00:00
Matt Arsenault	b4e591cd6e	RenameIndependentSubregs: Fix infinite loop Apparently this replacement can really be substituting the same as the original register. Avoid restarting the loop when there's been no change in the register uses. llvm-svn: 306441	2017-06-27 18:28:10 +00:00
Sanjay Patel	e4d3650f6b	[CGP] add an IR builder to memcmp expansion class instead of recreating it; NFCI This was a clean-up suggestion from: https://reviews.llvm.org/D34005 llvm-svn: 306438	2017-06-27 18:18:42 +00:00
Matthias Braun	f32356fc1f	LiveRangeCalc: Slightly improve map usage; NFC - DenseMap should be faster than std::map - Use the `InsertRes = insert() if (!InsertRes.inserted)` pattern rather than the `if (!X.contains(...)) { X.insert(...); }` to save one map lookup. llvm-svn: 306436	2017-06-27 18:05:26 +00:00
Hiroshi Inoue	544972b656	[SelectionDAG] set dereferenceable flag in MergeConsecutiveStores to fix assetion failure When SelectionDAG merges consecutive stores and loads in MergeConsecutiveStores, it does not set dereferenceable flag for a created load instruction. This results in an assertion failure if SelectionDAG commonizes this load instruction with other load instructions, as well as it may miss optimization opportunities. This patch sat dereferenceable flag for the newly created load instruction if all the load instructions to be merged are dereferenceable. Differential Revision: https://reviews.llvm.org/D34679 llvm-svn: 306404	2017-06-27 12:43:08 +00:00
Hiroshi Inoue	4ea2be813b	fix trivial typos, NFC succesor -> successor llvm-svn: 306393	2017-06-27 10:35:37 +00:00
Matthias Braun	c9980d490b	ScheduleDAGInstrs: Fix fixupKills() adding too many kill flags. Remove invalid shortcut in fixupKills(): A register needs to be marked live even when we are not adding a kill flag. This is because a partially live register must not get a kill flags, but it still needs to be fully marked live when walking backwards. llvm-svn: 306352	2017-06-27 00:58:48 +00:00
Wolfgang Pieb	038c037dfc	DAGCombine: Make sure we only eliminate trunc/extend when the scales of truncation and extension match. This fixes PR33368. Reviewer: rksimon Differential Revision: https://reviews.llvm.org/D34069 llvm-svn: 306345	2017-06-26 23:05:51 +00:00
Eugene Zelenko	0e7a6818a7	[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 306341	2017-06-26 22:44:03 +00:00
Matt Arsenault	76ea38a523	RenameIndependentSubregs: Fix iterator problem Fixes bug 33597. Use of substituteRegister in the tied operand case messes up the register use iterator, causing some uses to be left unprocessed. llvm-svn: 306333	2017-06-26 21:33:36 +00:00
Tim Northover	cd830aa42d	AArch64: legalize G_EXTRACT operations. This is the dual problem to legalizing G_INSERTs so most of the code and testing was cribbed from there. llvm-svn: 306328	2017-06-26 20:34:13 +00:00
Mikael Holmen	b58321ae94	[IfConversion] Hoist removeBranch calls out of if/else clauses [NFC] Summary: Also added a comment. Pulled out of https://reviews.llvm.org/D34099. Reviewers: iteratee Reviewed By: iteratee Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34388 llvm-svn: 306279	2017-06-26 09:33:04 +00:00
Serguei Katkov	63a64a0d08	This reverts commit r306272. Revert "[MBP] do not rotate loop if it creates extra branch" It breaks the sanitizer build bots. Need to fix this. llvm-svn: 306276	2017-06-26 06:51:45 +00:00
Serguei Katkov	0f120ff044	[MBP] do not rotate loop if it creates extra branch This is a last fix for the corner case of PR32214. Actually this is not really corner case in general. We should not do a loop rotation if we create an additional branch due to it. Consider the case where we have a loop chain H, M, B, C , where H is header with viable fallthrough from pre-header and exit from the loop M - some middle block B - backedge to Header but with exit from the loop also. C - some cold block of the loop. Let's H is determined as a best exit. If we do a loop rotation M, B, C, H we can introduce the extra branch. Let's compute the change in number of branches: +1 branch from pre-header to header -1 branch from header to exit +1 branch from header to middle block if there is such -1 branch from cold bock to header if there is one So if C is not a predecessor of H then we introduce extra branch. This change actually prohibits rotation of the loop if both true 1) Best Exit has next element in chain as successor. 2) Last element in chain is not a predecessor of first element of chain. Reviewers: iteratee, xur Reviewed By: iteratee Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34271 llvm-svn: 306272	2017-06-26 05:27:27 +00:00
Elena Demikhovsky	711ebc79e7	AVX-512: Fixed a crash during legalization of <3 x i8> type The compiler fails with assertion during legalization of SETCC for <3 x i8> operands. The result is extended to <4 x i8> and then truncated <4 x i1>. It does not happen on AVX2, because the final result of SETCC is <4 x i32>. Differential Revision: https://reviews.llvm.org/D34503 llvm-svn: 306242	2017-06-25 13:36:20 +00:00
Hiroshi Inoue	b3494afe28	[SelectionDAG] set dereferenceable flag when expanding memcpy/memmove When SelectionDAG expands memcpy (or memmove) call into a sequence of load and store instructions, it disregards dereferenceable flag even the source pointer is known to be dereferenceable. This results in an assertion failure if SelectionDAG commonizes a load instruction generated for memcpy with another load instruction for the source pointer. This patch makes SelectionDAG to set the dereferenceable flag for the load instructions properly to avoid the assertion failure. Differential Revision: https://reviews.llvm.org/D34467 llvm-svn: 306209	2017-06-24 15:17:38 +00:00
Tim Northover	2454dd4f6e	GlobalISel: remove G_SEQUENCE instruction. It was trying to do too many things. The basic lumping together of values for legalization purposes is now handled by G_MERGE_VALUES. More complex things involving gaps and odd sizes are handled by G_INSERT sequences. llvm-svn: 306120	2017-06-23 16:15:55 +00:00
Tim Northover	b1c935a08d	GlobalISel: convert buildSequence to use non-deprecated instructions. G_SEQUENCE is going away soon so as a first step the MachineIRBuilder needs to be taught how to emulate it with alternatives. We use G_MERGE_VALUES where possible, and a sequence of G_INSERTs if not. llvm-svn: 306119	2017-06-23 16:15:37 +00:00
Andrew Kaylor	cd3ba468bb	Restrict the definition of loop preheader to avoid EH blocks Differential Revision: https://reviews.llvm.org/D34487 llvm-svn: 306070	2017-06-22 23:27:16 +00:00
Nirav Dave	58a9006547	[DAG] Add Target Store Merge pass ordering function Allow targets to specify if they should merge stores before or after legalization. llvm-svn: 306006	2017-06-22 15:07:49 +00:00
Sam Clegg	b5685bbe56	Mark dump() methods as const. NFC Add const qualifier to any dump() method where adding one was trivial. Differential Revision: https://reviews.llvm.org/D34481 llvm-svn: 305963	2017-06-21 22:19:17 +00:00
Sanjay Patel	34471ba139	[CGP, memcmp] replace CreateZextOrTrunc with CreateZext because it can never trunc llvm-svn: 305936	2017-06-21 18:20:52 +00:00
Sanjay Patel	d81d05ae77	[CGP] fix variables to be unsigned in memcmp expansion llvm-svn: 305935	2017-06-21 18:06:13 +00:00
Nirav Dave	b7f2852991	[DAG] Move BaseIndexOffset into separate Libarary. NFC. Move BaseIndexOffset analysis out of DAGCombiner for use in other files. llvm-svn: 305921	2017-06-21 15:40:43 +00:00
Nirav Dave	5b554f9cb2	[DAG] Remove Node csonstruction from BaseIndexOffset match. NFCI. Move GlobalAddress Offset decomposition from initial match into comparision check and removing the possibility of constructing a new offseted global address when examining addresses. llvm-svn: 305917	2017-06-21 15:07:30 +00:00
Javed Absar	cb00bf36ce	Use range-loop in machine-scheduler. NFCI. Converts to range-loop usage in machine scheduler. This makes the code neater and easier to read, and also keeps pace of the machine scheduler implementation with C++11 features. Reviewed by: Matthias Braun Differential Revision: https://reviews.llvm.org/D34320 llvm-svn: 305887	2017-06-21 09:10:10 +00:00
Guy Blank	c35ff985b1	[DAGCombiner] Add another combine from build vector to shuffle Add support for combining a build vector to a shuffle. When the build vector is of extracted elements from 2 vectors (vec1, vec2) where vec2 is 2 times smaller than vec1. llvm-svn: 305883	2017-06-21 07:38:41 +00:00
Dean Michael Berris	04bb923d73	[XRay] Reduce synthetic references emitted by XRay Summary: When we're building with XRay instrumentation, we use a trick that preserves references from the function to a function sled index. This index table lives in a separate section, and without this trick the linker is free to garbage-collect this section and all the segments it refers to. Until we're able to tell the linkers to preserve these sections, we use this reference trick to keep around both the index and the entries in the instrumentation map. Before this change we emitted both a synthetic reference to the label in the instrumentation map, and to the entry in the function map index. This change removes the first synthetic reference and only emits one synthetic reference to the index -- the index entry has the references to the labels in the instrumentation map, so the linker will still preserve those if the function itself is preserved. This reduces the amount of synthetic references we emit from 16 bytes to just 8 bytes in x86_64, and similarly to other platforms. Reviewers: dblaikie Subscribers: javed.absar, kpw, pelikan, llvm-commits Differential Revision: https://reviews.llvm.org/D34340 llvm-svn: 305880	2017-06-21 06:39:42 +00:00
Serguei Katkov	8374972c7a	[ImplicitNullChecks] Uphold an invariant in areMemoryOpsAliased Right now areMemoryOpsAliased has an assertion justified as: MMO1 should have a value due it comes from operation we'd like to use as implicit null check. assert(MMO1->getValue() && "MMO1 should have a Value!"); However, it is possible for that invariant to not be upheld in the following situation (conceptually): Null check %RAX NotNullSucc: %RAX = LEA %RSP, 16 // I0 %RDX = MOV64rm %RAX // I1 With the current code, we will have an early exit from ImplicitNullChecks::isSuitableMemoryOp on I0 with SR_Unsuitable. However, I1 will look plausible (since it loads from %RAX) and will go ahead and call areMemoryOpsAliased(I1, I0). This will cause us to fail the assert mentioned above since I1 does not load from an IR level value and thus is allowed to have a non-Value base address. The fix is to bail out earlier whenever we see an unsuitable instruction overwrite PointerReg. This would guarantee that when we call areMemoryOpsAliased, we're guaranteed to be looking at an instruction that loads from or stores to an IR level value. Original Patch Author: sanjoy Reviewers: sanjoy, mkazantsev, reames Reviewed By: sanjoy Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34385 llvm-svn: 305879	2017-06-21 06:38:23 +00:00
Adrian Prantl	67e2f64767	Fix a crash in DwarfDebug::validThroughout. The instruction it falls over on is an IMPLICT_DEF that also happens to be the only instruction in its lexical scope. That LexicalScope has never been created because its range is empty. This patch skips over all meta-instructions instead of just DBG_VALUEs. Thanks to David Blaikie for providing a testcase! llvm-svn: 305853	2017-06-20 21:08:52 +00:00
Aditya Nandakumar	48c7ec9aa0	[GISel]: Add G_FMA opcode for fused multiply adds https://reviews.llvm.org/D34372 Reviewed by dsanders llvm-svn: 305824	2017-06-20 19:25:23 +00:00
Matthias Braun	4512fd3488	RegisterScavenging: Followup to r305625 This does some improvements/cleanup to the recently introduced scavengeRegisterBackwards() functionality: - Rewrite findSurvivorBackwards algorithm to use the existing LiveRegUnit::accumulateBackward() code. This also avoids the Available and Candidates bitset and just need 1 LiveRegUnit instance (= 1 bitset). - Pick registers in allocation order instead of register number order. llvm-svn: 305817	2017-06-20 18:43:14 +00:00
Tim Northover	a30b7057aa	DAG: correctly legalize UMULO. We were incorrectly sign extending into the high word (as you would for SMULO) when legalizing UMULO in terms of a wider full multiplication. Patch by James Duley. llvm-svn: 305800	2017-06-20 15:01:38 +00:00
Daniel Sanders	0c5795464b	[globalisel][tablegen] Add support for COPY_TO_REGCLASS. Summary: As part of this * Emitted instructions now have named MachineInstr variables associated with them. This isn't particularly important yet but it's a small step towards multiple-insn emission. * constrainSelectedInstRegOperands() is no longer hardcoded. It's now added as the ConstrainOperandsToDefinitionAction() action. COPY_TO_REGCLASS uses an alternate constraint mechanism ConstrainOperandToRegClassAction() which supports arbitrary constraints such as that defined by COPY_TO_REGCLASS. Reviewers: ab, qcolombet, t.p.northover, rovka, kristof.beyls, aditya_nandakumar Reviewed By: ab Subscribers: javed.absar, igorb, llvm-commits Differential Revision: https://reviews.llvm.org/D33590 llvm-svn: 305791	2017-06-20 12:36:34 +00:00
Haojian Wu	d2efe6d376	[SelectionDAG] Fix an use-after-free issue introduced in r305775. vector.back() will be invalidated when memory reallocation happens. llvm-svn: 305785	2017-06-20 09:29:43 +00:00
Igor Breger	078eb813d6	[GlobalISel] combine not symmetric merge/unmerge nodes. Summary: In some cases legalization ends up with not symmetric merge/unmerge nodes. Transform it to merge/unmerge nodes. Reviewers: t.p.northover, qcolombet, zvi Reviewed By: t.p.northover Subscribers: rovka, kristof.beyls, guyblank, llvm-commits Differential Revision: https://reviews.llvm.org/D33626 llvm-svn: 305783	2017-06-20 08:54:17 +00:00
Max Kazantsev	178450709e	[SelectionDAG] Get rid of recursion in CalcNodeSethiUllmanNumber The recursive implementation of CalcNodeSethiUllmanNumber may overflow stack on extremely long pred chains. This patch replaces it with an equivalent iterative implementation. Differential Revision: https://reviews.llvm.org/D33769 llvm-svn: 305775	2017-06-20 07:07:09 +00:00
Nirav Dave	2d9d97ed85	[DAG] Simplify BaseIndexOffset. NFCI. Remove tail calls and cleanup codeflow. llvm-svn: 305768	2017-06-20 02:48:39 +00:00
Eugene Zelenko	027b1deacd	[Target] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 305757	2017-06-19 22:43:19 +00:00
Matt Arsenault	4985027a8e	Fix typos llvm-svn: 305749	2017-06-19 21:54:25 +00:00
Sanjay Patel	b84c07adbb	[CGP, PowerPC] try to constant fold before creating loads for memcmp expansion This is the last step needed to avoid regressions for x86 before we flip the switch to allow expansion of the smallest set of memcpy() via CGP. The DAG version checks for constant strings, so we need to do that here too. FWIW, the 2 constant test is not handled by LibCallSimplifier::optimizeMemCmp() because that code is limited to 8-bit constant arrays. LibCallSimplifier will also fail to optimize some 1 constant tests because its alignment requirements are too strict (shouldn't require alignment for a constant operand). Differential Revision: https://reviews.llvm.org/D34071 llvm-svn: 305734	2017-06-19 19:48:35 +00:00
Nirav Dave	fc3cc2762d	Allow truncated and extend memory operations in Store Merge. NFCI. As all store merges checks are based on the memory operation performed, allow use of truncated stores and extended loads as valid input candidates for merging. Relanding after fixing selection between truncated and normal store. llvm-svn: 305701	2017-06-19 15:32:28 +00:00
Florian Hahn	d1d5802289	Recommit rL305677: [CodeGen] Add generic MacroFusion pass Use llvm::make_unique to avoid ambiguity with MSVC. This patch adds a generic MacroFusion pass, that is used on X86 and AArch64, which both define target-specific shouldScheduleAdjacent functions. This generic pass should make it easier for other targets to implement macro fusion and I intend to add macro fusion for ARM shortly. Differential Revision: https://reviews.llvm.org/D34144 llvm-svn: 305690	2017-06-19 12:53:31 +00:00
Florian Hahn	8e7ee1f5d5	Revert r305677 [CodeGen] Add generic MacroFusion pass. This causes Windows buildbot failures do an ambiguous call. llvm-svn: 305681	2017-06-19 11:26:15 +00:00
Florian Hahn	9de5c3ed15	[CodeGen] Add generic MacroFusion pass. Summary: This patch adds a generic MacroFusion pass, that is used on X86 and AArch64, which both define target-specific shouldScheduleAdjacent functions. This generic pass should make it easier for other targets to implement macro fusion and I intend to add macro fusion for ARM shortly. Reviewers: craig.topper, evandro, t.p.northover, atrick, MatzeB Reviewed By: MatzeB Subscribers: atrick, aemerson, mgorny, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D34144 llvm-svn: 305677	2017-06-19 10:51:38 +00:00
Galina Kistanova	09768137f0	Fixed the warning introduced by r305625 to make ubuntu-gcc7.1-werror bot green. llvm-svn: 305640	2017-06-17 21:05:28 +00:00
Matthias Braun	4e4ba838d9	RegScavenging: Add scavengeRegisterBackwards() Re-apply r276044/r279124/r305516. Fixed a problem where we would refuse to place spills as the very first instruciton of a basic block and thus artifically increase pressure (test in test/CodeGen/PowerPC/scavenging.mir:spill_at_begin) This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 305625	2017-06-17 02:08:18 +00:00
Davide Italiano	6df246515f	[SelectionDAG] Update Loop info after splitting critical edges. The analysis is expected to be preserved by SelectionDAG. llvm-svn: 305621	2017-06-17 00:56:27 +00:00
Sam Clegg	9f1012760e	[WebAssembly] Use __stack_pointer global when writing wasm binary This ensures that symbolic relocations are generated for stack pointer manipulations. These relocations are of type R_WEBASSEMBLY_GLOBAL_INDEX_LEB. This change also adds support for reading relocations of this type in WasmObjectFile.cpp. Since its a globally imported symbol this does mean that the get_global/set_global instruction won't be valid until the objects are linked that global used in no longer an imported global. Differential Revision: https://reviews.llvm.org/D34172 llvm-svn: 305616	2017-06-16 23:59:10 +00:00
Craig Topper	1b6ccd7bca	[SelectionDAG] Use APInt::isSubsetOf. NFC llvm-svn: 305606	2017-06-16 23:19:14 +00:00
Craig Topper	01a0913b1b	[SelectionDAG] Use APInt::isNullValue/isOneValue. NFC llvm-svn: 305605	2017-06-16 23:19:12 +00:00
Craig Topper	996c15f14b	[TargetLowering] Use ConstantSDNode::isOne and getSExtValue instead of getting the underlying APInt first. NFC llvm-svn: 305604	2017-06-16 23:19:10 +00:00
Adrian Prantl	fcd1037fbe	Improve the accuracy of variable ranges .debug_loc location lists. For the following motivating example bool c(); void f(); bool start() { bool result = c(); if (!c()) { result = false; goto exit; } f(); result = true; exit: return result; } we would previously generate a single DW_AT_const_value(1) because only the DBG_VALUE in the second-to-last basic block survived codegen. This patch improves the heuristic used to determine when a DBG_VALUE is available at the beginning of its variable's enclosing lexical scope: - Stop giving singular constants blanket permission to take over the entire scope. There is still a special case for constants in the function prologue that we also miight want to retire later. - Use the lexical scope information to determine available-at-entry instead of proximity to the function prologue. After this patch we generate a location list with a more accurate narrower availability for the constant true value. As a pleasant side effect, we also generate inline locations instead of location lists where a loacation covers the entire range of the enclosing lexical scope. Measured on compiling llc with four targets this doesn't have an effect on compile time and reduces the size of the debug info for llc by ~600K. rdar://problem/30286912 llvm-svn: 305599	2017-06-16 22:40:04 +00:00
Matthias Braun	120ef06239	Revert "RegScavenging: Add scavengeRegisterBackwards()" Revert because of reports of some PPC input starting to spill when it was predicted that it wouldn't and no spillslot was reserved. This reverts commit r305516. llvm-svn: 305566	2017-06-16 17:48:08 +00:00
Daniel Neilson	88ff739fcf	[Atomics] Rename and change prototype for atomic memcpy intrinsic Summary: Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html This change is to alter the prototype for the atomic memcpy intrinsic. The prototype itself is being changed to more closely resemble the semantics and parameters of the llvm.memcpy intrinsic -- to ease later combination of the llvm.memcpy and atomic memcpy intrinsics. Furthermore, the name of the atomic memcpy intrinsic is being changed to make it clear that it is not a generic atomic memcpy, but specifically a memcpy is unordered atomic. Reviewers: reames, sanjoy, efriedma Reviewed By: reames Subscribers: mzolotukhin, anna, llvm-commits, skatkov Differential Revision: https://reviews.llvm.org/D33240 llvm-svn: 305558	2017-06-16 14:43:59 +00:00
Hiroshi Inoue	186daa5345	[MachineBlockPlacement] trivial fix in comments, NFC - Topologocal is abbreviated as "topo" in comments, but "top" is used in only one comment. Modify it for consistency. - Capitalize "succ" and "pred" for consistency in one figure. - Other trivial fixes. llvm-svn: 305552	2017-06-16 12:23:04 +00:00
Ahmed Bougacha	71279ceee2	Revert "[DAG] Allow truncated and extend memory operations in Store Merge. NFCI." This reverts commit r305468, as it caused PR33475. llvm-svn: 305527	2017-06-15 23:29:47 +00:00
Matthias Braun	04697c0362	RegScavenging: Add scavengeRegisterBackwards() Re-apply r276044/r279124. Trying to reproduce or disprove the ppc64 problems reported in the stage2 build last time, which I cannot reproduce right now. This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 llvm-svn: 305516	2017-06-15 22:14:55 +00:00
Lei Huang	cdf47c5983	[MachineLICM] Hoist TOC-based address instructions Add condition for MachineLICM to safely hoist instructions that utilize non constant registers that are reserved. On PPC, global variable access is done through the table of contents (TOC) which is always in register X2. The ABI reserves this register in any functions that have calls or access global variables. A call through a function pointer involves saving, changing and restoring this register around the call and thus MachineLICM does not consider it to be invariant. We can however guarantee the register is preserved across the call and thus is invariant. Differential Revision: https://reviews.llvm.org/D33562 llvm-svn: 305490	2017-06-15 18:29:59 +00:00
Benjamin Kramer	be435df139	Fold variable into assert. Silences an unused variable warning in Release builds. llvm-svn: 305488	2017-06-15 17:58:24 +00:00
Arnold Schwaighofer	c86f47bee2	ISel: Fix FastISel of swifterror values The code assumed that we process instructions in basic block order. FastISel processes instructions in reverse basic block order. We need to pre-assign virtual registers before selecting otherwise we get def-use relationships wrong. This only affects code with swifterror registers. rdar://32659327 llvm-svn: 305484	2017-06-15 17:34:42 +00:00
Nirav Dave	1a975acba3	[DAG] As StoreMerge now generates only legal nodes remove unecessary guard when run post-legalization NFCI. llvm-svn: 305477	2017-06-15 16:27:49 +00:00
Nirav Dave	ba6db6fbfd	[DAG] Defer Pre/Post IndexStore merge to after mergestore. NFCI. In preparation for doing storemerge post-legalization, reorder visitSTORE passes to move pre/post-index combining after store merge. Reordered passes other than store merge are unaffected. llvm-svn: 305473	2017-06-15 15:05:48 +00:00
Nirav Dave	101d566fcb	[DAG] Allow truncated and extend memory operations in Store Merge. NFCI. As all store merges checks are based on the memory operation performed, allow use of truncated stores and extended loads as valid input candidates for merging. llvm-svn: 305468	2017-06-15 14:04:07 +00:00
Nirav Dave	1cb883a45b	[DAG] Make MergeStores generate legalized stores. NFCI. Realized merged stores as truncstores if store will be realized as such by legalization. llvm-svn: 305467	2017-06-15 13:34:54 +00:00
Nirav Dave	e6c69ce782	[DAG] Use correct size for truncated store merge of load. NFCI. Avoid non-legal memory ops by checking correct size when merging stores of loads into a extload-truncstore pair. llvm-svn: 305466	2017-06-15 13:28:06 +00:00
Diana Picus	743dbc42d8	[ARM] GlobalISel: Add support for i32 modulo Add support for modulo for targets that have hardware division and for those that don't. When hardware division is not available, we have to choose the correct libcall to use. This is generally straightforward, except for AEABI. The AEABI variant is trickier than the other libcalls because it returns { quotient, remainder }, instead of just one value like the other libcalls that we've seen so far. Therefore, we need to use custom lowering for it. However, we don't want to have too much special code, so we refactor the target-independent code in the legalizer by adding a helper for replacing an instruction with a libcall. This helper is used by the legalizer itself when dealing with simple calls, and also by the custom ARM legalization for the more complicated AEABI divmod calls. llvm-svn: 305459	2017-06-15 10:53:31 +00:00
David Callahan	20523d002f	Allow -profile-guided-section-prefix more than once Summary: At present, `-profile-guided-section-prefix` is a `cl::Optional` option, which means it demands to be passed exactly zero or one times. Our build system makes it pretty tricky to guarantee this. We often accidentally pass the flag more than once (but always with the same "false" value) which results in an error, after which compilation fails: ``` clang (LLVM option parsing): for the -profile-guided-section-prefix option: may only occur zero or one times! ``` While we work on improving our build system, it also seems reasonable just to allow `-profile-guided-section-prefix` to be passed more than once, by to `cl::ZeroOrMore`. Quoting [[ http://llvm.org/docs/CommandLine.html#controlling-the-number-of-occurrences-required-and-allowed \| the documentation ]]: > The cl::ZeroOrMore modifier ... indicates that your program will allow the option to be specified zero or more times. > ... > If an option is specified multiple times for an option of the cl::opt class, only the last value will be retained. Reviewers: danielcdh Reviewed By: danielcdh Subscribers: twoh, david2050, llvm-commits Differential Revision: https://reviews.llvm.org/D34219 llvm-svn: 305413	2017-06-14 20:35:33 +00:00
Simon Dardis	a1f0320a27	[mips] Fix multiprecision arithmetic. For multiprecision arithmetic on MIPS, rather than using ISD::ADDE / ISD::ADDC, get SelectionDAG to break down the operation into ISD::ADDs and ISD::SETCCs. For MIPS, only the DSP ASE has a carry flag, so in the general case it is not useful to directly support ISD::{ADDE, ADDC, SUBE, SUBC} nodes. Also improve the generation code in such cases for targets with TargetLoweringBase::ZeroOrOneBooleanContent by directly using the result of the comparison node rather than using it in selects. Similarly for ISD::SUBE / ISD::SUBC. Address optimization breakage by moving the generation of MIPS specific integer multiply-accumulate nodes to before legalization. This revolves PR32713 and PR33424. Thanks to Simonas Kazlauskas and Pirama Arumuga Nainar for reporting the issue! Reviewers: slthakur Differential Revision: https://reviews.llvm.org/D33494 llvm-svn: 305389	2017-06-14 14:46:30 +00:00
Florian Hahn	1f9320a4cd	Align definition of DW_OP_plus with DWARF spec [3/3] Summary: This patch is part of 3 patches that together form a single patch, but must be introduced in stages in order not to break things. The way that LLVM interprets DW_OP_plus in DIExpression nodes is basically that of the DW_OP_plus_uconst operator since LLVM expects an unsigned constant operand. This unnecessarily restricts the DW_OP_plus operator, preventing it from being used to describe the evaluation of runtime values on the expression stack. These patches try to align the semantics of DW_OP_plus and DW_OP_minus with that of the DWARF definition, which pops two elements off the expression stack, performs the operation and pushes the result back on the stack. This is done in three stages: • The first patch (LLVM) adds support for DW_OP_plus_uconst. • The second patch (Clang) contains changes all its uses from DW_OP_plus to DW_OP_plus_uconst. • The third patch (LLVM) changes the semantics of DW_OP_plus and DW_OP_minus to be in line with its DWARF meaning. This patch includes the bitcode upgrade from legacy DIExpressions. Patch by Sander de Smalen. Reviewers: echristo, pcc, aprantl Reviewed By: aprantl Subscribers: fhahn, javed.absar, aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D33894 llvm-svn: 305386	2017-06-14 13:14:38 +00:00
Daniel Sanders	e9e6ba3b15	[globalisel][legalizer] G_LOAD/G_STORE NarrowScalar should not emit G_GEP x, 0. Summary: When legalizing G_LOAD/G_STORE using NarrowScalar, we should avoid emitting %0 = G_CONSTANT ty 0 %1 = G_GEP %x, %0 since it's cheaper to not emit the redundant instructions than it is to fold them away later. Reviewers: qcolombet, t.p.northover, ab, rovka, aditya_nandakumar, kristof.beyls Reviewed By: qcolombet Subscribers: javed.absar, llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D32746 llvm-svn: 305340	2017-06-13 23:42:32 +00:00
Florian Hahn	c9381ce2b9	Align definition of DW_OP_plus with DWARF spec [1/3] Summary: This patch is part of 3 patches that together form a single patch, but must be introduced in stages in order not to break things. The way that LLVM interprets DW_OP_plus in DIExpression nodes is basically that of the DW_OP_plus_uconst operator since LLVM expects an unsigned constant operand. This unnecessarily restricts the DW_OP_plus operator, preventing it from being used to describe the evaluation of runtime values on the expression stack. These patches try to align the semantics of DW_OP_plus and DW_OP_minus with that of the DWARF definition, which pops two elements off the expression stack, performs the operation and pushes the result back on the stack. This is done in three stages: • The first patch (LLVM) adds support for DW_OP_plus_uconst. • The second patch (Clang) contains changes all its uses from DW_OP_plus to DW_OP_plus_uconst. • The third patch (LLVM) changes the semantics of DW_OP_plus and DW_OP_minus to be in line with its DWARF meaning. This patch includes the bitcode upgrade from legacy DIExpressions. Patch by Sander de Smalen. Reviewers: pcc, echristo, aprantl Reviewed By: aprantl Subscribers: fhahn, aprantl, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33892 llvm-svn: 305304	2017-06-13 16:54:44 +00:00
Adrian Prantl	41b1f2f8c0	Fix an assertion failure when duplicate dbg.declares are present. This fixes PR33157. https://bugs.llvm.org//show_bug.cgi?id=33157 We might also think about disallowing duplicate dbg.declare intrinsics entirely, but this may complicate some passes needlessly. llvm-svn: 305244	2017-06-12 22:41:06 +00:00
Matthias Braun	b21f8848ef	SplitKit: Fix partially live subreg splitting Fix thinko/typo in subreg aware liverange splitting logic. I'm not sure how to write a proper testcase for this. The original problem only happens on an out-of-tree target. Forcing subreg enabled targets to spill and split in a predictable way is near impossible. llvm-svn: 305228	2017-06-12 20:30:52 +00:00
Peter Collingbourne	54103de7c1	IR: Replace the "Linker Options" module flag with "llvm.linker.options" named metadata. The new metadata is easier to manipulate than module flags. Differential Revision: https://reviews.llvm.org/D31349 llvm-svn: 305227	2017-06-12 20:10:48 +00:00
Geoff Berry	6ea2cff39a	[SelectionDAG] Allow sin/cos -> sincos optimization on GNU triples w/ just -fno-math-errno Summary: This change enables the sin(x) cos(x) -> sincos(x) optimization on GNU target triples. This optimization was being inhibited when -ffast-math wasn't set because sincos in GLibC does not set errno, while sin and cos do. However, this optimization will only run if the attributes on the sin/cos calls include readnone, which is how clang represents the fact that it doesn't care about the errno values set by these functions (via the -fno-math-errno flag). Reviewers: hfinkel, bogner Subscribers: mcrosier, javed.absar, llvm-commits, paul.redmond Differential Revision: https://reviews.llvm.org/D32921 llvm-svn: 305204	2017-06-12 17:15:41 +00:00
Than McIntosh	54e602ffe4	StackColoring: smarter check for slot overlap Summary: The old check for slot overlap treated 2 slots `S` and `T` as overlapping if there existed a CFG node in which both of the slots could possibly be active. That is overly conservative and caused stack blowups in Rust programs. Instead, check whether there is a single CFG node in which both of the slots are possibly active together. Fixes PR32488. Patch by Ariel Ben-Yehuda <ariel.byd@gmail.com> Reviewers: thanm, nagisa, llvm-commits, efriedma, rnk Reviewed By: thanm Subscribers: dotdash Differential Revision: https://reviews.llvm.org/D31583 llvm-svn: 305193	2017-06-12 14:56:02 +00:00
Sanjay Patel	21f1293d28	[DAG] add helper to bind memop chains; NFCI This step is just intended to reduce code duplication rather than change any functionality. A follow-up would be to replace PPCTargetLowering::spliceIntoChain() usage with this new helper. Differential Revision: https://reviews.llvm.org/D33649 llvm-svn: 305192	2017-06-12 14:41:48 +00:00
Amaury Sechet	86f06ba66a	[DAGCombine] Make sure we check the ResNo from UADDO before combining Summary: UADDO has 2 result, and one must check the result no before doing any kind of combine. Without it, the transform is invalid. Reviewers: joerg Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D34088 llvm-svn: 305162	2017-06-11 11:36:38 +00:00
Sanjay Patel	b14e115055	[CGP] add a reference to DataLayout in MemCmpExpansion; NFCI We're currently passing endian-ness around as a param (and not uniformly), so this eliminates the need for that. I'd like to add a constant fold call too, and that requires a DL. llvm-svn: 305129	2017-06-09 23:01:05 +00:00
Zvi Rackover	e37b726243	SelectionDAG: Remove deleted nodes from legalized set to avoid clash with newly created nodes Summary: During DAG legalization loop in SelectionDAG::Legalize(), bookkeeping of the SDNodes that were already legalized is implemented with SmallPtrSet (LegalizedNodes). This kind of set stores only pointers to objects, not the objects themselves. Unfortunately, if SDNode is deleted during legalization for some reason, LegalizedNodes set is not informed about this fact. This wouldn’t be so bad, if SelectionDAG wouldn’t reuse space deallocated after deletion of unused nodes, for creation of new ones. Because of this, new nodes, created during legalization often can have pointers identical to ones that have been previously legalized, added to the LegalizedNodes set, and deleted afterwards. This in turn causes, that newly created nodes, sharing the same pointer as deleted old ones, are present in LegalizedNodes already at the moment of creation, so we never call Legalize on them. The fix facilitates the fact, that DAG notifies listeners about each modification. I have registered DAGNodeDeletedListener inside SelectionDAG::Legalize, with a callback function that removes any pointer of any deleted SDNode from the LegalizedNodes set. With this modification, LegalizeNodes set does not contain pointers to nodes that were deleted, so newly created nodes can always be inserted to it, even if they share pointers with old deleted nodes. Patch by pawel.szczerbuk@intel.com The issue this patch addresses causes failures in an out-of-tree target, and i was not able to create a reproducer for an in-tree target, hence there is no test-case. Reviewers: delena, spatel, RKSimon, hfinkel, davide, qcolombet Reviewed By: delena Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33891 llvm-svn: 305084	2017-06-09 14:53:45 +00:00
Simon Dardis	398cd5e620	Reland "[SelectionDAG] Enable target specific vector scalarization of calls and returns" By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown, backends can request that LLVM to scalarize vector types for calls and returns. The MIPS vector ABI requires that vector arguments and returns are passed in integer registers. With SelectionDAG's new hooks, the MIPS backend can now handle LLVM-IR with vector types in calls and returns. E.g. 'call @foo(<4 x i32> %4)'. Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for calls and returns if vector types were not legal. If vector types were legal, a single 128bit vector argument would be assigned to a single 32 bit / 64 bit integer register. By teaching the MIPS backend to inspect the original types, it can now implement the MIPS vector ABI which requires a particular method of scalarizing vectors. Previously, the MIPS backend relied on clang to scalarize types such as "call @foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3, i32 inreg %4)". This patch enables the MIPS backend to take either form for vector types. The previous version of this patch had a "conditional move or jump depends on uninitialized value". Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur Differential Revision: https://reviews.llvm.org/D27845 llvm-svn: 305083	2017-06-09 14:37:08 +00:00
Serge Rogatch	21bdccdce7	[XRay] Fix computation of function size subject to XRay threshold Summary: Currently XRay compares its threshold against `Function::size()` . However, `Function::size()` returns the number of basic blocks (as I understand, such as cycle bodies, if/else bodies, switch-case bodies, etc.), rather than the number of instructions. The name of the parameter `-fxray-instruction-threshold=N`, as well as XRay documentation at http://llvm.org/docs/XRay.html , suggests that instructions should be counted, rather than the number of basic blocks. I see two options: 1. Count the number of MachineInstr`s in MachineFunction : this gives better estimate for the number of assembly instructions on the target. So a user can check in disassembly that the threshold works more or less correctly. 2. Count the number of Instruction`s in a Function : AFAIK, this gives correct number of IR instructions, which the user can check in IR listing. However, this number may be far (several times for small functions) from the number of assembly instructions finally emitted. Option 1 is implemented in this patch because I think that having the closer estimate for the number of assembly instructions emitted is more important than to have a clear definition of the metric. Reviewers: dberris, rengolin Reviewed By: dberris Subscribers: llvm-commits, iid_iunknown Differential Revision: https://reviews.llvm.org/D34027 llvm-svn: 305072	2017-06-09 13:23:23 +00:00
Nirav Dave	645feb1e55	Prevent RemoveDeadNodes from deleted already deleted node. This prevents against assertion errors like PR32659 which occur from a replacement deleting a node after it's been added to the list argument of RemoveDeadNodes. The specific failure from PR32659 does not currently happen, but it is still potentially possible. The underlying cause is that the callers of the change dfunction builds up a list of nodes to delete after having moved their uses and it possible that a move of a later node will cause a previously deleted nodes to be deleted. Reviewers: bkramer, spatel, davide Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33731 llvm-svn: 305070	2017-06-09 12:57:35 +00:00
Saleem Abdulrasool	b3bb143c65	sink DebugCompressionType into MC for exposing to clang This is a preparatory change to expose the debug compression style to clang. It requires exposing the enumeration and passing the actual value through to the backend from the frontend in actual value form rather than a boolean that selects the GNU style of debug info compression. Minor tweak to the ELF Object Writer to use a variable for re-used values. Add an assertion that debug information format is one of the two currently known types if debug information is being compressed. llvm-svn: 305038	2017-06-09 00:40:19 +00:00
Matthias Braun	44b7f15be2	RegAllocPBQP: Do not assign reserved physical register (0) RegAllocPBQP: Since getRawAllocationOrder() may return a collection that includes reserved physical registers, iterate to find an un-reserved physical register. (1) VirtRegMap: Enforce the invariant: "no reserved physical registers" in assignVirt2Phys(). Previously, this was checked only after the fact in VirtRegRewriter::rewrite. (2) MachineVerifier: updated the test per MatzeB's review. (3) +testcase Patch by Nick Johnson<Nicholas.Paul.Johnson@deshawresearch.com>! Differential Revision: https://reviews.llvm.org/D33947 llvm-svn: 305016	2017-06-08 21:30:54 +00:00
Sanjay Patel	03e3bdee22	fix formatting; NFC llvm-svn: 305008	2017-06-08 20:00:09 +00:00
Sanjay Patel	33b4725d4a	[CGP] don't expand a memcmp with nobuiltin attribute This matches the behavior used in the SDAG when expanding memcmp. For reference, we're intentionally treating the earlier fortified call transforms differently after: https://bugs.llvm.org/show_bug.cgi?id=23093 https://reviews.llvm.org/rL233776 One motivation for not transforming nobuiltin calls is that it can interfere with sanitizers: https://reviews.llvm.org/D19781 https://reviews.llvm.org/D19801 Differential Revision: https://reviews.llvm.org/D34043 llvm-svn: 305007	2017-06-08 19:47:25 +00:00
Sanjay Patel	3ca822d107	[CGP / PowerPC] avoid multi-block overhead for simple memcmp expansion The test diff for PowerPC shows we can better optimize if this case is one block. For x86, there's would be a substantial difference if CGP expansion was enabled because branches are assumed cheap and SDAG can't optimize across blocks. Instead of this: _cmp_eq8: movq (%rdi), %rax cmpq (%rsi), %rax je LBB23_1 ## BB#2: ## %res_block movl $1, %ecx jmp LBB23_3 LBB23_1: xorl %ecx, %ecx LBB23_3: ## %endblock xorl %eax, %eax testl %ecx, %ecx sete %al retq We get this: cmp_eq8: movq (%rdi), %rcx xorl %eax, %eax cmpq (%rsi), %rcx sete %al retq And that matches the optimal codegen that we get from the current expansion in SelectionDAGBuilder::visitMemCmpCall(). If this looks right, then I just need to confirm that vector-sized expansion will work from here, and we can enable CGP memcmp() expansion for x86. Ie, we'll bypass the power-of-2 special cases currently optimized in SDAG because we can lower the IR produced here optimally. Differential Revision: https://reviews.llvm.org/D34005 llvm-svn: 304987	2017-06-08 16:53:18 +00:00
Eugene Zelenko	88025fac90	[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 304954	2017-06-07 23:53:32 +00:00
Nirav Dave	ef57f4db49	[DAG] Improve Store Merge candidate pruning. NFC. When considering merging stores values are the results of loads only consider stores whose values come from loads from the same base. This fixes much of the longer compile times in PR33330. llvm-svn: 304934	2017-06-07 18:51:56 +00:00
Sanjay Patel	52cdd47910	[CGP] avoid zext/trunc of a memcmp expansion compare This could be viewed as another shortcoming of the DAGCombiner: when both operands of a compare are zexted from the same source type, we should be able to compare the original types. The effect on PowerPC perf is likely unnoticeable, but there's a visible regression for x86 if we feed the suboptimal IR for memcmp expansion to the DAG: _cmp_eq4_zexted_to_i64: movl (%rdi), %ecx movl (%rsi), %edx xorl %eax, %eax cmpq %rdx, %rcx sete %al _cmp_eq4_better: movl (%rdi), %ecx xorl %eax, %eax cmpl (%rsi), %ecx sete %al llvm-svn: 304923	2017-06-07 16:16:45 +00:00
Sanjay Patel	aca20904b7	[CGP] pass size as param in MemCmpExpansion; NFCI Avoid extracting the constant int twice. llvm-svn: 304920	2017-06-07 15:05:13 +00:00
Sanjay Patel	de78301a10	[CGP] pass size as param in MemCmpExpansion; NFCI Avoid extracting the constant int twice. llvm-svn: 304917	2017-06-07 14:45:49 +00:00
Sanjay Patel	d73630f05c	[CGP] getParent()->getParent() --> getFunction(); NFCI llvm-svn: 304916	2017-06-07 14:29:52 +00:00
Simon Pilgrim	18db4739a5	[DAG] Move SelectionDAG::isCommutativeBinOp to TargetLowering. This will allow commutation of target-specific DAG nodes in future patches Differential Revision: https://reviews.llvm.org/D33882 llvm-svn: 304911	2017-06-07 14:05:04 +00:00
Sanjay Patel	ff912dafed	[CGP] add helper function for generating compare of load pairs; NFCI In the special (but also the likely common) case, we can avoid the multi-block complexity of the general algorithm, so moving this part off on its own will make it re-usable. llvm-svn: 304908	2017-06-07 13:33:00 +00:00
Sanjay Patel	38da012a6a	[CGP] fix formatting in MemCmpExpansion; NFC llvm-svn: 304903	2017-06-07 12:44:36 +00:00
NAKAMURA Takumi	bf2abbfa92	Update libdeps to add BinaryFormat, introduced in r304864. llvm-svn: 304869	2017-06-07 04:48:49 +00:00
Zachary Turner	c5632126fc	Move Object format code to lib/BinaryFormat. This creates a new library called BinaryFormat that has all of the headers from llvm/Support containing structure and layout definitions for various types of binary formats like dwarf, coff, elf, etc as well as the code for identifying a file from its magic. Differential Revision: https://reviews.llvm.org/D33843 llvm-svn: 304864	2017-06-07 03:48:56 +00:00
Quentin Colombet	61c0121740	[InlineSpiller] Only account for real spills in the hoisting logic Spills of undef values shouldn't impact the placement of the relevant spills. Drive by review. llvm-svn: 304850	2017-06-07 00:22:07 +00:00
Sanjay Patel	fa9aa2642f	[CGP / PowerPC] use direct compares if there's only one load per block in memcmp() expansion I'd like to enable CGP memcmp expansion for x86, but the output from CGP would regress the special cases (memcmp(x,y,N) != 0 for N=1,2,4,8,16,32 bytes) that we already handle. I'm not sure if we'll actually be able to produce the optimal code given the block-at-a-time limitation in the DAG. We might have to just avoid those special-cases here in CGP. But regardless of that, I think this is a win for the more general cases. http://rise4fun.com/Alive/cbQ Differential Revision: https://reviews.llvm.org/D33963 llvm-svn: 304849	2017-06-07 00:17:08 +00:00
Eugene Zelenko	5ddbdccf6e	[CodeGen] Fix some Clang-tidy modernize-use-using and Include What You Use warnings; other minor fixes (NFC). llvm-svn: 304839	2017-06-06 22:22:41 +00:00
Sanjay Patel	f14b349546	[CGP] fix formatting/typos in MemCmpExpansion; NFC llvm-svn: 304830	2017-06-06 20:30:47 +00:00
Matthias Braun	938b489a87	llc: Add ability to parse mir from stdin - Add -x <language> option to switch between IR and MIR inputs. - Change MIR parser to read from stdin when filename is '-'. - Add a simple mir roundtrip test. llvm-svn: 304825	2017-06-06 20:06:57 +00:00
Sanjay Patel	02fa101cad	[DAG] remove duplicated code for isOnlyUsedInZeroEqualityComparison(); NFCI llvm-svn: 304822	2017-06-06 19:40:09 +00:00
Matthias Braun	9d94ca9229	MIRPrinter: Avoid assert() when printing empty INLINEASM strings. CodeGen uses MO_ExternalSymbol to represent the inline assembly strings. Empty strings for symbol names appear to be invalid. For now just special case the output code to avoid hitting an `assert()` in `printLLVMNameWithoutPrefix()`. This fixes https://llvm.org/PR33317 llvm-svn: 304815	2017-06-06 19:00:58 +00:00
Simon Pilgrim	0ed1e4f769	Fix spelling mistake in getRThroughput static function names. NFCI. llvm-svn: 304799	2017-06-06 14:25:34 +00:00
Chandler Carruth	eb66b33867	Sort the remaining #include lines in include/... and lib/.... I did this a long time ago with a janky python script, but now clang-format has built-in support for this. I fed clang-format every line with a #include and let it re-sort things according to the precise LLVM rules for include ordering baked into clang-format these days. I've reverted a number of files where the results of sorting includes isn't healthy. Either places where we have legacy code relying on particular include ordering (where possible, I'll fix these separately) or where we have particular formatting around #include lines that I didn't want to disturb in this patch. This patch is entirely mechanical. If you get merge conflicts or anything, just ignore the changes in this patch and run clang-format over your #include lines in the files. Sorry for any noise here, but it is important to keep these things stable. I was seeing an increasing number of patches with irrelevant re-ordering of #include lines because clang-format was used. This patch at least isolates that churn, makes it easy to skip when resolving conflicts, and gets us to a clean baseline (again). llvm-svn: 304787	2017-06-06 11:49:48 +00:00
Vivek Pandya	9d4d8b5728	[Improve CodeGen Testing] This patch renables MIRPrinter print fields which have value equal to its default. If -simplify-mir option is passed then MIRPrinter will not print such fields. This change also required some lit test cases in CodeGen directory to be changed. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D32304 llvm-svn: 304779	2017-06-06 08:16:19 +00:00
Mandeep Singh Grang	efd068d7d5	[llvm] Remove double semicolons Reviewers: craig.topper, arsenm, mehdi_amini Reviewed By: mehdi_amini Subscribers: mehdi_amini, wdng, nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D33924 llvm-svn: 304767	2017-06-06 05:08:36 +00:00
Matthias Braun	17b01651c8	CodeGen: Refactor MIR parsing When parsing .mir files immediately construct the MachineFunctions and put them into MachineModuleInfo. This allows us to get rid of the delayed construction (and delayed error reporting) through the MachineFunctionInitialzier interface. Differential Revision: https://reviews.llvm.org/D33809 llvm-svn: 304758	2017-06-06 00:44:35 +00:00
Matthias Braun	13c1e17841	CodeGen/LLVMTargetMachine: Refactor ISel pass construction; NFCI - Move ISel (and pre-isel) pass construction into TargetPassConfig - Extract AsmPrinter construction into a helper function Putting the ISel code into TargetPassConfig seems a lot more natural and both changes together make make it easier to build custom pipelines involving .mir in an upcoming commit. This moves MachineModuleInfo to an earlier place in the pass pipeline which shouldn't have any effect. llvm-svn: 304754	2017-06-06 00:26:13 +00:00

... 2 3 4 5 6 ...

23110 Commits