llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 03:02:36 +01:00

Author	SHA1	Message	Date
Simon Pilgrim	e3fad35070	[X86][AVX512] Tag VPMADD52/VPSADBW instruction scheduler classes llvm-svn: 319772	2017-12-05 14:59:40 +00:00
Bjorn Pettersson	bcce892345	[DAGCombine] Handle big endian correctly in CombineConsecutiveLoads Summary: Found out, at code inspection, that there was a fault in DAGCombiner::CombineConsecutiveLoads for big-endian targets. A BUILD_PAIR is always having the least significant bits of the composite value in element 0. So when we are doing the checks for consecutive loads, for big endian targets, we should check if the load to elt 1 is at the lower address and the load to elt 0 is at the higher address. Normally this bug only resulted in missed oppurtunities for doing the load combine. I guess that in some rare situation it could lead to faulty combines, but I've not seen that happen. Note that this patch actually will trigger load combine for some big endian regression tests. One example is test/CodeGen/PowerPC/anon_aggr.ll where we now get t76: i64,ch = load<LD8[FixedStack-9] instead of t37: i32,ch = load<LD4[FixedStack-10]> t35: i32,ch = load<LD4[FixedStack-9]> t41: i64 = build_pair t37, t35 before legalization. Then the legalization will split the LD8 into two loads, so the end result is the same. That should verify that the transfomation is correct now. Reviewers: niravd, hfinkel Reviewed By: niravd Subscribers: nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D40444 llvm-svn: 319771	2017-12-05 14:50:05 +00:00
Simon Pilgrim	117cdd90a4	[X86][AVX512] Add missing scalar CMPSS/CMPSD logic scheduler classes llvm-svn: 319770	2017-12-05 14:34:42 +00:00
Mikael Holmen	b9975c4aa6	Bail out of a SimplifyCFG switch table opt at undef values. Summary: A true or false result is expected from a comparison, but it seems the possibility of undef was overlooked, which could lead to a failed assert. This is fixed by this patch by bailing out if we encounter undef. The bug is old and the assert has been there since the end of 2014, so it seems this is unusual enough to forego optimization. Patch by JesperAntonsson. Reviewers: spatel, eeckstein, hans Reviewed By: hans Subscribers: uabelho, llvm-commits Differential Revision: https://reviews.llvm.org/D40639 llvm-svn: 319768	2017-12-05 14:14:00 +00:00
Simon Pilgrim	1b86bca5d7	[X86][AVX512] Cleanup bit logic scheduler classes llvm-svn: 319767	2017-12-05 14:04:23 +00:00
Sam Parker	3ff1bb935a	[DAGCombine] isLegalNarrowLoad function (NFC) Pull the checks upon the load out from ReduceLoadWidth into their own function. Differential Revision: https://reviews.llvm.org/D40833 llvm-svn: 319766	2017-12-05 14:03:51 +00:00
Simon Pilgrim	ed1ab359bf	[X86][AVX512] Tag scalar CVT and CMP instruction scheduler classes llvm-svn: 319765	2017-12-05 13:49:44 +00:00
Dean Michael Berris	d303344909	[XRay][docs] Document xray_mode and log registration API. This marks certain flags in XRay as deprecated (in particular, `xray_naive_log=` and `xray_fdr_log=`), and recommends the use of the `xray_mode=` flag. llvm-svn: 319763	2017-12-05 12:43:12 +00:00
Igor Laevsky	d5ce9e969c	[InstCombine] Don't crash on out of bounds shifts Differential Revision: https://reviews.llvm.org/D40649 llvm-svn: 319761	2017-12-05 12:18:15 +00:00
Simon Pilgrim	3393cb0b28	[X86][AVX512] Tag VPCMP/VPCMPU instruction scheduler classes Move hardcoded itinerary out to the instruction declarations. Not sure that IIC_SSE_ALU_F32P is the best schedule for integer comparisons, but I'm not going to change it right now. llvm-svn: 319760	2017-12-05 12:14:36 +00:00
Simon Pilgrim	c9d7b641ae	[X86][AVX512] Cleanup VPCMP scheduler classes Move hardcoded itinerary out to the instruction declarations. Not sure that IIC_SSE_ALU_F32P is the best schedule for integer comparisons, but I'm not going to change it right now. llvm-svn: 319758	2017-12-05 12:02:22 +00:00
Simon Pilgrim	993d861554	[X86][AVX512] Tag VFIXUPIMM instructions scheduler classes llvm-svn: 319757	2017-12-05 11:46:57 +00:00
Jonas Paulsson	b3c888286c	[SystemZ] set 'guessInstructionProperties = 0' and set flags as needed. This has proven a healthy exercise, as many cases of incorrect instruction flags were corrected in the process. As part of this, IntrWriteMem was added to several SystemZ instrinsics. Furthermore, a bug was exposed in TwoAddress with this change (as incorrect hasSideEffects flags were removed and instructions could now be sunk), and the test case for that bugfix (r319646) is included here as test/CodeGen/SystemZ/twoaddr-sink.ll. One temporary test regression (one extra copy) which will hopefully go away in upcoming patches for similar cases: test/CodeGen/SystemZ/vec-trunc-to-i1.ll Review: Ulrich Weigand. https://reviews.llvm.org/D40437 llvm-svn: 319756	2017-12-05 11:24:39 +00:00
Jonas Paulsson	b4cf0df8b1	[Regalloc] Generate and store multiple regalloc hints. MachineRegisterInfo used to allow just one regalloc hint per virtual register. This patch extends this to a vector of regalloc hints, which is filled in by common code with sorted copy hints. Such hints will make for more ID copies that can be removed. NB! This improvement is currently (and hopefully temporarily) disabled by default, except for SystemZ. The only reason for this is the big impact this has on tests, which has unfortunately proven unmanageable. It was a long while since all the tests were updated and just waiting for review (which didn't happen), but now targets have to enable this themselves instead. Several targets could get a head-start by downloading the tests updates from the Phabricator review. Thanks to those who helped, and sorry you now have to do this step yourselves. This should be an improvement generally for any target! The target may still create its own hint, in which case this has highest priority and is stored first in the vector. If it has target-type, it will not be recomputed, as per the previous behaviour. The temporary hook enableMultipleCopyHints() will be removed as soon as all targets return true. Review: Quentin Colombet, Ulrich Weigand. https://reviews.llvm.org/D38128 llvm-svn: 319754	2017-12-05 10:52:24 +00:00
George Rimar	d9bca550a8	Fix build bot after r319750 "[Support/TarWriter] - Don't allow TarWriter to add the same file more than once." Error was: error: comparison of integers of different signs: 'const unsigned long' and 'const int' [-Werror,-Wsign-compare] http://lab.llvm.org:8011/builders/ubuntu-gcc7.1-werror/builds/3469/steps/build-unified-tree/logs/stdio http://lab.llvm.org:8011/builders/clang-with-thin-lto-ubuntu/builds/7118/steps/build-stage2-compiler/logs/stdio llvm-svn: 319752	2017-12-05 10:35:11 +00:00
Pavel Labath	d9438e5fb6	Re-commit "[cmake] Enable zlib support on windows" This recommits r319533 which was broken llvm-config --system-libs output. The reason was that I used find_libraries for searching for the z library. This returns absolute paths, and when these paths made it into llvm-config, it made it produce nonsensical flags. To fix this, I hand-roll a search for the library in the same way that we search for the terminfo library a couple of lines below. This is a bit less flexible than the find_library option, as it does not allow the user to specify the path to the library at configure time (which is important on windows, as zlib is unlikely to be found in any of the standard places cmake searches), but I was able to guide the build to find it with appropriate values of LIB and INCLUDE environment variables. Reviewers: compnerd, rnk, beanz, rafael Subscribers: llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D40779 llvm-svn: 319751	2017-12-05 10:24:15 +00:00
George Rimar	bde67cb387	[Support/TarWriter] - Don't allow TarWriter to add the same file more than once. This is for PR35460. Currently when LLD adds files to TarWriter it may pass the same file multiple times. For example it happens for clang reproduce file which specifies archive (.a) files more than once in command line. Patch makes TarWriter to ignore files with the same path, so it will add only the first one to archive. Differential revision: https://reviews.llvm.org/D40606 llvm-svn: 319750	2017-12-05 10:09:59 +00:00
Guy Blank	a0e8355018	[X86] Fix a bug in handling GRXX subclasses in Domain Reassignment pass When trying to determine the correct Mask register class corresponding to a GPR register class, not all register classes were handled. This caused an assertion to be raised on some scenarios. Differential Revision: https://reviews.llvm.org/D40290 llvm-svn: 319745	2017-12-05 09:08:24 +00:00
Craig Topper	4e6e5fe875	[SelectionDAG] Use WidenTargetBoolean in WidenVecRes_MLOAD and WidenVecOp_MSTORE instead of implementing it manually and incorrectly. The CONCAT_VECTORS operand get its type from getSetCCResultType, but if the mask type and the setcc have different scalar sizes this creates an illegal CONCAT_VECTORS operation. The concat type should be 2x the mask type, and then an extend should be added if needed. llvm-svn: 319744	2017-12-05 08:15:03 +00:00
Michael Trent	b34aad4a84	Test commit, as per the LLVM Developer Policy. Commit message, as per the same policy. I added a blank space to the end of the file. Excelsior. llvm-svn: 319743	2017-12-05 07:50:00 +00:00
Craig Topper	e2dd3ace24	[X86] Use vector widening to support sign extend from i1 when the dest type is not 512-bits and vlx is not enabled. Previously we used a wider element type and truncated. But its more efficient to keep the element type and drop unused elements. If BWI isn't supported and we have a i16 or i8 type, we'll extend it to be i32 and still use a truncate. llvm-svn: 319740	2017-12-05 06:37:21 +00:00
Daniel Sanders	f0a9960826	Revert r319691: [globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64. Some concerns were raised with the direction. Revert while we discuss it and look into an alternative llvm-svn: 319739	2017-12-05 05:52:07 +00:00
Kuba Mracek	91eccfcbfe	Disable detect_leaks in the ASanified build of LLVM when using Apple LLVM. The released Apple LLVM versions don't support LSan. llvm-svn: 319738	2017-12-05 05:22:02 +00:00
Craig Topper	85c1f9fb40	[X86] Fix a crash if avx512bw and xop are both enabled when the IR contrains a v32i8 bitreverse. llvm-svn: 319737	2017-12-05 04:47:12 +00:00
Matt Arsenault	acb3dd7d12	AMDGPU: Fix missing subtarget feature initializer llvm-svn: 319733	2017-12-05 03:15:44 +00:00
Matt Arsenault	00e7e6f5cc	AMDGPU: Fix crash when scheduling DBG_VALUE This calls handleMove with a DBG_VALUE instruction, which isn't tracked by LiveIntervals. I'm not sure this is the correct place to fix this. The generic scheduler seems to have more deliberate region selection that skips dbg_value. The test is also really hard to reduce. I haven't been able to figure out what exactly causes this particular case to try moving the dbg_value. llvm-svn: 319732	2017-12-05 03:09:23 +00:00
Craig Topper	0477280769	[X86] Use vector widening to support zero extend from i1 when the dest type is not 512-bits and vlx is not enabled. Previously we used a wider element type and truncated. But its more efficient to keep the element type and drop unused elements. If BWI isn't supported and we have a i16 or i8 type, we'll extend it to be i32 and still use a truncate. llvm-svn: 319728	2017-12-05 01:45:46 +00:00
Craig Topper	36a8c4ab23	[X86] Don't use kunpck for vXi1 concat_vectors if the upper bits are undef. This can be efficiently selected by a COPY_TO_REGCLASS without the need for an extra instruction. llvm-svn: 319726	2017-12-05 01:28:06 +00:00
Craig Topper	b8ec6117da	[X86] Use getZeroVector and remove an unnecessary creation of an APInt before calling getConstant. NFCI The getConstant function can take care of creating the APInt internally. getZeroVector will take care of using the correct type for the build vector to avoid re-lowering. The test change here is because execution domain constraints apparently pass through undef inputs of a zeroing xor. So the different ordering of register allocation here caused the dependency to change. llvm-svn: 319725	2017-12-05 01:28:04 +00:00
Craig Topper	360e161965	[X86] Rearrange some of the code around AVX512 sign/zero extends. NFCI Move the AVX512 code out of LowerAVXExtend. LowerAVXExtend has two callers but one of them pre-checks for AVX-512 so the code is only live from the other caller. So move the AVX-512 checks up to that caller for symmetry. Move all of the i1 input type code in Lower_AVX512ZeroExend together. llvm-svn: 319724	2017-12-05 01:28:00 +00:00
Shoaib Meenai	54283334f4	[cmake] Modernize some conditionals. NFC The "x${...}" form was a workaround for CMake versions prior to 3.1, where the if command would interpret arguments as variables even when quoted [1]. We can drop the workaround now that our minimum CMake version is 3.4. [1] https://cmake.org/cmake/help/v3.1/policy/CMP0054.html Differential Revision: https://reviews.llvm.org/D40744 llvm-svn: 319723	2017-12-05 01:19:48 +00:00
Matthias Braun	e0fe5ea8ec	MachineFrameInfo: Cleanup some parameter naming inconsistencies; NFC Consistently use the same parameter names as the names of the affected fields. This avoids some unintuitive abbreviations like `isSS`. llvm-svn: 319722	2017-12-05 01:18:15 +00:00
Matthias Braun	893a25473b	TwoAddressInstructionPass: Trigger -O0 behavior on optnone While we cannot skip the whole TwoAddressInstructionPass even for -O0 there are some parts of the pass that are currently skipped at -O0 but not for optnone. Changing this as there is no reason to have those two hit different code paths here. llvm-svn: 319721	2017-12-05 00:56:14 +00:00
Petr Hosek	3f97323c09	[CMake] Don't use comma as an alternate separator Using comma can break in cases when we're passing flags that already use comma as a separator. Fixes PR35504. Differential Revision: https://reviews.llvm.org/D40761 llvm-svn: 319719	2017-12-05 00:15:18 +00:00
Jan Vesely	08d45d83d9	AMDGPU/EG: Add a new FeatureFMA and use it to selectively enable FMA instruction Only used by pre-GCN targets v2: fix predicate setting for FMA_Common Differential Revision: https://reviews.llvm.org/D40692 llvm-svn: 319712	2017-12-04 23:07:28 +00:00
Jan Vesely	5624ed9d95	AMDGPU: Disable fp64 support on pre GCN asics It's not implemented. Passing +fp64-fp16-denormal feature enables fp64 even on asics that don't support it v2: fix hasFP64 query Differential Revision: https://reviews.llvm.org/D39931 llvm-svn: 319709	2017-12-04 22:57:29 +00:00
Evgeniy Stepanov	2872d7198e	[msan] Add a fixme note for a minor deficiency. llvm-svn: 319708	2017-12-04 22:50:39 +00:00
Hans Wennborg	96b1a36cd4	Revert r319490 "XOR the frame pointer with the stack cookie when protecting the stack" This broke the Chromium build (crbug.com/791714). Reverting while investigating. > Summary: This strengthens the guard and matches MSVC. > > Reviewers: hans, etienneb > > Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits > > Differential Revision: https://reviews.llvm.org/D40622 > > git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319490 91177308-0d34-0410-b5e6-96231b3b80d8 llvm-svn: 319706	2017-12-04 22:21:15 +00:00
Matt Arsenault	9da212a8d2	AMDGPU: Fix creating invalid copy when adjusting dmask Move the entire optimization to one place. Before it was possible to adjust dmask without changing the register class of the output instruction, since they were done in separate places. Fix all lane sizes and move all of the optimization into the DAG folding. llvm-svn: 319705	2017-12-04 22:18:27 +00:00
Matt Arsenault	4df1b6ca59	AMDGPU: Use return value of MorphNodeTo llvm-svn: 319704	2017-12-04 22:18:22 +00:00
Daniel Sanders	37451b86e7	Allow similar TargetOpcodes to use inheritance to factor out commonality. NFC. Summary: While implementing atomicrmw in https://reviews.llvm.org/D40092 I found that inheritance is unusable for all the Generic Opcodes in GlobalISel. This is because the whole header is included inside a 'let mayLoad = 0, mayStore = 0 ... in' block. In TableGen, the order of precedence for field assignments is: 1. Values from classes the record inherits from. 2. Values from 'let Name=Value in { ... }' 3. Values from 'let Name=Value;' As such the 'let mayLoad = 0, mayStore = 0, ... in' surrounding the 'include "GenericOpcodes.td"' was overriding any values provided via inheritance. We hadn't noticed this before because we were only using 'let Name=Value;' to specialize opcodes. Fix this by moving the default values to the lowest precedence. This is accomplished by moving the values to a common base class (StandardPseudoInstruction for most TargetOpcodes, and GenericOpcode for GlobalISel specific TargetOpcodes) Reviewers: qcolombet Reviewed By: qcolombet Subscribers: llvm-commits, igorb Differential Revision: https://reviews.llvm.org/D40096 llvm-svn: 319701	2017-12-04 21:40:57 +00:00
Paul Robinson	8a47dbb61c	Re-submit r289925 (Update .debug_line section version to match DWARF version) Set the .debug_line version to match the requested DWARF version, except with a maximum of v4 because we don't support v5 yet. Previously Chromium had issues with this patch; see PR31407. Chromium tool issues have been addressed, so hopefully this will go through this time. Patch by Katya Romanova! Differential Revision: https://reviews.llvm.org/D38002 llvm-svn: 319699	2017-12-04 21:27:46 +00:00
Daniel Sanders	a60df77dd5	[globalisel][tablegen] Tests for r319691 I forgot to 'svn add' the test files. llvm-svn: 319698	2017-12-04 21:14:34 +00:00
Hans Wennborg	218aa3c4c3	DAG: Follow-up to r319692 check the truncates inputs have the same type MatchRotate assumes the types of the types of LHS and RHS are equal, which is always the case then they come from an OR node, but here we're getting them from two different TRUNC nodes, so we have to check the types. llvm-svn: 319695	2017-12-04 20:48:50 +00:00
Hans Wennborg	c50acb9936	DAG: Match truncated rotation (PR35487) If the truncation has been pushed past the or-node, look through it and truncate afterwards. Differential revision: https://reviews.llvm.org/D40792 llvm-svn: 319692	2017-12-04 20:39:57 +00:00
Daniel Sanders	2a3d1acd34	[globalisel][tablegen] Split atomic load/store into separate opcode and enable for AArch64. This patch splits atomics out of the generic G_LOAD/G_STORE and into their own G_ATOMIC_LOAD/G_ATOMIC_STORE. This is a pragmatic decision rather than a necessary one. Atomic load/store has little in implementation in common with non-atomic load/store. They tend to be handled very differently throughout the backend. It also has the nice side-effect of slightly improving the common-case performance at ISel since there's no longer a need for an atomicity check in the matcher table. All targets have been updated to remove the atomic load/store check from the G_LOAD/G_STORE path. AArch64 has also been updated to mark G_ATOMIC_LOAD/G_ATOMIC_STORE legal. There is one issue with this patch though which also affects the extending loads and truncating stores. The rules only match when an appropriate G_ANYEXT is present in the MIR. For example, (G_ATOMIC_STORE (G_TRUNC:s16 (G_ANYEXT:s32 (G_ATOMIC_LOAD:s16 X)))) will match but: (G_ATOMIC_STORE (G_ATOMIC_LOAD:s16 X)) will not. This shouldn't be a problem at the moment, but as we get better at eliminating extends/truncates we'll likely start failing to match in some cases. The current plan is to fix this in a patch that changes the representation of extending-load/truncating-store to allow the MMO to describe a different type to the operation. llvm-svn: 319691	2017-12-04 20:39:32 +00:00
Hiroshi Yamauchi	c483f325bf	Move splitIndirectCriticalEdges() to BasicBlockUtils.h. Summary: Move splitIndirectCriticalEdges() from CodeGenPrepare to BasicBlockUtils.h so that it can be called from other places. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40750 llvm-svn: 319689	2017-12-04 20:36:01 +00:00
Matthias Braun	5728705fa1	Add missing triple args to tests llvm-svn: 319686	2017-12-04 20:08:28 +00:00
Haicheng Wu	247fddf382	[ConstantFold] Support vector index when factoring out GEP index into preceding dimensions Follow-up of r316824. This patch supports the vector type for both current and previous index when factoring out the current one into the previous one. Differential Revision: https://reviews.llvm.org/D39556 llvm-svn: 319683	2017-12-04 19:56:33 +00:00
Sanjoy Das	859d430a7c	[SCEV] Use a "Discovered" set instead of a "Visited" set; NFC Suggested by Max Kazantsev in https://reviews.llvm.org/D39361 llvm-svn: 319679	2017-12-04 19:22:01 +00:00

1 2 3 4 5 ...

157413 Commits