llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 04:02:41 +01:00

Author	SHA1	Message	Date
Sander de Smalen	5074e446e7	[AArch64][SVE] Allocate locals that are scalable vectors. This patch adds a target interface to set the StackID for a given type, which allows scalable vectors (e.g. `<vscale x 16 x i8>`) to be assigned a 'sve-vec' StackID, so it is allocated in the SVE area of the stack frame. Reviewers: ostannard, efriedma, rengolin, cameron.mcinally Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D70080	2019-11-13 09:45:24 +00:00
Simon Tatham	6aa69a8001	[ARM,MVE] Use VMOV.{S8,S16} for sign-extended extractelement. MVE includes instructions that extract an 8- or 16-bit lane from a vector and sign-extend it into the output 32-bit GPR. `ARMInstrMVE.td` already included isel patterns to select those instructions in response to the `ARMISD::VGETLANEs` selection-DAG node type. But `ARMISD::VGETLANEs` was never actually generated, because the code that creates it was conditioned on NEON only. It's an easy fix to enable the same code for integer MVE, and now IR that sign-extends the result of an extractelement (whether explicitly or as part of the function call ABI) will use `vmov.s8` instead of `vmov.u8` followed by `sxtb`. Reviewers: SjoerdMeijer, dmgreen, ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70132	2019-11-13 09:08:41 +00:00
joanlluch	a1334aac0e	[TargetLowering][DAGCombine][MSP430] Shift Amount Threshold in DAGCombine (4) Summary: Replaces ``` unsigned getShiftAmountThreshold(EVT VT) ``` by ``` bool shouldAvoidTransformToShift(EVT VT, unsigned amount) ``` thus giving more flexibility for targets to decide whether particular shift amounts must be considered expensive or not. Updates the MSP430 target with a custom implementation. This continues D69116, D69120, D69326 and updates them, so all of them must be committed before this. Existing tests apply, a few more have been added. Reviewers: asl, spatel Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70042	2019-11-13 09:23:08 +01:00
Craig Topper	e9ebc959cb	[X86] Remove setOperationAction for FP_TO_SINT v8i16. This is no longer needed after widening legalization as we custom legalize v8i8 ourselves. Added entries to the cost model, but bumped the cost slightly to account for the truncate shuffle that wasn't costed before.	2019-11-12 22:45:52 -08:00
Francesco Petrogalli	d054b2dd05	[VFABI] Add LLVM internal mangling for vector functions. Summary: This patch adds a custom ISA for vector functions for internal use in LLVM. The <isa> token is set to "_LLVM_", and it is not attached to any specific instruction Vector ISA, or Vector Function ABI. The ISA is used as a token for handling Vector Function ABI-style vectorization for those vector functions that are not directly associated to any existing Vector Function ABI (for example, some of the vector functions exposed by TargetLibraryInfo). The demangling function for this ISA in a Vector Function ABI context is set to be the same as the common one shared between X86 and AArch64. Reviewers: jdoerfert, sdesmalen, simoll Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70089	2019-11-13 03:26:39 +00:00
Matt Arsenault	ee4c001c6a	AMDGPU: Extend add x, (ext setcc) combine to sub This is the same as the add case, but inverts the operation type. This avoids regressions in a future patch.	2019-11-13 07:13:58 +05:30
Matt Arsenault	bf9d9a1180	AMDGPU: Switch backend default max workgroup size to 1024 Previously this would default to 256, not the maximum supported size of 1024. Using a maximum lower than the hardware maximum requires language runtimes to enforce this limit for correctness, which no language has correctly done. Switch the default to the conservatively correct maximum, and force frontends to opt-in to the more optimal 256 default maximum. I don't really understand why the changes in occupancy-levels.ll increased the computed occupancy, which I expected to decrease. I'm not sure if these tests should be forcing the old maximum.	2019-11-13 07:11:02 +05:30
Matt Arsenault	5cfd953988	AMDGPU Reduce reported maximum group size to 1024 While some targets allow encoding 2048, this was never tested or supported.	2019-11-13 06:34:28 +05:30
Alina Sbirlea	a76fef4322	[GlobalsAA] Reenable test.	2019-11-12 16:53:28 -08:00
Alina Sbirlea	6f9d9bcfe8	Temporarily disable test.	2019-11-12 15:57:51 -08:00
Eric Christopher	ec6661f8aa	Temporarily Revert "Reapply [LVI] Normalize pointer behavior" as it's broken python 3.6. Reverting to figure out if it's a problem in python or the compiler for now. This reverts commit 885a05f48a5d320946c89590b73a764e5884fe4f.	2019-11-12 15:51:51 -08:00
Craig Topper	335135422b	[X86] Don't consider v64i1 as a legal type unless v64i8 is also a legal type. This avoids some nasty issues with argument passing and lowering of arbitrary v64i8 shuffles.	2019-11-12 14:56:02 -08:00
Craig Topper	90fb1e427b	[X86] Only pass v64i8/v32i16 as v16i32 on non-avx512bw targets if the v16i32 type won't be split by prefer-vector-width=256 Otherwise just let the v64i8/v32i16 types be split to v32i8/v16i16. In reality this shouldn't happen because it means we have a 512-bit vector argument, but min-legal-vector-width says a value less than 512. But a 512-bit argument should have been factored into the preferred vector width.	2019-11-12 14:56:01 -08:00
Yonghong Song	9cf279eaff	[BPF] generate BTF_KIND_VARs for all non-static globals Enable to generate BTF_KIND_VARs for non-static default-section globals which is not allowed previously. Modified the existing test case to accommodate the new change. Also removed unused linkage enum members VAR_GLOBAL_TENTATIVE and VAR_GLOBAL_EXTERNAL. Differential Revision: https://reviews.llvm.org/D70145	2019-11-12 14:34:08 -08:00
Alina Sbirlea	28990514f5	[GlobalsAA] Restrict ModRef result if any internal method has its address taken. Summary: If there are any internal methods whose address was taken, conclude there is nothing known in relation of any other internal method and a global. Reviewers: nlopes, sanjoy.google Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69690	2019-11-12 14:24:56 -08:00
Alina Sbirlea	bbaa6235bb	[GVNHoist] Preserve AAResults. Resolves PR38906, PR40898.	2019-11-12 14:10:04 -08:00
Evandro Menezes	4d92249e81	[AArch64] Update for Exynos Fix the modeling for loads and stores using the register offset addresing mode.	2019-11-12 14:37:41 -06:00
Evandro Menezes	2acbd313a1	[AArch64] Fix addressing mode predicates Fix predicates related to the register offset addressing mode.	2019-11-12 14:37:28 -06:00
Fangrui Song	61e55d5d11	[llvm-objcopy][COFF] Implement --redefine-sym and --redefine-syms The parsing error tests in ELF/redefine-symbols.test are not specific to ELF. Move them to redefine-symbols.test. Add COFF/redefine-symbols.test for COFF specific tests. Also fix the documentation regarding --redefine-syms: the old and new names are separated by whitespace, not an equals sign. Reviewed By: mstorsjo Differential Revision: https://reviews.llvm.org/D70036	2019-11-12 11:28:00 -08:00
Peter Collingbourne	9930bdcc8f	ARM: Don't emit R_ARM_NONE relocations to compact unwinding decoders in .ARM.exidx on Android. These relocations are specified by the ARM EHABI (section 6.3). As I understand it, their purpose is to accommodate unwinder implementations that wish to reduce code size by placing the implementations of the compact unwinding decoders in a separate translation unit, and using extern weak symbols to refer to them from the main unwinder implementation, so that they are only linked when something in the binary needs them in order to unwind. However, neither of the unwinders used on Android (libgcc, LLVM libunwind) use this technique, and in fact emitting these relocations ends up being counterproductive to code size because they cause a copy of the unwinder to be statically linked into most binaries, regardless of whether it is actually needed. Furthermore, these relocations create circular dependencies (between libc and the unwinder) in cases where the unwinder is dynamically linked and libc contains compact unwind info. Therefore, deviate from the EHABI here and stop emitting these relocations on Android. Differential Revision: https://reviews.llvm.org/D70027	2019-11-12 10:52:59 -08:00
Michael Liao	db1604061b	Fix build with shared libraries. NFC. - Dependent components need linking directly.	2019-11-12 13:40:35 -05:00
Krzysztof Parzyszek	a7dcb8c305	[Hexagon] Update PS_aligna with max stack alignment once isel completes	2019-11-12 11:47:29 -06:00
Julian Lettner	95339f1d20	[lit] Better/earlier errors for empty runs Fail early, when we discover no tests at all, or filter out all of them. There is also `--allow-empty-runs` to disable test to allow workflows like `LIT_FILTER=abc ninja check-all`. Apparently `check-all` invokes lit multiple times if certain projects are enabled, which would produce unwanted "empty runs". Specify via `LIT_OPTS=--allow-empty-runs`. There are 3 causes for empty runs: 1) No tests discovered. This is always an error. Fix test suite config or command line. 2) All tests filtered out. This is an error by default, but can be suppressed via `--alow-empty-runs`. Should prevent accidentally passing empty runs, but allow the workflow above. 3) The number of shards is greater than the number of tests. Currently, this is never an error. Personally, I think we should consider making this an error by default; if this happens, you are doing something wrong. I added a warning but did not change the behavior, since this warrants more discussion. Reviewed By: atrick, jdenny Differential Revision: https://reviews.llvm.org/D70105	2019-11-12 09:11:36 -08:00
Sanjay Patel	0d90bb492f	[SLP] add test for miscompile with reduction (PR43948); NFC	2019-11-12 11:36:41 -05:00
Krzysztof Parzyszek	fe9991c322	[Hexagon] Fix vector spill expansion to use proper alignment 1. Add pseudos PS_vloadrv_ai and PS_vstorerv_ai: those are now used for single vector registers in loadRegFromStackSlot (and store...). 2. Remove pseudos PS_vloadrwu_ai and PS_vstorerwu_ai. The alignment is now checked when expanding spill pseudos (both in frame lowering and in expand-post-ra-pseudos), and a proper instruction is generated. 3. Update MachineMemOperands when dealigning vector spill slots. 4. Return vector predicate registers in getCallerSavedRegs.	2019-11-12 09:43:21 -06:00
Krzysztof Parzyszek	573d28992d	[Hexagon] Convert stack object offsets to int64, NFC This will print [SP-56] instead of [SP+4294967240].	2019-11-12 09:43:21 -06:00
Krzysztof Parzyszek	34d2e6a2af	[Hexagon] Handle stack realignment in hexagon-vextract	2019-11-12 09:43:21 -06:00
Krzysztof Parzyszek	0ce37ce8eb	[Hexagon] Require PS_aligna whenever variable-sized objects are present	2019-11-12 09:43:21 -06:00
Jinsong Ji	477ae4e17b	[PowerPC] Remove allow-deprecated-dag-overlap and fix broken tests Summary: This is found during review of https://reviews.llvm.org/D67088. CHECK-DAG is non-overlapping after https://reviews.llvm.org/D47106. -allow-deprecated-dag-overlap was introduced to temporary accept old behavior. But it actually hide some broken tests, eg: `test/CodeGen/PowerPC/swaps-le-1.ll` The codegen has changed, but the CHECK-DAG still PASS due to allowing `overlap`. This patch remove the deprecated options, and fix the broken tests. Reviewers: #powerpc, hfinkel, nemanjai, steven.zhang, shchenz Reviewed By: shchenz Subscribers: shchenz, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69733	2019-11-12 15:18:54 +00:00
Tom Weaver	30144fca09	[DBG][OPT] Attempt to salvage or undef debug info when removing trivially deletable instructions in the Reassociate Expression pass. Reviewed By: aprantl, vsk Differential revision: https://reviews.llvm.org/D69943	2019-11-12 15:17:04 +00:00
Jinsong Ji	ca8d41d144	[PowerPC][NFC]Fix typo in desc for enable-ppc-prefetching	2019-11-12 14:46:57 +00:00
Florian Hahn	2d3ee56e42	[Examples] Add IRTransformations directory to examples. This patch adds a new IRTransformations directory to llvm/examples/. This is intended to serve as a new home for example transformations/analysis code used by various tutorials. If LLVM_BUILD_EXAMPLES is enabled, the ExamplesIRTransforms library is linked into the opt binary and the example passes become available. To start off with, it contains the CFG simplifications used in the IR part of the 'Getting Started With LLVM: Basics' tutorial at the US LLVM Developers Meeting 2019. Reviewers: paquette, jfb, meikeb, lhames, kbarton Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D69416	2019-11-12 14:14:48 +00:00
Alex Denisov	ed10a34931	Mark llvm::ConstantExpr::getAsInstruction as const Summary: getAsInstruction is the only non-const member method. It is impossible to enforce const-correctness because of it. Reviewers: jmolloy, majnemer Reviewed By: jmolloy Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70113	2019-11-12 14:24:12 +01:00
Florian Hahn	380e3e814f	[AArch64ExpandPseudos] Preserve renamable state when expanding MOVi64 & co. If the MOVi operand was renamable, the operands of the expanded instructions are also renamable. Reviewers: thegameg, samparker, zatrazz Reviewed By: thegameg Differential Revision: https://reviews.llvm.org/D70061	2019-11-12 11:29:04 +00:00
Diana Picus	003518d973	[InstCombine] Skip scalable vectors in combineLoadToOperationType Don't try to canonicalize loads to scalable vector types to loads of integers. This removes one assertion when trying to use a TypeSize as a parameter to DataLayout::isLegalInteger. It does not handle the second part of the function (which looks at bitcasts). This patch also contains a NFC fix for Load Analysis, where a variable initialization that would cause the same assertion is moved closer to its use. This allows us to run the new test for InstCombine without having to teach LocationSize to play nicely with scalable vectors. Differential Revision: https://reviews.llvm.org/D70075	2019-11-12 12:27:09 +01:00
Simon Pilgrim	f9dd23df58	[X86] Cleanup prefixes + regenerate for fp-intrinsics-fma.ll	2019-11-12 11:24:00 +00:00
Simon Pilgrim	0afb0b590c	FileCheckPattern::FindRegexVarEnd - make helper function static. NFC Fixes cppcheck warning.	2019-11-12 11:14:19 +00:00
Simon Pilgrim	4f01b8cba4	[X86] Add PR39464 addcarry/subborrow test cases Additional coverage for D70079	2019-11-12 11:14:18 +00:00
Florian Hahn	5f133744e9	[LoopInterchange] Only skip PHIs with incoming values from the inner loop. Currently we have limited support for outer loops with multiple basic blocks after the inner loop exit. But the current checks for creating PHIs for loop exit values only assumes the header and latches of the outer loop. It is better to just skip incoming values defined in the original inner loops. Those are handled earlier. Reviewers: efriedma, mcrosier Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D70059	2019-11-12 10:30:51 +00:00
Pavel Labath	6350896386	DWARFDebugLoclists: add location list "interpretation" logic Summary: This patch extracts the logic for computing the "absolute" locations, which was partially present in the debug_loclists dumper, completes it, and moves it into a separate function. This makes it possible to later reuse the same logic for uses other than dumping. The dumper is changed to reuse the location list interpreter, and its format is changed somewhat. In "verbose" mode it prints the "raw" value of a location list, the interpreted location (if available) and the expression itself. In non-verbose mode it prints only one of the location forms: it prefers the interpreted form, but falls back to the "raw" format if interpretation is not possible (for instance, because we were not given a base address, or the resolution of indirect addresses failed). This patch also undos some of the changes made in D69672, namely the part about making all functions static. The main reason for this is that I learned that the original approach (dumping only fully resolved locations) meant that it was impossible to rewrite one of the existing tests. To make that possible (and make the "inline location" dump work in more cases), I now reuse the same dumping mechanism as is used for section-based dumping. As this required having more objects know about the various location lists classes, it seemed like a good idea to create an interface abstracting the difference between them. Therefore, I now create a DWARFLocationTable class, which will serve as a base class for the location list classes. DWARFDebugLoclists is made to inherit from that. DWARFDebugLoc will follow. Another positive effect of this change is that section-based dumping code will not need to use templates (as originally) envisioned, and that the argument lists of the dumping functions become shorter. Reviewers: dblaikie, probinson, JDevlieghere, aprantl, SouraVX Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70081	2019-11-12 10:40:13 +01:00
David Zarzycki	a3f2e48ddd	[X86] Add more add/sub carry tests Preparation for: https://reviews.llvm.org/D70079 https://reviews.llvm.org/D70077	2019-11-12 11:36:59 +02:00
Daniil Suchkov	f01bc2a4a4	[NFC][InstCombine] Add tests that show a number of canonicalization opportunities Reviewers: spatel, RKSimon, lebedev.ri, apilipenko Reviewed-By: apilipenko Tags: #llvm Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D68263	2019-11-12 15:43:29 +07:00
Tim Renouf	11ee47cb13	MCP: Fixed bug with dest overlapping copy source In MachineCopyPropagation, when propagating the source of a copy into the operand of a later instruction, bail if a destination overlaps (partly defines) the copy source. If the instruction where the substitution is happening is also a copy, allowing the propagation confuses the tracking mechanism. Differential Revision: https://reviews.llvm.org/D69953 Change-Id: Ic570754f878f2d91a4a50a9bdcf96fbaa240726d	2019-11-12 08:18:11 +00:00
Craig Topper	d453b74bfe	[X86] Add fptosi test to fp-intrinsics.ll	2019-11-11 23:55:12 -08:00
Craig Topper	60037a5cd9	[X86] Update stale comment. NFC	2019-11-11 23:55:12 -08:00
Mikael Holmen	9ef711ef65	[VFABI] Remove unused variables in testcase, fix buildbot E.g. the buildbot at http://lab.llvm.org:8011/builders/ppc64le-lld-multistage-test/builds/7259/steps/build-stage2-unified-tree/logs/stdio failed with /home/buildbots/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/unittests/Transforms/Utils/VFABIUtils.cpp:50:22: error: unused variable 'FnAttrs' [-Werror,-Wunused-variable] const AttributeSet FnAttrs = Attrs.getFnAttributes(); ^ 1 error generated.	2019-11-12 08:28:12 +01:00
Georgii Rymar	6350ceb0f2	[llvm-readelf/llvm-readobj][test] - Convert elf-linker-options.ll to use YAML. This converts elf-linker-options.ll to use yaml2obj instead of llc, improves and cleanups it a bit. This opens a road to add an additional tests for checking the broken cases. Differential revision: https://reviews.llvm.org/D70004	2019-11-12 10:08:06 +03:00
Georgii Rymar	edf395d3e2	[yaml2obj/obj2yaml] - Add support for SHT_LLVM_LINKER_OPTIONS sections. SHT_LLVM_LINKER_OPTIONS section contains pairs of null-terminated strings. This patch adds support for them. Differential revision: https://reviews.llvm.org/D69895	2019-11-12 09:55:20 +03:00
Hideto Ueno	3e7b5160af	[Attributor] Use must-be-executed-context in align deduction Summary: This patch introduces align attribute deduction for callsite argument, function argument, function returned and floating value based on must-be-executed-context. Reviewers: jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69797	2019-11-12 06:41:19 +00:00
Nick Terrell	639a0c16d7	[Support] Optimize SHA1 implementation * Add inline to the helper functions because gcc-9 won't inline all of them without the hint. I've avoided `__attribute__((always_inline))` because gcc and clang will inline without it, and improves compatibility. * Replace the byte-by-byte copy in update() with endian::readbe32() since perf reports that 1/2 of the time is spent copying into the buffer before this patch. When lld uses --build-id=sha1 it spends 30-45% of CPU in SHA1 depending on the binary (not wall-time since it is parallel). This patch speeds up SHA1 by a factor of 2 on clang-8 and 3 on gcc-6. This leads to a >10% improvement in overall linking time. lld-speed-test benchmarks run on an Intel i9-9900k with Turbo disabled on CPU 0 compiled with clang-9. Stats recorded with `perf stat -r 5`. All inputs are using `--build-id=sha1`. \| Input \| Before (seconds) \| After (seconds) \| \| --- \| --- \| --- \| \| chrome \| 2.14 \| 1.82 (-15%) \| \| chrome-icf \| 2.56 \| 2.29 (-10%) \| \| clang \| 0.65 \| 0.53 (-18%) \| \| clang-fsds \| 0.69 \| 0.58 (-16%) \| \| clang-gdb-index \| 21.71 \| 19.3 (-11%) \| \| gold \| 0.42 \| 0.34 (-19%) \| \| gold-fsds \| 0.431 \| 0.355 (-17%) \| \| linux-kernel \| 0.625 \| 0.575 (-8%) \| \| llvm-as \| 0.045 \| 0.039 (-14%) \| \| llvm-as-fsds \| 0.035 \| 0.039 (-11%) \| \| mozilla \| 11.3 \| 9.8 (-13%) \| \| mozilla-gc \| 11.84 \| 10.36 (-12%) \| \| mozilla-O0 \| 8.2 \| 5.84 (-28%) \| \| scylla \| 5.59 \| 4.52 (-19%) \| Reviewed By: ruiu, MaskRay Differential Revision: https://reviews.llvm.org/D69295	2019-11-11 22:14:28 -08:00

... 7 8 9 10 11 ...

188088 Commits