llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00

Author	SHA1	Message	Date
Ahmed Bougacha	b9e69e57c8	[AArch64] Set AddPristinesAndCSRs to expandCMP_SWAP LivePhysRegs. We run after PEI. Found via inspection; no obvious testcase. Follow-up to r266339. llvm-svn: 267780	2016-04-27 20:33:05 +00:00
Ahmed Bougacha	e8bff14c32	[AArch64] Set correct successors in CMPXCHG pseudo expansion. transferSuccessors() would LoadCmpBB a successor of DoneBB, whereas it should be a successor of the original MBB. Follow-up to r266339. Unfortunately, it's tricky to catch this in the verifier. llvm-svn: 267779	2016-04-27 20:33:02 +00:00
Gerolf Hoflehner	19cd041163	[DAGCombiner] Follow coding convention for function name (NFC) llvm-svn: 267745	2016-04-27 17:27:16 +00:00
Matthew Simpson	8b365897a4	Add parentheses to silence buildbot warning llvm-svn: 267734	2016-04-27 16:25:04 +00:00
Matthew Simpson	995eceaf0c	[TTI] Add hook for vector extract with extension This change adds a new hook for estimating the cost of vector extracts followed by zero- and sign-extensions. The motivating example for this change is the SMOV and UMOV instructions on AArch64. These instructions move data from vector to general purpose registers while performing the corresponding extension (sign-extend for SMOV and zero-extend for UMOV) at the same time. For these operations, TargetTransformInfo can assume the extensions are free and only report the cost of the vector extract. The SLP vectorizer has been updated to make use of the new hook. Differential Revision: http://reviews.llvm.org/D18523 llvm-svn: 267725	2016-04-27 15:20:21 +00:00
Ahmed Bougacha	db9e64109d	[CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI. Differential Revision: http://reviews.llvm.org/D17176 llvm-svn: 267606	2016-04-26 21:15:30 +00:00
Craig Topper	b7db006017	[AArch64] Expand v1i64 and v2i64 ctlz. The default is legal, which results in 'Cannot select' errors. llvm-svn: 267522	2016-04-26 05:26:51 +00:00
Junmo Park	69d7e17058	Remove MinLatency in SchedMachineModel. NFC. Summary: We don't use MinLatency any more since r184032. Reviewers: atrick, hfinkel, mcrosier Differential Revision: http://reviews.llvm.org/D19474 llvm-svn: 267502	2016-04-26 00:37:46 +00:00
Andrew Kaylor	3825400f62	Add optimization bisect opt-in calls for AArch64 passes Differential Revision: http://reviews.llvm.org/D19394 llvm-svn: 267479	2016-04-25 21:58:52 +00:00
Quentin Colombet	7f8c56085e	Re-apply r267206 with a fix for the encoding problem: when the immediate of log2(Mask) is smaller than 32, we must use the 32-bit variant because the 64-bit variant cannot encode it. Therefore, set the subreg part accordingly. [AArch64] Fix optimizeCondBranch logic. The opcode for the optimized branch does not depend on the size of the activate bits in the AND masks, but the AND opcode itself. Indeed, we need to use a X or W variant based on the AND variant not based on whether the mask fits into the related variant. Otherwise, we may end up using the W variant of the optimized branch for 64-bit register inputs! This fixes the last make check verifier issues for AArch64: PR27479. llvm-svn: 267465	2016-04-25 20:54:08 +00:00
Nick Lewycky	e703258f62	Fix typo in comment. NFC llvm-svn: 267354	2016-04-24 17:55:57 +00:00
Gerolf Hoflehner	f601ce3709	[MachineCombiner] Support for floating-point FMA on ARM64 (re-commit r267098) The original patch caused crashes because it could derefence a null pointer for SelectionDAGTargetInfo for targets that do not define it. Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267328	2016-04-24 05:14:01 +00:00
Renato Golin	d3f349208c	Revert "[AArch64] Fix optimizeCondBranch logic." This reverts commit r267206, as it broke self-hosting on AArch64. llvm-svn: 267294	2016-04-23 19:30:52 +00:00
Quentin Colombet	dfaf1211a5	[AArch64] Fix optimizeCondBranch logic. The opcode for the optimized branch does not depend on the size of the activate bits in the AND masks, but the AND opcode itself. Indeed, we need to use a X or W variant based on the AND variant not based on whether the mask fits into the related variant. Otherwise, we may end up using the W variant of the optimized branch for 64-bit register inputs! This fixes the last make check verifier issues for AArch64: PR27479. llvm-svn: 267206	2016-04-22 20:09:58 +00:00
Quentin Colombet	e89cf71564	[AArch64] When creating MRS instruction, make sure the destination register is declared as a definition. This fixes the machine verifier error for CodeGen/AArch64/nzcv-save.ll. llvm-svn: 267185	2016-04-22 18:46:17 +00:00
Quentin Colombet	e4d08c7e23	[AArch64][AdvSIMDScalar] Update the kill flags correctly. We used to simply set the kill flags to true when transforming a scalar instruction to a vector one. SrcScalar1 = copy SrcVector1 ... = opScalar SrcScalar1 => SrcScalar1 = copy SrcVector1 ... = opVector SrcVector1<kill> This is obviously wrong. The proper update consists in: 1. Propagate the kill status from the copy to the new opVector 2. Reset the kill status on the copy, since the live-range of SrcVector1 got extended. This fixes some of the machine verifier errors for AArch64 with make check. llvm-svn: 267180	2016-04-22 18:09:14 +00:00
Daniel Sanders	ffe901fc26	Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64 It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others. llvm-svn: 267127	2016-04-22 09:37:26 +00:00
Gerolf Hoflehner	d63bafa58b	[MachineCombiner] Support for floating-point FMA on ARM64 Evaluates fmul+fadd -> fmadd combines and similar code sequences in the machine combiner. It adds support for float and double similar to the existing integer implementation. The key features are: - DAGCombiner checks whether it should combine greedily or let the machine combiner do the evaluation. This is only supported on ARM64. - It gives preference to throughput over latency: the heuristic used is to combine always in loops. The targets decides whether the machine combiner should optimize for throughput or latency. - Supports for fmadd, f(n)msub, fmla, fmls patterns - On by default at O3 ffast-math llvm-svn: 267098	2016-04-22 02:15:19 +00:00
Quentin Colombet	e5c2597507	[RegisterBankInfo] Change the API for the verify methods. Return bool instead of void so that it is natural to put the calls into asserts. llvm-svn: 267033	2016-04-21 18:34:43 +00:00
Evgeny Astigeevich	4405b74d21	[AArch64][CodeGen] Fix of PR27158: incorrect peephole optimization in AArch64InstrInfo::optimizeCompareInstr AArch64InstrInfo::optimizeCompareInstr has bug PR27158 which causes generation of incorrect code. A compare instruction is substituted with another instruction which does not produce the same flags as the original compare instruction. This patch contains: 1. Fix of the bug. 2. A regression test in MIR. 3. A new test to check that SUBS is replaced by SUB. Differential Revision: http://reviews.llvm.org/D18838 llvm-svn: 266969	2016-04-21 08:54:08 +00:00
Marcin Koscielnicki	0919e582e6	[AArch64] [ARM] Make a target-independent llvm.thread.pointer intrinsic. Both AArch64 and ARM support llvm.<arch>.thread.pointer intrinsics that just return the thread pointer. I have a pending patch that does the same for SystemZ (D19054), and there are many more targets that could benefit from one. This patch merges the ARM and AArch64 intrinsics into a single target independent one that will also be used by subsequent targets. Differential Revision: http://reviews.llvm.org/D19098 llvm-svn: 266818	2016-04-19 20:51:05 +00:00
Tim Shen	3a75cd4bf9	[SSP, 2/2] Create llvm.stackguard() intrinsic and lower it to LOAD_STACK_GUARD With this change, ideally IR pass can always generate llvm.stackguard call to get the stack guard; but for now there are still IR form stack guard customizations around (see getIRStackGuard()). Future SSP customization should go through LOAD_STACK_GUARD. There is a behavior change: stack guard values are not CSEed anymore, since we should never reuse the value in case that it has been spilled (and corrupted). See ssp-guard-spill.ll. This also cause the change of stack size and codegen in X86 and AArch64 test cases. Ideally we'd like to know if the guard created in llvm.stackprotector() gets spilled or not. If the value is spilled, discard the value and reload stack guard; otherwise reuse the value. This can be done by teaching register allocator to know how to rematerialize LOAD_STACK_GUARD and force a rematerialization (which seems hard), or check for spilling in expandPostRAPseudo. It only makes sense when the stack guard is a global variable, which requires more instructions to load. Anyway, this seems to go out of the scope of the current patch. llvm-svn: 266806	2016-04-19 19:40:37 +00:00
Mehdi Amini	9ff867f98c	[NFC] Header cleanup Removed some unused headers, replaced some headers with forward class declarations. Found using simple scripts like this one: clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' \| xargs grep -L 'IndexedMap[<]' \| xargs grep -n --color=auto 'IndexedMap' Patch by Eugene Kosov <claprix@yandex.ru> Differential Revision: http://reviews.llvm.org/D19219 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266595	2016-04-18 09:17:29 +00:00
Chad Rosier	8ec20feca4	[AArch64] Add load/store pair instructions to getMemOpBaseRegImmOfsWidth(). This improves AA in the MI schduler when reason about paired instructions. Phabricator Revision: http://reviews.llvm.org/D17098 PR26358 llvm-svn: 266462	2016-04-15 18:09:10 +00:00
Geoff Berry	56afefb9c8	[AArch64] Add MMOs to callee-save load/store instructions. Summary: Without MMOs, the callee-save load/store instructions were treated as volatile by the MI post-RA scheduler and AArch64LoadStoreOptimizer. Reviewers: t.p.northover, mcrosier Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17661 llvm-svn: 266439	2016-04-15 15:16:19 +00:00
Jun Bum Lim	ad7ab4cf46	[MachineScheduler]Add support for store clustering Perform store clustering just like load clustering. This change add StoreClusterMutation in machine-scheduler. To control StoreClusterMutation, added enableClusterStores() in TargetInstrInfo.h. This is enabled only on AArch64 for now. This change also add support for unscaled stores which were not handled in getMemOpBaseRegImmOfs(). llvm-svn: 266437	2016-04-15 14:58:38 +00:00
Craig Topper	ab8ae0e21d	Use MVT instead of EVT to remove a bunch of unnecessary calls to getSimpleVT. llvm-svn: 266414	2016-04-15 06:20:21 +00:00
Tom Stellard	63209a899d	[GlobalISel] Move GISelAccessor class into public headers Reviewers: qcolombet Subscribers: joker.eph, vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D19120 llvm-svn: 266348	2016-04-14 17:45:38 +00:00
Tom Stellard	c2eabf0b58	[GlobalISel] Coding style and whitespace fixes Reviewers: qcolombet Subscribers: joker.eph, llvm-commits, vkalintiris Differential Revision: http://reviews.llvm.org/D19119 llvm-svn: 266342	2016-04-14 17:23:33 +00:00
Tim Northover	d6721e4e7e	AArch64: expand cmpxchg after regalloc at -O0. FastRegAlloc works only at the basic-block level and spills all live-out registers. Unfortunately for a stack-based cmpxchg near the spill slots, this can perpetually clear the exclusive monitor, which means the cmpxchg will never succeed. I believe the only way to handle this within LLVM is by expanding the loop post-regalloc. We don't want this in general because it severely limits the optimisations that can be done, so we limit this to -O0 compilations. It's an ugly hack, and about the one good point in the whole mess is that we can treat all cmpxchg operations in the most naive way possible (seq_cst, no clrex faff) without affecting correctness. Should fix PR25526. llvm-svn: 266339	2016-04-14 17:03:29 +00:00
Matthias Braun	ba8f42fa2c	TargetLowering: Factor out common code for tail call eligibility checking; NFC llvm-svn: 266270	2016-04-14 01:10:42 +00:00
Matthias Braun	40d6ea2b7b	AArch64: Use a callee save registers for swiftself parameters It is very likely that the swiftself parameter is alive throughout most functions function so putting it into a callee save register should avoid spills for the callers with only a minimum amount of extra spills in the callees. Currently the generated code is correct but unnecessarily spills and reloads arguments passed in callee save registers, I will address this in upcoming patches. This also adds a missing check that for tail calls the preserved value of the caller must be the same as the callees parameter. Differential Revision: http://reviews.llvm.org/D19007 llvm-svn: 266251	2016-04-13 21:43:16 +00:00
Evandro Menezes	076242a2eb	[AArch64] Disable LDP/STP for quads Disable LDP/STP for quads on Exynos M1 as they are not as efficient as pairs of regular LDR/STR. Patch by Abderrazek Zaafrani <a.zaafrani@samsung.com>. llvm-svn: 266223	2016-04-13 18:31:45 +00:00
Tim Northover	f9f3caeceb	AArch64: don't create instructions that write to xzr/wzr twice. These are unpredictable even on AArch64. Patch by Yichao Yu. llvm-svn: 266206	2016-04-13 16:25:39 +00:00
Evandro Menezes	de85c02fba	[AArch64] Fuse AES{D,E}/AESMC for Exynos M1. (NFC) llvm-svn: 266144	2016-04-12 22:42:36 +00:00
Matthias Braun	c7cce2e862	AArch64: Drive-by cleanup llvm-svn: 266035	2016-04-12 02:16:13 +00:00
Manman Ren	e6636caf43	Swift Calling Convention: swifterror target support. Differential Revision: http://reviews.llvm.org/D18716 llvm-svn: 265997	2016-04-11 21:08:06 +00:00
Tim Shen	8cac1d5c28	[SSP] Remove llvm.stackprotectorcheck. This is a cleanup patch for SSP support in LLVM. There is no functional change. llvm.stackprotectorcheck is not needed, because SelectionDAG isn't actually lowering it in SelectBasicBlock; rather, it adds check code in FinishBasicBlock, ignoring the position where the intrinsic is inserted (See FindSplitPointForStackProtector()). llvm-svn: 265851	2016-04-08 21:26:31 +00:00
Quentin Colombet	2aebd873f1	[AArch64] Fix a typo in the register class to register bank mapping. For GPR family we want the GPR register bank, not FPR! llvm-svn: 265743	2016-04-07 23:10:14 +00:00
Quentin Colombet	fe5d2f01f8	[AArch64] Get rid of some GlobalISel ifdefs. llvm-svn: 265725	2016-04-07 21:24:40 +00:00
Quentin Colombet	539cebbb04	[AArch64] gcc does not like litteral without quotes even on preprocessor macros. llvm-svn: 265720	2016-04-07 20:49:15 +00:00
Quentin Colombet	ec63533e26	[AArch64][CallLowering] Do not build the API if GlobalISel is not built. This gets rid of some ifdefs and dummy implementations that were here just to fill the blanks. llvm-svn: 265719	2016-04-07 20:47:51 +00:00
Quentin Colombet	17164e059e	[GlobalISel] Add RegBankSelect hooks into the pass pipeline. Now, RegBankSelect will happen after the IRTranslation and the target may optionally add additional passes in between. llvm-svn: 265716	2016-04-07 20:27:33 +00:00
Quentin Colombet	2313f65666	[RegisterBank] Rename RegisterBank::contains into RegisterBank::covers. llvm-svn: 265695	2016-04-07 17:09:39 +00:00
Quentin Colombet	29f0c6f517	[AArch64] Teach RegisterBankInfo about the CC register bank. We need to cover each register class with a register bank. llvm-svn: 265629	2016-04-07 00:39:29 +00:00
Quentin Colombet	be98b33b17	[AArch64] Teach RegisterBankInfo about the mapping of register classes on register banks. llvm-svn: 265626	2016-04-07 00:14:30 +00:00
JF Bastien	f4f5b32f44	NFC: make AtomicOrdering an enum class Summary: In the context of http://wg21.link/lwg2445 C++ uses the concept of 'stronger' ordering but doesn't define it properly. This should be fixed in C++17 barring a small question that's still open. The code currently plays fast and loose with the AtomicOrdering enum. Using an enum class is one step towards tightening things. I later also want to tighten related enums, such as clang's AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI' enum). This change touches a few lines of code which can be improved later, I'd like to keep it as NFC for now as it's already quite complex. I have related changes for clang. As a follow-up I'll add: bool operator<(AtomicOrdering, AtomicOrdering) = delete; bool operator>(AtomicOrdering, AtomicOrdering) = delete; bool operator<=(AtomicOrdering, AtomicOrdering) = delete; bool operator>=(AtomicOrdering, AtomicOrdering) = delete; This is separate so that clang and LLVM changes don't need to be in sync. Reviewers: jyknight, reames Subscribers: jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D18775 llvm-svn: 265602	2016-04-06 21:19:33 +00:00
James Y Knight	82e732e95b	Put quotes around #error string. GCC reports "missing terminating ' character", even when it's being skipped by preprocessing. llvm-svn: 265590	2016-04-06 19:52:32 +00:00
Quentin Colombet	f79a85c911	[AArch64] Change the CMake to avoid to build GlobalISel related APIs when GISel is not built. The positive side effects are: - We do not have to define dummy implementation - We do not have to do weird gymnastic to avoid like issues (like missing constructor or vtable for the base classes) llvm-svn: 265570	2016-04-06 17:38:12 +00:00
Quentin Colombet	c5fc893533	[AArch64] Teach the subtarget how to get to the RegisterBankInfo. Rework the access to GlobalISel APIs to contain how much of the APIs we need to access for the final executable to build when GlobalISel is not built. This prevents massive usage of ifdefs in various places. Now, all the GlobalISel ifdefs will be happing only in AArch64TargetMachine.cpp. llvm-svn: 265567	2016-04-06 17:26:03 +00:00

1 2 3 4 5 ...

1567 Commits