llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 22:12:57 +02:00

Author	SHA1	Message	Date
Eric Christopher	85fb5acbfc	Use the class version of getPointerTy rather than getting back to ourselves via a call through the DAG. llvm-svn: 274721	2016-07-07 01:49:59 +00:00
Eric Christopher	0c6a162975	Use the class definition for useSoftFloat. llvm-svn: 274720	2016-07-07 01:49:57 +00:00
Eric Christopher	21a19f0bb6	Rename argument for consistency. llvm-svn: 274717	2016-07-07 01:08:23 +00:00
Eric Christopher	c576377764	Remove the plumbing for isDarwinABI from EmitTailCallLoadFPAndRetAddr. llvm-svn: 274716	2016-07-07 01:08:21 +00:00
Eric Christopher	5863323c9b	Use the MachineFunction that we've already queried for in the function. llvm-svn: 274715	2016-07-07 01:08:19 +00:00
Eric Christopher	44a2dbfcdb	Remove the plumbing for isDarwinABI from the PrepareTailCall hierarchy. llvm-svn: 274714	2016-07-07 01:08:17 +00:00
Eric Christopher	a5e10dee36	Remove the plumbing of 64-bitness from PrepareTailCall and functions called by it. llvm-svn: 274711	2016-07-07 00:39:32 +00:00
Eric Christopher	f4b60b9df5	Sink call to get the MachineFunction into EmitTailCallStoreFPAndRetAddr and remove the argument. llvm-svn: 274710	2016-07-07 00:39:30 +00:00
Eric Christopher	516f19cc78	Remove unnecessary subtarget parameters in PPCTargetLowering. llvm-svn: 274709	2016-07-07 00:39:27 +00:00
Sanjay Patel	1079b801df	fix typo; NFC llvm-svn: 274636	2016-07-06 16:42:46 +00:00
Nemanja Ivanovic	f7fafe4ce2	[PowerPC] - Legalize vector types by widening instead of integer promotion This patch corresponds to review: http://reviews.llvm.org/D20443 It changes the legalization strategy for illegal vector types from integer promotion to widening. This only applies for vectors with elements of width that is a multiple of a byte since we have hardware support for vectors with 1, 2, 3, 8 and 16 byte elements. Integer promotion for vectors is quite expensive on PPC due to the sequence of breaking apart the vector, extending the elements and reconstituting the vector. Two of these operations are expensive. This patch causes between minor and major improvements in performance on most benchmarks. There are very few benchmarks whose performance regresses. These regressions can be handled in a subsequent patch with a DAG combine (similar to how this patch handles int -> fp conversions of illegal vector types). llvm-svn: 274535	2016-07-05 09:22:29 +00:00
Duncan P. N. Exon Smith	6e80950911	CodeGen: Use MachineInstr& in TargetLowering, NFC This is a mechanical change to make TargetLowering API take MachineInstr& (instead of MachineInstr), since the argument is expected to be a valid MachineInstr. In one case, changed a parameter from MachineInstr to MachineBasicBlock::iterator, since it was used as an insertion point. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. llvm-svn: 274287	2016-06-30 22:52:52 +00:00
Rafael Espindola	5d2ec2042d	Delete unused includes. NFC. llvm-svn: 274225	2016-06-30 12:19:16 +00:00
Duncan P. N. Exon Smith	193410d6d7	CodeGen: Use MachineInstr& in TargetInstrInfo, NFC This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when the argument is expected to be a valid MachineInstr. This is a general API improvement. Although it would be possible to do this one function at a time, that would demand a quadratic amount of churn since many of these functions call each other. Instead I've done everything as a block and just updated what was necessary. This is mostly mechanical fixes: adding and removing `` and `&` operators. The only non-mechanical change is to split ARMBaseInstrInfo::getOperandLatencyImpl out from ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a `MachineInstr` which it updated to the instruction bundle leader; now, the latter calls the former either with the same `MachineInstr&` or the bundle leader. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. Note: I updated WebAssembly, Lanai, and AVR (despite being off-by-default) since it turned out to be easy. I couldn't run tests for AVR since llc doesn't link with it turned on. llvm-svn: 274189	2016-06-30 00:01:54 +00:00
Rafael Espindola	05f9468ad4	Drop support for creating $stubs. They are created by ld64 since OS X 10.5. llvm-svn: 274130	2016-06-29 14:59:50 +00:00
Nick Lewycky	d48e0385a1	NFC. Fix popular typo in comment 'deferencing' --> 'dereferencing'. Bonus changes, * placement in X86ISelLowering and 'exerce' -> 'exercise' in test. llvm-svn: 273984	2016-06-28 01:45:05 +00:00
Rafael Espindola	050532596b	Move shouldAssumeDSOLocal to Target. Should fix the shared library build. llvm-svn: 273958	2016-06-27 23:15:57 +00:00
Rafael Espindola	62f38708ac	Use the isPositionIndependent predicate. NFC. llvm-svn: 273875	2016-06-27 14:05:43 +00:00
Rafael Espindola	036a01c6e4	Simplify getLabelAccessInfo. It now takes a IsPIC flag instead of computing and returning it. llvm-svn: 273871	2016-06-27 12:56:02 +00:00
Rafael Espindola	ac2178402c	Refactor duplicated code. NFC. llvm-svn: 273595	2016-06-23 18:43:06 +00:00
Rafael Espindola	6c4a0b7a05	Use shouldAssumeDSOLocal. With this it handle -fPIE. llvm-svn: 273499	2016-06-22 22:09:17 +00:00
Rafael Espindola	c74cdbfe37	Extract a few variables to make 'if' smaller. NFC. llvm-svn: 273497	2016-06-22 21:56:34 +00:00
Krzysztof Parzyszek	2e1948c27f	[SDAG] Remove FixedArgs parameter from CallLoweringInfo::setCallee The setCallee function will set the number of fixed arguments based on the size of the argument list. The FixedArgs parameter was often explicitly set to 0, leading to a lack of consistent value for non- vararg functions. Differential Revision: http://reviews.llvm.org/D20376 llvm-svn: 273403	2016-06-22 12:54:25 +00:00
NAKAMURA Takumi	24b157d37a	Reformat blank lines. llvm-svn: 273131	2016-06-20 01:05:15 +00:00
NAKAMURA Takumi	f81f554f6c	Trailing whitespace. llvm-svn: 273130	2016-06-20 00:49:20 +00:00
NAKAMURA Takumi	01c29da292	Untabify. llvm-svn: 273129	2016-06-20 00:37:41 +00:00
Davide Italiano	94fa1b1aa7	[Codegen] Change PICLevel. We convert `Default` to `NotPIC` so that target independent code can reason about this correctly. Differential Revision: http://reviews.llvm.org/D21394 llvm-svn: 273024	2016-06-17 18:07:14 +00:00
Benjamin Kramer	6de42f77e0	[PPC] Strength-reduce SmallVectors into arrays. No functionality change intended. llvm-svn: 272999	2016-06-17 13:15:10 +00:00
Benjamin Kramer	e80783f62f	Pass DebugLoc and SDLoc by const ref. This used to be free, copying and moving DebugLocs became expensive after the metadata rewrite. Passing by reference eliminates a ton of track/untrack operations. No functionality change intended. llvm-svn: 272512	2016-06-12 15:39:02 +00:00
Hal Finkel	ff8397dabb	[PowerPC] Fix a DAG replacement bug in PPCTargetLowering::DAGCombineExtBoolTrunc While promoting nodes in PPCTargetLowering::DAGCombineExtBoolTrunc, it is possible for one of the nodes to be replaced by another. To make sure we do not visit the deleted nodes, and to make sure we visit the replacement nodes, use a list of HandleSDNodes to track the to-be-promoted nodes during the promotion process. The same fix has been applied to the analogous code in PPCTargetLowering::DAGCombineTruncBoolExt. Fixes PR26985. llvm-svn: 269272	2016-05-12 04:00:56 +00:00
Nemanja Ivanovic	286a9532e8	[Power9] Add support for -mcpu=pwr9 in the back end This patch corresponds to review: http://reviews.llvm.org/D19683 Simply adds the bits for being able to specify -mcpu=pwr9 to the back end. llvm-svn: 268950	2016-05-09 18:54:58 +00:00
Strahinja Petrovic	e297a43f7c	[PowerPC] fix register alignment for long double type This patch fixes register alignment for long double type in soft float mode. Before this patch alignment was 8 and this patch changes it to 4. Differential Revision: http://reviews.llvm.org/D18034 llvm-svn: 268909	2016-05-09 12:27:39 +00:00
Nemanja Ivanovic	d6a1215edc	[PowerPC] Generate VSX version of splat word This patch corresponds to review: http://reviews.llvm.org/D18592 It allows the PPC back end to generate the xxspltw instruction where we previously only emitted vspltw. llvm-svn: 268516	2016-05-04 16:04:02 +00:00
Guozhi Wei	194cccd63b	[PPC] Enable shuffling of VSX vectors This patch fixes PR27078 by enabling shuffling of vectors if VSX is available. llvm-svn: 268064	2016-04-29 17:00:54 +00:00
Craig Topper	945a4cd524	[CodeGen] Default CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to Expand in TargetLoweringBase. This is what the majority of the targets want and removes a bunch of code. Set it to Legal explicitly in the few cases where that's the desired behavior. llvm-svn: 267853	2016-04-28 03:34:31 +00:00
Ahmed Bougacha	db9e64109d	[CodeGen] Add getBuildVector and getSplatBuildVector helpers. NFCI. Differential Revision: http://reviews.llvm.org/D17176 llvm-svn: 267606	2016-04-26 21:15:30 +00:00
Marcin Koscielnicki	704c818d77	[PowerPC] Add support for llvm.thread.pointer Differential Revision: http://reviews.llvm.org/D19304 llvm-svn: 267546	2016-04-26 10:37:22 +00:00
Chuang-Yu Cheng	da7d0bf651	[ppc64] Reenable sibling call optimization on ppc64 since fixed tsan library tail-call issue print-stack-trace.cc test failure of compiler-rt has been fixed by r266869 (http://reviews.llvm.org/D19148), so reenable sibling call optimization on ppc64 Reviewers: nemanjai kbarton llvm-svn: 267527	2016-04-26 07:38:24 +00:00
Tim Shen	a12f27cb73	[PPC, SSP] Support PowerPC Linux stack protection. llvm-svn: 266809	2016-04-19 20:14:52 +00:00
Nirav Dave	a36a5aeca3	Fix typing on generated LXV2DX/STXV2DX instructions [PPC] Previously when casting generic loads to LXV2DX/ST instructions we would leave the original load return type in place allowing for an assertion failure when we merge two equivalent LXV2DX nodes with different types. This fixes PR27350. Reviewers: nemanjai Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19133 llvm-svn: 266438	2016-04-15 15:01:38 +00:00
Chuang-Yu Cheng	6e4b4f696f	CXX_FAST_TLS calling convention: performance improvement for PPC64 This is the same change on PPC64 as r255821 on AArch64. I have even borrowed his commit message. The access function has a short entry and a short exit, the initialization block is only run the first time. To improve the performance, we want to have a short frame at the entry and exit. We explicitly handle most of the CSRs via copies. Only the CSRs that are not handled via copies will be in CSR_SaveList. Frame lowering and prologue/epilogue insertion will generate a short frame in the entry and exit according to CSR_SaveList. The majority of the CSRs will be handled by register allcoator. Register allocator will try to spill and reload them in the initialization block. We add CSRsViaCopy, it will be explicitly handled during lowering. 1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target supports it for the given machine function and the function has only return exits). We also call TLI->initializeSplitCSR to perform initialization. 2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to virtual registers at beginning of the entry block and copies from virtual registers to CSRsViaCopy at beginning of the exit blocks. 3> we also need to make sure the explicit copies will not be eliminated. Author: Tom Jablin (tjablin) Reviewers: hfinkel kbarton cycheng http://reviews.llvm.org/D17533 llvm-svn: 265781	2016-04-08 12:04:32 +00:00
JF Bastien	f4f5b32f44	NFC: make AtomicOrdering an enum class Summary: In the context of http://wg21.link/lwg2445 C++ uses the concept of 'stronger' ordering but doesn't define it properly. This should be fixed in C++17 barring a small question that's still open. The code currently plays fast and loose with the AtomicOrdering enum. Using an enum class is one step towards tightening things. I later also want to tighten related enums, such as clang's AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI' enum). This change touches a few lines of code which can be improved later, I'd like to keep it as NFC for now as it's already quite complex. I have related changes for clang. As a follow-up I'll add: bool operator<(AtomicOrdering, AtomicOrdering) = delete; bool operator>(AtomicOrdering, AtomicOrdering) = delete; bool operator<=(AtomicOrdering, AtomicOrdering) = delete; bool operator>=(AtomicOrdering, AtomicOrdering) = delete; This is separate so that clang and LLVM changes don't need to be in sync. Reviewers: jyknight, reames Subscribers: jyknight, llvm-commits Differential Revision: http://reviews.llvm.org/D18775 llvm-svn: 265602	2016-04-06 21:19:33 +00:00
Ehsan Amiri	ee69c81e47	[PPC] Use VSX/FP Facility integer load when an integer load's only users are conversion to FP http://reviews.llvm.org/D18405 When the integer value loaded is never used directly as integer we should use VSX or Floating Point Facility integer loads and avoid extra direct move llvm-svn: 265593	2016-04-06 20:12:29 +00:00
Chuang-Yu Cheng	a5a77bb95c	[ppc64] Temporary disable sibling call optimization on ppc64 due to breaking test case r265506 breaks print-stack-trace.cc test case of compiler-rt in bootstrap test. http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/1708 llvm-svn: 265528	2016-04-06 10:48:36 +00:00
Chuang-Yu Cheng	de5217e247	[ppc64] Enable sibling call optimization on ppc64 ELFv1/ELFv2 abi This patch enable sibling call optimization on ppc64 ELFv1/ELFv2 abi, and add a couple of test cases. This patch also passed llvm/clang bootstrap test, and spec2006 build/run/result validation. Original issue: https://llvm.org/bugs/show_bug.cgi?id=25617 Great thanks to Tom's (tjablin) help, he contributed a lot to this patch. Thanks Hal and Kit's invaluable opinions! Reviewers: hfinkel kbarton http://reviews.llvm.org/D16315 llvm-svn: 265506	2016-04-06 02:04:38 +00:00
Hal Finkel	f7a5afdf0a	[PowerPC] Add a late MI-level pass for QPX load/splat simplification Chapter 3 of the QPX manual states that, "Scalar floating-point load instructions, defined in the Power ISA, cause a replication of the source data across all elements of the target register." Thus, if we have a load followed by a QPX splat (from the first lane), the splat is redundant. This adds a late MI-level pass to remove the redundant splats in some of these cases (specifically when both occur in the same basic block). This optimization is scheduled just prior to post-RA scheduling. It can't happen before anything that might replace the load with some already-computed quantity (i.e. store-to-load forwarding). llvm-svn: 265047	2016-03-31 20:39:41 +00:00
Hal Finkel	fb163aaea6	[PowerPC] Load two floats directly instead of using one 64-bit integer load When dealing with complex<float>, and similar structures with two single-precision floating-point numbers, especially when such things are being passed around by value, we'll sometimes end up loading both float values by extracting them from one 64-bit integer load. It looks like this: t13: i64,ch = load<LD8[%ref.tmp]> t0, t6, undef:i64 t16: i64 = srl t13, Constant:i32<32> t17: i32 = truncate t16 t18: f32 = bitcast t17 t19: i32 = truncate t13 t20: f32 = bitcast t19 The problem, especially before the P8 where those bitcasts aren't legal (and get expanded via the stack), is that it would have been better to use two floating-point loads directly. Here we add a target-specific DAGCombine to do just that. In short, we turn: ld 3, 0(5) stw 3, -8(1) rldicl 3, 3, 32, 32 stw 3, -4(1) lfs 3, -4(1) lfs 0, -8(1) into: lfs 3, 4(5) lfs 0, 0(5) llvm-svn: 264988	2016-03-31 02:56:05 +00:00
Hal Finkel	82fdd00490	[PowerPC] Refactor popcnt[dw] target features Instead of using two feature bits, one to indicate the availability of the popcnt[dw] instructions, and another to indicate whether or not they're fast, use a single enum. This allows more consistent control via target attribute strings, and via Clang's command line. llvm-svn: 264690	2016-03-29 01:36:01 +00:00
Hal Finkel	9df9f25590	[PowerPC] On the A2, popcnt[dw] are very slow The A2 cores support the popcntw/popcntd instructions, but they're microcoded, and slower than our default software emulation. Specifically, popcnt[dw] take approximately 74 cycles, whereas our software emulation takes only 24-28 cycles. I've added a new target feature to indicate a slow popcnt[dw], instead of just removing the existing target feature from the a2/a2q processor models, because: 1. This allows us to return more accurate information via the TTI interface (I recognize that this currently makes no practical difference) 2. Is hopefully easier to understand (it allows the core's features to match its manual while still having the desired effect). llvm-svn: 264600	2016-03-28 17:52:08 +00:00
Eric Christopher	6ddb84b162	Finish the incomplete 'd' inline asm constraint support for PPC by making sure we give it a register and mark it as a register constraint. llvm-svn: 264340	2016-03-24 21:04:52 +00:00

1 2 3 4 5 ...

1114 Commits