llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00

Author	SHA1	Message	Date
Simon Pilgrim	12dfcfe0b1	[X86][SSE] Add SSE_SHUFP OpndItins Update multi-classes to take the scheduling OpndItins instead of hard coding it. Will be reused in the AVX512 equivalents. llvm-svn: 319249	2017-11-28 23:09:18 +00:00
Simon Pilgrim	4850e200a0	[X86] Test clflushopt intrinsic on 32 and 64-bit targets llvm-svn: 319247	2017-11-28 23:04:42 +00:00
Simon Pilgrim	9e6ca21eca	[X86][SSE] Add SSE_UNPCK/SSE_PUNPCK OpndItins Update multi-classes to take the scheduling OpndItins instead of hard coding it. Will be reused in the AVX512 equivalents. llvm-svn: 319245	2017-11-28 22:55:08 +00:00
Simon Pilgrim	3e9942c3b8	[X86][SSE] Use SSE_PACK OpndItins in PACKSS/PACKUS instruction definitions Update multi-classes to take the scheduling OpndItins instead of hard coding it. SSE_PACK will be reused in the AVX512 equivalents. llvm-svn: 319243	2017-11-28 22:47:45 +00:00
Adam Nemet	473eb94bec	Remove this test After r319235, we no longer generate this remark. llvm-svn: 319242	2017-11-28 22:39:38 +00:00
Simon Pilgrim	79c0ba6d94	Fix VS2017 narrowing conversion warning. NFCI llvm-svn: 319240	2017-11-28 22:32:43 +00:00
Craig Topper	a16a5768e1	[X86] Remove unused variable. llvm-svn: 319239	2017-11-28 22:28:23 +00:00
Adam Nemet	4a462917d8	Demote this opt remark to DEBUG. From a random opt-stat output: Top 10 remarks: tailcallelim/tailcall 53% inline/AlwaysInline 13% gvn/LoadClobbered 13% inline/Inlined 8% inline/TooCostly 2% inline/NoDefinition 2% licm/LoadWithLoopInvariantAddressInvalidated 2% licm/Hoisted 1% asm-printer/InstructionCount 1% prologepilog/StackSize 1% llvm-svn: 319235	2017-11-28 22:11:00 +00:00
Craig Topper	6b120a1adf	[X86] Remove code from combineUIntToFP that tried to favor UINT_TO_FP if legal when zero extending from vXi8/vX816. The UINT_TO_FP is immediately converted to SINT_TO_FP when the node is re-evaluated because we'll detect that the sign bit is zero. llvm-svn: 319234	2017-11-28 22:08:51 +00:00
Craig Topper	0daab52721	[X86] Remove custom lowering for uint_to_fp from vXi8/vXi16. We have a DAG combine that uses a zero extend that should prevent this from ever occurring now. llvm-svn: 319233	2017-11-28 22:08:48 +00:00
Daniel Sanders	ce08391be8	[globalisel][tablegen] Add support for importing G_ATOMIC_CMPXCHG, G_ATOMICRMW_* rules from SelectionDAG. GIM_CheckNonAtomic has been replaced by GIM_CheckAtomicOrdering to allow it to support a wider range of orderings. This has then been used to import patterns using nodes such as atomic_cmp_swap, atomic_swap, and atomic_load_*. llvm-svn: 319232	2017-11-28 22:07:05 +00:00
Adrian Prantl	ee6ecd60f5	SROA: Don't create variable fragments that are outside of the variable. An alloca may be larger than a variable that is described to be stored there. Don't create a dbg.value for fragments that are outside of the variable. This fixes PR35447. https://bugs.llvm.org/show_bug.cgi?id=35447 llvm-svn: 319230	2017-11-28 21:30:38 +00:00
Don Hinton	57f9a6b164	[cmake] Pass LLVM_USE_LINKER flag when building host tools, e.g., LLVM_OPTIMIZED_TABLEGEN=ON, and not crosscompiling. Differential Revision: https://reviews.llvm.org/D39734 llvm-svn: 319228	2017-11-28 21:23:30 +00:00
Alexey Bataev	6450563023	[SLP] Additional test for PR35354, NFC. llvm-svn: 319224	2017-11-28 20:48:24 +00:00
Mandeep Singh Grang	8acd316e73	[Hexagon] Use stable sort for HexagonShuffler to remove non-deterministic ordering Summary: This fixes failures in the following tests uncovered by D39245: LLVM :: CodeGen/Hexagon/args.ll LLVM :: CodeGen/Hexagon/constp-extract.ll LLVM :: CodeGen/Hexagon/expand-condsets-basic.ll LLVM :: CodeGen/Hexagon/gp-rel.ll LLVM :: CodeGen/Hexagon/packetize_cond_inst.ll LLVM :: CodeGen/Hexagon/simple_addend.ll LLVM :: CodeGen/Hexagon/swp-stages4.ll LLVM :: CodeGen/Hexagon/swp-vmult.ll LLVM :: CodeGen/Hexagon/swp-vsum.ll LLVM :: MC/Hexagon/align.s LLVM :: MC/Hexagon/asmMap.s LLVM :: MC/Hexagon/dis-duplex-p0.s LLVM :: MC/Hexagon/double-vector-producer.s LLVM :: MC/Hexagon/inst_select.ll LLVM :: MC/Hexagon/instructions/j.s Reviewers: colinl, kparzysz, adasgupt, slarin Reviewed By: kparzysz Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D40227 llvm-svn: 319223	2017-11-28 20:48:10 +00:00
Daniel Sanders	ded67ff6bb	[aarch64][globalisel] Add missing tests from r319216 llvm-svn: 319220	2017-11-28 20:27:59 +00:00
Sean Fertile	4a7b507243	[PowerPC] Allow tail calls of fastcc functions from C CallingConv functions. Allow fastcc callees to be tail-called from ccc callers. Differential Revision: https://reviews.llvm.org/D40355 llvm-svn: 319218	2017-11-28 20:25:58 +00:00
Daniel Sanders	cd1fb2fd55	[aarch64][globalisel] Define G_ATOMIC_CMPXCHG and G_ATOMICRMW_* and make them legal The IRTranslator cannot generate these instructions at the moment so there's no issue with not having implemented ISel for them yet. D40092 will add G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMICRMW_* to the IRTranslator and a further patch will add support for lowering G_ATOMIC_CMPXCHG_WITH_SUCCESS into G_ATOMIC_CMPXCHG with an external success check via the `Lower` action. The separation of G_ATOMIC_CMPXCHG_WITH_SUCCESS and G_ATOMIC_CMPXCHG is to import SelectionDAG rules while still supporting targets that prefer to custom lower the original LLVM-IR-like operation. llvm-svn: 319216	2017-11-28 20:21:15 +00:00
Mandeep Singh Grang	4adb0a62b6	[SelectionDAG] Make sorting predicate stronger to remove non-deterministic ordering Summary: Recommitting this with the correct sorting predicate. The Low field of Clusters is a ConstantInt and cannot be directly compared. So we needed to invoke slt (signed less than) to compare correctly. This fixes failures in the following tests uncovered by D39245: LLVM :: CodeGen/ARM/ifcvt3.ll LLVM :: CodeGen/ARM/switch-minsize.ll LLVM :: CodeGen/X86/switch.ll LLVM :: CodeGen/X86/switch-bt.ll LLVM :: CodeGen/X86/switch-density.ll Reviewers: hans, fhahn Reviewed By: hans Subscribers: aemerson, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D40541 llvm-svn: 319210	2017-11-28 19:55:54 +00:00
Simon Pilgrim	8b94bfe4a5	[X86][SSE] Add SSE_HADDSUB/SSE_PABS/SSE_PALIGN OpndItins Update multi-classes to take the scheduling OpndItins instead of hard coding it. Will be reused in the AVX512 equivalents. llvm-svn: 319209	2017-11-28 19:39:47 +00:00
Craig Topper	219d029650	[X86] In lowerVectorShuffleAsElementInsertion, if were able to find a scalar i8 or i16 and need to zero extend it, make sure we use a vXi32 type of the full vector width. Previously, this was hardcoded to v4i32, but if the input type is 256 bits we need to use v8i32. Fixes PR35443 llvm-svn: 319208	2017-11-28 19:25:45 +00:00
Francis Visoiu Mistrih	4000378d4b	[CodeGen] Fix doxygen \file comment style llvm-svn: 319207	2017-11-28 19:23:39 +00:00
Francis Visoiu Mistrih	708b5d7b59	[CodeGen] Fix doxygen llvm-svn: 319206	2017-11-28 19:15:46 +00:00
Sanjay Patel	e6f05aab66	[InstCombine] auto-generate complete test checks; NFC llvm-svn: 319205	2017-11-28 19:13:23 +00:00
Krzysztof Parzyszek	1d202524c0	[Hexagon] Make sure to zero-extend bytes before building a vector llvm-svn: 319204	2017-11-28 19:13:17 +00:00
Sanjay Patel	aa4475e9e8	[InstCombine] auto-generate complete test checks; NFC llvm-svn: 319203	2017-11-28 19:07:28 +00:00
Daniel Sanders	17578579a6	[mir] Print/Parse both MOLoad and MOStore when they occur together. Summary: They're not always mutually exclusive. read-modify-write atomics are both at the same time. One example of this is the SWP instructions on AArch64. Another example is GlobalISel's G_ATOMICRMW_* generic instructions which will be added in a later patch. Reviewers: arphaman, aemerson Reviewed By: aemerson Subscribers: aemerson, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D40157 llvm-svn: 319202	2017-11-28 18:57:02 +00:00
Rafael Espindola	fc4818e12a	Fix non assert build warnings. llvm-svn: 319200	2017-11-28 18:50:08 +00:00
Hans Wennborg	5508a81574	EntryExitInstrumenter: set DebugLocs on the inserted call instructions (PR35412) Apparently the verifier requires that inlineable calls in a function with debug info have debug locations. llvm-svn: 319199	2017-11-28 18:44:26 +00:00
Zachary Turner	4202759d3b	[CodeView] Refactor / Rewrite TypeSerializer and TypeTableBuilder. The motivation behind this patch is that future directions require us to be able to compute the hash value of records independently of actually using them for de-duplication. The current structure of TypeSerializer / TypeTableBuilder being a single entry point that takes an unserialized type record, and then hashes and de-duplicates it is not flexible enough to allow this. At the same time, the existing TypeSerializer is already extremely complex for this very reason -- it tries to be too many things. In addition to serializing, hashing, and de-duplicating, ti also supports splitting up field list records and adding continuations. All of this functionality crammed into this one class makes it very complicated to work with and hard to maintain. To solve all of these problems, I've re-written everything from scratch and split the functionality into separate pieces that can easily be reused. The end result is that one class TypeSerializer is turned into 3 new classes SimpleTypeSerializer, ContinuationRecordBuilder, and TypeTableBuilder, each of which in isolation is simple and straightforward. A quick summary of these new classes and their responsibilities are: - SimpleTypeSerializer : Turns a non-FieldList leaf type into a series of bytes. Does not do any hashing. Every time you call it, it will re-serialize and return bytes again. The same instance can be re-used over and over to avoid re-allocations, and in exchange for this optimization the bytes returned by the serializer only live until the caller attempts to serialize a new record. - ContinuationRecordBuilder : Turns a FieldList-like record into a series of fragments. Does not do any hashing. Like SimpleTypeSerializer, returns references to privately owned bytes, so the storage is invalidated as soon as the caller tries to re-use the instance. Works equally well for LF_FIELDLIST as it does for LF_METHODLIST, solving a long-standing theoretical limitation of the previous implementation. - TypeTableBuilder : Accepts sequences of bytes that the user has already serialized, and inserts them by de-duplicating with a hash table. For the sake of convenience and efficiency, this class internally stores a SimpleTypeSerializer so that it can accept unserialized records. The same is not true of ContinuationRecordBuilder. The user is required to create their own instance of ContinuationRecordBuilder. Differential Revision: https://reviews.llvm.org/D40518 llvm-svn: 319198	2017-11-28 18:33:17 +00:00
Simon Pilgrim	456966cb64	[X86][X87] Tag FP_TO_INT_IN_MEM pseudos with hasNoSchedulingInfo We don't need scheduling info for pseudos llvm-svn: 319197	2017-11-28 18:10:29 +00:00
Francis Visoiu Mistrih	2f8e427cd6	[CodeGen] Separate MachineOperand implementation from MachineInstr Move the implementation to its own file. Differential Revision: https://reviews.llvm.org/D40419 llvm-svn: 319194	2017-11-28 17:58:43 +00:00
Francis Visoiu Mistrih	5ce551fce8	[CodeGen] Cleanup MachineOperand * clang-format * move doxygen from the implementation to headers * remove duplicate doxygen llvm-svn: 319193	2017-11-28 17:58:38 +00:00
Konstantin Zhuravlyov	ec13d639b3	AMDGPU: Add num spilled s/vgprs to metadata This was requested by tools. Differential Revision: https://reviews.llvm.org/D40321 llvm-svn: 319192	2017-11-28 17:51:08 +00:00
Adam Nemet	206e1448f6	Add opt-viewer testing Detects whether we have the Python modules (pygments, yaml) required by opt-viewer and hooks this up to REQUIRES. This fixes https://bugs.llvm.org/show_bug.cgi?id=34129 (the lack of opt-viewer testing). It's also related to https://github.com/apple/swift/pull/12938 and the idea is to expose LLVM_HAVE_OPT_VIEWER_MODULES to the Swift cmake. Differential Revision: https://reviews.llvm.org/D40202 llvm-svn: 319188	2017-11-28 17:26:28 +00:00
Francis Visoiu Mistrih	961f3df27b	[CodeGen] Print register names in lowercase in both MIR and debug output As part of the unification of the debug format and the MIR format, always print registers as lowercase. * Only debug printing is affected. It now follows MIR. Differential Revision: https://reviews.llvm.org/D40417 llvm-svn: 319187	2017-11-28 17:15:09 +00:00
Dan Gohman	07539b2a74	[WebAssembly] Support bitcasted function addresses with varargs. Generalize FixFunctionBitcasts to handle varargs functions. This in particular fixes the case where clang bitcasts away a varargs when calling a K&R-style function. This avoids interacting with tricky ABI details because it operates at the LLVM IR level before varargs ABI details are exposed. This fixes PR35385. llvm-svn: 319186	2017-11-28 17:15:03 +00:00
Matt Arsenault	7cb9a92f9b	DAG: Legalize truncstores to illegal int types Truncate to a legal int type, and produce a new truncstore from a narrower type. llvm-svn: 319185	2017-11-28 17:11:30 +00:00
Simon Pilgrim	fd18f19969	[X86][X87] Tag FTST x87 instruction scheduler class Looking through Agner, FTST is very similar to generic float compare behaviour, so I've added them to the existing IIC_FCOMI (WriteFAdd) tags. llvm-svn: 319184	2017-11-28 16:57:20 +00:00
Sanjay Patel	d0b175cd85	[InstCombine] add tests from D39421 to show current transforms; NFC llvm-svn: 319182	2017-11-28 16:40:30 +00:00
Francis Visoiu Mistrih	e550dd7457	[Support] Add unit test for printLowerCase Add test case for the function added in r319171. llvm-svn: 319177	2017-11-28 16:11:56 +00:00
Don Hinton	921e884d63	[cmake] Remove redundant call to cmake when building host tools. Summary: Remove the redundant, config-time call to cmake when building host tools for cross compiles or optimized tablegen.. The config-time call to cmake is redundant because it will always get called again when the CONFIGURE_LLVM_${target_name} target fires at build-time. This speeds up initial configuration, but has no affect on build behavior. Reviewers: beanz Reviewed By: beanz Subscribers: mgorny, llvm-commits Differential Revision: https://reviews.llvm.org/D40229 llvm-svn: 319176	2017-11-28 16:08:57 +00:00
Simon Pilgrim	f02308ab75	[X86][X87] Tag FABS/FCHS/FSQRT/FSIN/FCOS x87 instruction scheduler classes Atom's FABS/FCHS/FSQRT latencies taken from Agner. Note: I just added FSIN and FCOS to the existing IIC_FSINCOS itinerary, which is actually a more costly instruction. llvm-svn: 319175	2017-11-28 15:03:42 +00:00
Jonas Paulsson	52efcc56d4	Use getStoreSize() in various places instead of 'BitSize >> 3'. This is needed for cases when the memory access is not as big as the width of the data type. For instance, storing i1 (1 bit) would be done in a byte (8 bits). Using 'BitSize >> 3' (or '/ 8') would e.g. give the memory access of an i1 a size of 0, which for instance makes alias analysis return NoAlias even when it shouldn't. There are no tests as this was done as a follow-up to the bugfix for the case where this was discovered (r318824). This handles more similar cases. Review: Björn Petterson https://reviews.llvm.org/D40339 llvm-svn: 319173	2017-11-28 14:44:32 +00:00
Simon Pilgrim	10fb5e6983	[X86][X86] Add some x87 schedule tests Still missing some instructions: mainly loads/stores/system ops, all flagged as TODO. llvm-svn: 319172	2017-11-28 14:35:52 +00:00
Francis Visoiu Mistrih	bea7cec05c	[Support] Merge toLower / toUpper implementations Merge the ones from StringRef and StringExtras. llvm-svn: 319171	2017-11-28 14:22:27 +00:00
Francis Visoiu Mistrih	eba849e869	[CodeGen] Rename functions PrintReg* to printReg* LLVM Coding Standards: Function names should be verb phrases (as they represent actions), and command-like function should be imperative. The name should be camel case, and start with a lower case letter (e.g. openFile() or isFoo()). Differential Revision: https://reviews.llvm.org/D40416 llvm-svn: 319168	2017-11-28 12:42:37 +00:00
Simon Pilgrim	eb7d8e4750	[X86][3DNow] Add instruction itinerary and scheduling classes for femms/prefetch/prefetchw llvm-svn: 319167	2017-11-28 12:37:35 +00:00
Peter Smith	4d40e9cf91	[ARM][AArch64] Workaround ARM/AArch64 peculiarity in clearing icache. Certain ARM implementations treat icache clear instruction as a memory read, and CPU segfaults on trying to clear cache on !PROT_READ page. We workaround this in Memory::protectMappedMemory by adding PROT_READ to affected pages, clearing the cache, and then setting desired protection. This fixes "AllocationTests/MappedMemoryTest.***/3" unit-tests on affected hardware. Reviewers: psmith, zatrazz, kristof.beyls, lhames Reviewed By: lhames Subscribers: llvm-commits, krytarowski, peter.smith, jgreenhalgh, aemerson, rengolin Patch by maxim-kuvrykov! Differential Revision: https://reviews.llvm.org/D40423 llvm-svn: 319166	2017-11-28 12:34:05 +00:00
Chandler Carruth	d600be3a1d	Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable. The core idea is to (re-)introduce some redundancies where their cost is hidden by the cost of materializing immediates for constant operands of PHI nodes. When the cost of the redundancies is covered by this, avoiding materializing the immediate has numerous benefits: 1) Less register pressure 2) Potential for further folding / combining 3) Potential for more efficient instructions due to immediate operand As a motivating example, consider the remarkably different cost on x86 of a SHL instruction with an immediate operand versus a register operand. This pattern turns up surprisingly frequently, but is somewhat rarely obvious as a significant performance problem. The pass is entirely target independent, but it does rely on the target cost model in TTI to decide when to speculate things around the PHI node. I've included x86-focused tests, but any target that sets up its immediate cost model should benefit from this pass. There is probably more that can be done in this space, but the pass as-is is enough to get some important performance on our internal benchmarks, and should be generally performance neutral, but help with more extensive benchmarking is always welcome. One awkward part is that this pass has to be scheduled after everything that can eliminate these kinds of redundancies. This includes SimplifyCFG, GVN, etc. I'm open to suggestions about better places to put this. We could in theory make it part of the codegen pass pipeline, but there doesn't really seem to be a good reason for that -- it isn't "lowering" in any sense and only relies on pretty standard cost model based TTI queries, so it seems to fit well with the "optimization" pipeline model. Still, further thoughts on the pipeline position are welcome. I've also only implemented this in the new pass manager. If folks are very interested, I can try to add it to the old PM as well, but I didn't really see much point (my use case is already switched over to the new PM). I've tested this pretty heavily without issue. A wide range of benchmarks internally show no change outside the noise, and I don't see any significant changes in SPEC either. However, the size class computation in tcmalloc is substantially improved by this, which turns into a 2% to 4% win on the hottest path through tcmalloc for us, so there are definitely important cases where this is going to make a substantial difference. Differential revision: https://reviews.llvm.org/D37467 llvm-svn: 319164	2017-11-28 11:32:31 +00:00

1 2 3 4 5 ...

157152 Commits