llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00

Author	SHA1	Message	Date
Reid Kleckner	0558b04f57	Revert "[X86] Elide references to _chkstk for dynamic allocas" This reverts commit r262370. It turns out there is code out there that does sequences of allocas greater than 4K: http://crbug.com/591404 The goal of this change was to improve the code size of inalloca call sequences, but we got tangled up in the mess of dynamic allocas. Instead, we should come back later with a separate MI pass that uses dominance to optimize the full sequence. This should also be able to remove the often unneeded stacksave/stackrestore pairs around the call. llvm-svn: 262505	2016-03-02 19:20:59 +00:00
Matthias Braun	5ca7af07c9	ARM: Introduce conservative load/store optimization mode Most of the time ARM has the CCR.UNALIGN_TRP bit set to false which means that unaligned loads/stores do not trap and even extensive testing will not catch these bugs. However the multi/double variants are not affected by this bit and will still trap. In effect a more aggressive load/store optimization will break existing (bad) code. These bugs do not necessarily manifest in the broken code where the misaligned pointer is formed but often later in perfectly legal code where it is accessed. This means recompiling system libraries (which have no alignment bugs) with a newer compiler will break existing applications (with alignment bugs) that worked before. So (under protest) I implemented this safe mode which limits the formation of multi/double operations to cases that are not affected by user code (stack operations like spills/reloads) or cases where the normal operations trap anyway (floating point load/stores). It is disabled by default. Differential Revision: http://reviews.llvm.org/D17015 llvm-svn: 262504	2016-03-02 19:20:00 +00:00
Justin Bogner	fc205457a6	SelectionDAG: Use correctly sized allocation functions for SDNodes The placement new calls here were all calling the allocation function in RecyclingAllocator/Recycler for SDNode, instead of the function for the specific subclass we were constructing. Since this particular allocator always overallocates it more or less worked, but would hide what we're actually doing from any memory tools. Also, if you tried to change this allocator so something like a BumpPtrAllocator or MallocAllocator, the compiler would crash horribly all the time. Part of llvm.org/PR26808. llvm-svn: 262500	2016-03-02 19:01:11 +00:00
Geoff Berry	8e0ed2340c	[AArch64] Enable non-leaf frame pointer elimination. Summary: This change enables frame pointer elimination in non-leaf functions. The -fomit-frame-pointer option still needs to be used when compiling via clang (or an equivalent method of not setting the 'no-frame-pointer-elim*' function attributes if generating llvm IR via some other method) to take advantage of this optimization. This change should be NFC when compiling via clang without -fomit-frame-pointer. Reviewers: t.p.northover Subscribers: aemerson, rengolin, tberghammer, qcolombet, llvm-commits, danalbert, mcrosier, srhines Differential Revision: http://reviews.llvm.org/D17730 llvm-svn: 262495	2016-03-02 17:58:31 +00:00
Chandler Carruth	e597ed0112	[AA] Hoist the logic to reformulate various AA queries in terms of other parts of the AA interface out of the base class of every single AA result object. Because this logic reformulates the query in terms of some other aspect of the API, it would easily cause O(n^2) query patterns in alias analysis. These could in turn be magnified further based on the number of call arguments, and then further based on the number of AA queries made for a particular call. This ended up causing problems for Rust that were actually noticable enough to get a bug (PR26564) and probably other places as well. When originally re-working the AA infrastructure, the desire was to regularize the pattern of refinement without losing any generality. While I think it was successful, that is clearly proving to be too costly. And the cost is needless: we gain no actual improvement for this generality of making a direct query to tbaa actually be able to re-use some other alias analysis's refinement logic for one of the other APIs, or some such. In short, this is entirely wasted work. To the extent possible, delegation to other API surfaces should be done at the aggregation layer so that we can avoid re-walking the aggregation. In fact, this significantly simplifies the logic as we no longer need to smuggle the aggregation layer into each alias analysis (or the TargetLibraryInfo into each alias analysis just so we can form argument memory locations!). However, we also have some delegation logic inside of BasicAA and some of it even makes sense. When the delegation logic is baking in specific knowledge of aliasing properties of the LLVM IR, as opposed to simply reformulating the query to utilize a different alias analysis interface entry point, it makes a lot of sense to restrict that logic to a different layer such as BasicAA. So one aspect of the delegation that was in every AA base class is that when we don't have operand bundles, we re-use function AA results as a fallback for callsite alias results. This relies on the IR properties of calls and functions w.r.t. aliasing, and so seems a better fit to BasicAA. I've lifted the logic up to that point where it seems to be a natural fit. This still does a bit of redundant work (we query function attributes twice, once via the callsite and once via the function AA query) but it is exactly twice here, no more. The end result is that all of the delegation logic is hoisted out of the base class and into either the aggregation layer when it is a pure retargeting to a different API surface, or into BasicAA when it relies on the IR's aliasing properties. This should fix the quadratic query pattern reported in PR26564, although I don't have a stand-alone test case to reproduce it. It also seems general goodness. Now the numerous AAs that don't need target library info don't carry it around and depend on it. I think I can even rip out the general access to the aggregation layer and only expose that in BasicAA as it is the only place where we re-query in that manner. However, this is a non-trivial change to the AA infrastructure so I want to get some additional eyes on this before it lands. Sadly, it can't wait long because we should really cherry pick this into 3.8 if we're going to go this route. Differential Revision: http://reviews.llvm.org/D17329 llvm-svn: 262490	2016-03-02 15:56:53 +00:00
Michael Zuckerman	823b8e16d6	[LLVM][AVX512]PSRAWI Change imm8 to int. Differential Revision: http://reviews.llvm.org/D17705 llvm-svn: 262480	2016-03-02 12:05:07 +00:00
Simon Pilgrim	f9f7ca4f85	[X86][SSE] Lower 128-bit MOVDDUP with existing VBROADCAST mechanisms We have a number of useful lowering strategies for VBROADCAST instructions (both from memory and register element 0) which the 128-bit form of the MOVDDUP instruction can make use of. This patch tweaks lowerVectorShuffleAsBroadcast to enable it to broadcast 2f64 args using MOVDDUP as well. It does require a slight tweak to the lowerVectorShuffleAsBroadcast mechanism as the existing MOVDDUP lowering uses isShuffleEquivalent which can match binary shuffles that can lower to (unary) broadcasts. Differential Revision: http://reviews.llvm.org/D17680 llvm-svn: 262478	2016-03-02 11:43:05 +00:00
Nikolay Haustov	7908fea386	Revert "[AMDGPU] table-driven parser/printer for amd_kernel_code_t structure fields" Build failure with clang. llvm-svn: 262477	2016-03-02 11:16:56 +00:00
Nikolay Haustov	c360acc67e	Revert "[AMDGPU] Using table-driven amd_kernel_code_t field parser in assembler." Build failure with clang. llvm-svn: 262475	2016-03-02 10:54:21 +00:00
Nikolay Haustov	9479911a82	[AMDGPU] Using table-driven amd_kernel_code_t field parser in assembler. complementary patch to table-driven amd_kernel_code_t field parser/printer utility. lit tests passed. Patch by: Valery Pykhtin Differential Revision: http://reviews.llvm.org/D17151 llvm-svn: 262474	2016-03-02 10:36:30 +00:00
Nikolay Haustov	0f9d70887b	[AMDGPU] table-driven parser/printer for amd_kernel_code_t structure fields This is going to be used in .hsatext disassembler and can be used in current assembler parser (lit tests passed on parsing). Code using this helpers isn't included in this patch. Benefits: unified approach fast field name lookup on parsing Later I would like to enhance some of the field naming/syntax using this code. Patch by: Valery Pykhtin Differential Revision: http://reviews.llvm.org/D17150 llvm-svn: 262473	2016-03-02 10:36:25 +00:00
Dmitry Vyukov	d77444bc90	libfuzzer: fix compiler warnings - unused sigaction/setitimer result (used in assert) - unchecked fscanf return value - signed/unsigned comparison llvm-svn: 262472	2016-03-02 09:54:40 +00:00
Craig Topper	e377f9a96e	[X86] Remove unnecessary call to isReg from emitter's DestMem handling for VEX prefix. The operand is always a register. NFC llvm-svn: 262468	2016-03-02 07:32:45 +00:00
Craig Topper	fba583ca7a	[X86] Make X86MCCodeEmitter::DetermineREXPrefix locate operands more like how VEX prefix handling does. llvm-svn: 262467	2016-03-02 07:32:43 +00:00
David Majnemer	572acaa24c	[X86] Permit reading of the FLAGS register without it being previously defined We modeled the RDFLAGS{32,64} operations as "using" {E,R}FLAGS. While technically correct, this is not be desirable for folks who want to examine aspects of the FLAGS register which are not related to computation like whether or not CPUID is a valid instruction. Differential Revision: http://reviews.llvm.org/D17782 llvm-svn: 262465	2016-03-02 06:46:52 +00:00
Craig Topper	a32ceccef8	[X86] Remove assertion I accidentally left in. llvm-svn: 262464	2016-03-02 06:35:22 +00:00
Craig Topper	05f010fdc4	[X86] Be more structured about how we capture the register number when it is encoded in bits 7:4 of the immediate. For some instructions the register is not the last operand and the immediate handling had to detect this and hardcode the index to find it. It also required CurOp to be pointing at the last operand handled in the Form switch whereas for any instruction it would be pointing at the next operand. Now we just capture the value in the Form switch when we know exactly where it is and the CurOp pointer can behave normally. llvm-svn: 262462	2016-03-02 06:06:18 +00:00
Sanjoy Das	ce8e76bbc5	[SCEV] Minor naming, braces cleanup; NFC llvm-svn: 262459	2016-03-02 04:52:22 +00:00
Craig Topper	f43237f0ae	[X86] Use MCPhysReg and uint16_t for static arrays of registers and opcodes respectively should reduce size tiny bit. NFC llvm-svn: 262458	2016-03-02 04:42:31 +00:00
Matt Arsenault	b2f04b3a87	AMDGPU: Fix bug 26659. Fix checking the same instruction twice instead of the second branch that uses vccz. I don't think this matters currently because s_branch_vccnz is always used currently. llvm-svn: 262457	2016-03-02 04:12:39 +00:00
Matt Arsenault	f18c3f7466	AMDGPU: Cleanup suggested in bug 23960 llvm-svn: 262456	2016-03-02 04:05:14 +00:00
Matt Arsenault	ce793849bc	Bug 20810: Use report_fatal_error instead of unreachable llvm-svn: 262455	2016-03-02 03:33:55 +00:00
Sanjoy Das	097e45c701	Add a comment with a rational for the unusual code structure llvm-svn: 262454	2016-03-02 02:56:29 +00:00
Sanjoy Das	f24997d09c	Qualify getRangeForAffineAR with this-> for MSVC llvm-svn: 262453	2016-03-02 02:44:08 +00:00
George Burgess IV	872ab3815a	Attempt to fix ASAN failure in a MemorySSA test. llvm-svn: 262452	2016-03-02 02:35:04 +00:00
Sanjoy Das	ab9ef01214	Perturb code in an attempt to appease MSVC For some reason MSVC seems to think I'm calling getConstant() from a static context. Try to avoid this issue by explicitly specifying 'this->' (though I'm not confident that this will actually work). llvm-svn: 262451	2016-03-02 02:34:20 +00:00
Sanjoy Das	5ad27c32eb	More code permutation to appease MSVC llvm-svn: 262449	2016-03-02 02:15:42 +00:00
Sanjoy Das	559acdcb2c	Remove "auto" to appease the MSVC bots llvm-svn: 262448	2016-03-02 01:59:37 +00:00
Matt Arsenault	ceef9d2175	DAGCombiner: Make sure an integer is being truncated llvm-svn: 262446	2016-03-02 01:36:51 +00:00
Sanjay Patel	afbc4f1551	revert r262424 because there's a clang test for AArch64 that checks -O3 asm output that is broken by this change llvm-svn: 262440	2016-03-02 01:04:09 +00:00
Sanjoy Das	88f19f877b	[SCEV] Make getRange smarter around selects Have ScalarEvolution::getRange re-consider cases like "{C?A:B,+,C?P:Q}" by factoring out "C" and computing RangeOf{A,+,P} union RangeOf({B,+,Q}) instead. The latter can be easier to compute precisely in cases like "{C?0:N,+,C?1:-1}" N is the backedge taken count of the loop; since in such cases the latter form simplifies to [0,N+1) union [0,N+1). llvm-svn: 262438	2016-03-02 00:57:54 +00:00
Sanjoy Das	dc721f4ae2	[SCEV] Extract out a getRangeForAffineAR; NFC Pure code-motion change. Will be used later in making getRange more clever. llvm-svn: 262437	2016-03-02 00:57:39 +00:00
Sanjay Patel	7266aa16e2	[InstCombine] convert 'isPositive' and 'isNegative' vector comparisons to shifts (PR26701) As noted in the code comment, I don't think we can do the same transform that we do for scalar integers comparisons to vector integers comparisons because it might pessimize the general case. Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT for integer vectors. But we should now recognize all the variants of this construct and produce the optimal code for the cases shown in: https://llvm.org/bugs/show_bug.cgi?id=26701 llvm-svn: 262424	2016-03-01 23:55:18 +00:00
Dehao Chen	9581e2d337	Perform InstructioinCombiningPass before SampleProfile pass. Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17742 llvm-svn: 262419	2016-03-01 22:53:02 +00:00
Kostya Serebryany	96af1208c1	[libFuzzer] deprecate exit_on_first flag llvm-svn: 262417	2016-03-01 22:33:14 +00:00
Kostya Serebryany	d5755334e5	[libFuzzer] add generic signal handlers so that libFuzzer can report at least something if ASan is not handlig the signals for us. Remove abort_on_timeout flag. llvm-svn: 262415	2016-03-01 22:19:21 +00:00
Colin LeMahieu	7f4d873f79	[NFC] Convert tabs to spaces. llvm-svn: 262411	2016-03-01 22:05:03 +00:00
Matthias Braun	9215ed1dcf	AArch64: Reenable CompleteModel for A53, A57 and Kryo models The fixes in r262393 completed them as well. llvm-svn: 262408	2016-03-01 21:55:35 +00:00
Colin LeMahieu	f2ca033f39	[Hexagon] Modifying r262258 to only be in effect in the hand assembler path, not the integrated assembler. llvm-svn: 262400	2016-03-01 21:37:41 +00:00
Matt Arsenault	cf419f3ccc	DAGCombiner: Turn truncate of a bitcasted vector to an extract On AMDGPU where operations i64 operations are often bitcasted to v2i32 and back, this pattern shows up regularly where it breaks some expected combines on i64, such as load width reducing. This fixes some test failures in a future commit when i64 loads are changed to promote. llvm-svn: 262397	2016-03-01 21:31:53 +00:00
Rafael Espindola	39a5050ba2	Add LLVMBuild for ObjectYAML. Should fix the DBUILD_SHARED_LIBS bots. llvm-svn: 262396	2016-03-01 21:29:33 +00:00
Jacques Pienaar	ea54ef2b77	[lanai] Add ELF enum value and relocations. Add ELF enum value and relocations for Lanai backed. General Lanai backend discussion on llvm-dev thread "[RFC] Lanai backend" (http://lists.llvm.org/pipermail/llvm-dev/2016-February/095118.html). Differential Revision: http://reviews.llvm.org/D17008 llvm-svn: 262394	2016-03-01 21:21:42 +00:00
Matthias Braun	ca99f60d07	AArch64: Add missing schedinfo, check completeness for cyclone This adds some missing generic schedule info definitions, enables completeness checking for cyclone and fixes a typo uncovered by that. Differential Revision: http://reviews.llvm.org/D17748 llvm-svn: 262393	2016-03-01 21:20:31 +00:00
Kit Barton	4d7130da2e	[Power9] Implement new vector compare, extract, insert instructions This change implements the following vector operations: - Vector Compare Not Equal - vcmpneb(.) vcmpneh(.) vcmpnew(.) - vcmpnezb(.) vcmpnezh(.) vcmpnezw(.) - Vector Extract Unsigned - vextractub vextractuh vextractuw vextractd - vextublx vextubrx vextuhlx vextuhrx vextuwlx vextuwrx - Vector Insert - vinsertb vinserth vinsertw vinsertd 26 instructions. Phabricator: http://reviews.llvm.org/D15916 llvm-svn: 262392	2016-03-01 20:51:57 +00:00
Sanjay Patel	ed5fa039a5	[x86] use getBitcast() This isn't quite NFC because some of the SDLocs may change which could cause scheduling differences. But no regression tests are affected and there is no functional change intended. llvm-svn: 262391	2016-03-01 20:47:02 +00:00
David Blaikie	ad15e89ef0	Fix some warnings a bit harder/different This is an alternate fix to 262378 and a fix to a pessimizing-move warning. llvm-svn: 262390	2016-03-01 20:41:17 +00:00
Geoff Berry	c28f755d4b	Revert "[AArch64] Fix isLegalAddImmediate() to return true for valid negative values." Revert r262248 in an attempt to fix the clang-native-aarch64-full bot and to investigate a performance regression in SingleSource/Benchmarks/CoyoteBench/huffbench llvm-svn: 262388	2016-03-01 20:28:52 +00:00
Vasileios Kalintiris	dd324fafb5	Revert "[mips] Promote the result of SETCC nodes to GPR width." This reverts commit r262316. It seems that my change breaks an out-of-tree chromium buildbot, so I'm reverting this in order to investigate the situation further. llvm-svn: 262387	2016-03-01 20:25:43 +00:00
Kit Barton	977c191475	New file to track implementation status of new POWER9 instructions llvm-svn: 262386	2016-03-01 20:19:43 +00:00
Matthias Braun	6958800987	TableGen: Check scheduling models for completeness TableGen checks at compiletime that for scheduling models with "CompleteModel = 1" one of the following holds: - Is marked with the hasNoSchedulingInfo flag - The instruction is a subclass of Sched - There are InstRW definitions in the scheduling model Typical steps necessary to complete a model: - Ensure all pseudo instructions that are expanded before machine scheduling (usually everything handled with EmitYYY() functions in XXXTargetLowering). - If a CPU does not support some instructions mark the corresponding resource unsupported: "WriteRes<WriteXXX, []> { let Unsupported = 1; }". - Add missing scheduling information. Differential Revision: http://reviews.llvm.org/D17747 llvm-svn: 262384	2016-03-01 20:03:21 +00:00
Justin Lebar	f7a86812e3	[NVPTX] Annotate param loads/stores as mayLoad/mayStore. Summary: Tablegen was unable to determine that param loads/stores were actually reading or writing from memory. I think this isn't a problem in practice for param stores, because those occur in a block right before we make our call. But param loads don't have to at the very beginning of a function, so should be annotated as mayLoad so we don't incorrectly optimize them. Reviewers: jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D17471 llvm-svn: 262381	2016-03-01 19:44:22 +00:00
Justin Lebar	18e0e2943c	[NVPTX] Remove workaround for tablegen crash in NVPTXInstrInfo.td. Summary: Looks like this was caused by a typo. Reviewers: jholewinski Subscribers: jholewinski, llvm-commits, tra Differential Revision: http://reviews.llvm.org/D17357 llvm-svn: 262380	2016-03-01 19:44:20 +00:00
Reid Kleckner	fffc44b291	Fix -Wnon-virtual-dtor warnings llvm-svn: 262378	2016-03-01 19:39:54 +00:00
Owen Anderson	999a9f171c	Fix an issue where fast math flags were dropped during scalarization. Most portions of InstCombine properly propagate fast math flags, but apparently the vector scalarization section was overlooked. llvm-svn: 262376	2016-03-01 19:35:52 +00:00
Sanjoy Das	fb2b660171	[SCEV] Minor cleanup: rename method, C++11'ify; NFC llvm-svn: 262374	2016-03-01 19:28:01 +00:00
Justin Lebar	881bba8e1c	[NVPTX] Use different, convergent MIs for convergent calls. Summary: Calls sometimes need to be convergent. This is already handled at the LLVM IR level, but it also needs to be handled at the MI level. Ideally we'd propagate convergence from instructions, down through the selection DAG, and into MIs. But this is Hard, and would affect optimizations in the SDNs -- right now only SDNs with two operands have any flags at all. Instead, here's a much simpler hack: Add new opcodes for NVPTX for convergent calls, and generate these when lowering convergent LLVM calls. Reviewers: jholewinski Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17423 llvm-svn: 262373	2016-03-01 19:24:03 +00:00
Justin Lebar	49fc12b62a	[NVPTX] Nix hack used to emit '{' and '}' for NVPTX calls. Summary: Tablegen understands backslash as an escape char; that's sufficient. Reviewers: jholewinski Subscribers: llvm-commits, tra, jholewinski Differential Revision: http://reviews.llvm.org/D17432 llvm-svn: 262372	2016-03-01 19:24:00 +00:00
Justin Lebar	53267ab7b3	[NVPTX] Reformat NVPTXInstrInfo.td, and add additional comments. Summary: Also simplify some of the embedded C++ logic. No functional changes. Reviewers: jholewinski Subscribers: llvm-commits, tra, jholewinski Differential Revision: http://reviews.llvm.org/D17354 llvm-svn: 262371	2016-03-01 19:23:30 +00:00
David Majnemer	3003d6c7c0	[X86] Elide references to _chkstk for dynamic allocas The _chkstk function is called by the compiler to probe the stack in an order consistent with Windows' expectations. However, it is possible to elide the call to _chkstk and manually adjust the stack pointer if we can prove that the allocation is fixed size and smaller than the probe size. This shrinks chrome.dll, chrome_child.dll and chrome.exe by a cummulative ~133 KB. Differential Revision: http://reviews.llvm.org/D17679 llvm-svn: 262370	2016-03-01 19:20:23 +00:00
Rafael Espindola	12fd371a27	Move ObjectYAML code to a new library. It is only ever used by obj2yaml and yaml2obj. No point in linking it everywhere. llvm-svn: 262368	2016-03-01 19:15:06 +00:00
Sanjay Patel	97d933d122	fix function names; NFC llvm-svn: 262367	2016-03-01 19:14:09 +00:00
David Majnemer	fc5944990a	[Verifier] Don't abort on invalid cleanuprets Code in visitEHPadPredecessors assume a little too much about the validity of a cleanupret with an invalid cleanuppad operand. llvm-svn: 262364	2016-03-01 18:59:50 +00:00
Easwaran Raman	ecdd424b03	Fix breakage caused by r262360. llvm-svn: 262363	2016-03-01 18:59:11 +00:00
Daniel Berlin	07e4e63d22	Add the beginnings of an update API for preserving MemorySSA Summary: This adds the beginning of an update API to preserve MemorySSA. In particular, this patch adds a way to remove memory SSA accesses when instructions are deleted. It also adds relevant unit testing infrastructure for MemorySSA's API. (There is an actual user of this API, i will make that diff dependent on this one. In practice, a ton of opt passes remove memory instructions, so it's hopefully an obviously useful API :P) Reviewers: hfinkel, reames, george.burgess.iv Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D17157 llvm-svn: 262362	2016-03-01 18:46:54 +00:00
Simon Atanasyan	1c312f2b01	[DebugInfo] Dump CIE augmentation data as a list of hex bytes CIE augmentation data might contain non-printable characters. The patch prints the data as a list of hex bytes. Differential Revision: http://reviews.llvm.org/D17759 llvm-svn: 262361	2016-03-01 18:38:05 +00:00
Easwaran Raman	578ce7baf1	Metadata support for profile summary. This adds support to convert ProfileSummary object to Metadata and create a ProfileSummary object from metadata. This would allow attaching profile summary information to Module allowing optimization passes to use it. llvm-svn: 262360	2016-03-01 18:30:58 +00:00
Matt Arsenault	807567a0a9	DAGCombiner: Turn extract of bitcasted integer into truncate This reduces the number of bitcast nodes and generally cleans up the DAG when bitcasting between integers and vectors everywhere. llvm-svn: 262358	2016-03-01 18:01:37 +00:00
Matt Arsenault	4b1c813bd6	Add isScalarInteger helper to EVT/MVT llvm-svn: 262357	2016-03-01 18:01:28 +00:00
Changpeng Fang	929a348e60	AMDGPU/SI: Implement DS_PERMUTE/DS_BPERMUTE Instruction Definitions and Intrinsics Summary: This patch impleemnts DS_PERMUTE/DS_BPERMUTE instruction definitions and intrinsics, which are new since VI. Reviewers: tstellarAMD, arsenm Subscribers: llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D17614 llvm-svn: 262356	2016-03-01 17:51:23 +00:00
Kostya Serebryany	306353eaf8	[libFuzzer] remove FuzzerSanitizerOptions.cpp llvm-svn: 262354	2016-03-01 17:46:32 +00:00
Michael Zuckerman	71f617e26e	[LLVM][AVX512] PSRL{DI\|QI} Change imm8 to int Differential Revision: http://reviews.llvm.org/D17713 llvm-svn: 262353	2016-03-01 17:46:32 +00:00
Hans Wennborg	f725e062bd	[X86] Check that attribute parameters match for tail calls (PR26590) In the code below on 32-bit targets, x would previously get forwarded to g() without sign-extension to 32 bits as required by the parameter attribute. void g(signed short); void f(unsigned short x) { g(x); } llvm-svn: 262352	2016-03-01 17:45:23 +00:00
Sanjay Patel	d47daeaf1d	fix documentation comments; NFC llvm-svn: 262351	2016-03-01 17:25:35 +00:00
Petar Jovanovic	0302dd0999	Revert "calculate builtin_object_size if argument is a removable pointer" Revert r262337 as "check-llvm ubsan" step failed on sanitizer-x86_64-linux-fast buildbot. llvm-svn: 262349	2016-03-01 16:50:08 +00:00
Sanjay Patel	74bcf66c0f	function names start with a lowercase letter; NFC llvm-svn: 262347	2016-03-01 16:17:48 +00:00
Nikolay Haustov	fe5ca2dc8c	[AMDGPU] Remove unused disassembler code. llvm-svn: 262346	2016-03-01 16:02:40 +00:00
Rafael Espindola	c165498992	Refactor duplicated code for linking with pthread. llvm-svn: 262344	2016-03-01 15:54:40 +00:00
Nikolay Haustov	15d5943a07	[AMDGPU] Fix build warnings. llvm-svn: 262338	2016-03-01 14:50:59 +00:00
Petar Jovanovic	a1fb751763	calculate builtin_object_size if argument is a removable pointer This patch fixes calculating correct value for builtin_object_size function when pointer is used only in builtin_object_size function call and never after that. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D17337 llvm-svn: 262337	2016-03-01 14:39:55 +00:00
Nikolay Haustov	e2718bc744	[AMDGPU] Disassembler code refactored + error messages. Idea behind this change is to make code shorter and as much common for all targets as possible. Let's even accept more code than is valid for a particular target, leaving it for the assembler to sort out. 64bit instructions decoding added. Error\warning messages on unrecognized instructions operands added, InstPrinter allowed to print invalid operands helping to find invalid/unsupported code. The change is massive and hard to compare with previous version, so it makes sense just to take a look on the new version. As a bonus, with a few TD changes following, it disassembles the majority of instructions. Currently it fully disassembles >300K binary source of some blas kernel. Previous TODOs were saved whenever possible. Patch by: Valery Pykhtin Differential Revision: http://reviews.llvm.org/D17720 llvm-svn: 262332	2016-03-01 13:57:29 +00:00
Petr Pavlu	324853823b	[LTO] Fix error reporting from lto_module_create_in_local_context() Function lto_module_create_in_local_context() would previously rely on the default LLVMContext being created for it by LTOModule::makeLTOModule(). This context exits the program on error and is not arranged to update sLastStringError in tools/lto/lto.cpp. Function lto_module_create_in_local_context() now creates an LLVMContext by itself, sets it up correctly to its needs and then passes it to LTOModule::createInLocalContext() which takes ownership of the context and keeps it present for the lifetime of the returned LTOModule. Function LTOModule::makeLTOModule() is modified to take a reference to LLVMContext (instead of a pointer) and no longer creates a default context when nullptr is passed to it. Method LTOModule::createInContext() that takes a pointer to LLVMContext is removed because it allows to pass a nullptr to it. Instead LTOModule::createFromBuffer() (that takes a reference to LLVMContext) should be used. Differential Revision: http://reviews.llvm.org/D17715 llvm-svn: 262330	2016-03-01 13:13:49 +00:00
Michael Zuckerman	c05422513f	[AVX512][PSRAQ][PSRAD] Change imm8 to int. Differential Revision: http://reviews.llvm.org/D17692 llvm-svn: 262320	2016-03-01 11:36:23 +00:00
Amjad Aboud	557cf6fe56	Disallow generating vzeroupper before return instruction (iret) in interrupt handler function. This resolves https://llvm.org/bugs/show_bug.cgi?id=26412 Differential Revision: http://reviews.llvm.org/D17542 llvm-svn: 262319	2016-03-01 11:32:03 +00:00
Simon Atanasyan	bcec242b5a	[MC][YAML] Rangify the loop. NFC llvm-svn: 262317	2016-03-01 10:11:27 +00:00
Vasileios Kalintiris	e8b2910afd	[mips] Promote the result of SETCC nodes to GPR width. Summary: This patch modifies the existing comparison, branch, conditional-move and select patterns, and adds new ones where needed. Also, the updated SLT{u,i,iu} set of instructions generate a GPR width result. The majority of the code changes in the Mips back-end fix the wrong assumption that the result of SETCC nodes always produce an i32 value. The changes in the common code path account for the fact that in 64-bit MIPS targets, i1 is promoted to i32 instead of i64. Reviewers: dsanders Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D10970 llvm-svn: 262316	2016-03-01 10:08:01 +00:00
Nikolay Haustov	b8a1824976	[TableGen] AsmMatcher: Skip optional operands in the midle of instruction if it is not present Previosy, if actual instruction have one of optional operands then other optional operands listed before this also should be presented. For example instruction v_fract_f32 v0, v1, mul:2 have one optional operand - OMod and do not have optional operand clamp. Previously this was not allowed because clamp is listed before omod in AsmString: string AsmString = "v_fract_f32$vdst, $src0_modifiers$clamp$omod"; Making this work required some hacks (both OMod and Clamp match classes have same PredicateMethod). Now, if MatchInstructionImpl meets formal optional operand that is not presented in actual instruction it skips this formal operand and tries to match current actual operand with next formal. Patch by: Sam Kolton Review: http://reviews.llvm.org/D17568 [AMDGPU] Assembler: Check immediate types for several optional operands in predicate methods With this change you should place optional operands in order specified by asm string: clamp -> omod offset -> glc -> slc -> tfe Fixes for several tests. Depends on D17568 Patch by: Sam Kolton Review: http://reviews.llvm.org/D17644 llvm-svn: 262314	2016-03-01 08:34:43 +00:00
Nikolay Haustov	dd568e520a	AsmParser: Fix nested .irp/.irpc Count .irp/.irpc in parseMacroLikeBody similar to .rept Update tests. Review: http://reviews.llvm.org/D17707 llvm-svn: 262313	2016-03-01 08:18:28 +00:00
Craig Topper	ec7d7b8389	[X86] Centralize the masking of TSFlags with FormMask into a variable earlier so we can stop masking in multiple places. NFC llvm-svn: 262312	2016-03-01 07:15:59 +00:00
Craig Topper	a90bbbbd14	[X86] Localize a temporary variable into the cases its need in. NFC llvm-svn: 262310	2016-03-01 06:42:48 +00:00
Craig Topper	a60b62369e	[X86] Be consistent about using pre/post increment/decrement in nearby code. NFC llvm-svn: 262309	2016-03-01 06:42:46 +00:00
Craig Topper	f449a9bf2a	[X86] Combine some initialization code with variable declaration and comments. NFC llvm-svn: 262301	2016-03-01 05:42:16 +00:00
Matt Arsenault	4978a6e2f1	LegalizeDAG: Use correct ptr type when expanding unaligned load/store This fixes regressions exposed in existing AMDGPU tests in a future commit when all loads are custom lowered. llvm-svn: 262299	2016-03-01 05:13:35 +00:00
Matt Arsenault	cb99e5ce0d	AMDGPU: Don't emit build_pair during udivrem legalization Technically you aren't supposed to emit these after type legalization for some reason, and we use vector extracts of bitcasted integers as the canonical way to do this. llvm-svn: 262298	2016-03-01 05:06:05 +00:00
Matt Arsenault	6b8a75fa0b	AMDGPU: Don't use estimated stack size when we know the real stack size llvm-svn: 262297	2016-03-01 04:58:20 +00:00
Matt Arsenault	8238b5c22c	AMDGPU: Set HasExtractBitInsn This currently does not have the control over the bitwidth, and there are missing optimizations to reduce the integer to 32-bit if it can be. But in most situations we do want the sinking to occur. llvm-svn: 262296	2016-03-01 04:58:17 +00:00
David Majnemer	f7676c3bd3	[WinEH] Allocate the registration node before the catch objects The CatchObjOffset is relative to the end of the EH registration node for 32-bit x86 WinEH targets. A special sentinel value, 0, is used to indicate that no catch object should be initialized. This means that a catch object allocated immediately before the registration node would be assigned a CatchObjOffset of 0, leading the runtime to believe that a catch object should not be initialized. To handle this, allocate the registration node prior to any other frame object. This will ensure that catch objects will not be allocated before the registration node. This fixes PR26757. Differential Revision: http://reviews.llvm.org/D17689 llvm-svn: 262294	2016-03-01 04:30:16 +00:00
David Majnemer	54603e7ab8	[Verifier] Diagnose when unwinding out of cycles of blocks Generally speaking, this can only happen with unreachable code. However, neglecting to check for this condition would lead us to loop forever. llvm-svn: 262284	2016-03-01 01:19:05 +00:00
Adam Nemet	a977036086	[LAA] Add missing debug output llvm-svn: 262279	2016-03-01 00:50:08 +00:00
Sanjay Patel	aee22b5eed	[x86, InstCombine] transform more x86 masked loads to LLVM intrinsics Continuation of: http://reviews.llvm.org/rL262269 llvm-svn: 262273	2016-02-29 23:59:00 +00:00
Adam Nemet	5ed52a9810	[LLE] Fix a comment llvm-svn: 262270	2016-02-29 23:21:12 +00:00

1 2 3 4 5 ...

87801 Commits