llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-25 14:02:52 +02:00

Author	SHA1	Message	Date
Michael Kuperstein	6cb752cf86	Fix test from r242886 to use the right triple. llvm-svn: 242889	2015-07-22 11:19:22 +00:00
Michael Kuperstein	80699ec16e	[X86] Add .intel_syntax noprefix directive to intel-syntax x86 asm output Patch by: michael.zuckerman@intel.com Differential Revision: http://reviews.llvm.org/D11223 llvm-svn: 242886	2015-07-22 10:49:44 +00:00
Michael Kuperstein	809b1c325f	Fix mem2reg to correctly handle allocas only used in a single block Currently, a load from an alloca that is used in as single block and is not preceded by a store is replaced by undef. This is not always correct if the single block is inside a loop. Fix the logic so that: 1) If there are no stores in the block, replace the load with an undef, as before. 2) If there is a store (regardless of where it is in the block w.r.t the load), bail out, and let the rest of mem2reg handle this alloca. Patch by: gil.rapaport@intel.com Differential Revision: http://reviews.llvm.org/D11355 llvm-svn: 242884	2015-07-22 10:29:29 +00:00
Kuba Brecka	cc9246c4cd	[asan] Improve moving of non-instrumented allocas In r242510, non-instrumented allocas are now moved into the first basic block. This patch limits that to only move allocas that are present after the first instrumented one (i.e. only move allocas up). A testcase was updated to show behavior in these two cases. Without the patch, an alloca could be moved down, and could cause an invalid IR. Differential Revision: http://reviews.llvm.org/D11339 llvm-svn: 242883	2015-07-22 10:25:38 +00:00
Elena Demikhovsky	be2ecab469	AVX-512: Added intrinsics for VCVT* instructions. All SKX forms. All VCVT instructions for float/double/int/long types. Differential Revision: http://reviews.llvm.org/D11343 llvm-svn: 242877	2015-07-22 08:56:00 +00:00
Chen Li	ca56183986	[LoopUnswitch] Code refactoring to separate trivial loop unswitch and non-trivial loop unswitch in processCurrentLoop() Summary: The current code in LoopUnswtich::processCurrentLoop() mixes trivial loop unswitch and non-trivial loop unswitch together. It goes over all basic blocks in the loop and checks if a condition is trivial or non-trivial unswitch condition. However, trivial unswitch condition can only occur in the loop header basic block (where it controls whether or not the loop does something at all). This refactoring separate trivial loop unswitch and non-trivial loop unswitch. Before going over all basic blocks in the loop, it checks if the loop header contains a trivial unswitch condition. If so, unswitch it. Otherwise, go over all blocks like before but don't check trivial condition any more since they are not possible to be in the other blocks. This code has no functionality change. Reviewers: meheff, reames, broune Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11276 llvm-svn: 242873	2015-07-22 05:26:29 +00:00
Jingyue Wu	45a0757122	[BranchFolding] do not iterate the aliases of virtual registers Summary: MCRegAliasIterator only works for physical registers. So, do not run it on virtual registers. With this issue fixed, we can resurrect the BranchFolding pass in NVPTX backend. Reviewers: jholewinski, bkramer Subscribers: henryhu, meheff, llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D11174 llvm-svn: 242871	2015-07-22 04:16:52 +00:00
Chandler Carruth	c205d0f3c2	[SROA] Fix a nasty pile of bugs to do with big-endian, different alloca types and loads, loads or stores widened past the size of an alloca, etc. This started off with a bug report about big-endian behavior with bitfields and loads and stores to a { i32, i24 } struct. An initial attempt to fix this was sent for review in D10357, but that didn't really get to the root of the problem. The core issue was that canConvertValue and convertValue in SROA were handling different bitwidth integers by doing a zext of the integer. It wouldn't do a trunc though, only a zext! This would in turn lead SROA to form an i24 load from an i24 alloca, zext it to i32, and then use it. This would at least produce the wrong value for big-endian systems. One of my many false starts here was to correct the computation for big-endian systems by shifting. But this doesn't actually work because the original code has a 64-bit store to the entire 8 bytes, and a 32-bit load of the last 4 bytes, and because the alloc size is 8 bytes, we can't lose that last (least significant if bigendian) byte! The real problem here is that we're forming an i24 load in SROA which is actually not sufficiently wide to load all of the necessary bits here. The source has an i32 load, and SROA needs to form that as well. The straightforward way to do this is to disable the zext logic in canConvertValue and convertValue, forcing us to actually load all 32-bits. This seems like a really good change, but it in turn breaks several other parts of SROA. First in the chain of knock-on failures, we had places where we were doing integer-widening promotion even though some of the integer loads or stores extended past the end of the alloca's memory! There was even a comment about preventing this, but it only prevented the case where the type had a different bit size from its store size. So I added checks to handle the cases where we actually have a widened load or store and to avoid trying to special integer widening promotion in those cases. Second, we actually rely on the ability to promote in the face of loads past the end of an alloca! This is important so that we can (for example) speculate loads around PHI nodes to do more promotion. The bits loaded are garbage, but as long as they aren't used and the alignment is suitable high (which it wasn't in the test case!) this is "fine". And we can't stop promoting here, lots of things stop working well if we do. So we need to add specific logic to handle the extension (and truncation) case, but only where that extension or truncation are over bytes that are outside the alloca's allocated storage and thus totally bogus to load or store. And of course, once we add back this correct handling of extension or truncation, we need to correctly handle bigendian systems to avoid re-introducing the exact bug that started us off on this chain of misery in the first place, but this time even more subtle as it only happens along speculated loads atop a PHI node. I've ported an existing test for PHI speculation to the big-endian test file and checked that we get that part correct, and I've added several more interesting big-endian test cases that should help check that we're getting this correct. Fun times. llvm-svn: 242869	2015-07-22 03:32:42 +00:00
Frederic Riss	f2d69e9ab6	[dsymutil] Implement ODR uniquing for C++ code. This optimization allows the DWARF linker to reuse definition of types it has emitted in previous CUs rather than reemitting them in each CU that references them. The size and link time gains are huge. For example when linking the DWARF for a debug build of clang, this generates a ~150M dwarf file instead of a ~700M one (the numbers date back a bit and must not be totally accurate these days). As with all the other parts of the llvm-dsymutil codebase, the goal is to keep bit-for-bit compatibility with dsymutil-classic. The code is littered with a lot of FIXMEs that should be addressed once we can get rid of the compatibilty goal. llvm-svn: 242847	2015-07-21 22:41:43 +00:00
Alex Lorenz	d54c1a6d40	MIR Serialization: Start serializing the CFI operands with .cfi_def_cfa_offset. This commit begins serialization of the CFI index machine operands by serializing one kind of CFI instruction - the .cfi_def_cfa_offset instruction. Reviewers: Duncan P. N. Exon Smith llvm-svn: 242845	2015-07-21 22:28:27 +00:00
Jingyue Wu	98a3974775	[MDA] change BlockScanLimit into a command line option. Summary: In the benchmark (https://github.com/vetter/shoc) we are researching, the duplicated load is not eliminated because MemoryDependenceAnalysis hit the BlockScanLimit. This patch change it into a command line option instead of a hardcoded value. Patched by Xuetian Weng. Test Plan: test/Analysis/MemoryDependenceAnalysis/memdep-block-scan-limit.ll Reviewers: jingyue, reames Subscribers: reames, llvm-commits Differential Revision: http://reviews.llvm.org/D11366 llvm-svn: 242842	2015-07-21 21:50:39 +00:00
Bruno Cardoso Lopes	5962ea6ef1	[AsmPrinter] Check for valid constants in handleIndirectSymViaGOTPCRel Check whether BaseCst is valid before extracting a GlobalValue. This fixes PR24163. Patch by David Majnemer. llvm-svn: 242840	2015-07-21 21:45:42 +00:00
Michael J. Spencer	9eab0b93a6	[Object][ELF] Handle files with no section header string table. llvm-svn: 242839	2015-07-21 21:40:33 +00:00
Bill Schmidt	76e220a5ef	[PPC64LE] More vector swap optimization TLC This makes one substantive change and a few stylistic changes to the VSX swap optimization pass. The substantive change is to permit LXSDX and LXSSPX instructions to participate in swap optimization computations. The previous change to insert a swap following a SUBREG_TO_REG widening operation makes this almost trivial. I experimented with also permitting STXSDX and STXSSPX instructions. This can be done using similar techniques: we could insert a swap prior to a narrowing COPY operation, and then permit these stores to participate. I prototyped this, but discovered that the pattern of a narrowing COPY followed by an STXSDX does not occur in any of our test-suite code. So instead, I added commentary indicating that this could be done. Other TLC: - I changed SH_COPYSCALAR to SH_COPYWIDEN to more clearly indicate the direction of the copy. - I factored the insertion of swap instructions into a separate function. Finally, I added a new test case to check that the scalar-to-vector loads are working properly with swap optimization. llvm-svn: 242838	2015-07-21 21:40:17 +00:00
Reid Kleckner	29702bfe5f	Re-land 242726 to use RAII to do cleanup The LooksLikeCodeInBug11395() codepath was returning without clearing the ProcessedAllocas cache. llvm-svn: 242809	2015-07-21 17:40:14 +00:00
Arnold Schwaighofer	dbca53379c	MergeFunc: Transfer the callee's attributes when replacing a direct caller We insert a bitcast which obfuscates the getCalledFunction for the utility function which looks up attributes from the called function. Loosing ABI changing parameter attributes is a bad thing. rdar://21516488 llvm-svn: 242807	2015-07-21 17:07:07 +00:00
Alex Lorenz	e961fcfaa4	MIR Serialization: Serialize the external symbol machine operands. Reviewers: Duncan P. N. Exon Smith llvm-svn: 242806	2015-07-21 16:59:53 +00:00
Nico Weber	7ded368332	Revert 242726, it broke ASan on OS X. llvm-svn: 242792	2015-07-21 15:48:53 +00:00
Karthik Bhat	bceb74b45e	Constfold trunc,rint,nearbyint,ceil and floor using APFloat A patch by Chakshu Grover! This patch allows constfolding of trunc,rint,nearbyint,ceil and floor intrinsics using APFloat class. Differential Revision: http://reviews.llvm.org/D11144 llvm-svn: 242763	2015-07-21 08:52:23 +00:00
Igor Breger	5441f451cc	AVX512 : Implemented VPMADDUBSW and VPMADDWD instruction , Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D11351 llvm-svn: 242761	2015-07-21 07:11:28 +00:00
Akira Hatanaka	cc98d1ef43	[ARM] Define subtarget feature "reserve-r9", which is used to decide whether register r9 should be reserved. This recommits r242737, which broke bots because the number of subtarget features went over the limit of 64. This change is needed because we cannot use a backend option to set cl::opt "arm-reserve-r9" when doing LTO. Out-of-tree projects currently using cl::opt option "-arm-reserve-r9" to reserve r9 should make changes to add subtarget feature "reserve-r9" to the IR. rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D11320 llvm-svn: 242756	2015-07-21 01:42:02 +00:00
Matthias Braun	ef0a4f20fd	ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common code Re-apply of r241928 which had to be reverted because of the r241926 revert. This commit factors out common code from MergeBaseUpdateLoadStore() and MergeBaseUpdateLSMultiple() and introduces a new function MergeBaseUpdateLSDouble() which merges adds/subs preceding/following a strd/ldrd instruction into an strd/ldrd instruction with writeback where possible. Differential Revision: http://reviews.llvm.org/D10676 llvm-svn: 242743	2015-07-21 00:19:01 +00:00
Matthias Braun	7c07a54d9a	ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2 Re-apply r241926 with an additional check that r13 and r15 are not used for LDRD/STRD. See http://llvm.org/PR24190. This also already includes the fix from r241951. Differential Revision: http://reviews.llvm.org/D10623 llvm-svn: 242742	2015-07-21 00:18:59 +00:00
Akira Hatanaka	35e9895d47	Revert r242737. This caused builds to fail with the following error message: error:Too many subtarget features! Bump MAX_SUBTARGET_FEATURES. llvm-svn: 242740	2015-07-20 23:51:12 +00:00
Akira Hatanaka	b6d87c3ef4	[ARM] Define subtarget feature "reserve-r9", which is used to decide whether register r9 should be reserved. This change is needed because we cannot use a backend option to set cl::opt "arm-reserve-r9" when doing LTO. Out-of-tree projects currently using cl::opt option "-arm-reserve-r9" to reserve r9 should make changes to add subtarget feature "reserve-r9" to the IR. rdar://problem/21529937 Differential Revision: http://reviews.llvm.org/D11320 llvm-svn: 242737	2015-07-20 23:21:30 +00:00
Matthias Braun	e87c09c013	Revert "ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2" This reverts commit r241926. This caused http://llvm.org/PR24190 llvm-svn: 242735	2015-07-20 23:17:20 +00:00
Matthias Braun	186006cfa8	Revert "ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common code" This reverts commit r241928. This caused http://llvm.org/PR24190 llvm-svn: 242734	2015-07-20 23:17:16 +00:00
JF Bastien	ead0e16c6e	Targets: commonize some stack realignment code This patch does the following: * Fix FIXME on `needsStackRealignment`: it is now shared between multiple targets, implemented in `TargetRegisterInfo`, and isn't `virtual` anymore. This will break out-of-tree targets, silently if they used `virtual` and with a build error if they used `override`. * Factor out `canRealignStack` as a `virtual` function on `TargetRegisterInfo`, by default only looks for the `no-realign-stack` function attribute. Multiple targets duplicated the same `needsStackRealignment` code: - Aarch64. - ARM. - Mips almost: had extra `DEBUG` diagnostic, which the default implementation now has. - PowerPC. - WebAssembly. - x86 almost: has an extra `-force-align-stack` option, which the default implementation now has. The default implementation of `needsStackRealignment` used to just return `false`. My current patch changes the behavior by simply using the above shared behavior. This affects: - AMDGPU - BPF - CppBackend - MSP430 - NVPTX - Sparc - SystemZ - XCore - Out-of-tree targets This is a breaking change! `make check` passes. The only implementation of the `virtual` function (besides the slight different in x86) was Hexagon (which did `MF.getFrameInfo()->getMaxAlignment() > 8`), and potentially some out-of-tree targets. Hexagon now uses the default implementation. `needsStackRealignment` was being overwritten in `<Target>GenRegisterInfo.inc`, to return `false` as the default also did. That was odd and is now gone. Reviewers: sunfish Subscribers: aemerson, llvm-commits, jfb Differential Revision: http://reviews.llvm.org/D11160 llvm-svn: 242727	2015-07-20 22:51:32 +00:00
Reid Kleckner	d30055fb9f	Don't try to instrument allocas used by outlined SEH funclets Summary: Arguments to llvm.localescape must be static allocas. They must be at some statically known offset from the frame or stack pointer so that other functions can access them with localrecover. If we ever want to instrument these, we can use more indirection to recover the addresses of these local variables. We can do it during clang irgen or with the asan module pass. Reviewers: eugenis Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11307 llvm-svn: 242726	2015-07-20 22:49:44 +00:00
Matthias Braun	4828912a6a	AArch64: Add aditional Cyclone macroop fusion opportunities Related to rdar://19205407 Differential Revision: http://reviews.llvm.org/D10746 llvm-svn: 242724	2015-07-20 22:34:47 +00:00
Matthias Braun	37785e3077	MachineScheduler: Restrict macroop fusion to data-dependent instructions. Before creating a schedule edge to encourage MacroOpFusion check that: - The predecessor actually writes a register that the branch reads. - The predecessor has no successors in the ScheduleDAG so we can schedule it in front of the branch. This avoids skewing the scheduling heuristic in cases where macroop fusion cannot happen. Differential Revision: http://reviews.llvm.org/D10745 llvm-svn: 242723	2015-07-20 22:34:44 +00:00
Quentin Colombet	e9061eb73e	[ARM] Refactor the prologue/epilogue emission to be more robust. This is the first step toward supporting shrink-wrapping for this target. The changes could be summarized by these items: - Expand the tail-call return as part of the expand pseudo pass. - Get rid of the assumptions that the epilogue is the exit block: * Do not assume which registers are free in the epilogue. (This indirectly improve the lowering of the code for the segmented stacks, see the test cases.) * Take into account that the basic block can be empty. Related to <rdar://problem/20821730> llvm-svn: 242714	2015-07-20 21:42:14 +00:00
Jingyue Wu	b0d6c8585a	[NVPTX] make load on global readonly memory to use ldg Summary: [NVPTX] make load on global readonly memory to use ldg Summary: As describe in [1], ld.global.nc may be used to load memory by nvcc when __restrict__ is used and compiler can detect whether read-only data cache is safe to use. This patch will try to check whether ldg is safe to use and use them to replace ld.global when possible. This change can improve the performance by 18~29% on affected kernels (ratt_kernel and rwdot_kernel) in S3D benchmark of shoc [2]. Patched by Xuetian Weng. [1] http://docs.nvidia.com/cuda/kepler-tuning-guide/#read-only-data-cache [2] https://github.com/vetter/shoc Test Plan: test/CodeGen/NVPTX/load-with-non-coherent-cache.ll Reviewers: jholewinski, jingyue Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D11314 llvm-svn: 242713	2015-07-20 21:28:54 +00:00
Rafael Espindola	8dc741d1bf	Simplify iterating over the dynamic section and report broken ones. llvm-svn: 242712	2015-07-20 21:23:29 +00:00
Krzysztof Parzyszek	9a22bbea14	[Hexagon] Generate MUX from conditional transfers when dot-new not possible llvm-svn: 242711	2015-07-20 21:23:25 +00:00
Alex Lorenz	2f98a55682	MIR Serialization: Initial serialization of machine constant pools. This commit implements the initial serialization of machine constant pools and the constant pool index machine operands. The constant pool is serialized using a YAML sequence of YAML mappings that represent the constant values. The target-specific constant pool items aren't serialized by this commit. Reviewers: Duncan P. N. Exon Smith llvm-svn: 242707	2015-07-20 20:51:18 +00:00
Sanjoy Das	8a13b0e25b	[ImplicitNullChecks] Work with implicit defs. Summary: This change generalizes the implicit null checks pass to work with instructions that don't have any explicit register defs. This lets us use X86's `cmp` against memory as faulting load instructions. Reviewers: reames, JosephTremoulet Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11286 llvm-svn: 242703	2015-07-20 20:31:39 +00:00
Alex Lorenz	96c6a7e8a7	MIR Parser: Add support for quoted named global value operands. This commit extends the machine instruction lexer and implements support for the quoted global value tokens. With this change the syntax for the global value identifier tokens becomes identical to the syntax for the global identifier tokens from the LLVM's assembly language. Reviewers: Duncan P. N. Exon Smith llvm-svn: 242702	2015-07-20 20:31:01 +00:00
Rafael Espindola	1c58b24a44	Remove Elf_Rela_Iter and Elf_Rel_Iter. Use just the pointers and check for invalid relocation sections. llvm-svn: 242700	2015-07-20 20:07:50 +00:00
Chad Rosier	69ba87e018	[AArch64] Change EON pattern to match more often. Phabricator: http://reviews.llvm.org/D11359 Patch by Geoff Berry <gberry@codeaurora.org> llvm-svn: 242694	2015-07-20 18:42:27 +00:00
Bill Schmidt	56989d09e1	Add missing test for r242296 (vec_sld) llvm-svn: 242680	2015-07-20 15:43:21 +00:00
Rafael Espindola	3323a6b994	Report errors an invalid virtual addresses. llvm-svn: 242676	2015-07-20 14:45:03 +00:00
Tom Stellard	836373703a	AMDGPU/SI: Add VI patterns to select FLAT instructions for global memory ops Summary: The MUBUF addr64 bit has been removed on VI, so we must use FLAT instructions when the pointer is stored in VGPRs. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11067 llvm-svn: 242673	2015-07-20 14:28:41 +00:00
Rafael Espindola	0fb99d1c9a	Simplify iterating over program headers and detect corrupt ones. We now use a simple pointer and have range loops. llvm-svn: 242669	2015-07-20 13:35:33 +00:00
Vasileios Kalintiris	3e5853048c	[mips] Added support for the ERETNC instruction. Summary: This required adding the instruction predicate HasMips32r5. Patch by Scott Egerton. Reviewers: dsanders, vkalintiris Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11136 llvm-svn: 242666	2015-07-20 12:28:56 +00:00
Rafael Espindola	04bc262b8d	llvm-readobj: Handle invalid references to the string table. llvm-svn: 242658	2015-07-20 03:38:17 +00:00
Rafael Espindola	d3ff9d7446	Move CHECKs closer to the RUN line. llvm-svn: 242657	2015-07-20 03:31:25 +00:00
Rafael Espindola	7c36fd7751	llvm-readobj: call exit(1) on error. llvm-readobj exists for testing llvm. We can safely stop the program the first time we know the input in corrupted. This is in preparation for making it handle a few more broken files. llvm-svn: 242656	2015-07-20 03:23:55 +00:00
Arnold Schwaighofer	a066a47d7f	Revert "MergeFuncs: Transfer the function parameter attributes to the call site" It is okay to not transfer parameter attributes. This reverts commit r242558. llvm-svn: 242646	2015-07-19 19:30:43 +00:00
Simon Pilgrim	4477808227	[X86][SSE] Tidied up vector CTLZ/CTTZ. NFCI. llvm-svn: 242645	2015-07-19 17:09:43 +00:00

1 2 3 4 5 ...

31061 Commits