llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 22:42:46 +02:00

Author	SHA1	Message	Date
Michael Zuckerman	ae040817a7	[X86] Add support for mmword memory operand size for Intel-syntax x86 assembly Differential Revision: http://reviews.llvm.org/D12151 llvm-svn: 245835	2015-08-24 10:26:54 +00:00
Oliver Stannard	3875250459	Add DAG optimisation for FP16_TO_FP The FP16_TO_FP node only uses the bottom 16 bits of its input, so the following pattern can be optimised by removing the AND: (FP16_TO_FP (AND op, 0xffff)) -> (FP16_TO_FP op) This is a common pattern for ARM targets when functions have __fp16 arguments, as they are passed as floats (so that they get passed in the correct registers), but then bitcast and truncated to ignore the top 16 bits. llvm-svn: 245832	2015-08-24 09:47:45 +00:00
Scott Douglass	2a6e523fef	[ARM] Use AEABI helpers for i64 div and rem Differential Revision: http://reviews.llvm.org/D12232 llvm-svn: 245830	2015-08-24 09:17:18 +00:00
Scott Douglass	abc35dc1e3	[ARM] Refactor LowerDivRem before adding LowerREM (nfc) Differential Revision: http://reviews.llvm.org/D12230 llvm-svn: 245829	2015-08-24 09:17:11 +00:00
Michael Zuckerman	4f0060b27e	first commit to llvm llvm-svn: 245825	2015-08-24 07:48:50 +00:00
Mehdi Amini	490bf85c83	Require Dominator Tree For SROA, improve compile-time TL-DR: SROA is followed by EarlyCSE which requires the DominatorTree. There is no reason not to require it up-front for SROA. Some history is necessary to understand why we ended-up here. r123437 switched the second (Legacy)SROA in the optimizer pipeline to use SSAUpdater in order to avoid recomputing the costly DominanceFrontier. The purpose was to speed-up the compile-time. Later r123609 removed the need for the DominanceFrontier in (Legacy)SROA. Right after, some cleanup was made in r123724 to remove any reference to the DominanceFrontier. SROA existed in two flavors: SROA_SSAUp and SROA_DT (the latter replacing SROA_DF). The second argument of `createScalarReplAggregatesPass` was renamed from `UseDomFrontier` to `UseDomTree`. I believe this is were a mistake was made. The pipeline was not updated and the call site was still: PM->add(createScalarReplAggregatesPass(-1, false)); At that time, SROA was immediately followed in the pipeline by EarlyCSE which required alread the DominatorTree. Not requiring the DominatorTree in SROA didn't save anything, but unfortunately it was lost at this point. When the new SROA Pass was introduced in r163965, I believe the goal was to have an exact replacement of the existing SROA, this bug slipped through. You can see currently: $ echo "" \| clang -x c++ -O3 -c - -mllvm -debug-pass=Structure ... ... FunctionPass Manager SROA Dominator Tree Construction Early CSE After this patch: $ echo "" \| clang -x c++ -O3 -c - -mllvm -debug-pass=Structure ... ... FunctionPass Manager Dominator Tree Construction SROA Early CSE This improves the compile time from 88s to 23s for PR17855. https://llvm.org/bugs/show_bug.cgi?id=17855 And from 113s to 12s for PR16756 https://llvm.org/bugs/show_bug.cgi?id=16756 Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D12267 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 245820	2015-08-23 22:15:49 +00:00
Sanjay Patel	6a117030dd	remove FIXME; fixed by r245733 llvm-svn: 245819	2015-08-23 20:43:25 +00:00
David Majnemer	8cec64183b	[IR] Cleanup EH instructions a little bit Just a cosmetic change, no functionality change is intended. llvm-svn: 245818	2015-08-23 19:22:31 +00:00
Simon Pilgrim	72d01e9f8c	[DAGCombiner] Fold CONCAT_VECTORS of bitcasted EXTRACT_SUBVECTOR Minor generalization of D12125 - peek through any bitcast to the original vector that we're extracting from. llvm-svn: 245814	2015-08-23 15:22:14 +00:00
Davide Italiano	aab4507d18	[llvm-readobj/ELF] Factor out common code. llvm-svn: 245813	2015-08-23 14:06:40 +00:00
Frederic Riss	65d4f1a00e	[dwarfdump] Do not apply relocations in mach-o files if there is no LoadedObjectInfo. Not only do we not need to do anything to read correct values from the object files, but the current logic actually wrongly applies twice the section base address when there is no LoadedObjectInfo passed to the DWARFContext creation (as the added test shows). Simply do not apply any relocations on the mach-o debug info if there is no load offset to apply. llvm-svn: 245807	2015-08-23 04:44:21 +00:00
Frederic Riss	8899fdf520	[dsymutil] Remove old ODR uniquing tests These tests have been obsoleted by the refactored versions introduced in the previous commit. llvm-svn: 245804	2015-08-23 02:38:37 +00:00
Frederic Riss	14240ed492	[dsymutil] Refactor ODR uniquing tests to be more readable. This patch adds all the refactored tests in new files, the old tests will be removed by a followup commit. Thanks to D. Blaikie for all the feedback. llvm-svn: 245803	2015-08-23 02:38:29 +00:00
Joseph Tremoulet	1be12637b1	[LangRef] Fix sphinx warning Fix invalid inline literal introduced in r245797 llvm-svn: 245801	2015-08-23 01:04:12 +00:00
Mehdi Amini	3cb178c5fa	Add missing break in AArch64DAGToDAGISel::Select() switch case Reported by coverity. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 245800	2015-08-23 00:42:57 +00:00
Mehdi Amini	a112343baa	Do not use dyn_cast<> after isa<> Reported by coverity. From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 245799	2015-08-23 00:27:57 +00:00
Joseph Tremoulet	56089ea65e	[WinEH] Require token linkage in EH pad/ret signatures Summary: WinEHPrepare is going to require that cleanuppad and catchpad produce values of token type which are consumed by any cleanupret or catchret exiting the pad. This change updates the signatures of those operators to require/enforce that the type produced by the pads is token type and that the rets have an appropriate argument. The catchpad argument of a `CatchReturnInst` must be a `CatchPadInst` (and similarly for `CleanupReturnInst`/`CleanupPadInst`). To accommodate that restriction, this change adds a notion of an operator constraint to both LLParser and BitcodeReader, allowing appropriate sentinels to be constructed for forward references and appropriate error messages to be emitted for illegal inputs. Also add a verifier rule (noted in LangRef) that a catchpad with a catchpad predecessor must have no other predecessors; this ensures that WinEHPrepare will see the expected linear relationship between sibling catches on the same try. Lastly, remove some superfluous/vestigial casts from instruction operand setters operating on BasicBlocks. Reviewers: rnk, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12108 llvm-svn: 245797	2015-08-23 00:26:33 +00:00
David Blaikie	6c8cdd903c	Update test case so it passes the verifier Some debug info was drastically out of date, from the days where we used to emit a list of length one (with a single null entry) rather than an empty list (or, more recently, no list at all) for list fields that have no elements. llvm-svn: 245796	2015-08-22 22:38:44 +00:00
David Blaikie	db75149683	Verifier: Don't crash on null entries in debug info retained types list There was already a good error path for this. Added a test for it & made a minor code change to ensure the error path was actually reached, rather than crashing before we got that far. llvm-svn: 245795	2015-08-22 22:36:40 +00:00
Davide Italiano	ee5050500d	[llvm-readobj] Test --macho-data-in-code option. As added bonus this converts an existing test from macho-dump to llvm-readobj. Only 66 to go. llvm-svn: 245791	2015-08-22 20:30:56 +00:00
Jingyue Wu	2549889c79	[NVPTX] Allow undef value as global initializer Summary: __shared__ variable may now emit undef value as initializer, do not throw error on that. Test Plan: test/CodeGen/NVPTX/global-addrspace.ll Patch by Xuetian Weng Reviewers: jholewinski, tra, jingyue Subscribers: llvm-commits, jholewinski Differential Revision: http://reviews.llvm.org/D12242 llvm-svn: 245785	2015-08-22 05:40:26 +00:00
NAKAMURA Takumi	5f184b4bb7	[CMake] add_llvm_external_project: Just warn about nonexistent directories. These entries were generated accidentally. llvm-svn: 245783	2015-08-22 05:11:02 +00:00
NAKAMURA Takumi	4cfca08317	[CMake] Make LLVM_EXTERNAL__SOURCE_DIR consistent against older buildsites. If corresponding in-tree subdirectory exists, just ignore LLVM_EXTERNAL stuff. Otherwise, set LLVM_TOOL__BUILD ON/OFF properly according to LLVM_EXTERNAL_. This makes easier to walk among old revisions without deleteing CMakeCache.txt. Before r242059, LLVM_EXTERNAL_* was working like; if(EXISTS ${_SOURCE_DIR}/CMakeLists.txt) set(_BUILD ON CACHE) if(_BUILD is ON) add_subdirectory(_SOURCE_DIR) endif() endif() llvm-svn: 245782	2015-08-22 04:53:52 +00:00
Peter Collingbourne	772d0abe3e	LTO: Maintain target triple, FeatureStr and CGOptLevel in the module or LTOCodeGenerator. This makes it easier to create new TargetMachines on demand. llvm-svn: 245781	2015-08-22 02:25:53 +00:00
Matt Arsenault	42bf1dc33c	AMDGPU: Allow specifying different opcode on VI for SMRD/SMEM Although the basic s_load_* instructions happen to use the same opcode, some of the special case SMRD instructions have different opcodes. llvm-svn: 245775	2015-08-22 00:54:31 +00:00
Matt Arsenault	3784a7252a	AMDGPU: Improve accuracy of instruction rates for some FP instructions llvm-svn: 245774	2015-08-22 00:50:41 +00:00
Matt Arsenault	2211a78780	AMDGPU: Use DFS to avoid second loop over function llvm-svn: 245772	2015-08-22 00:43:38 +00:00
Matt Arsenault	c8ff6e4f0e	AMDGPU: Make sure to run verifier after SIFixSGPRLiveRanges llvm-svn: 245769	2015-08-22 00:19:34 +00:00
Matt Arsenault	12b207c6f3	AMDGPU: Improve debug printing in SIFixSGPRLiveRanges llvm-svn: 245768	2015-08-22 00:19:25 +00:00
Matt Arsenault	d80c9718c1	AMDGPU: Move CI instructions into CIInstructions.td There are still a couple of CI patterns left in SIInstructions. llvm-svn: 245767	2015-08-22 00:16:34 +00:00
Matt Arsenault	30e4b51f0a	AMDGPU: Minor cleanups to help with f16 support The main change is inverting the condition for the operand class classes so that VT.Size == 16 uses VGPR_32 instead of 64. llvm-svn: 245764	2015-08-21 23:49:51 +00:00
JF Bastien	91f6300e91	Improve the determinism of MergeFunctions Summary: Merge functions previously relied on unsigned comparisons of pointer values to order functions. This caused observable non-determinism in the compiler for large bitcode programs. Basically, opt -mergefuncs program.bc \| md5sum produces different hashes when run repeatedly on the same machine. Differing output was observed on three large bitcodes, but it was less frequent on the smallest file. It is possible that this only manifests on the large inputs, hence remaining undetected until now. This patch fixes this by removing (almost, see below) all places where comparisons between pointers are used to order functions. Most of these changes are local, but the comparison of global values requires assigning an identifier to each local in the order it is visited. This is very similar to the way the comparison function identifies Value's defined within a function. Because the order of visiting the functions and their subparts is deterministic, the identifiers assigned to the globals will be as well, and the order of functions will be deterministic. With these changes, there is no more observed non-determinism. There is also only minor slowdowns (negligible to 4%) compared to the baseline, which is likely a result of the fact that global comparisons involve hash lookups and not just pointer comparisons. The one caveat so far is that programs containing BlockAddress constants can still be non-deterministic. It is not clear what the right solution is here. In particular, even if the global numbers are used to order by function, we still need a way to order the BasicBlock's. Unfortunately, we cannot just bail out and fail to order the functions or consider them equal, because we require a total order over functions. Note that programs with BlockAddress constants are relatively rare, so the impact of leaving this in is minor as long as this pass is opt-in. Author: jrkoenig Reviewers: nlewycky, jfb, dschuff Subscribers: jevinskie, llvm-commits, chapuni Differential revision: http://reviews.llvm.org/D12168 llvm-svn: 245762	2015-08-21 23:27:24 +00:00
Adam Nemet	40fc6ceb9d	[LAA] Hold bounds via ValueHandles during SCEV expansion SCEV expansion can invalidate previously expanded values. For example in SCEVExpander::ReuseOrCreateCast, if we already have the requested cast value but it's not at the desired location, a new cast is inserted and the old cast will be invalidated. Therefore, when expanding the bounds for the pointers, a later entry can invalidate the IR value for an earlier one. The fix is to store a value handle rather than the value itself. The newly added test has a more detailed description of how the bug triggers. This bug can have a negative but potentially highly variable performance impact in Loop Distribution. Because one of the bound values was invalidated and is an undef expression now, InstCombine is free to transform the array overlap check: Start0 <= End1 && Start1 <= End0 into: Start0 <= End1 So depending on the runtime location of the arrays, we would detect a conflict and fall back on the original loop of the versioned loop. Also tested compile time with SPEC2006 LTO bc files. llvm-svn: 245760	2015-08-21 23:19:57 +00:00
Tyler Nowicki	0df99a252e	Standardized 'failed' to 'Failed' in LoopVectorizationRequirements. llvm-svn: 245759	2015-08-21 23:03:24 +00:00
Alex Lorenz	c757bb60d6	MIRLangRef: Add 'MIR Testing Guide' section. llvm-svn: 245757	2015-08-21 22:58:33 +00:00
Peter Collingbourne	db573c5328	LTO: Change signature of LTOCodeGenerator::setCodePICModel() to take a Reloc::Model. This allows us to remove a bunch of code in LTOCodeGenerator and llvm-lto and has the side effect of improving error handling in the libLTO C API. llvm-svn: 245756	2015-08-21 22:57:17 +00:00
Tom Stellard	c3f6130f41	AMDGPU/SI: Better handle s_wait insertion We can wait on either VM, EXP or LGKM. The waits are independent. Without this patch, a wait inserted because of one of them would also wait for all the previous others. This patch makes s_wait only wait for the ones we need for the next instruction. Here's an example of subtle perf reduction this patch solves: This is without the patch: buffer_load_format_xyzw v[8:11], v0, s[44:47], 0 idxen buffer_load_format_xyzw v[12:15], v0, s[48:51], 0 idxen s_load_dwordx4 s[44:47], s[8:9], 0xc s_waitcnt lgkmcnt(0) buffer_load_format_xyzw v[16:19], v0, s[52:55], 0 idxen s_load_dwordx4 s[48:51], s[8:9], 0x10 s_waitcnt vmcnt(1) buffer_load_format_xyzw v[20:23], v0, s[44:47], 0 idxen The s_waitcnt vmcnt(1) is useless. The reason it is added is because the last buffer_load_format_xyzw needs s[44:47], which was issued by the first s_load_dwordx4. It waits for all VM before that call to have finished. Internally after every instruction, 3 counters (for VM, EXP and LGTM) are updated after every instruction. For example buffer_load_format_xyzw will increase the VM counter, and s_load_dwordx4 the LGKM one. Without the patch, for every defined register, the current 3 counters are stored, and are used to know how long to wait when an instruction needs the register. Because of that, the s[44:47] counter includes that to use the register you need to wait for the previous buffer_load_format_xyzw. Instead this patch stores only the counters that matter for the register, and puts zero for the other ones, since we don't need any wait for them. Patch by: Axel Davy Differential Revision: http://reviews.llvm.org/D11883 llvm-svn: 245755	2015-08-21 22:47:27 +00:00
Sanjoy Das	ead6e3fe61	Re-apply r245635, "[InstCombine] Transform A & (L - 1) u< L --> L != 0" The original checkin was buggy, this change has a fix. Original commit message: [InstCombine] Transform A & (L - 1) u< L --> L != 0 Summary: This transform is never a pessimization at the IR level (since it replaces an `icmp` with another), and has potentiall payoffs: 1. It may make the `icmp` fold away or become loop invariant. 2. It may make the `A & (L - 1)` computation dead. This shows up in Java, in range checks generated by array accesses of the form `a[i & (a.length - 1)]`. Reviewers: reames, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12210 llvm-svn: 245753	2015-08-21 22:22:37 +00:00
David Blaikie	9b832d73d7	Range-for-ify some things in GlobalMerge llvm-svn: 245752	2015-08-21 22:19:06 +00:00
David Blaikie	bdffc32d98	[opaque pointer types] Fix a few easy places in GlobalMerge that were accessing value types through pointee types llvm-svn: 245746	2015-08-21 22:00:44 +00:00
Alex Lorenz	fd8e770627	MIR Serialization: Serialize the pointer IR expression values in the machine memory operands. llvm-svn: 245745	2015-08-21 21:54:12 +00:00
Vedant Kumar	5213c133c7	[ARM] Fix MachO CPU Subtype selection Differential Revision: http://reviews.llvm.org/D12040 llvm-svn: 245744	2015-08-21 21:52:48 +00:00
Alex Lorenz	58c9b18bbf	MIRParser: Split the 'parseIRConstant' method into two methods. NFC. One variant of this method can be reused when parsing the quoted IR pointer expressions in the machine memory operands. llvm-svn: 245743	2015-08-21 21:48:22 +00:00
David Blaikie	a1c619f61c	[opaque pointer types] Push the passing of value types up from Function/GlobalVariable to GlobalObject (coming next, pushing this up into GlobalValue, so it can store the value type directly) llvm-svn: 245742	2015-08-21 21:35:28 +00:00
Hal Finkel	8f05a818d7	[PowerPC] PPCVSXFMAMutate should not segfault on undef input registers When PPCVSXFMAMutate would look at the input addend register, it would get its input value number. This would fail, however, if the register was undef, causing a segfault. Don't segfault (just skip such FMA instructions). Fixes the test case from PR24542 (although that may have been over-reduced). llvm-svn: 245741	2015-08-21 21:34:24 +00:00
Alex Lorenz	6fab7d4ea6	AsmParser: Save and restore the parsing state for types using SlotMapping. This commit extends the 'SlotMapping' structure and includes mappings for named and numbered types in it. The LLParser is extended accordingly to fill out those mappings at the end of module parsing. This information is useful when we want to parse standalone constant values at a later stage using the 'parseConstantValue' method. The constant values can be constant expressions, which can contain references to types. In order to parse such constant values, we have to restore the internal named and numbered mappings for the types in LLParser, otherwise the parser will report a parsing error. Therefore, this commit also introduces a new method called 'restoreParsingState' to LLParser, which uses the slot mappings to restore some of its internal parsing state. This commit is required to serialize constant value pointers in the machine memory operands for the MIR format. Reviewers: Duncan P. N. Exon Smith llvm-svn: 245740	2015-08-21 21:32:39 +00:00
Bruno Cardoso Lopes	0dc7d0fbc7	[LVI] Use a SmallVector instead of SmallPtrSet. NFC llvm-svn: 245739	2015-08-21 21:18:26 +00:00
Alex Lorenz	745ed2743c	MIRLangRef: Describe the syntax for the immediate operands, register values, register operands and register flags. llvm-svn: 245738	2015-08-21 21:17:01 +00:00
Alex Lorenz	5f32ed0eca	MIR Serialization: Print MCSymbol operands. This commit allows the MIR printer to print the MCSymbol machine operands. Unfortunately they can't be parsed at this time. I will create a bug that will track the fact that the MCSymbol operands can't be parsed yet. llvm-svn: 245737	2015-08-21 21:12:44 +00:00
Simon Pilgrim	a6ae3df1c8	Line endings fix. llvm-svn: 245736	2015-08-21 21:09:51 +00:00

1 2 3 4 5 ...

120902 Commits