llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00

Author	SHA1	Message	Date
Mehdi Amini	ea195a382e	Remove every uses of getGlobalContext() in LLVM (but the C API) At the same time, fixes InstructionsTest::CastInst unittest: yes you can leave the IR in an invalid state and exit when you don't destroy the context (like the global one), no longer now. This is the first part of http://reviews.llvm.org/D19094 From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266379	2016-04-14 21:59:01 +00:00
Matt Arsenault	2dfb6d03c5	AMDGPU: Run SIFoldOperands after PeepholeOptimizer PeepholeOptimizer cleans up redundant copies, which makes the operand folding more effective. shader-db stats: Totals: SGPRS: 34200 -> 34336 (0.40 %) VGPRS: 22118 -> 21655 (-2.09 %) Code Size: 632144 -> 633460 (0.21 %) bytes LDS: 11 -> 11 (0.00 %) blocks Scratch: 10240 -> 11264 (10.00 %) bytes per wave Max Waves: 8822 -> 8918 (1.09 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 7704 -> 7840 (1.77 %) VGPRS: 5169 -> 4706 (-8.96 %) Code Size: 234444 -> 235760 (0.56 %) bytes LDS: 2 -> 2 (0.00 %) blocks Scratch: 0 -> 1024 (0.00 %) bytes per wave Max Waves: 1188 -> 1284 (8.08 %) Wait states: 0 -> 0 (0.00 %) Increases: SGPRS: 35 (0.01 %) VGPRS: 1 (0.00 %) Code Size: 59 (0.02 %) LDS: 0 (0.00 %) Scratch: 1 (0.00 %) Max Waves: 48 (0.02 %) Wait states: 0 (0.00 %) Decreases: SGPRS: 26 (0.01 %) VGPRS: 54 (0.02 %) Code Size: 68 (0.03 %) LDS: 0 (0.00 %) Scratch: 0 (0.00 %) Max Waves: 4 (0.00 %) Wait states: 0 (0.00 %) llvm-svn: 266378	2016-04-14 21:58:24 +00:00
Matt Arsenault	61abb9daf9	AMDGPU: Directly emit m0 initialization with s_mov_b32 Currently what comes out of instruction selection is a register initialized to -1, and then copied to m0. MachineCSE doesn't consider copies, but we want these to be CSEed. This isn't much of a problem currently, because SIFoldOperands is run immediately after. This avoids regressions when SIFoldOperands is run later from leaving all copies to m0. llvm-svn: 266377	2016-04-14 21:58:15 +00:00
Matt Arsenault	e73cb153a7	AMDGPU: Fold bitcasts of scalar constants to vectors This cleans up some messes since the individual scalar components can be CSEed. llvm-svn: 266376	2016-04-14 21:58:07 +00:00
Geoff Berry	f26cedc3ed	[ScheduleDAGInstrs] Re-factor for based on review feedback. NFC. Summary: Re-factor some code to improve clarity and style based on review comments from http://reviews.llvm.org/D18093. Reviewers: MatzeB, mcrosier Subscribers: MatzeB, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19128 llvm-svn: 266372	2016-04-14 21:31:07 +00:00
Renato Golin	bf4b959200	[ARM] Adding IEEE-754 SIMD detection to loop vectorizer Some SIMD implementations are not IEEE-754 compliant, for example ARM's NEON. This patch teaches the loop vectorizer to only allow transformations of loops that either contain no floating-point operations or have enough allowance flags supporting lack of precision (ex. -ffast-math, Darwin). For that, the target description now has a method which tells us if the vectorizer is allowed to handle FP math without falling into unsafe representations, plus a check on every FP instruction in the candidate loop to check for the safety flags. This commit makes LLVM behave like GCC with respect to ARM NEON support, but it stops short of fixing the underlying problem: sub-normals. Neither GCC nor LLVM have a flag for allowing sub-normal operations. Before this patch, GCC only allows it using unsafe-math flags and LLVM allows it by default with no way to turn it off (short of not using NEON at all). As a first step, we push this change to make it safe and in sync with GCC. The second step is to discuss a new sub-normal's flag on both communitues and come up with a common solution. The third step is to improve the FastMath flags in LLVM to encode sub-normals and use those flags to restrict NEON FP. Fixes PR16275. llvm-svn: 266363	2016-04-14 20:42:18 +00:00
Sanjay Patel	1d8eaf1ecc	[InstCombine] remove constant by inverting compare + logic (PR27105) https://llvm.org/bugs/show_bug.cgi?id=27105 We can check if all bits outside of a constant mask are set with a single constant. As noted in the bug report, although this form should be considered the canonical IR, backends may want to transform this into an 'andn' / 'andc' comparison against zero because that could be a single machine instruction. Differential Revision: http://reviews.llvm.org/D18842 llvm-svn: 266362	2016-04-14 20:17:40 +00:00
Dehao Chen	dab7c40296	Fix null pointer access for discriminator assignment. Summary: This fixes the buildbot failure. Reviewers: dnovillo, davidxl Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19129 llvm-svn: 266360	2016-04-14 19:46:38 +00:00
Tom Stellard	383448325b	AMDGPU: Add skeleton GlobalIsel implementation Summary: This adds the necessary target code to be able to run the ir translator. Lowering function arguments and returns is a nop and there is no support for RegBankSelect. Reviewers: arsenm, qcolombet Subscribers: arsenm, joker.eph, vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D19077 llvm-svn: 266356	2016-04-14 19:09:28 +00:00
Dehao Chen	5e2f05be2a	Update discriminator assignment algorithm to handle nested call correctly. Summary: Add discriminator for nested call correctly. Reviewers: davidxl, dnovillo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19127 llvm-svn: 266354	2016-04-14 18:37:18 +00:00
Reid Kleckner	2c00bf1830	Sink DI metadata usage out of MachineInstr.h and MachineInstrBuilder.h MachineInstr.h and MachineInstrBuilder.h are very popular headers, widely included across all LLVM backends. It turns out that there only a handful of TUs that actually care about DI operands on MachineInstrs. After this change, touching DebugInfoMetadata.h and rebuilding llc only needs 112 actions instead of 542. llvm-svn: 266351	2016-04-14 18:29:59 +00:00
Davide Italiano	8301c52067	[ValueMapper] Range-loopify to improve readability. NFC. llvm-svn: 266350	2016-04-14 18:07:32 +00:00
Jacques Pienaar	7870fedd7e	[lanai] Add custom lowering for SRL_PARTS i32. llvm-svn: 266349	2016-04-14 17:59:22 +00:00
Tom Stellard	63209a899d	[GlobalISel] Move GISelAccessor class into public headers Reviewers: qcolombet Subscribers: joker.eph, vkalintiris, llvm-commits Differential Revision: http://reviews.llvm.org/D19120 llvm-svn: 266348	2016-04-14 17:45:38 +00:00
Nicolai Haehnle	bbd04b8c49	[DivergenceAnalysis] Treat PHI with incoming undef as constant Summary: If a PHI has an incoming undef, we can pretend that it is equal to one non-undef, non-self incoming value. This is particularly relevant in combination with the StructurizeCFG pass, which introduces PHI nodes with undefs. Previously, this lead to branch conditions that were uniform before StructurizeCFG to become non-uniform afterwards, which confused the SIAnnotateControlFlow pass. This fixes a crash when Mesa radeonsi compiles a shader from dEQP-GLES3.functional.shaders.switch.switch_in_for_loop_dynamic_vertex Reviewers: arsenm, tstellarAMD, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19013 llvm-svn: 266347	2016-04-14 17:42:47 +00:00
Nicolai Haehnle	25eef7cc0f	[StructurizeCFG] Annotate branches that were treated as uniform Summary: This fully solves the problem where the StructurizeCFG pass does not consider the same branches as uniform as the SIAnnotateControlFlow pass. The patch in D19013 helps with this problem, but is not sufficient (and, interestingly, causes a "regression" with one of the existing test cases). No tests included here, because tests in D19013 already cover this. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19018 llvm-svn: 266346	2016-04-14 17:42:35 +00:00
Nicolai Haehnle	014104e476	AMDGPU: Remove SIFixSGPRLiveRanges pass Summary: This pass is unnecessary and overly conservative. It was motivated by situations like def %vreg0:SGPR_32 ... if-block: .. def %vreg1:SGPR_32 ... else-block: ... use %vreg0:SGPR_32 ... and similar situations with uses after the non-uniform control flow, where we are not allowed to assign %vreg0 and %vreg1 to the same physical register, even though in the original, thread/workitem-based CFG, it looks like the live ranges of these registers do not overlap. However, by the time register allocation runs, we have moved to a wave-based CFG that accurately represents the fact that the wave may run through both the if- and the else-block. So the live ranges of %vreg0 and %vreg1 already overlap even without the SIFixSGPRLiveRanges pass. In addition to proving this change correct, I have tested it with Piglit and a small number of other tests. Reviewers: arsenm, tstellarAMD Subscribers: MatzeB, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19041 llvm-svn: 266345	2016-04-14 17:42:29 +00:00
Nicolai Haehnle	141bf141e2	AMDGPU: change a redundant if () to an assert(). NFC Summary: I've been carrying this change around with me for a while, because the if () managed to confuse me while following the code. All callers ensure that the assertion holds. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19042 llvm-svn: 266344	2016-04-14 17:42:18 +00:00
Tom Stellard	c2eabf0b58	[GlobalISel] Coding style and whitespace fixes Reviewers: qcolombet Subscribers: joker.eph, llvm-commits, vkalintiris Differential Revision: http://reviews.llvm.org/D19119 llvm-svn: 266342	2016-04-14 17:23:33 +00:00
Tim Northover	d6721e4e7e	AArch64: expand cmpxchg after regalloc at -O0. FastRegAlloc works only at the basic-block level and spills all live-out registers. Unfortunately for a stack-based cmpxchg near the spill slots, this can perpetually clear the exclusive monitor, which means the cmpxchg will never succeed. I believe the only way to handle this within LLVM is by expanding the loop post-regalloc. We don't want this in general because it severely limits the optimisations that can be done, so we limit this to -O0 compilations. It's an ugly hack, and about the one good point in the whole mess is that we can treat all cmpxchg operations in the most naive way possible (seq_cst, no clrex faff) without affecting correctness. Should fix PR25526. llvm-svn: 266339	2016-04-14 17:03:29 +00:00
Jacques Pienaar	77c93881e7	[lanai] Add areMemAccessesTriviallyDisjoint, getMemOpBaseRegImmOfs and getMemOpBaseRegImmOfsWidth. Summary: Add getMemOpBaseRegImmOfsWidth to enable determining independence during MiSched. Reviewers: eliben, majnemer Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18903 llvm-svn: 266338	2016-04-14 16:47:42 +00:00
Tom Stellard	c0b7282ebc	AMDGPU: allow specifying a workgroup size that needs to fit in a compute unit Summary: For GL_ARB_compute_shader we need to support workgroup sizes of at least 1024. However, if we want to allow large workgroup sizes, we may need to use less registers, as we have to run more waves per SIMD. This patch adds an attribute to specify the maximum work group size the compiled program needs to support. It defaults, to 256, as that has no wave restrictions. Reducing the number of registers available is done similarly to how the registers were reserved for chips with the sgpr init bug. Reviewers: mareko, arsenm, tstellarAMD, nhaehnle Subscribers: FireBurn, kerberizer, llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D18340 Patch By: Bas Nieuwenhuizen llvm-svn: 266337	2016-04-14 16:27:07 +00:00
Tom Stellard	b9654f5566	AMDGPU/SI: Use the correct scratch wave offset register for shaders. Summary: The code previously always used s1 as it was using the user + system SGPR information for compute kernels. This is incorrect for Mesa shaders though, The register should be the next SGPR after all user and system SGPR's. We use that Mesa adds arguments for all input and system SGPR's and take the next available SGPR for the scratch wave offset register. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewers: mareko, arsenm, nhaehnle, tstellarAMD Subscribers: qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18941 Patch By: Bas Nieuwenhuizen llvm-svn: 266336	2016-04-14 16:27:03 +00:00
Betul Buyukkurt	f4233962c1	[PGO] Do not attach VP metadata if value count at site is 0 [NFC] llvm-svn: 266335	2016-04-14 16:25:45 +00:00
Silviu Baranga	83a7a9c1e0	[SCEV][LAA] Add tests for SCEV expression transformations performed during LAA Summary: Add a print method to Predicated Scalar Evolution which prints all interesting transformations done by PSE. Loop Access Analysis will now print this as part of the analysis output. We now use this to check the exact expression transformations that were done by PSE in LAA. The additional checking also acts as white-box testing for the getAsAddRec method. Reviewers: anemet, sanjoy Subscribers: sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18792 llvm-svn: 266334	2016-04-14 16:08:45 +00:00
Simon Dardis	46f0e9cf63	Summary: Alias 'jic $reg, 0' to 'jrc $reg' and 'jialc $reg, 0' to 'jalrc $reg' like binutils. This patch was previous committed as r266055 as seemed to have caused some spurious test failures. They did not reappear after further local testing. llvm-svn: 266301	2016-04-14 13:43:17 +00:00
Igor Kudrin	dde32a28b6	[Coverage] Update testing methods to support more than two files Differential Revision: http://reviews.llvm.org/D18757 llvm-svn: 266289	2016-04-14 10:43:37 +00:00
Vasileios Kalintiris	1ea0fc57ba	[mips] Remove duplicate tests and add missing prefixes for *-LABEL checks. NFC. Summary: The only difference between the removed tests and the pre-existing ones, is the materialization of the zero constant, which shouldn't matter for these cases. Reviewers: dsanders, sdardis Subscribers: dsanders, sdardis, llvm-commits Differential Revision: http://reviews.llvm.org/D18693 llvm-svn: 266285	2016-04-14 09:13:13 +00:00
Igor Kudrin	0f6ff5a6b3	[Coverage] Avoid unnecessary copying of std::vector Approved by: Justin Bogner <mail@justinbogner.com> Differential Revision: http://reviews.llvm.org/D18756 llvm-svn: 266284	2016-04-14 09:10:00 +00:00
Adam Nemet	aa40b369a6	Revert "Support arbitrary addrspace pointers in masked load/store intrinsics" This reverts commit r266086. It breaks the LTO build of gcc in SPEC2000. llvm-svn: 266282	2016-04-14 08:47:17 +00:00
Mehdi Amini	341c55e737	ThinLTO: linkonce compile-time optimization, do not bother when there is only one input file From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266281	2016-04-14 08:46:22 +00:00
David Majnemer	d2ed420815	[CodeGen] Teach LLVM how to lower @llvm.{min,max}num to {MIN,MAX}NAN The behavior of {MIN,MAX}NAN differs from that of {MIN,MAX}NUM when only one of the inputs is NaN: -NUM will return the non-NaN argument while -NAN would return NaN. It is desirable to lower to @llvm.{min,max}num to -NAN if they don't have a native instruction for -NUM. Notably, ARMv7 NEON's vmin has the -NAN semantics. N.B. Of course, it is only safe to do this if the intrinsic call is marked nnan. llvm-svn: 266279	2016-04-14 07:13:24 +00:00
Mehdi Amini	6bf091d940	Do not use getGlobalContext()... ever. This code was creating a new type in the global context, regardless of which context the user is sitting in, what can possibly go wrong? From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 266275	2016-04-14 04:36:40 +00:00
Matt Arsenault	489a8fbeea	AMDGPU: Implement canonicalize Also add generic DAG node for it. llvm-svn: 266272	2016-04-14 01:42:16 +00:00
Matthias Braun	ba8f42fa2c	TargetLowering: Factor out common code for tail call eligibility checking; NFC llvm-svn: 266270	2016-04-14 01:10:42 +00:00
George Burgess IV	01989e03ff	[CFLAA] Fix up code style a bit. NFC. llvm-svn: 266262	2016-04-13 23:27:37 +00:00
Sanjay Patel	4fd6cc9cca	[ppc] add tests to show potential andc optimization llvm-svn: 266261	2016-04-13 23:23:30 +00:00
Tim Northover	3d26dcef22	ARM: override cost function to re-enable ConstantHoisting (& fix it). At some point, ARM stopped getting any benefit from ConstantHoisting because the pass called a different variant of getIntImmCost. Reimplementing the correct variant revealed some problems, however: + ConstantHoisting was modifying switch statements. This is simply invalid, the cases must remain integer constants no matter the notional cost. + ConstantHoisting was mangling alloca instructions in the entry block. These should be handled by FrameLowering, so constants actually have a cost of 0. Worse, the resulting bitcasts meant they became dynamic allocas. rdar://25707382 llvm-svn: 266260	2016-04-13 23:08:27 +00:00
Amaury Sechet	9e1c3df5ca	Revert "Add LLVMGetAttrKindIDInContext in the C API in order to facilitate migration away from LLVMAttribute" This reverts commit 0bcfd95c268bcb180a525e1837e84475df8acdc7. llvm-svn: 266259	2016-04-13 23:01:39 +00:00
Duncan P. N. Exon Smith	9437b219c2	ValueMapper: Resolve cycles on the new nodes Fix a major bug from r265456. Although it's now much rarer, ValueMapper sometimes has to duplicate cycles. The might-transitively-reference-a-temporary counts don't decrement on their own when there are cycles, and you need to call MDNode::resolveCycles to fix it. r265456 was checking the input nodes to see if they were unresolved. This is useless; they should never be unresolved. Instead we should check the output nodes and resolve cycles on them. llvm-svn: 266258	2016-04-13 22:54:01 +00:00
Amaury Sechet	7c6da5b716	Add LLVMGetAttrKindIDInContext in the C API in order to facilitate migration away from LLVMAttribute Summary: LLVMAttribute has outlived its utility and is becoming a problem for C API users that what to use all the LLVM attributes. In order to help moving away from LLVMAttribute in a smooth manner, this diff introduce LLVMGetAttrKindIDInContext, which can be used instead of the enum values. Reviewers: Wallbraker, whitequark, joker.eph, echristo Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D18749 llvm-svn: 266257	2016-04-13 22:51:40 +00:00
Reid Kleckner	17e6fe60ac	[IR] Optimize memory usage of Metadata on MSVC An unsigned 2 bit bitfield takes 4 bytes in MSVC. Instead of a bitfield, just use an unsigned char. We can go back to a bitfield when someone implements the TODO of exposing and reusing the remaining 6 bits. llvm-svn: 266256	2016-04-13 22:46:06 +00:00
Davide Italiano	e98adefed7	[llvm-lto] Uniform error handling. NFC. llvm-svn: 266255	2016-04-13 22:08:26 +00:00
Matthias Braun	1f20820c09	ARM: Use a callee save register for the swiftself parameter. It is very likely that the swiftself parameter is alive throughout most functions function so putting it into a callee save register should avoid spills for the callers with only a minimum amount of extra spills in the callees. Currently the generated code is correct but unnecessarily spills and reloads arguments passed in callee save registers, I will address this in upcoming patches. This also adds a missing check that for tail calls the preserved value of the caller must be the same as the callees parameter. Differential Revision: http://reviews.llvm.org/D18901 llvm-svn: 266253	2016-04-13 21:43:25 +00:00
Matthias Braun	fc08c62acd	X86: Use a callee save register for the swiftself parameter. It is very likely that the swiftself parameter is alive throughout most functions function so putting it into a callee save register should avoid spills for the callers with only a minimum amount of extra spills in the callees. Currently the generated code is correct but unnecessarily spills and reloads arguments passed in callee save registers, I will address this in upcoming patches. This also adds a missing check that for tail calls the preserved value of the caller must be the same as the callees parameter. Differential Revision: http://reviews.llvm.org/D18902 llvm-svn: 266252	2016-04-13 21:43:21 +00:00
Matthias Braun	40d6ea2b7b	AArch64: Use a callee save registers for swiftself parameters It is very likely that the swiftself parameter is alive throughout most functions function so putting it into a callee save register should avoid spills for the callers with only a minimum amount of extra spills in the callees. Currently the generated code is correct but unnecessarily spills and reloads arguments passed in callee save registers, I will address this in upcoming patches. This also adds a missing check that for tail calls the preserved value of the caller must be the same as the callees parameter. Differential Revision: http://reviews.llvm.org/D19007 llvm-svn: 266251	2016-04-13 21:43:16 +00:00
Davide Italiano	54816bb00e	[llvm-lto] clang-format before working on this file. llvm-svn: 266250	2016-04-13 21:41:35 +00:00
Easwaran Raman	07c58a3800	Return immediately from analyzeCall if analyzeBlock returns false. This is part of the patch reviewed at http://reviews.llvm.org/D17584 llvm-svn: 266249	2016-04-13 21:20:22 +00:00
Kevin Enderby	3bd0231fd8	Start to add real error messages for malformed Mach-O files. And update the existing test cases in test/Object/macho-invalid.test to use llvm-objdump with the -macho option to produce these error messages and stop producing the generic "Invalid data was encountered while parsing the file" message. Working from the beginning of the file, if the mach header is too large for the size of the file and then if the load commands that follow extend past the end of the file these two errors now generate correct error messages. Both of these have existing test cases in test/Object/macho-invalid.test . But the first with macho-invalid-header it will never trigger the error message "mach header extends past the end of the file" using any of the llvm tools as they all use identify_magic() which rejects files with the correct magic number that are too small in size. So I tested this by hacking that code and seeing the error message down in parseHeader() really does happen. So in case there is ever code in llvm that directly calls createMachOObjectFile() this error message will be correctly produced. The second error message of "load commands extends past the end of the file" is triggered by a number of existing tests cases in test/Object/macho-invalid.test . Also other tests trigger different error messages now like "ilocalsym plus nlocalsym in LC_DYSYMTAB load command extends past the end of the symbol table". There are two existing test cases that still get the "Invalid data was encountered ..." error messages that I will tackle next. But they will involve a bit of pluming an Expect<...> up through the call stack and I want to do those as separate changes. FYI, for those test cases that were trying to test specific errors that now get different errors I’ll fix those in follow on changes and create new test cases for those so they test the error they were meant to test. llvm-svn: 266248	2016-04-13 21:17:58 +00:00
JF Bastien	c9dcb33a23	NFC mergefunc: const correctness Some of the comparators were const others weren't making it annoying to add new comparators which call existing ones. llvm-svn: 266247	2016-04-13 21:12:21 +00:00

1 2 3 4 5 ...

130039 Commits