llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2025-02-01 05:01:59 +01:00

Author	SHA1	Message	Date
Hal Finkel	c9890f4fe1	[BDCE] Add a bit-tracking DCE pass BDCE is a bit-tracking dead code elimination pass. It is based on ADCE (the "aggressive DCE" pass), with the added capability to track dead bits of integer valued instructions and remove those instructions when all of the bits are dead. Currently, it does not actually do this all-bits-dead removal, but rather replaces the instruction's uses with a constant zero, and lets instcombine (and the later run of ADCE) do the rest. Because we essentially get a run of ADCE "for free" while tracking the dead bits, we also do what ADCE does and removes actually-dead instructions as well (this includes instructions newly trivially dead because all bits were dead, but not all such instructions can be removed). The motivation for this is a case like: int __attribute__((const)) foo(int i); int bar(int x) { x \|= (4 & foo(5)); x \|= (8 & foo(3)); x \|= (16 & foo(2)); x \|= (32 & foo(1)); x \|= (64 & foo(0)); x \|= (128& foo(4)); return x >> 4; } As it turns out, if you order the bit-field insertions so that all of the dead ones come last, then instcombine will remove them. However, if you pick some other order (such as the one above), the fact that some of the calls to foo() are useless is not locally obvious, and we don't remove them (without this pass). I did a quick compile-time overhead check using sqlite from the test suite (Release+Asserts). BDCE took ~0.4% of the compilation time (making it about twice as expensive as ADCE). I've not looked at why yet, but we eliminate instructions due to having all-dead bits in: External/SPEC/CFP2006/447.dealII/447.dealII External/SPEC/CINT2006/400.perlbench/400.perlbench External/SPEC/CINT2006/403.gcc/403.gcc MultiSource/Applications/ClamAV/clamscan MultiSource/Benchmarks/7zip/7zip-benchmark llvm-svn: 229462	2015-02-17 01:36:59 +00:00
Hal Finkel	0e48ce380c	Specify arch in test/CodeGen/X86/float-conv-elim.ll This test was failing on non-x86 hosts because it specified a cpu of x86_64, but not an architecture. x86_64 is obviously not a valid cpu on all architectures. llvm-svn: 229460	2015-02-17 00:11:19 +00:00
Hal Finkel	a9011331c4	[PowerPC] Support non-direct-sub/superclass VSX copies Our register allocation has become better recently, it seems, and is now starting to generate cross-block copies into inflated register classes. These copies are not transformed into subregister insertions/extractions by the PPCVSXCopy class, and so need to be handled directly by PPCInstrInfo::copyPhysReg. The code to do this was almost there, but not quite (it was unnecessarily restricting itself to only the direct sub/super-register-class case (not copying between, for example, something in VRRC and the lower-half of VSRC which are super-registers of F8RC). Triggering this behavior manually is difficult; I'm including two bugpoint-reduced test cases from the test suite. llvm-svn: 229457	2015-02-16 23:46:30 +00:00
Cameron McInally	857d405820	[AVX512] Make 512b vector floating point rounds legal on AVX512. llvm-svn: 229445	2015-02-16 22:15:42 +00:00
Simon Pilgrim	e9778d75a1	[X86][SSE] Add SSE MOVQ instructions to SSEPackedInt domain Patch to explicitly add the SSE MOVQ (rr,mr,rm) instructions to SSEPackedInt domain - prevents a number of costly domain switches. Differential Revision: http://reviews.llvm.org/D7600 llvm-svn: 229439	2015-02-16 21:50:56 +00:00
Mehdi Amini	d44ae390cb	SelectionDAG: fold (fp_to_u/sint (s/uint_to_fp)) here too Update SPARC tests to match. From: Fiona Glaser <fglaser@apple.com> llvm-svn: 229438	2015-02-16 21:47:58 +00:00
Mehdi Amini	ce11b626e7	InstCombine: fold more cases of (fp_to_u/sint (u/sint_to_fp val)) Fixes radar 15486701. From: Fiona Glaser <fglaser@apple.com> llvm-svn: 229437	2015-02-16 21:47:54 +00:00
Mehdi Amini	1468b597ea	Tests: reformat sitofp.ll and use FileCheck From: Fiona Glaser <fglaser@apple.com> llvm-svn: 229436	2015-02-16 21:47:50 +00:00
Craig Topper	b6e168f770	[X86] Remove the multiply by 8 that goes into the shift constant for X86ISD::VSHLDQ and X86ISD::VSRLDQ. This simplifies the pattern matching in isel and allows these nodes to become the patterns embedded in the instruction. llvm-svn: 229431	2015-02-16 20:52:07 +00:00
David Majnemer	4da38e22ad	ConstantFold: Properly fold GEP indices wider than i64 llvm-svn: 229420	2015-02-16 19:10:02 +00:00
Andrew Trick	e7964c82c7	AArch64: Safely handle the incoming sret call argument. This adds a safe interface to the machine independent InputArg struct for accessing the index of the original (IR-level) argument. When a non-native return type is lowered, we generate the hidden machine-level sret argument on-the-fly. Before this fix, we were representing this argument as OrigArgIndex == 0, which is an outright lie. In particular this crashed in the AArch64 backend where we actually try to access the type of the original argument. Now we use a sentinel value for machine arguments that have no original argument index. AArch64, ARM, Mips, and PPC now check for this case before accessing the original argument. Fixes <rdar://19792160> Null pointer assertion in AArch64TargetLowering llvm-svn: 229413	2015-02-16 18:10:47 +00:00
James Molloy	317bb7473f	[LoopReroll] Relax some assumptions a little. We won't find a root with index zero in any loop that we are able to reroll. However, we may find one in a non-rerollable loop, so bail gracefully instead of failing hard. llvm-svn: 229406	2015-02-16 17:02:00 +00:00
James Molloy	382c1caece	[LoopReroll] Don't crash on dead code If a PHI has no users, don't crash; bail gracefully. This shouldn't happen often, but we can make no guarantees that previous passes didn't leave dead code around. llvm-svn: 229405	2015-02-16 17:01:52 +00:00
Chandler Carruth	d44ede78e3	[x86] Add a generic unpack-targeted lowering technique. This can be used to generically lower blends and is particularly nice because it is available frome SSE2 onward. This removes a lot of the remaining domain crossing blends in SSE2 code. I'm hoping to replace some of the "interleaved" lowering hacks with something closer to this which should be more principled. First, this needs to learn how to detect and use other interleavings besides that of the natural type provided. That will be a follow-up patch though. llvm-svn: 229378	2015-02-16 12:28:18 +00:00
Chandler Carruth	3816521c2e	[x86] Switch this test to use checks generated by my update script. NFC llvm-svn: 229377	2015-02-16 12:23:22 +00:00
Michael Kuperstein	70d8ac0a8e	Fix quoting of #pragma comment for MS compat, LLVM part. For #pragma comment(linker, ...) MSVC expects the comment string to be quoted, but for #pragma comment(lib, ...) the compiler itself quotes the library name. Since this distinction disappears by the time the directive reaches the backend, move quoting for the "lib" version to the frontend. Differential Revision: http://reviews.llvm.org/D7652 llvm-svn: 229375	2015-02-16 11:57:17 +00:00
Chandler Carruth	358c1db65e	[x86] Add initial basic support for forming blends of v16i8 vectors. This blend instruction is ... really lame. The register usage is insane. As a consequence this is probably only barely better than 2 pshufbs followed by a por, and that mostly because it only has to read from a single memory location. However, this doesn't fix as much as I kind of expected, so more to go. Pretty sure that the ordering and delegation of v16i8 is just really, really bad. llvm-svn: 229373	2015-02-16 10:58:23 +00:00
Chandler Carruth	db7a8ca276	[x86] Add some more test cases for i8 vector blends. llvm-svn: 229372	2015-02-16 10:51:49 +00:00
David Majnemer	64cd3dbda5	IR: SrcTy == DstTy doesn't imply that a cast is valid Cast validity depends on the cast's kind, not just its types. llvm-svn: 229366	2015-02-16 09:37:35 +00:00
David Majnemer	4f5d97ee4f	AsmParser: extractvalue requires at least one index operand llvm-svn: 229365	2015-02-16 09:18:13 +00:00
David Majnemer	7f40c08dca	AsmParser: Make sure GlobalVariables have sane types llvm-svn: 229364	2015-02-16 08:41:08 +00:00
David Majnemer	b5464fbff9	AsmParser: Reject alloca with function type llvm-svn: 229363	2015-02-16 08:38:03 +00:00
David Majnemer	9580d6a824	Verifier: Diagnose module flags which have null ID operands llvm-svn: 229361	2015-02-16 08:14:22 +00:00
Craig Topper	b3a29e8067	[X86] Add support for lowering shuffles to 256-bit PALIGNR instruction. llvm-svn: 229359	2015-02-16 06:29:06 +00:00
Craig Topper	988e9c859c	[X86] Remove some hard tab characters from tests. llvm-svn: 229358	2015-02-16 06:29:02 +00:00
David Majnemer	3ae73b8e74	DebugInfo: Don't crash if 'Debug Info Version' has a strange value llvm-svn: 229356	2015-02-16 06:04:53 +00:00
David Majnemer	271992a42e	DataLayout: Validate that the pref alignment is at least the ABI align llvm-svn: 229355	2015-02-16 05:41:55 +00:00
David Majnemer	18f3685387	DataLayout: Report when the datalayout type alignment/width is too large llvm-svn: 229354	2015-02-16 05:41:53 +00:00
David Majnemer	2b452a1df4	IR: Properly return nullptr when getAggregateElement is out-of-bounds We didn't properly handle the out-of-bounds case for ConstantAggregateZero and UndefValue. This would manifest as a crash when the constant folder was asked to fold a load of a constant global whose struct type has no operands. This fixes PR22595. llvm-svn: 229352	2015-02-16 04:02:09 +00:00
Chandler Carruth	a34a4a834e	[x86] Teach the 128-bit vector shuffle lowering routines to take advantage of the existence of a reasonable blend instruction. The 256-bit vector shuffle lowering has leveraged the general technique of decomposed shuffles and blends for quite some time, but this never made it back into the 128-bit code, and there are a large number of patterns where this is substantially better. For example, this removes almost all domain crossing in vector shuffles that involve some blend and some permutation with SSE4.1 and later. See the massive reduction in 'shufps' for integer test cases in this commit. This isn't perfect yet for a few reasons: 1) The v8i16 shuffle lowering continues to plague me. We don't always form an unpack-based blend when that would be better. But the wins pretty drastically outstrip the losses here. 2) The v16i8 shuffle lowering is just a disaster here. I never went and implemented blend support here for some terrible reason. I'll do that next probably. I've not updated it for now. More variations on this technique are coming as well -- we don't shuffle-into-unpack or shuffle-into-palignr, both of which would also be profitable. Note that some test cases grow significantly in the number of instructions, but I expect to actually be faster. We use pshufd+pshufd+blendw instead of a single shufps, but the pshufd's are very likely to pipeline well (two ports on most modern intel chips) and the blend is a very fast instruction. The domain switch penalty will essentially always be more than a blend instruction, which is the only increase in tree height. llvm-svn: 229350	2015-02-16 01:52:02 +00:00
Chandler Carruth	572fa3dbba	[x86] Clean up a few test cases with the update script. NFC llvm-svn: 229349	2015-02-16 01:39:50 +00:00
Filipe Cabecinhas	e4564d63bb	[Bitcode reader] Fix a few assertions when reading invalid files Summary: When creating {insert,extract}value instructions from a BitcodeReader, we weren't verifying the fields were valid. Bugs found with afl-fuzz Reviewers: rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7325 llvm-svn: 229345	2015-02-16 00:03:11 +00:00
Simon Pilgrim	9849f0e2cd	Added (still inefficient) shuffle test case for PR21138 llvm-svn: 229321	2015-02-15 18:21:39 +00:00
Simon Pilgrim	dc017ce5f6	Added some test cases of missed opportunities to use unpckl/unpckh shuffles llvm-svn: 229313	2015-02-15 15:07:45 +00:00
Simon Pilgrim	ddbf019542	[X86][AVX2] vpslldq/vpsrldq byte shifts for AVX2 This patch refactors the existing lowerVectorShuffleAsByteShift function to add support for 256-bit vectors on AVX2 targets. It also fixes a tablegen issue that prevented the lowering of vpslldq/vpsrldq vec256 instructions. Differential Revision: http://reviews.llvm.org/D7596 llvm-svn: 229311	2015-02-15 13:19:52 +00:00
Chandler Carruth	eb48186ef8	[x86] Add the test case from PR22412, we now get this right even with the new vector shuffle legality. llvm-svn: 229310	2015-02-15 12:45:05 +00:00
Chandler Carruth	2fac6b1c98	[x86] Teach the decomposed shuffle/blend lowering to use an early blend when that will allow it to lower with a single permute instead of multiple permutes. It tries to detect when it will only have to do a single permute in either case to maximize folding of loads and such. This cuts a lot of the avx2 shuffle permute counts in half. =] llvm-svn: 229309	2015-02-15 12:42:15 +00:00
Chandler Carruth	0b684c6980	[SDAG] Teach the SelectionDAG to canonicalize vector shuffles of splats directly into blends of the splats. These patterns show up even very late in the vector shuffle lowering where we don't have any chance for DAG combining to kick in, and blending is a tremendously simpler operation to model. By coercing the shuffle into a blend we can much more easily match and lower shuffles of splats. Immediately with this change there are significantly more blends being matched in the x86 vector shuffle lowering. llvm-svn: 229308	2015-02-15 12:18:12 +00:00
Chandler Carruth	1d8146cec4	[x86] Stop shuffling zero vectors. =] I was somewhat surprised this pattern really came up, but it does. It seems better to just directly handle it than try to special case every place where we end up forming a shuffle that devolves to a shuffle of a zero vector. llvm-svn: 229301	2015-02-15 10:34:52 +00:00
Chandler Carruth	5c0c778648	[x86] When splitting 256-bit vectors into 128-bit vectors, don't extract subvectors from buildvectors. That doesn't really make any sense and it breaks all of the down-stream matching of buildvectors to cleverly lower shuffles. With this, we now get the shift-based lowering of 256-bit vector shuffles with AVX1 when we split them into 128-bit vectors. We also do much better on the zero-extension patterns, although there remains quite a bit of room for improvement here. llvm-svn: 229299	2015-02-15 10:12:02 +00:00
Michael Kuperstein	c9049f9057	gold-plugin: fix test to allow default visibility on local symbols GNU ld sets default, not hidden, visibility on local symbols. Having default or hidden visibility on local symbols makes no difference in run-time behavior. Patch by: H.J. Lu <hjl.tools@gmail.com> llvm-svn: 229297	2015-02-15 09:32:30 +00:00
Chandler Carruth	65b90c2fa8	[x86] Update some tests with the latest version of my script and llc. This mostly adds some shuffle decode comments and cleans up indentation. llvm-svn: 229296	2015-02-15 09:26:15 +00:00
Chandler Carruth	8b98b98e16	[x86] Add a slight variation on some of the other generic shuffle lowerings -- one which decomposes into an initial blend followed by a permute. Particularly on newer chips, blends are handled independently of shuffles and so this is much less bottlenecked on the single port that floating point shuffles are executed with on Intel. I'll be adding this lowering to a bunch of other code paths in subsequent commits to handle still more places where we can effectively leverage blends when they're available in the ISA. llvm-svn: 229292	2015-02-15 08:26:30 +00:00
Craig Topper	be42a00218	[X86] Add assembly parser support for mnemonic aliases for AVX-512 vpcmp instructions. llvm-svn: 229287	2015-02-15 07:13:48 +00:00
Chandler Carruth	85a4ad9fb4	[x86] Add a test case for PR22390 which was a dup of PR22377 and fixed by r229285. This is a nice different test case though, so I'd like to have the extra testing of these kinds of patterns. llvm-svn: 229286	2015-02-15 07:05:50 +00:00
Chandler Carruth	6153ae3921	[x86] Fix PR22377, a regression with the new vector shuffle legality test. This was just a matter of the DAG combine for vector shuffles being too aggressive. This is a bit of a grey area, but I think generally if we can re-use intermediate shuffles, we should. Certainly, given the test cases I have available, this seems like the right call. llvm-svn: 229285	2015-02-15 07:01:10 +00:00
Chandler Carruth	635ad2f50d	[x86] Switch a collection of tests explicitly to the new vector shuffle legality test (essentially, everything is legal). I'm planning to make this the default shortly, but I'd like to fix a collection of the bugs it exposes first, and this will let me easily test them. It also showcases both the improvements and a few of the regressions triggered by the change. The biggest improvements by far are the significantly reduced shuffling and domain crossing in the combining test case. The biggest regressions are missing some clever blending patterns. llvm-svn: 229284	2015-02-15 06:37:21 +00:00
Chandler Carruth	83f63dfef3	[x86] Remove the now-default-on flag for the new vector shuffle lowering strategy from a bunch of tests. llvm-svn: 229283	2015-02-15 06:20:51 +00:00
Craig Topper	e9ad59aeaf	[X86] Add assembler predicates for the rest of the AVX512 feature flags. This makes the assembly matching consistent across all AVX512 instructions. Without this we were allowing some AVX512 instructions to be parsed always, but not the foundation instructions. llvm-svn: 229280	2015-02-15 04:54:55 +00:00
David Blaikie	9f95d71989	FileCheck-ize a test to make it easier to migrate to typeless pointers llvm-svn: 229278	2015-02-15 04:14:00 +00:00

1 2 3 4 5 ...

28613 Commits