llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-23 19:23:23 +01:00

Author	SHA1	Message	Date
Chandler Carruth	a6a8ccfa74	[x86] Add some tests for specific patterns of lane-flips combined with in-lane shuffles that aren't always handled well by the current vector shuffle lowering. No functionality change yet, that will follow in a subsequent commit. llvm-svn: 221938	2014-11-13 22:49:44 +00:00
Rafael Espindola	77ea46e343	Move calls to push_back out of readAbbreviated(Literal\|Field). These functions always return a single value and not all callers want to push them into a SmallVector. llvm-svn: 221934	2014-11-13 22:29:02 +00:00
Reid Kleckner	aa6f2355ba	Avoid usage of char16_t as MSVC "14" doesn't appear to support it Fixes the MSVC "14" build. llvm-svn: 221932	2014-11-13 22:09:56 +00:00
David Blaikie	9aaa1dd2e1	Fix nested namespace with decltype to hopefully work with MSVC Build failed here: http://lab.llvm.org:8011/builders/lld-x86_64-win7/builds/14629/steps/build_Lld/logs/stdio So I'm taking a shot in the dark that MSVC (whatever version that is) can't cope with nested name specifiers with a decltype prefix. llvm-svn: 221931	2014-11-13 21:56:57 +00:00
Rafael Espindola	21641494c7	Make a few helper functions static. NFC. llvm-svn: 221930	2014-11-13 21:54:59 +00:00
David Blaikie	6d609b52ab	Use unique_ptr to handle ownership of TreePatterns in CodeGenDAGPatterns::PatternFragments We might be able to use unique_ptr to handle ownership of the TreePatternNodes too - looking into that next. llvm-svn: 221928	2014-11-13 21:40:02 +00:00
Aditya Nandakumar	4d9c1ff994	We can get the TLOF from the TargetMachine - so constructor no longer requires TargetLoweringObjectFile to be passed. llvm-svn: 221926	2014-11-13 21:29:21 +00:00
Chad Rosier	9c48d1f2b8	[GVN] Perform Scalar PRE on gep indices that feed loads before doing Load PRE. Phabricator Revision: http://reviews.llvm.org/D6103 Patch by "Balaram Makam" <bmakam@codeaurora.org>! llvm-svn: 221924	2014-11-13 21:17:58 +00:00
Juergen Ributzka	fb67fff9cb	[FastISel][AArch64] Don't bail during simple GEP instruction selection. The generic FastISel code would bail, because it can't emit a sign-extend for AArch64. This copies the code over and uses AArch64 specific emit functions. This is not ideal and 'computeAddress' should handles this, so it can fold the address computation into the memory operation. I plan to clean up 'computeAddress' anyways, so I will add that in a future commit. Related to rdar://problem/18962471. llvm-svn: 221923	2014-11-13 20:50:44 +00:00
Matt Arsenault	d222be0dd7	R600/SI: Use s_movk_i32 llvm-svn: 221922	2014-11-13 20:44:23 +00:00
Matt Arsenault	95ce2351ba	R600/SI: Fix definition for s_cselect_b32 These were directly using the old base instruction class, and specifying the wrong register classes for operands. The operands can be the other special inputs besides SGPRs. The op name was also being directly used for the asm string, so this was printed without any operands. llvm-svn: 221921	2014-11-13 20:23:36 +00:00
Matt Arsenault	2ef28e2234	R600: Fix assert on empty function If a function is just an unreachable, this would hit a "this is not a MachO target" assertion because of setting HasSubsectionViaSymbols. llvm-svn: 221920	2014-11-13 20:07:40 +00:00
Rui Ueyama	3ed9ac191d	Un-break the big-endian buildbots llvm-svn: 221919	2014-11-13 20:07:06 +00:00
Matt Arsenault	8f277a520f	R600: Error on initializer for LDS. Also give a proper error for other address spaces. llvm-svn: 221917	2014-11-13 19:56:13 +00:00
Matt Arsenault	5487619f0a	R600/SI: Get rid of FCLAMP_SI pseudo It's not necessary. Also use complex patterns to allow src modifier usage. llvm-svn: 221916	2014-11-13 19:49:04 +00:00
David Majnemer	8e7c697444	Object, Mach-O: Refactor and clean code up Don't assert if we can return an error code, reuse existing functionality like is64Bit(). llvm-svn: 221915	2014-11-13 19:48:56 +00:00
Roman Divacky	a2f5d7b42b	Use -Wcast-qual in cmake builds, not only autoconfo ones. llvm-svn: 221913	2014-11-13 19:45:27 +00:00
Matt Arsenault	41886925dc	R600/SI: Allow commuting with src2_modifiers llvm-svn: 221911	2014-11-13 19:26:50 +00:00
Matt Arsenault	a919d1ac8d	R600/SI: Allow commuting some 3 op instructions e.g. v_mad_f32 a, b, c -> v_mad_f32 b, a, c This simplifies matching v_madmk_f32. This looks somewhat surprising, but it appears to be OK to do this. We can commute src0 and src1 in all of these instructions, and that's all that appears to matter. llvm-svn: 221910	2014-11-13 19:26:47 +00:00
Rafael Espindola	58a1843cf7	Return word_t from read. This removes the need for a special Read64. llvm-svn: 221909	2014-11-13 18:44:53 +00:00
Tim Northover	8c9fca07bd	ARM: allow constpool entry to be moved to the user's block in all cases. Normally entries can only move to a lower address, but when that wasn't viable, the user's block was considered anyway. Unfortunately, it went via createNewWater which wasn't designed to handle the case where there's already an island after the block. Unfortunately, the test we have is slow and fragile, and I couldn't reduce it to anything sane even with the @llvm.arm.space intrinsic. The test change here is recreating the previous one after the change. rdar://problem/18545506 llvm-svn: 221905	2014-11-13 17:58:53 +00:00
Tim Northover	a52b6a893f	ARM: avoid duplicating branches during constant islands. We were using a naive heuristic to determine whether a basic block already had an unconditional branch at the end. This mostly corresponded to reality (assuming branches got optimised) because there's not much point in a branch to the next block, but could go wrong. llvm-svn: 221904	2014-11-13 17:58:51 +00:00
Tim Northover	5807daa475	ARM: add @llvm.arm.space intrinsic for testing ConstantIslands. Creating tests for the ConstantIslands pass is very difficult, since it depends on precise layout details. Having the ability to precisely inject a number of bytes into the stream helps greatly. llvm-svn: 221903	2014-11-13 17:58:48 +00:00
Rafael Espindola	671a029836	Fix the other build system. llvm-svn: 221901	2014-11-13 17:12:19 +00:00
Rafael Espindola	c534aef78e	Fix a regression on the disassembling C API. The fix is easy. Unfortunately, we had 0 tests, so adding one was somewhat complicated. Thanks to Kevin Enderby for the report. llvm-svn: 221899	2014-11-13 16:52:07 +00:00
Colin LeMahieu	00e705c691	[Hexagon] NFC Renaming reserved identifier. llvm-svn: 221898	2014-11-13 16:36:30 +00:00
Chad Rosier	164ded07a4	[Reassociate] Update comment. NFC. llvm-svn: 221894	2014-11-13 15:40:20 +00:00
Rafael Espindola	d5e7318740	Simplify code a bit. NFC. Thanks to Sean Silva for the suggestion. llvm-svn: 221892	2014-11-13 14:45:22 +00:00
Rafael Espindola	84a83a3faa	Small optimization: once the size is know, we don't have to call fillCurWord. llvm-svn: 221891	2014-11-13 14:37:51 +00:00
Aaron Ballman	f9b64b77a6	Fixing -Wtype-limits warnings with the asserts (the expression would always evaluate to true). Also fixing a -Wcast-qual warning, where the cast expression isn't required. llvm-svn: 221888	2014-11-13 13:55:13 +00:00
Aaron Ballman	f737ce7028	Fixing some sign comparison warnings from MSVC; NFC. llvm-svn: 221887	2014-11-13 13:39:49 +00:00
Duncan P. N. Exon Smith	90fbefa579	IR: Create the Metadata class This will become the root of a new class hierarchy separate from `Value`. As a first step, stick it between `Value` and `MDNode`. This is part of PR21532. llvm-svn: 221886	2014-11-13 13:17:47 +00:00
Elena Demikhovsky	9da1df2e58	AVX-512: SINT_TO_FP cost model and some bugfixes Checked some corner cases, for example translation of <8 x i1> to <8 x double> llvm-svn: 221883	2014-11-13 11:46:16 +00:00
David Majnemer	0b6910895b	Object, COFF: Refactor code to get relocation iterators No functional change intended. llvm-svn: 221880	2014-11-13 09:50:18 +00:00
Hal Finkel	d9aee5d51a	OCAMLFLAGS can contain =, don't use = with sed Like HOST_LDFLAGS, etc. OCAMLFLAGS can contain =, so use ! as the substitution separator instead of = (otherwise, sed might error). llvm-svn: 221879	2014-11-13 09:29:30 +00:00
Aditya Nandakumar	b93fb292df	This patch changes the ownership of TLOF from TargetLoweringBase to TargetMachine so that different subtargets could share the TLOF effectively llvm-svn: 221878	2014-11-13 09:26:31 +00:00
Hal Finkel	f6f1c7d61b	Revert r219432 - "Revert "[BasicAA] Revert "Revert r218714 - Make better use of zext and sign information.""" Let's try this again... This reverts r219432, plus a bug fix. Description of the bug in r219432 (by Nick): The bug was using AllPositive to break out of the loop; if the loop break condition i != e is changed to i != e && AllPositive then the test_modulo_analysis_with_global test I've added will fail as the Modulo will be calculated incorrectly (as the last loop iteration is skipped, so Modulo isn't updated with its Scale). Nick also adds this comment: ComputeSignBit is safe to use in loops as it takes into account phi nodes, and the == EK_ZeroEx check is safe in loops as, no matter how the variable changes between iterations, zero-extensions will always guarantee a zero sign bit. The isValueEqualInPotentialCycles check is therefore definitely not needed as all the variable analysis holds no matter how the variables change between loop iterations. And this patch also adds another enhancement to GetLinearExpression - basically to convert ConstantInts to Offsets (see test_const_eval and test_const_eval_scaled for the situations this improves). Original commit message: This reverts r218944, which reverted r218714, plus a bug fix. Description of the bug in r218714 (by Nick): The original patch forgot to check if the Scale in VariableGEPIndex flipped the sign of the variable. The BasicAA pass iterates over the instructions in the order they appear in the function, and so BasicAliasAnalysis::aliasGEP is called with the variable it first comes across as parameter GEP1. Adding a %reorder label puts the definition of %a after %b so aliasGEP is called with %b as the first parameter and %a as the second. aliasGEP later calculates that %a == %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) - ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly conclude that %a > %b. Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug. Slightly modified by me to add an early exit from the loop and avoid unnecessary, but expensive, function calls. Original commit message: Two related things: 1. Fixes a bug when calculating the offset in GetLinearExpression. The code previously used zext to extend the offset, so negative offsets were converted to large positive ones. 2. Enhance aliasGEP to deduce that, if the difference between two GEP allocations is positive and all the variables that govern the offset are also positive (i.e. the offset is strictly after the higher base pointer), then locations that fit in the gap between the two base pointers are NoAlias. Patch by Nick White! llvm-svn: 221876	2014-11-13 09:16:54 +00:00
David Majnemer	7366e9380f	Object, COFF: Increase code reuse Split getObject's smarts into checkOffset, use this to replace the handwritten check in getSectionContents. Similarly, replace checks in section_rel_begin/section_rel_end with getNumberOfRelocations. No functionality change intended. llvm-svn: 221873	2014-11-13 08:46:37 +00:00
David Majnemer	4c60a9e21b	llvm-readobj: relocAddressLess could potentially lie On error conditions, relocAddressLess might claim that a value is less than itself. Instead, abort llvm-readobj. No functionality change intended. llvm-svn: 221872	2014-11-13 07:54:05 +00:00
David Majnemer	0c18289151	llvm-readobj, COFF: Remove an unused variable printRelocation doesn't use the section contents. No functionality change intended. llvm-svn: 221871	2014-11-13 07:42:13 +00:00
David Majnemer	854aeb4e3e	Object, COFF: getRelocationSymbol shouldn't assert lib/Object is supposed to be robust to malformed object files. Don't assert if we don't have a symbol table. I'll try to come up with a test case later. llvm-svn: 221870	2014-11-13 07:42:11 +00:00
David Majnemer	c6acbb53c8	Object, COFF: Cleanup some code in getSectionName Use StringRef::startswith to tidy up some code, no functionality change intended. llvm-svn: 221869	2014-11-13 07:42:09 +00:00
David Majnemer	249b5acb93	Object, COFF: Fix some theoretical bugs getObject didn't consider the case where a pointer came before the start of the object file. No test is included, trying to come up with something reasonable. llvm-svn: 221868	2014-11-13 07:42:07 +00:00
David Majnemer	996caa7a82	Object, COFF: Clean up formatting in hasExtendedRelocations No functionality changed intended. llvm-svn: 221867	2014-11-13 07:42:05 +00:00
Rafael Espindola	ad6b4ce7dc	Read 64 bits at a time in the bitcode reader. The reading of 64 bit values could still be optimized, but at least this cuts down on the number of virtual calls to fetch more data. llvm-svn: 221865	2014-11-13 07:23:22 +00:00
NAKAMURA Takumi	477bb4ae3d	Update \param(s) in MemoryObject::readBytes(). [-Wdocumentation] llvm-svn: 221863	2014-11-13 04:56:41 +00:00
Chandler Carruth	2a4813287b	[x86] Teach the vector shuffle lowering to make a more nuanced decision between splitting a vector into 128-bit lanes and recombining them vs. decomposing things into single-input shuffles and a final blend. This handles a large number of cases in AVX1 where the cross-lane shuffles would be much more expensive to represent even though we end up with a fast blend at the root. Instead, we can do a better job of shuffling in a single lane and then inserting it into the other lanes. This fixes the remaining bits of Halide's regression captured in PR21281 for AVX1. However, the bug persists in AVX2 because I've made this change reasonably conservative. The cases where it makes sense in AVX2 to split into 128-bit lanes are much more rare because we can often do full permutations across all elements of the 256-bit vector. However, the particular test case in PR21281 is an example of one of the rare cases where it is always better to work in a single 128-bit lane. I'm going to try to teach the logic to detect and form the good code even in AVX2 next, but it will need to use a separate heuristic. Finally, there is one pesky regression here where we previously would craftily use vpermilps in AVX1 to shuffle both high and low halves at the same time. We no longer pull that off, and not for any really good reason. Ultimately, I think this is just another missing nuance to the selection heuristic that I'll try to add in afterward, but this change already seems strictly worth doing considering the magnitude of the improvements in common matrix math shuffle patterns. As always, please let me know if this causes a surprising regression for you. llvm-svn: 221861	2014-11-13 04:06:10 +00:00
Rui Ueyama	c478cf6240	llvm-readobj: Print out address table when dumping COFF delay-import table llvm-svn: 221855	2014-11-13 03:22:54 +00:00
Frederic Riss	67f2855174	Add an assert and a test that verify r221709's fix. llvm-svn: 221854	2014-11-13 03:20:23 +00:00
Chandler Carruth	d0c20aee06	[x86] Don't form overly fragmented blends when splitting and re-combining shuffles because nothing was available in the wider vector type. The key observation (which I've put in the comments for future maintainers) is that at this point, no further combining is really possible. And so even though these shuffles trivially could be combined, we need to actually do that as we produce them when producing them this late in the lowering. This fixes another (huge) part of the Halide vector shuffle regressions. As it happens, this was already well covered by the tests, but I hadn't noticed how bad some of these got. The specific patterns that turn directly into unpckl/h patterns were occurring many times in common vector processing code. There are still more problems here sadly, but trying to incrementally tease them apart and it looks like this is the core of the problem in the splitting logic. There is some chance of regression here, you can see it in the test changes. Specifically, where we stop forming pshufb in some cases, it is possible that pshufb was in fact faster. Intel "says" that pshufb is slower than the instruction sequences replacing it. llvm-svn: 221852	2014-11-13 02:42:08 +00:00

... 3 4 5 6 7 ...

109884 Commits