llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 04:52:54 +02:00

Author	SHA1	Message	Date
Chandler Carruth	72c329e5b3	[SROA] Fix two total think-os in r225061 that should have been caught on a +asserts bootstrap, but my bootstrap had asserts off. Oops. Anyways, in some places it is reasonable to cast (as a sanity check) the pointer operand to a load or store to an instruction within SROA -- namely when the pointer operand is expected to be derived from an alloca, and thus always an instruction. However, the pre-splitting code also deals with loads and stores to non-alloca pointers and there we need to just use the Value*. Nothing about the code relied on the instruction cast, it was only there essentially as an invariant assertion. Remove the two that don't actually hold. This should fix the proximate issue in PR22080, but I'm also doing an asserts bootstrap myself to see if there are other issues lurking. I'll craft a reduced test case in a moment, but I wanted to get the tree healthy as quickly as possible. llvm-svn: 225068	2015-01-01 23:26:16 +00:00
Hal Finkel	de1be0f87c	[PowerPC] use UINT64_C instead of ul Attempting to fix PR22078 (building on 32-bit systems) by replacing my careless use of 1ul to be a uint64_t constant with UINT64_C(1). llvm-svn: 225066	2015-01-01 19:33:59 +00:00
Michael Gottesman	1ce5923146	Revert "Just use a using directive in SmallMapVector instead of inheriting from MapVector itself." This reverts commit r225059. I think MSVC 2012 has a problem with this. This is an attempt to fix one of the MSVC 2012 bots. llvm-svn: 225065	2015-01-01 13:54:05 +00:00
Chandler Carruth	d5912b0090	Revert r225053: Add an ArrayRef upcasting constructor from ArrayRef<U> -> ArrayRef<T> where T is a base of U. This appears to have broken at least the windows build bots due to compile errors in the predicate that didn't simply supress the overload. I'm not sure what the fix is, and the bots have been broken for a long time now so I'm just reverting until Michael can figure out a fix. llvm-svn: 225064	2015-01-01 13:01:25 +00:00
Chandler Carruth	b27ce0e2ce	[SROA] Switch to using a more direct debug logging technique in one part of my new load and store splitting, and fix a bug where it logged a totally irrelevant slice rather than the actual slice in question. The logging here previously worked because we used to place new slices onto the back of the core sequence, but that caused other problems. I updated the actual code to store new slices in their own vector but didn't update the logging. There isn't a good way to reuse the logging any more, and frankly it wasn't needed. We can directly log this bit more easily. llvm-svn: 225063	2015-01-01 12:56:47 +00:00
Chandler Carruth	4a7c5492f8	[SROA] Fix formatting with clang-format which I managed to fail to do prior to committing r225061. Sorry for that. llvm-svn: 225062	2015-01-01 12:01:03 +00:00
Chandler Carruth	e46230af0c	[SROA] Teach SROA how to much more intelligently handle split loads and stores. When there are accesses to an entire alloca with an integer load or store as well as accesses to small pieces of the alloca, SROA splits up the large integer accesses. In order to do that, it uses bit math to merge the small accesses into large integers. While this is effective, it produces insane IR that can cause significant problems in the rest of the optimizer: - It can cause load and store mismatches with GVN on the non-alloca side where we end up loading an i64 (or some such) rather than loading specific elements that are stored. - We can't always get rid of the integer bit math, which is why we can't always fix the loads and stores to work well with GVN. - This is especially bad when we have operations that mix poorly with integer bit math such as floating point operations. - It will block things like the vectorizer which might be able to handle the scalar stores that underly the aggregate. At the same time, we can't just directly split up these loads and stores in all cases. If there is actual integer arithmetic involved on the values, then using integer bit math is actually the perfect lowering because we can often combine it heavily with the surrounding math. The solution this patch provides is to find places where SROA is partitioning aggregates into small elements, and look for splittable loads and stores that it can split all the way to some other adjacent load and store. These are uniformly the cases where failing to split the loads and stores hurts the optimizer that I have seen, and I've looked extensively at the code produced both from more and less aggressive approaches to this problem. However, it is quite tricky to actually do this in SROA. We may have loads and stores to the same alloca, or other complex patterns that are hard to handle. This complexity leads to the somewhat subtle algorithm implemented here. We have to do this entire process as a separate pass over the partitioning of the alloca, and split up all of the loads prior to splitting the stores so that we can handle safely the cases of overlapping, including partially overlapping, loads and stores to the same alloca. We also have to reconstitute the post-split slice configuration so we can avoid iterating again over all the alloca uses (the slow part of SROA). But we also have to ensure that when we split up loads and stores to other allocas, we do re-iterate over them in SROA to adapt to the more refined partitioning now required. With this, I actually think we can fix a long-standing TODO in SROA where I avoided splitting as many loads and stores as probably should be splittable. This limitation historically mitigated the fallout of all the bad things mentioned above. Now that we have more intelligent handling, I plan to remove the FIXME and more aggressively mark integer loads and stores as splittable. I'll do that in a follow-up patch to help with bisecting any fallout. The net result of this change should be more fine-grained and accurate scalars being formed out of aggregates. At the very least, Clang now generates perfect code for this high-level test case using std::complex<float>: #include <complex> void g1(std::complex<float> &x, float a, float b) { x += std::complex<float>(a, b); } void g2(std::complex<float> &x, float a, float b) { x -= std::complex<float>(a, b); } void foo(const std::complex<float> &x, float a, float b, std::complex<float> &x1, std::complex<float> &x2) { std::complex<float> l1 = x; g1(l1, a, b); std::complex<float> l2 = x; g2(l2, a, b); x1 = l1; x2 = l2; } This code isn't just hypothetical either. It was reduced out of the hot inner loops of essentially every part of the Eigen math library when using std::complex<float>. Those loops would consistently and pervasively hop between the floating point unit and the integer unit due to bit math extraction and insertion of floating point values that were "stored" in a 64-bit integer register around the loop backedge. So far, this change has passed a bootstrap and I have done some other testing and so far, no issues. That doesn't mean there won't be though, so I'll be prepared to help with any fallout. If you performance swings in particular, please let me know. I'm very curious what all the impact of this change will be. Stay tuned for the follow-up to also split more integer loads and stores. llvm-svn: 225061	2015-01-01 11:54:38 +00:00
Michael Gottesman	05fbb043f5	Just use a using directive in SmallMapVector instead of inheriting from MapVector itself. llvm-svn: 225059	2015-01-01 08:05:41 +00:00
Hal Finkel	93997c9aa6	[PowerPC] Improve instruction selection bit-permuting operations (64-bit) This is the second installment of improvements to instruction selection for "bit permutation" instruction sequences. r224318 added logic for instruction selection for 32-bit bit permutation sequences, and this adds lowering for 64-bit sequences. The 64-bit sequences are more complicated than the 32-bit ones because: a) the 64-bit versions of the 32-bit rotate-and-mask instructions work by replicating the lower 32-bits of the value-to-be-rotated into the upper 32 bits -- and integrating this into the cost modeling for the various bit group operations is non-trivial b) unlike the 32-bit instructions in 32-bit mode, the rotate-and-mask instructions cannot, in one instruction, specify the mask starting index, the mask ending index, and the rotation factor. Also, forming arbitrary 64-bit constants is more complicated than in 32-bit mode because the number of instructions necessary is value dependent. Plus, support for 'late masking' was added: it is sometimes more efficient to treat the overall value as if it had no mandatory zero bits when planning the bit-group insertions, and then mask them in at the very end. Unfortunately, as the structure of the bit groups is different in the two cases, the more feasible implementation technique was to generate both instruction sequences, and then pick the shorter one. And finally, we now generate reasonable code for i64 bswap: rldicl 5, 3, 16, 0 rldicl 4, 3, 8, 0 rldicl 6, 3, 24, 0 rldimi 4, 5, 8, 48 rldicl 5, 3, 32, 0 rldimi 4, 6, 16, 40 rldicl 6, 3, 48, 0 rldimi 4, 5, 24, 32 rldicl 5, 3, 56, 0 rldimi 4, 6, 40, 16 rldimi 4, 5, 48, 8 rldimi 4, 3, 56, 0 vs. what we used to produce: li 4, 255 rldicl 5, 3, 24, 40 rldicl 6, 3, 40, 24 rldicl 7, 3, 56, 8 sldi 8, 3, 8 sldi 10, 3, 24 sldi 12, 3, 40 rldicl 0, 3, 8, 56 sldi 9, 4, 32 sldi 11, 4, 40 sldi 4, 4, 48 andi. 5, 5, 65280 andis. 6, 6, 255 andis. 7, 7, 65280 sldi 3, 3, 56 and 8, 8, 9 and 4, 12, 4 and 9, 10, 11 or 6, 7, 6 or 5, 5, 0 or 3, 3, 4 or 7, 9, 8 or 4, 6, 5 or 3, 3, 7 or 3, 3, 4 which is 12 instructions, instead of 25, and seems optimal (at least in terms of code size). llvm-svn: 225056	2015-01-01 02:53:29 +00:00
Michael Gottesman	2c8ecfdec0	Add 2x constructors for TinyPtrVector, one that takes in one elemenet and the other that takes in an ArrayRef<EltTy> Currently one can only construct an empty TinyPtrVector. These are just missing elements of the API. llvm-svn: 225055	2014-12-31 23:33:24 +00:00
Michael Gottesman	b2e01905b0	Add a SmallMapVector class that is a MapVector with a Map of SmallDenseMap and a Vector of SmallVector. llvm-svn: 225054	2014-12-31 23:33:21 +00:00
Michael Gottesman	166915c263	Add an ArrayRef upcasting constructor from ArrayRef<U> -> ArrayRef<T> where T is a base of U. llvm-svn: 225053	2014-12-31 23:33:18 +00:00
Sanjay Patel	657e61ccbf	InstCombine: fsub nsz 0, X ==> fsub nsz -0.0, X Some day the backend may handle instruction-level fast math flags and make this transform unnecessary, but it's still better practice to use the canonical representation of fneg when possible (use a -0.0). This is a partial fix for PR20870 ( http://llvm.org/bugs/show_bug.cgi?id=20870 ). See also http://reviews.llvm.org/D6723. Differential Revision: http://reviews.llvm.org/D6731 llvm-svn: 225050	2014-12-31 22:14:05 +00:00
Rafael Espindola	13ff8033c2	Add r224985 back with a fix. The issues was that AArch64 has additional restrictions on when local relocations can be used. We have to take those into consideration when deciding to put a L symbol in the symbol table or not. Original message: Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 225048	2014-12-31 17:19:34 +00:00
Colin LeMahieu	f2dcb1dbfa	Reverting 225045 and 225043 and XFAIL multiline.ll on hexagon llvm-svn: 225047	2014-12-31 17:14:35 +00:00
Rafael Espindola	96643089c0	Add a test for the recent compiler-rt build failure. llvm-svn: 225046	2014-12-31 16:58:05 +00:00
Colin LeMahieu	ca60c4bd6c	[Hexagon] Removing assertion to appease buildbot until I can reproduce the problem llvm-svn: 225045	2014-12-31 16:20:00 +00:00
Rafael Espindola	afd829c72b	Revert "Remove doesSectionRequireSymbols." This reverts commit r224985. I am investigating why it made an Apple bot unhappy. llvm-svn: 225044	2014-12-31 16:06:48 +00:00
Colin LeMahieu	fe239ffbc6	[Hexagon] Changing an llvm_unreachable to an assertion and returning 0. Relocations aren't implemented yet but we don't need to abort for this in release builds. llvm-svn: 225043	2014-12-31 15:57:38 +00:00
Craig Topper	ec0329dc7b	[X86] Update disassembler tests for absolute move instructions to check the encodings. This provides testing for r225036. 64-bit mode is still broken. llvm-svn: 225037	2014-12-31 07:24:23 +00:00
Craig Topper	f189a728be	[X86] Fix disassembly of absolute moves to work correctly in 16 and 32-bit modes with all 4 combinations of OpSize and AdSize prefixes being present or not. llvm-svn: 225036	2014-12-31 07:07:31 +00:00
Craig Topper	c24a544cda	[x86] Simplify detection of jcxz/jecxz/jrcxz in disassembler. llvm-svn: 225035	2014-12-31 07:07:11 +00:00
David Majnemer	83939d3744	InstCombine: try to transform A-B < 0 into A < B We are allowed to move the 'B' to the right hand side if we an prove there is no signed overflow and if the comparison itself is signed. llvm-svn: 225034	2014-12-31 04:21:41 +00:00
Alexey Samsonov	4caafc0c9a	Revert "merge consecutive stores of extracted vector elements" This reverts commit r224611. This change causes crashes in X86 DAG->DAG Instruction Selection. llvm-svn: 225031	2014-12-31 00:40:28 +00:00
Colin LeMahieu	664727ddb9	[Hexagon] Adding accumulating add/sub, doubleword logic-not variants, doubleword bitfield extract, word parity, accumulating multiplies with saturation. llvm-svn: 225024	2014-12-31 00:08:34 +00:00
David Blaikie	78b05cc1ac	Fix a test case to not depend on asm comment syntax, so as to be portable Too many different comment characters - instead of trying to account for them all, instead disable the comments and just check for end-of-line instead. llvm-svn: 225020	2014-12-30 23:33:55 +00:00
David Blaikie	bcfb368a62	Generalize even further, for ARM comment syntax (@) llvm-svn: 225019	2014-12-30 23:23:58 +00:00
Colin LeMahieu	ce5a9848a5	[Hexagon] Adding double-logic on predicate instructions. llvm-svn: 225018	2014-12-30 23:22:39 +00:00
David Blaikie	ca09b563e4	Generalize test case to handle different asm syntax (# or // comments) llvm-svn: 225017	2014-12-30 23:21:57 +00:00
Colin LeMahieu	d9937c62e9	[Hexagon] Adding newvalue compare and jumps. llvm-svn: 225015	2014-12-30 23:04:21 +00:00
Peter Collingbourne	6ef4919520	RTDyldMemoryManager.cpp: Make the reference to __morestack weak. This fixes the DSO build for now. Eventually we should develop some other mechanism to make this work correctly with DSOs. llvm-svn: 225014	2014-12-30 22:52:33 +00:00
David Blaikie	3ab1d163f5	DebugInfo: Omit is_stmt from line table entries on the same line. GCC does this for non-zero discriminators and since GCC doesn't produce column info, that was the only place it comes up there. For LLVM, since we can emit discriminators and/or column info, it makes more sense to invert the condition and just test for changes in line number. This should resolve at least some of the GDB 7.5 test suite failures created by recent Clang changes that increase the location fidelity (which, since Clang defaults to including column info on Linux by default created a bunch of cases that confused GDB). In theory we could do this better/differently by grouping actual source statements together in a similar manner to the way lexical scopes are handled but given that GDB isn't really in a position to consume that (& users are probably somewhat used to different lines being different 'statements') this seems the safest and cheapest change. (I'm concerned that doing this 'right' would bloat the debugloc data even further - something Duncan's working hard to address) llvm-svn: 225011	2014-12-30 22:47:13 +00:00
Colin LeMahieu	4d12863d57	[Hexagon] Adding postincrement register newvalue stores. llvm-svn: 225010	2014-12-30 22:34:08 +00:00
Colin LeMahieu	e11e421bc5	[Hexagon] Removing old newvalue store variants. Adding postincrement immediate newvalue stores. llvm-svn: 225009	2014-12-30 22:28:31 +00:00
Zoran Jovanovic	a9daa0cdb9	[mips][microMIPS] Relocate with symbol for micromips symbols Differential Revision: http://reviews.llvm.org/D6796 llvm-svn: 225008	2014-12-30 22:04:16 +00:00
Colin LeMahieu	a76ddd9ae4	[Hexagon] Adding indexed store new-value variants. llvm-svn: 225007	2014-12-30 22:00:26 +00:00
Colin LeMahieu	ef54aa0778	[Hexagon] Adding indexed store of immediates. llvm-svn: 225006	2014-12-30 21:01:38 +00:00
Colin LeMahieu	4a47613bb1	[Hexagon] Adding indexed stores. llvm-svn: 225005	2014-12-30 20:42:23 +00:00
Peter Collingbourne	adf669ef17	x86_64: Fix calls to __morestack under the large code model. Under the large code model, we cannot assume that __morestack lives within 2^31 bytes of the call site, so we cannot use pc-relative addressing. We cannot perform the call via a temporary register, as the rax register may be used to store the static chain, and all other suitable registers may be either callee-save or used for parameter passing. We cannot use the stack at this point either because __morestack manipulates the stack directly. To avoid these issues, perform an indirect call via a read-only memory location containing the address. This solution is not perfect, as it assumes that the .rodata section is laid out within 2^31 bytes of each function body, but this seems to be sufficient for JIT. Differential Revision: http://reviews.llvm.org/D6787 llvm-svn: 225003	2014-12-30 20:05:19 +00:00
Kostya Serebryany	18d0a59ccc	[asan] change _sanitizer_cov_module_init to accept int* instead of int** llvm-svn: 224999	2014-12-30 19:29:28 +00:00
Michael Kuperstein	a000cb8396	[COFF] Don't try to add quotes to already quoted linker directives If a linker directive is already quoted, don't try to quote it again, otherwise it creates a mess. This pops up in places like: #pragma comment(linker,"\"/foo bar'\"") Differential Revision: http://reviews.llvm.org/D6792 llvm-svn: 224998	2014-12-30 19:23:48 +00:00
Colin LeMahieu	be9ae58d93	[Hexagon] Adding reg-reg indexed load forms. llvm-svn: 224997	2014-12-30 18:58:47 +00:00
Peter Collingbourne	46637a934e	The __morestack function is only available on i386 and x86_64 architectures. llvm-svn: 224994	2014-12-30 18:22:06 +00:00
Peter Collingbourne	05b567a80e	Make the __morestack function available to the JIT memory manager under Linux. This function's implementation lives in libgcc, a static library, so we need to expose it explicitly, like the other such functions. Differential Revision: http://reviews.llvm.org/D6788 llvm-svn: 224993	2014-12-30 18:06:52 +00:00
Colin LeMahieu	0b193a8b1c	[Hexagon] Dropping old combine instructions without encodings. llvm-svn: 224992	2014-12-30 17:53:54 +00:00
Colin LeMahieu	c9924ffc90	[Hexagon] Adding compare byte/halfword reg-reg/reg-imm forms. Adding compare to general register reg-imm form. llvm-svn: 224991	2014-12-30 17:39:24 +00:00
Colin LeMahieu	300c89d245	[Hexagon] Updating constant extender def, adding alu-not instructions, compare to general register, and inverted compares. llvm-svn: 224989	2014-12-30 15:44:17 +00:00
Elena Demikhovsky	d6e3f2ad88	Some code improvements in Masked Load/Store. No functional changes. llvm-svn: 224986	2014-12-30 14:28:14 +00:00
Rafael Espindola	1db8d30b1f	Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. llvm-svn: 224985	2014-12-30 13:13:27 +00:00
Elena Demikhovsky	1f674acd12	reverted prev commit (it was a mistake) llvm-svn: 224984	2014-12-30 10:17:21 +00:00

1 2 3 4 5 ...

111178 Commits