llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 13:02:52 +02:00

Author	SHA1	Message	Date
Pavel Chupin	3b0d0d5928	Fix lld-x86_64-win7 Build #11969 llvm-svn: 215097	2014-08-07 11:09:59 +00:00
Chandler Carruth	74dc5ae1d0	[x86] Fix another miscompile found through fuzz testing the new vector shuffle lowering. This is closely related to the previous one. Here we failed to use the source offset when swapping in the other case -- where we end up swapping the final shuffle. The cause of this bug is a bit different: I simply wasn't thinking about the fact that this mask is actually a slice of a wide mask and thus has numbers that need SourceOffset applied. Simple fix. Would be even more simple with an algorithm-y thing to use here, but correctness first. =] llvm-svn: 215095	2014-08-07 10:37:35 +00:00
Chandler Carruth	c19c208775	[x86] Fix another miscompile in the new vector shuffle lowering found via the fuzz tester. Here I missed an offset when round-tripping a value through a shuffle mask. I got it right 2 lines below. See a problem? I do. ;] I'll probably be adding a little "swap" algorithm which accepts a range and two values and swaps those values where they occur in the range. Don't really have a name for it, let me know if you do. llvm-svn: 215094	2014-08-07 10:14:27 +00:00
Chandler Carruth	4d35998980	[x86] Fix another miscompile in the new vector shuffle lowering found through the new fuzzer. This one is great: bad operator precedence led the modulus to happen at the wrong point. All the asserts didn't fire because there were usually the right values past the end of the 4 element region we were looking at. Probably could have gotten a crash here with ASan + fuzzing, but the correctness tests pinpointed this really nicely. llvm-svn: 215092	2014-08-07 09:45:02 +00:00
Pavel Chupin	7f4a227354	[x32] Use ebp/esp as frame and stack pointer Summary: Since pointers are 32-bit on x32 we can use ebp and esp as frame and stack pointer. Some operations like PUSH/POP and CFI_INSTRUCTION still require 64-bit register, so using 64-bit MachineFramePtr where required. X86_64 NaCl uses 64-bit frame/stack pointers, however it's been found that both isTarget64BitLP64 and isTarget64BitILP32 are true for NaCl. Addressing this issue here as well by making isTarget64BitLP64 false. Also mark hasReservedSpillSlot unreachable on X86. See inlined comments. Test Plan: Add one new simple test and upgrade 2 existing with x32 target case. Reviewers: nadav, dschuff Subscribers: llvm-commits, zinovy.nis Differential Revision: http://reviews.llvm.org/D4617 llvm-svn: 215091	2014-08-07 09:41:19 +00:00
Chandler Carruth	291bc5d9a4	[x86] Fix a miscompile in the new shuffle lowering found through the new fuzz testing. The function which tested for adjacency did what it said on the tin, but when I called it, I wanted it to do something more thorough: I wanted to know if the pairs of shuffle elements were adjacent and started at 0 mod 2. In one place I had the decency to try to test for this, but in the other it was completely skipped, miscompiling this test case. Fix this by making the helper actually do what I wanted it to do everywhere I called it (and removing the now redundant code in one place). I really dislike the name "canWidenShuffleElements" for this predicate. If anyone can come up with a better name, please let me know. The other name I thought about was "canWidenShuffleMask" but is it really widening the mask to reduce the number of lanes shuffled? I don't know. Naming things is hard. llvm-svn: 215089	2014-08-07 08:11:31 +00:00
Pete Cooper	5e1b7f85dc	Update Tablegen documents given that binary literals are now sized llvm-svn: 215088	2014-08-07 05:47:13 +00:00
Pete Cooper	cbc13312c3	Update BitRecTy::convertValue to allow if expressions with bit values on both sides of the if llvm-svn: 215087	2014-08-07 05:47:10 +00:00
Pete Cooper	5d88ea715c	Change the { } expression in tablegen to accept sized binary literals which are not just 0 and 1. It also allows nested { } expressions, as now that they are sized, we can merge pull bits from the nested value. In the current behaviour, everything in { } must have been convertible to a single bit. However, now that binary literals are sized, its useful to be able to initialize a range of bits. So, for example, its now possible to do bits<8> x = { 0, 1, { 0b1001 }, 0, 0b0 } llvm-svn: 215086	2014-08-07 05:47:07 +00:00
Pete Cooper	8cac65e882	Change BitsInit to inherit from TypedInit. This is useful in a later patch where binary literals such as 0b000 will become BitsInit values instead of IntInit values. llvm-svn: 215085	2014-08-07 05:47:04 +00:00
Pete Cooper	5e735d5967	Change TableGen so that binary literals such as 0b001 are now sized. Instead of these becoming an integer literal internally, they now become bits<n> values. Prior to this change, 0b001 was 1 bit long. This is confusing as clearly the user gave 3 bits. This new type holds both the literal value and the size, and so can ensure sizes match on initializers. For example, this used to be legal bits<1> x = 0b00; but now it must be written as bits<2> x = 0b00; llvm-svn: 215084	2014-08-07 05:47:00 +00:00
Pete Cooper	91540288e1	TableGen: Change { } to only accept bits<n> entries when n == 1. Prior to this change, it was legal to do something like bits<2> opc = { 0, 1 }; bits<2> opc2 = { 1, 0 }; bits<2> a = { opc, opc2 }; This involved silently dropping bits from opc and opc2 which is very hard to debug. Now the above test would be an error. Having tested with an assert, none of LLVM/clang was relying on this behaviour. Thanks to Adam Nemet for the above test. llvm-svn: 215083	2014-08-07 05:46:57 +00:00
Pete Cooper	4afa5aa1cc	Fix a whole bunch of binary literals which were the wrong size. All were being silently zero extended to the correct width. The commit after this changes { } and 0bxx literals to be of type bits<n> and not int. This means we need to write exactly the right number of bits, and not rely on the values being silently zero extended for us. llvm-svn: 215082	2014-08-07 05:46:54 +00:00
Chandler Carruth	c4749e70d4	Add an option to the shuffle fuzzer that lets you fuzz exclusively within a single bit-width of vectors. This is particularly useful for when you know you have bugs in a certain area and want to find simpler test cases than those produced by an open-ended fuzzing that ends up legalizing the vector in addition to shuffling it. llvm-svn: 215056	2014-08-07 04:49:54 +00:00
Bill Wendling	533008dec7	Use the minor number for the revision numbers. llvm-svn: 215055	2014-08-07 04:21:45 +00:00
Chandler Carruth	d62229a440	Add a vector shuffle fuzzer. This is a python script which for a given seed generates a random sequence of random shuffles of a random vector width. It embeds this into a function and emits a main function which calls the test routine and checks that the results (where defined) match the obvious results. I'll be using this to drive out miscompiles from the new vector shuffle logic now that it is clean of any crashes I can find with llvm-stress. Note, my python skills are very poor. Sorry if this is terrible code, and feel free to tell me how I should write this or just patch it as necessary. The tests generated try to be very portable and use boring C routines. It technically will mis-declare the C routines and pass 32-bit integers to parametrs that expect 64-bit integers. If someone wants to fix this and has less terrible ideas of how to do it, I'm all ears. Fortunately, this "just works" for x86. =] llvm-svn: 215054	2014-08-07 04:13:51 +00:00
Justin Bogner	c3838694ad	DebugInfo: Make a test more portable mach-o doesn't like sections without segments, and elf is perfectly happy with commas in section names, so use a Darwin-like section name. Suggestion by Eric Christopher. llvm-svn: 215052	2014-08-07 03:47:28 +00:00
Saleem Abdulrasool	37f9f1e4f7	MC: split Win64EHUnwindEmitter into a shared streamer This changes Win64EHEmitter into a utility WinEH UnwindEmitter that can be shared across multiple architectures and a target specific bit which is overridden (Win64::UnwindEmitter). This enables sharing the section selection code across X86 and the intended use in ARM for emitting unwind information for Windows on ARM. llvm-svn: 215050	2014-08-07 02:59:41 +00:00
Quentin Colombet	e540e9c357	[X86][SchedModel] Fixed missing/wrong scheduling model found by code inspection. Source: Agner Fog's Instruction tables. Related to <rdar://problem/15607571> llvm-svn: 215045	2014-08-07 00:20:44 +00:00
Kevin Enderby	b81ef24fcb	Add the -mcpu= option to llvm-objdump for use with the disassemblers. Also make the disassembler created with the Mach-O parser (the -m option) pick up the Target specific attributes specified with -mattr option. llvm-svn: 215032	2014-08-06 23:24:41 +00:00
Reid Kleckner	f0567dde14	MC X86: Accept ".att_syntax prefix" and diagnose noprefix Fixes PR18916. I don't think we need to implement support for either hybrid syntax. Nobody should write Intel assembly with '%' prefixes on their registers or AT&T assembly without them. llvm-svn: 215031	2014-08-06 23:21:13 +00:00
David Blaikie	a8c5d79f89	Revert "Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself."" This reverts commit r214761. Revert while Reid investigates & provides a reproduction for an assertion failure for this on Windows. llvm-svn: 214999	2014-08-06 22:30:12 +00:00
Sanjay Patel	8cd2aae34c	fix typo llvm-svn: 214995	2014-08-06 21:08:38 +00:00
Yaron Keren	baaa4b7845	getNewMemBuffer memsets the buffer to zeros, the caller don't have to initialize it. llvm-svn: 214994	2014-08-06 20:59:09 +00:00
Sanjay Patel	ec1cb09c1f	Fix a test that has no checks. X86 doesn't have fneg, so check for xor. Differential Revision: http://reviews.llvm.org/D4812 llvm-svn: 214992	2014-08-06 20:45:30 +00:00
Matt Arsenault	feb78a527b	R600: Cleanup fadd and fsub tests llvm-svn: 214991	2014-08-06 20:27:55 +00:00
Rui Ueyama	c76432d7c2	Revert "r214897 - Remove dead zero store to calloc initialized memory" It broke msan. llvm-svn: 214989	2014-08-06 19:30:38 +00:00
Eric Christopher	4a1cdb2ba7	Remove the target machine from CCState. Previously it was only used to get the subtarget and that's accessible from the MachineFunction now. This helps clear the way for smaller changes where we getting a subtarget will require passing in a MachineFunction/Function as well. llvm-svn: 214988	2014-08-06 18:45:26 +00:00
Adrian Prantl	9270b95c79	Improve performance of calculateDbgValueHistory. In r210492 the logic of calculateDbgValueHistory was changed to end register variable live ranges at the end of MBB conditionally on the fact that the register was or not clobbered by the function body. This requires an initial scan of all the operands of the function to collect all clobbered registers. In a second pass over all instructions, we compare this set with the set of clobbered registers for the current MachineInstruction. This modification incurred a compilation time regression on some benchmarks: the debug info emission phase takes ~10% more time. While a small performance hit is unavoidable due to the initial scan requirement, we can improve the situation by avoiding to create too many temporary sets and just use lambdas to work directly on the result of the initial scan. Fixes <rdar://problem/17884104> Patch by Frederic Riss! llvm-svn: 214987	2014-08-06 18:41:24 +00:00
Adrian Prantl	84367341ee	Cleanup collectChangingRegs The handling of the epilogue is best expressed as an early exit and there is no reason to look for register defs in DbgValue MIs. Patch by Frederic Riss! llvm-svn: 214986	2014-08-06 18:41:19 +00:00
David Blaikie	d1250e60a0	DebugInfo: Fix ranges+gmlt test case to actually exercise the gmlt situation. Originally this test case tested the specified behavior (that -gmlt would not produce DW_AT_ranges and that when no CU DW_AT_ranges were produced, no debug_ranges section (not even an empty list) would be produced) but then the ranges emission code was improved not to create ranges of a single element (instead favoring high_pc/low_pc) and so this test case no longer exercised the -gmlt portion of the behavior. This caused me some confusion when reading the comments and trying to update this test case for future changes to -gmlt. I've made this test resilient to those changes (by using the {{DW_TAG\|NULL}} pattern to block the end of the attribute search at the end of the CU's attribute list without mandating that it must (or must not) be followed by another tag (the future changes to -gmlt should produce no subprograms in this CU)) Fix the test case to have two functions in distinct sections to force the use of DW_AT_ranges. llvm-svn: 214985	2014-08-06 18:24:19 +00:00
Reid Kleckner	ba5fdafabe	Add a triple to this test to get the right IR mangling llvm-svn: 214982	2014-08-06 18:09:15 +00:00
Reid Kleckner	bec7530633	Don't count inreg params when mangling fastcall functions This is consistent with MSVC. llvm-svn: 214981	2014-08-06 18:09:04 +00:00
Reid Kleckner	e340c78d47	Round up the size of byval arguments to MinAlign Otherwise we can end up with an argument frame size that is not a multiple of stack slot size, which is very awkward. This fixes PR20547, which was a bug in x86_64 Sys V vararg handling. However, it's much easier to test this with x86 callee-cleanup functions, which previously ended in "retl $6" instead of "retl $8". This does affect behavior of all backends, but it presumably fixes the same bug in all of them. llvm-svn: 214980	2014-08-06 17:57:23 +00:00
Duncan P. N. Exon Smith	dcded783d2	UseListOrder: Use std::vector I initially used a `SmallVector<>` for `UseListOrder::Shuffle`, which was a silly choice. When I realized my error I quickly rolled a custom data structure. This commit simplifies it to a `std::vector<>`. Now that I've had a chance to measure performance, this data structure isn't part of a bottleneck, so the additional complexity is unnecessary. This is part of PR5680. llvm-svn: 214979	2014-08-06 17:36:08 +00:00
Chad Rosier	0214f513b9	[AArch64] Add a few isTarget* API to AArch64 Subtarget. llvm-svn: 214977	2014-08-06 16:56:58 +00:00
Chad Rosier	ecaa43dc0e	Add test case omitted in r214974. llvm-svn: 214975	2014-08-06 16:06:41 +00:00
Chad Rosier	8291c24b09	[AArch64] Fix OS ABI flag for aarch64-linux-gnu target. For triple aarch64-linux-gnu we were incorrectly setting IRIX. For triple aarch64 we are correctly setting SYSV. Patch by Ana Pazos <apazos@codeaurora.org>. llvm-svn: 214974	2014-08-06 16:05:02 +00:00
Sanjay Patel	d5cb9b68e1	use register iterators that include self to reduce code duplication in CriticalAntiDepBreaker This patch addresses 2 FIXME comments that I added to CriticalAntiDepBreaker while fixing PR20020. Initialize an MCSubRegIterator and an MCRegAliasIterator to include the self reg. Assuming that works as advertised, there should be functional difference with this patch, just less code. Also, remove the associated asserts - we're setting those values just before, so the asserts don't do anything meaningful. Differential Revision: http://reviews.llvm.org/D4566 llvm-svn: 214973	2014-08-06 15:58:15 +00:00
Robert Khasanov	970483b673	[AVX512] Added load/store instructions to Register2Memory opcode tables. Added lowering tests for load/store. Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com> llvm-svn: 214972	2014-08-06 15:40:34 +00:00
James Molloy	7518c61a09	[AArch64] Add a testcase for r214957. llvm-svn: 214965	2014-08-06 13:31:32 +00:00
James Molloy	0127d12e19	Add a new option -run-slp-after-loop-vectorization. This swaps the order of the loop vectorizer and the SLP/BB vectorizers. It is disabled by default so we can do performance testing - ideally we want to change to having the loop vectorizer running first, and the SLP vectorizer using its leftovers instead of the other way around. llvm-svn: 214963	2014-08-06 12:56:19 +00:00
Tim Northover	0028a9a97d	ARM: do not generate BLX instructions on Cortex-M CPUs. Particularly on MachO, we were generating "blx _dest" instructions on M-class CPUs, which don't actually exist. They happen to get fixed up by the linker into valid "bl _dest" instructions (which is why such a massive issue has remained largely undetected), but we shouldn't rely on that. llvm-svn: 214959	2014-08-06 11:13:14 +00:00
Tim Northover	7abd4db81b	ARM-MachO: materialize callee address correctly on v4t. llvm-svn: 214958	2014-08-06 11:13:06 +00:00
James Molloy	2bbe86fab0	[AArch64] Conditional selects are expensive on out-of-order cores. Specifically Cortex-A57. This probably applies to Cyclone too but I haven't enabled it for that as I can't test it. This gives ~4% improvement on SPEC 174.vpr, and ~1% in 471.omnetpp. llvm-svn: 214957	2014-08-06 10:42:18 +00:00
Chandler Carruth	2a63640957	[x86] Fix two independent miscompiles in the process of getting the same test case to actually generate correct code. The primary miscompile fixed here is that we weren't correctly handling in-place elements in one half of a single-input v8i16 shuffle when moving a dword of elements from that half to the other half. Some times, we would clobber the in-place elements in forming the dword to move across halves. The fix to this involves forcibly marking the in-place inputs even when there is no need to gather them into a dword, and to much more carefully re-arrange the elements when grouping them into a dword to move across halves. With these two changes we would generate correct shuffles for the test case, but found another miscompile. There are also some random perturbations of the generated shuffle pattern in SSE2. It looks like a wash; more instructions in some cases fewer in others. The second miscompile would corrupt the results into nonsense. This is a buggy pattern in one of the added DAG combines. Mapping elements through a PSHUFD when pairing redundant half-shuffles is much harder than this code makes it out to be -- it requires reasoning about all of where the input is used in the PSHUFD, not just one part of where it is used. Plus, we can't combine a half shuffle into a PSHUFD but the code didn't guard against it. I think this was just a bad idea and I've just removed that aspect of the combine. No tests regress as a consequence so seems OK. llvm-svn: 214954	2014-08-06 10:16:36 +00:00
Chandler Carruth	8d598f2f29	[x86] Switch to a formulation of a for loop that is much more obviously not corrupting the mask by mutating it more times than intended. No functionality changed (the results were non-overlapping so the old version "worked" but was non-obvious). llvm-svn: 214953	2014-08-06 10:16:33 +00:00
Adam Nemet	aea49d4d5f	[X86] Fixes commit r214890 to match the posted patch This was another fallout from my local rebase where something went wrong :( llvm-svn: 214951	2014-08-06 07:13:12 +00:00
Matt Arsenault	7d4ad478b1	Correct comment llvm-svn: 214945	2014-08-06 00:44:25 +00:00
Peter Collingbourne	129ba2ac92	[dfsan] Try not to create too many additional basic blocks in functions which already have a large number of blocks. Works around a performance issue with the greedy register allocator. llvm-svn: 214944	2014-08-06 00:33:40 +00:00

1 2 3 4 5 ...

106485 Commits