llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-19 11:02:59 +02:00

Author	SHA1	Message	Date
Simon Pilgrim	47c949cc50	Tidied up TRUNC combine code. NFC. Make use of DAG.getBitcast and use clang-format to reduce number of lines (and make it more readable). llvm-svn: 258644	2016-01-23 21:50:40 +00:00
Justin Lebar	67acdea900	[CUDA] Die gracefully when trying to output an LLVM alias. Summary: Previously, we would just output "foo = bar" in the assembly, and then ptxas would choke. Now we die before emitting any invalid code. Reviewers: echristo Subscribers: jholewinski, llvm-commits, jhen, tra Differential Revision: http://reviews.llvm.org/D16490 llvm-svn: 258638	2016-01-23 21:12:20 +00:00
Justin Lebar	cc870065cb	[CUDA] Make empty parameter lists in nvptx function decls easier to read. Summary: Before: .func (.param .b32 func_retval0) _ZL21__nvvm_reflect_anchorv( ) { After: .func (.param .b32 func_retval0) _ZL21__nvvm_reflect_anchorv() { Reviewers: bkramer Subscribers: llvm-commits, tra, jhen, echristo, jholewinski Differential Revision: http://reviews.llvm.org/D16512 llvm-svn: 258637	2016-01-23 21:12:17 +00:00
Benjamin Kramer	820941c5f2	Don't check if a list is empty with ilist::size. ilist::size() is O(n) while ilist::empty() is O(1) llvm-svn: 258636	2016-01-23 20:58:09 +00:00
Kostya Serebryany	0c11655f17	[libFuzzer] add -abort_on_timeout option llvm-svn: 258631	2016-01-23 19:34:19 +00:00
Akira Hatanaka	dfd425a3de	[Bitcode] Insert the darwin wrapper at the beginning of a file when the target is macho. It looks like the check for macho was accidentally dropped in r132959. I don't have a test case, but I'll add one if anyone knows how this can be tested. llvm-svn: 258627	2016-01-23 16:02:10 +00:00
Aaron Ballman	ab11289986	Silence a -Wparentheses warning; NFC. llvm-svn: 258626	2016-01-23 15:42:21 +00:00
Simon Pilgrim	37a0ceb144	Added missing comment. NFC. llvm-svn: 258624	2016-01-23 14:38:02 +00:00
Simon Pilgrim	0e48f0e5bb	[X86][SSE] Remove INSERTPS dependencies from unreferenced operands. If the INSERTPS zeroes out all the referenced elements from either of the 2 input vectors (and the input is not already UNDEF), then set that input to UNDEF to reduce dependencies. llvm-svn: 258622	2016-01-23 13:37:07 +00:00
Haicheng Wu	9d77533d54	[LIR] Add support for structs and hand unrolled loops Now LIR can turn following codes into memset: typedef struct foo { int a; int b; } foo_t; void bar(foo_t f, unsigned n) { for (unsigned i = 0; i < n; ++i) { f[i].a = 0; f[i].b = 0; } } void test(foo_t f, unsigned n) { for (unsigned i = 0; i < n; i += 2) { f[i] = 0; f[i+1] = 0; } } llvm-svn: 258620	2016-01-23 06:52:41 +00:00
Matthias Braun	f692315c07	Inline variable into assert Seems like some compilers still give unused variable warnings for bool var = ...; (void)var; so I have to inline the variable. llvm-svn: 258619	2016-01-23 06:49:29 +00:00
NAKAMURA Takumi	c47ea12e00	AArch64ISelLowering.cpp: Fix a warning. [-Wunused-variable] llvm-svn: 258618	2016-01-23 06:34:59 +00:00
Junmo Park	d7f46a4f6c	Remove extra whitespace. NFC. llvm-svn: 258617	2016-01-23 06:34:36 +00:00
David Majnemer	f62478a34a	[PruneEH] Don't try to insert a terminator after another terminator LLVM's BasicBlock has a single terminator, it is not valid to have two. llvm-svn: 258616	2016-01-23 06:00:44 +00:00
Manuel Jacob	865681354b	Put space after pointer type in test. NFC. llvm-svn: 258615	2016-01-23 05:47:34 +00:00
Matt Arsenault	fe8ee22547	AMDGPU: Remove more unused intrinsics Replace tests with lrp with basic IR expansion llvm-svn: 258612	2016-01-23 05:42:38 +00:00
David Majnemer	7a3addc91c	[PruneEH] FuncletPads must not have undef operands Instead of RAUW with undef, replace the first non-token instruction with unreachable. This fixes PR26263. llvm-svn: 258611	2016-01-23 05:41:29 +00:00
David Majnemer	09858a3961	[PruneEH] Unify invoke and call handling in DeleteBasicBlock No functionality change is intended. llvm-svn: 258610	2016-01-23 05:41:27 +00:00
David Majnemer	0728f4a41f	[PruneEH] Reuse code from removeUnwindEdge PruneEH had functionality idential to removeUnwindEdge. Consolidate around removeUnwindEdge. No functionality change is intended. llvm-svn: 258609	2016-01-23 05:41:22 +00:00
Matt Arsenault	38b08addbb	AMDGPU: Move amdgcn intrinsic handling into SITargetLowering llvm-svn: 258608	2016-01-23 05:32:20 +00:00
Matt Arsenault	f305746857	AMDGPU: Remove IntrNoMem from llvm.SI.sendmsg This has side effects. llvm-svn: 258607	2016-01-23 05:32:18 +00:00
Matt Arsenault	c3ec12b749	AMDGPU: Remove Feature64BitPtr This is a leftover from AMDIL that doesn't do anything and doesn't belong here. llvm-svn: 258606	2016-01-23 05:32:14 +00:00
Matthias Braun	0892910f16	AArch64ISel: Fix ccmp code selection matching deep expressions. Some of the conditions necessary to produce ccmp sequences were only checked in recursive calls to emitConjunctionDisjunctionTree() after some of the earlier expressions were already built. Move all checks over to isConjunctionDisjunctionTree() so they are all checked before we start emitting instructions. Also rename some variable to better reflect their usage. llvm-svn: 258605	2016-01-23 04:05:22 +00:00
Matthias Braun	da14179563	AArch64ISelLowering: Reduce maximum recursion depth of isConjunctionDisjunctionTree() This function will exhibit exponential runtime (2**n) so we should rather use a lower limit. llvm-svn: 258604	2016-01-23 04:05:18 +00:00
Matthias Braun	a0ca239a6c	Fix wrong indentation llvm-svn: 258603	2016-01-23 04:05:16 +00:00
Derek Schuff	0558165991	[WebAssembly] Fix RegNumbering for the stack pointer Previously it failed to add NumArgRegs to the offset and so clobbered an already-used register. Now just start the numbering after the arg regs and don't duplicate the add. Test coverage for this coming shortly with the implementation of byval. llvm-svn: 258597	2016-01-23 01:20:43 +00:00
Kostya Serebryany	548cef831b	[libFuzzer] add more fields to DictionaryEntry to count the number of uses and successes llvm-svn: 258589	2016-01-22 23:55:14 +00:00
David Majnemer	a2ed036c0a	[WinEH] Let cleanups post-dominated by unreachable get executed Cleanups in C++ are a little weird. They are only guaranteed to be reliably executed if, and only if, there is a viable catch handler which can handle the exception. This means that reachability of a cleanup is lexically determined by it being nested with a try-block which unwinds to a catch. It is cannot be reasoned about by examining the control flow edges leaving a cleanup. Usually this is not a problem. It becomes a problem when there are no edges out of a cleanup because we believed that code post-dominated by the cleanup is dead. In LLVM's case, this code is what informs the personality routine about the presence of a suitable catch handler. However, the lack of edges to that catch handler makes the handler become unreachable which causes us to remove it. By removing the handler, the cleanup becomes unreachable. Instead, inject a catch-all handler with every cleanup that has no unwind edges. This will allow us to properly unwind the stack. This fixes PR25997. llvm-svn: 258580	2016-01-22 23:20:43 +00:00
Kevin Enderby	9b924af8f7	Fix the code that leads to the incorrect trigger of the report_fatal_error() in MachOObjectFile::getSymbolByIndex() when a Mach-O file has a symbol table load command but the number of symbols are zero. The code in MachOObjectFile::symbol_begin_impl() should not be assuming there is a symbol at index 0, in cases there is no symbol table load command or the count of symbol is zero. So I also fixed that. And needed to fix MachOObjectFile::symbol_end_impl() to also do the same thing for no symbol table or one with zero entries. The code in MachOObjectFile::getSymbolByIndex() should trigger the report_fatal_error() for programmatic errors for any index when there is no symbol table load command and not return the end iterator. So also fixed that. Note there is no test case as this is a programmatic error. The test case using the file macho-invalid-bad-symbol-index has a symbol table load command with its number of symbols (nsyms) is zero. Which was incorrectly testing the bad triggering of the report_fatal_error() in in MachOObjectFile::getSymbolByIndex(). This test case is an invalid Mach-O file but not for that reason. It appears this Mach-O file use to have an nsyms value of 11, and what makes this Mach-O file invalid is the counts and indexes into the symbol table of the dynamic load command are now invalid because the number of symbol table entries (nsyms) is now zero. Which can be seen with the existing llvm-obdump: % llvm-objdump -private-headers macho-invalid-bad-symbol-index … Load command 4 cmd LC_SYMTAB cmdsize 24 symoff 4216 nsyms 0 stroff 4392 strsize 144 Load command 5 cmd LC_DYSYMTAB cmdsize 80 ilocalsym 0 nlocalsym 8 (past the end of the symbol table) iextdefsym 8 (greater than the number of symbols) nextdefsym 2 (past the end of the symbol table) iundefsym 10 (greater than the number of symbols) nundefsym 1 (past the end of the symbol table) ... And the native darwin tools generates an error for this file: % nm macho-invalid-bad-symbol-index nm: object: macho-invalid-bad-symbol-index truncated or malformed object (ilocalsym plus nlocalsym in LC_DYSYMTAB load command extends past the end of the symbol table) I added new checks for the indexes and sizes for these in the constructor of MachOObjectFile. And added comments for what would be a proper diagnostic messages. And changed the test case using macho-invalid-bad-symbol-index to test for the new error now produced. Also added a test with a valid Mach-O file with a symbol table load command where the number of symbols is zero that shows the report_fatal_error() is not called. llvm-svn: 258576	2016-01-22 22:49:55 +00:00
Ivan Krasin	ce1bcd8c31	Use std::piecewise_constant_distribution instead of ad-hoc binary search. Summary: Fix the issue with the most recently discovered unit receiving much less attention. Note: this is the second attempt (prev: r258473). Now, libc++ build is fixed. Reviewers: aizatsky, kcc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16487 llvm-svn: 258571	2016-01-22 22:28:27 +00:00
Weiming Zhao	e73494fde9	Fix LivePhysRegs::addLiveOuts Summary: The testing for returnBB was flipped which may cause ARM ld/st opt pass uses callee saved regs in returnBB when shrink-wrap is used. Reviewers: t.p.northover, apazos, MatzeB Subscribers: mcrosier, zzheng, aemerson, llvm-commits, rengolin Differential Revision: http://reviews.llvm.org/D16434 llvm-svn: 258569	2016-01-22 22:21:34 +00:00
Sanjay Patel	6ef10254c1	fix typos; NFC llvm-svn: 258567	2016-01-22 22:09:41 +00:00
Matt Arsenault	3913a77bb9	AMDGPU: Add new name for barrier intrinsic llvm-svn: 258558	2016-01-22 21:30:43 +00:00
Matt Arsenault	7a5e15697d	AMDGPU: Rename intrinsics to use amdgcn prefix The intrinsic target prefix should match the target name as it appears in the triple. This is not yet complete, but gets most of the important ones. llvm.AMDGPU.* intrinsics used by mesa and libclc are still handled for compatability for now. llvm-svn: 258557	2016-01-22 21:30:34 +00:00
Sergei Larin	7b219abac0	Make sure that any new and optimized objects created during GlobalOPT copy all the attributes from the base object. Summary: Make sure that any new and optimized objects created during GlobalOPT copy all the attributes from the base object. A good example of improper behavior in the current implementation is section information associated with the GlobalObject. If a section was set for it, and GlobalOpt is creating/modifying a new object based on this one (often copying the original name), without this change new object will be placed in a default section, resulting in inappropriate properties of the new variable. The argument here is that if customer specified a section for a variable, any changes to it that compiler does should not cause it to change that section allocation. Moreover, any other properties worth representation in copyAttributesFrom() should also be propagated. Reviewers: jmolloy, joker-eph, joker.eph Subscribers: slarin, joker.eph, rafael, tobiasvk, llvm-commits Differential Revision: http://reviews.llvm.org/D16074 llvm-svn: 258556	2016-01-22 21:18:20 +00:00
Sanjay Patel	ada0c1bc05	function names start with a lowercase letter; NFC llvm-svn: 258552	2016-01-22 21:11:47 +00:00
Sanjoy Das	26d6272ad2	[PlaceSafepoints] Introduce a -spp-no-statepoints flag Summary: This change adds a `-spp-no-statepoints` flag to PlaceSafepoints that bypasses the code that wraps newly introduced polls and existing calls in gc.statepoint. With `-spp-no-statepoints` enabled, PlaceSafepoints effectively becomes a safpeoint poll insertion pass. The eventual goal is to "constant fold" this option, along with `-rs4gc-use-deopt-bundles` to `true`, once clients using gc.statepoint are okay doing so. Reviewers: pgavlin, reames, JosephTremoulet Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16439 llvm-svn: 258551	2016-01-22 21:02:55 +00:00
Xinliang David Li	42a10f05d4	[PGO] Remove use of static variable. /NFC Make the variable a member of the writer trait object owned now by the writer. Also use a different generator interface to pass the infoObject from the writer. llvm-svn: 258544	2016-01-22 20:25:56 +00:00
Xinliang David Li	960dc56746	Revert 258486 -- for a better fix coming soon llvm-svn: 258538	2016-01-22 19:53:31 +00:00
Matt Arsenault	2b88adb9bd	AMDGPU: Fix crash with invariant markers The promote alloca pass didn't handle these intrinsics and crashed. These intrinsics should accept any address space, but for now just erase them to avoid breaking. llvm-svn: 258537	2016-01-22 19:47:54 +00:00
Jingyue Wu	90a1a65026	[NVPTX] expand mul_lohi to mul_lo and mul_hi Summary: Fixes PR26186. Reviewers: grosser, jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D16479 llvm-svn: 258536	2016-01-22 19:47:26 +00:00
Ahmed Bougacha	7980e233f5	[AArch64] Simplify emitConditionalCompare calls. NFC. Now that both callsites are identical, we can simplify the prototype and make it easier to reason about the 2-CC case. llvm-svn: 258534	2016-01-22 19:43:57 +00:00
Ahmed Bougacha	1c71a2aac6	[AArch64] Lower 2-CC FCCMPs (one/ueq) using AND'ed CCs. The current behavior is incorrect, as the two CCs returned by changeFPCCToAArch64CC, intended to be OR'ed, are instead used in an AND ccmp chain. Consider: define i32 @t(float %a, float %b, float %c, float %d, i32 %e, i32 %f) { %cc1 = fcmp one float %a, %b %cc2 = fcmp olt float %c, %d %and = and i1 %cc1, %cc2 %r = select i1 %and, i32 %e, i32 %f ret i32 %r } Assuming (%a < %b) and (%c < %d); we used to do: fcmp s0, s1 # nzcv <- 1000 orr w8, wzr, #0x1 # w8 <- 1 csel w9, w8, wzr, mi # w9 <- 1 csel w8, w8, w9, gt # w8 <- 1 fcmp s2, s3 # nzcv <- 1000 cset w9, mi # w9 <- 1 tst w8, w9 # (w8 & w9) == 1, so: nzcv <- 0000 csel w0, w0, w1, ne # w0 <- w0 We now do: fcmp s2, s3 # nzcv <- 1000 fccmp s0, s1, #0, mi # mi, so: nzcv <- 1000 fccmp s0, s1, #8, le # !le, so: nzcv <- 1000 csel w0, w0, w1, pl # !pl, so: w0 <- w1 In other words, we transformed: (c < d) && ((a < b) \|\| (a > b)) into: (c < d) && (a u>= b) && (a u<= b) whereas, per De Morgan's, we wanted: (c < d) && !((a u>= b) && (a u<= b)) Note that this problem doesn't occur in the test-suite. changeFPCCToAArch64CC produces disjunct CCs; here, one -> mi/gt. We can't represent that in the fccmp chain; it can't express arbitrary OR sequences, as one comment explains: In general we can create code for arbitrary "... (and (and A B) C)" sequences. We can also implement some "or" expressions, because "(or A B)" is equivalent to "not (and (not A) (not B))" and we can implement some negation operations. [...] However there is no way to negate the result of a partial sequence. Instead, introduce changeFPCCToANDAArch64CC, which produces the conjunct cond codes: - (a one b) == ((a olt b) \|\| (a ogt b)) == ((a ord b) && (a une b)) - (a ueq b) == ((a uno b) \|\| (a oeq b)) == ((a ule b) && (a uge b)) Note that, at first, one might think that, when PushNegate is true, we should use the disjunct CCs, in effect doing: (a \|\| b) = !(!a && !(b)) = !(!a && !(b1 \|\| b2)) <- changeFPCCToAArch64CC(b, b1, b2) = !(!a && !b1 && !b2) However, we can take advantage of the fact that the CC is already negated, which lets us avoid special-casing PushNegate and doing the simpler to reason about: (a \|\| b) = !(!a && (!b)) = !(!a && (b1 && b2)) <- changeFPCCToANDAArch64CC(!b, b1, b2) = !(!a && b1 && b2) This makes both emitConditionalCompare cases behave identically, and produces correct ccmp sequences for the 2-CC fcmps. llvm-svn: 258533	2016-01-22 19:43:54 +00:00
Ahmed Bougacha	3a901cfda8	[AArch64] Assert that CCMP isel didn't fail inconsistently. We verify that the op tree is eligible for CCMP emission in isConjunctionDisjunctionTree, but it's also possible that emitConjunctionDisjunctionTree fails later. The initial check is useful, as it avoids building nodes that will get discarded. Still, make sure that inconsistencies don't happen with an assert. llvm-svn: 258532	2016-01-22 19:43:43 +00:00
Sanjoy Das	a81b52c690	[RS4GC] Use OB_deopt instead of "deopt" llvm-svn: 258529	2016-01-22 19:20:40 +00:00
Krzysztof Parzyszek	7ec3ade80f	[Hexagon] Use general purpose registers to spill pred/mod registers into Patch by Tobias Edler Von Koch. llvm-svn: 258527	2016-01-22 19:15:58 +00:00
Matt Arsenault	8d0283f1a9	AMDGPU: Fix getArchTypePrefix llvm-svn: 258525	2016-01-22 19:09:12 +00:00
Matt Arsenault	fdfc9419b0	AMDGPU: Rename some r600 intrinsics to use correct TargetPrefix These ones aren't directly emitted by mesa and inserted by a pass. llvm-svn: 258523	2016-01-22 19:00:09 +00:00
Matt Arsenault	2e8073cc66	AMDGPU: Remove unused R600 intrinsics llvm-svn: 258522	2016-01-22 18:52:14 +00:00
David Majnemer	0533388931	[WinEH] Make collectFuncletMembers non-recursive Use a worklist for the pre-order DFS instead of using recursion. No functionality change is intended. llvm-svn: 258521	2016-01-22 18:49:50 +00:00
Kevin Enderby	d50c4b11ba	Fix MachOObjectFile::getSymbolName() to not call report_fatal_error() but to return object_error::parse_failed. Then made the code in llvm-nm do for Mach-O files what is done in the darwin native tools which is to print "bad string index" for bad string indexes. Updated the error message in the llvm-objdump test, and added tests to show llvm-nm prints "bad string index" and a test to print the actual bad string index value which in this case is 0xfe000002 when printing the fields as raw hex. llvm-svn: 258520	2016-01-22 18:47:14 +00:00
Matt Arsenault	720ce3fd59	AMDGPU: Change control flow intrinsics to use amdgcn prefix These aren't supposed to be used outside of the backend, so there aren't any users to worry about. llvm-svn: 258516	2016-01-22 18:42:55 +00:00
Matt Arsenault	d4ad2318d1	AMDGPU: Don't use separate mulhu/mulhs Pats llvm-svn: 258515	2016-01-22 18:42:49 +00:00
Matt Arsenault	351925633e	AMDGPU: Remove random TGSI intrinsic I don't think this was ever used. llvm-svn: 258514	2016-01-22 18:42:44 +00:00
Matt Arsenault	21c6e6f537	AMDGPU: Remove AMDGPU.fract intrinsic Mesa doesn't use this, and this is pattern matched already from fsub x, (ffloor x) llvm-svn: 258513	2016-01-22 18:42:38 +00:00
Xinliang David Li	16253b4d49	[PGO] eliminate use of static variable llvm-svn: 258486	2016-01-22 05:48:40 +00:00
JF Bastien	050cf771fb	NFC WebAssembly: update links I got a vanity URL, and moved the github waterfall repo. llvm-svn: 258484	2016-01-22 04:21:49 +00:00
Dan Gohman	46980bada3	[SelectionDAG] Fold more offsets into GlobalAddresses This reapplies r258296 and r258366, and also fixes an existing bug in SelectionDAG.cpp's isMemSrcFromString, neglecting to account for the offset in a GlobalAddressSDNode, which is uncovered by those patches. llvm-svn: 258482	2016-01-22 03:57:34 +00:00
Manuel Jacob	714fa41ac7	Replace Type::getInt32Ty() and comparison by isIntegerTy(32). NFC. llvm-svn: 258480	2016-01-22 03:30:27 +00:00
Ivan Krasin	7b4522dc59	Revert r258473 as it's breaking the build with libc++ Reviewers: kcc Differential Revision: http://reviews.llvm.org/D16441 llvm-svn: 258479	2016-01-22 03:21:52 +00:00
Eduard Burtescu	a868f6e2ac	[opaque pointer types] [NFC] DataLayout::getIndexedOffset: take source element type instead of pointer type and rename to getIndexedOffsetInType. Summary: Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16282 llvm-svn: 258478	2016-01-22 03:08:27 +00:00
Eduard Burtescu	cfc72ec986	[opaque pointer types] [NFC] FindAvailableLoadedValue: take LoadInst instead of just the pointer. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16422 llvm-svn: 258477	2016-01-22 01:51:51 +00:00
Eduard Burtescu	636d36b9c9	[opaque pointer types] [NFC] gep_type_{begin,end} now take source element type and address space. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16436 llvm-svn: 258474	2016-01-22 01:33:43 +00:00
Ivan Krasin	db4009626d	Use std::piecewise_constant_distribution instead of ad-hoc binary search. Summary: Fix the issue with the most recently discovered unit receiving much less attention. Note: I had to change the seed for one test to make it pass. Alternatively, the number of runs could be increased. I believe that the average time of 'foo' discovery is not increased, just seed=1 was particularly convenient for the previous PRNG scheme used. Reviewers: aizatsky, kcc Subscribers: llvm-commits, kcc Differential Revision: http://reviews.llvm.org/D16419 llvm-svn: 258473	2016-01-22 01:32:34 +00:00
Eduard Burtescu	0effa1afdd	[opaque pointer types] [NFC] Add an explicit type argument to ConstantFoldLoadFromConstPtr. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16418 llvm-svn: 258472	2016-01-22 01:17:26 +00:00
Pirama Arumuga Nainar	2e5b2b3d41	Do not lower VSETCC if operand is an f16 vector Summary: SETCC with f16 vectors has OperationAction set to Expand but still gets lowered to FCM* intrinsics based on its result type. This patch skips lowering of VSETCC if the operand is an f16 vector. v4 and v8 tests included. Reviewers: ab, jmolloy Subscribers: srhines, llvm-commits Differential Revision: http://reviews.llvm.org/D15361 llvm-svn: 258471	2016-01-22 01:16:57 +00:00
Reid Kleckner	2ddae2ea6b	Revert "[SelectionDAG] Fold more offsets into GlobalAddresses" This reverts r258296 and the follow up r258366. With this change, we miscompiled the following program on Windows: #include <string> #include <iostream> static const char kData[] = "asdf jkl;"; int main() { std::string s(kData + 3, sizeof(kData) - 3); std::cout << s << '\n'; } llvm-svn: 258465	2016-01-22 01:09:29 +00:00
Kostya Serebryany	f7155b3e82	[libFuzzer] don't do expensive memmem if the result will not be used llvm-svn: 258462	2016-01-22 01:04:58 +00:00
Teresa Johnson	2a387148a1	[ThinLTO] Do metadata linking during batch function importing Summary: Since we are currently not doing incremental importing there is no need to link metadata as a postpass. The module linker will only link in the imported subroutines due to the functionality added by r256003. (Note that the metadata postpass linking functionalitiy is still used by llvm-link, and may be needed here in the future if a more incremental strategy is adopted.) Reviewers: joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16424 llvm-svn: 258458	2016-01-22 00:15:53 +00:00
Eduard Burtescu	42b3bd4662	[opaque pointer types] [NFC] Take advantage of get{Source,Result}ElementType when folding GEPs. Summary: Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16302 llvm-svn: 258456	2016-01-21 23:42:06 +00:00
Sanjay Patel	ef7cae166d	move function definitions so we don't need separate declarations ; NFCI llvm-svn: 258455	2016-01-21 23:38:43 +00:00
Sanjay Patel	ff5da390f5	[LibCallSimplifier] refactor FP function signature checks ; NFCI Use the helper function added in r258428. The check should really be hoisted to the caller of all of these optimize* functions, but that's another step. llvm-svn: 258446	2016-01-21 22:58:01 +00:00
Sanjay Patel	7c9dc49b45	avoid variable shadowing; NFC llvm-svn: 258445	2016-01-21 22:41:16 +00:00
Sanjay Patel	4a76c00379	remove unnecessary variable; NFC llvm-svn: 258444	2016-01-21 22:31:18 +00:00
Reid Kleckner	4439f8e4ca	Avoid unnecessary stack realignment in musttail thunks with SSE2 enabled The X86 musttail implementation finds register parameters to forward by running the calling convention algorithm until a non-register location is returned. However, assigning a vector memory location has the side effect of increasing the function's stack alignment. We shouldn't increase the stack alignment when we are only looking for register parameters, so this change conditionalizes it. llvm-svn: 258442	2016-01-21 22:23:22 +00:00
Simon Pilgrim	6f240f4b49	[X86][SSE] Improve i16 splatting shuffles Better handling of the annoying pshuflw/pshufhw ops which only shuffle lower/upper halves of a vector. Added vXi16 unary shuffle support for cases where i16 elements (from the same half of the source) are being splatted to the whole of one of the halves. This avoids the general lowering case which must shuffle the 32-bit elements first - meaning that we used to end up with unnecessary duplicate pshuflw/pshufhw shuffles. Note this has the side effect of a lot of SSSE3 test cases no longer needing to use PSHUFB, as it falls below the 3 op combine threshold for when PSHUFB is typically worth it. I've raised PR26183 to discuss if the threshold should be changed and whether we need to make it more specific to the target CPU. Differential Revision: http://reviews.llvm.org/D14901 llvm-svn: 258440	2016-01-21 22:07:41 +00:00
Lang Hames	373875b04a	[RuntimeDyld][AArch64] Add support for the MachO ARM64_RELOC_SUBTRACTOR reloc. llvm-svn: 258438	2016-01-21 21:59:50 +00:00
David L Kreitzer	28ea778709	Fix for two constant propagation problems in GVN with the assume intrinsic instruction. Patch by Yuanrui Zhang. Differential Revision: http://reviews.llvm.org/D16100 llvm-svn: 258435	2016-01-21 21:32:35 +00:00
Kevin Enderby	a1e729dabc	Fix MachOObjectFile::getSymbolSection() to not call report_fatal_error() but to return object_error::parse_failed. Then made the code in llvm-nm do for Mach-O files what is done in the darwin native tools which is to print "(?,?)" or just "s" for bad section indexes. Also added a test to show it prints the bad section index of "42" when printing the fields as raw hex. llvm-svn: 258434	2016-01-21 21:13:27 +00:00
Sanjay Patel	1087b8fb2a	[LibCallSimplifier] don't get fooled by a fake fmin() This is similar to the bug/fix: https://llvm.org/bugs/show_bug.cgi?id=26211 http://reviews.llvm.org/rL258325 The fmin() test case reveals another bug caused by sloppy code duplication. It will crash without this patch because fp128 is a valid floating-point type, but we would think that we had matched a function that used doubles. The new helper function can be used to replace similar checks that are used in several other places in this file. llvm-svn: 258428	2016-01-21 20:19:54 +00:00
David Majnemer	4981d2326a	[InstCombine] Simplify (x >> y) <= x This commit extends the patterns recognised by InstSimplify to also handle (x >> y) <= x in the same way as (x /u y) <= x. The missing optimisation was found investigating why LLVM did not optimise away bound checks in a binary search: https://github.com/rust-lang/rust/pull/30917 Patch by Andrea Canciani! Differential Revision: http://reviews.llvm.org/D16402 llvm-svn: 258422	2016-01-21 18:55:54 +00:00
Chad Rosier	b9cedfb407	Partially revert "Add command line options to force function/loop alignments." This partially reverts r256571 in favor of the solution in r258409. llvm-svn: 258421	2016-01-21 18:49:15 +00:00
Rong Xu	69b08ad25b	[PGO] Passmanagerbuilder change that enable IR level PGO instrumentation This patch includes the passmanagerbuilder change that enables IR level PGO instrumentation. It adds two passmanagerbuilder options: -profile-generate=<profile_filename> and -profile-use=<profile_filename>. The new options are primarily for debug purpose. Reviewers: davidxl, silvas Differential Revision: http://reviews.llvm.org/D15828 llvm-svn: 258420	2016-01-21 18:28:59 +00:00
Adam Nemet	2628dc52d2	[TTI] Add getCacheLineSize Summary: And use it in PPCLoopDataPrefetch.cpp. @hfinkel, please let me know if your preference would be to preserve the ppc-loop-prefetch-cache-line option in order to be able to override the value of TTI::getCacheLineSize for PPC. Reviewers: hfinkel Subscribers: hulx2000, mcrosier, mssimpso, hfinkel, llvm-commits Differential Revision: http://reviews.llvm.org/D16306 llvm-svn: 258419	2016-01-21 18:28:36 +00:00
Rong Xu	6c08b3c582	[PGO] IR level instrumentation of indirect call value profiling This patch adds the instrumentation for indirect call value profiling. It finds all the indirect call-sites and generates instrprof_value_profile intrinsic calls. A new opt level option -disable-vp is introduced to disable this instrumentation. Reviewers: davidxl, betulb, vsk Differential Revision: http://reviews.llvm.org/D16016 llvm-svn: 258417	2016-01-21 18:11:44 +00:00
Sanjay Patel	9447739046	make helper functions static; NFCI llvm-svn: 258416	2016-01-21 18:01:57 +00:00
Manuel Jacob	4638a59057	Undo r258163 "Move part of an if condition into an assertion. NFC." This undoes the change made in r258163. The assertion fails if `Ptr` is of a vector type. The previous code doesn't look completely correct either, so I'll investigate this more. llvm-svn: 258411	2016-01-21 17:36:14 +00:00
Geoff Berry	8cf2ae77e2	[BlockPlacement] Add option to align all non-fall-through blocks. Summary: This option is being added for testing purposes. Reviewers: mcrosier Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16410 llvm-svn: 258409	2016-01-21 17:25:52 +00:00
Matthew Simpson	d8f9568a4c	Revert "[SLP] Truncate expressions to minimum required bit width" This reverts commit r258404. llvm-svn: 258408	2016-01-21 17:17:20 +00:00
Teresa Johnson	1a124a3e27	Use early return to simplify code (NFC) Follow on to r258405. llvm-svn: 258407	2016-01-21 17:16:53 +00:00
Vedant Kumar	28de1d0a47	[GCOV] Avoid emitting profile arcs for module and skeleton CUs Do not emit profile arc files and note files for module and skeleton CU's. Our users report seeing unexpected .gcda and .gcno files in their projects when using gcov-style profiling with modules or frameworks. The unwanted files come from these modules. This is not very helpful for end-users. Further, we've seen reports of instrumented programs crashing while writing these files out (due to I/O failures). rdar://problem/22838296 Reviewed-by: aprantl Differential Revision: http://reviews.llvm.org/D15997 llvm-svn: 258406	2016-01-21 17:04:42 +00:00
Teresa Johnson	29f701563c	[ThinLTO] Avoid unnecesary hash lookups during metadata linking (NFC) Replace sequences of count() followed by operator[] with either find() or insert(), depending on the context. llvm-svn: 258405	2016-01-21 16:46:40 +00:00
Matthew Simpson	14b16e7ee1	[SLP] Truncate expressions to minimum required bit width This change attempts to produce vectorized integer expressions in bit widths that are narrower than their scalar counterparts. The need for demotion arises especially on architectures in which the small integer types (e.g., i8 and i16) are not legal for scalar operations but can still be used in vectors. Like similar work done within the loop vectorizer, we rely on InstCombine to perform the actual type-shrinking. We use the DemandedBits analysis and ComputeNumSignBits from ValueTracking to determine the minimum required bit width of an expression. Differential revision: http://reviews.llvm.org/D15815 llvm-svn: 258404	2016-01-21 16:31:55 +00:00
Scott Egerton	18d74225d6	[mips] Allowed dla instructions on 32-bit architectures. Summary: This is now the same as the behaviour of the GNU assembler. This was done as it is required in order to build the Linux kernel with the integrated assembler enabled. Reviewers: dsanders, vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D13594 llvm-svn: 258400	2016-01-21 15:11:01 +00:00
Igor Breger	73167c5d63	AVX512: Masked move intrinsic implementation. Implemented intrinsic for the follow instructions (reg move) : VMOVDQU8/16, VMOVDQA32/64, VMOVAPS/PD. Differential Revision: http://reviews.llvm.org/D16316 llvm-svn: 258398	2016-01-21 14:18:11 +00:00
Michael Zuckerman	42638cd6f9	[AVX512] Adding VPERMT2B and VPERMI2B Intrinsics Differential Revision: http://reviews.llvm.org/D16398 llvm-svn: 258397	2016-01-21 13:36:01 +00:00
Krzysztof Parzyszek	50ddbf8f06	PR26172: unnecessary indirection in HexagonCopyToCombine.cpp llvm-svn: 258395	2016-01-21 12:45:17 +00:00
Marina Yatsina	a348761e3d	[X86] - Removing warning on legal cases caused by commit r258132 There's an overloading of the "movsd" and "cmpsd" instructions, e.g. movsd can be either "Move Data from String to String" or "Move or Merge Scalar Double-Precision Floating-Point Value". The former should produce warnings when parsing a memory operand that is not ESI/EDI, but the latter should not. Fixed the code to produce warnings only after making sure we're dealing with the first case. Expanded the tests of the produced warnings + fixed RUN line of the test so that it would check both stdout and stderr Differential Revision: http://reviews.llvm.org/D16359 llvm-svn: 258393	2016-01-21 11:37:06 +00:00
Manuel Jacob	f125133498	Change ConstantFoldInstOperands to take Instruction instead of opcode and type. NFC. Summary: The previous form, taking opcode and type, is moved to an internal helper and the new form, taking an instruction, is a wrapper around this helper. Although this is a slight cleanup on its own, the main motivation is to refactor the constant folding API to ease migration to opaque pointers. This will be follow-up work. Reviewers: eddyb Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D16383 llvm-svn: 258391	2016-01-21 06:33:22 +00:00
Manuel Jacob	90bf7d59dc	Introduce ConstantFoldCastOperand function and migrate some callers of ConstantFoldInstOperands to use it. NFC. Summary: Although this is a slight cleanup on its own, the main motivation is to refactor the constant folding API to ease migration to opaque pointers. This will be follow-up work. Reviewers: eddyb Subscribers: zzheng, dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D16380 llvm-svn: 258390	2016-01-21 06:31:08 +00:00
Manuel Jacob	80737df614	Introduce ConstantFoldBinaryOpOperands function and migrate some callers of ConstantFoldInstOperands to use it. NFC. Summary: Although this is a slight cleanup on its own, the main motivation is to refactor the constant folding API to ease migration to opaque pointers. This will be follow-up work. Reviewers: eddyb Subscribers: dblaikie, llvm-commits Differential Revision: http://reviews.llvm.org/D16378 llvm-svn: 258389	2016-01-21 06:26:35 +00:00
Tom Stellard	a693ddcf2d	AMDGPU/SI: Pass whether to use the SI scheduler via Target Attribute Summary: Currently the SI scheduler can be selected via command line option, but it turned out it would be better if it was selectable via a Target Attribute. This patch adds "si-scheduler" attribute to the backend. Reviewers: tstellarAMD, echristo Subscribers: echristo, arsenm Differential Revision: http://reviews.llvm.org/D16192 llvm-svn: 258386	2016-01-21 04:28:34 +00:00
David Majnemer	3271ade5e4	Rename MCLineEntry to MCDwarfLineEntry MCLineEntry gives the impression that it is generic MC machinery. However, it is specific to DWARF. llvm-svn: 258381	2016-01-21 01:59:03 +00:00
Kostya Serebryany	0b3db26b9a	[libFuzzer] don't use std::vector in one more hot path llvm-svn: 258380	2016-01-21 01:52:14 +00:00
Andrew Wilkins	7e6233eaac	[GlobalISel] make library an optional component Summary: Mark the LLVMGlobalISel library as optional in LLVMBuild.txt, since the library is only built if LLVM_BUILD_GLOBAL_ISEL is set. Without doing this, llvm-config includes the library in the list of components regardless of whether it's built, and then will error out when asked for the library names/paths. Reviewers: qcolombet Subscribers: joker.eph, llvm-commits, vkalintiris Differential Revision: http://reviews.llvm.org/D16386 llvm-svn: 258379	2016-01-21 01:41:03 +00:00
Mike Aizatsky	d01a744fd9	[libfuzzer] use %p for printing addresses llvm-svn: 258370	2016-01-21 00:02:09 +00:00
Rafael Espindola	d6b426313f	Remove redundant argument. It is already a member variable. llvm-svn: 258369	2016-01-21 00:00:53 +00:00
Dan Gohman	2536ce124f	[SelectionDAG] Fix constant offset folding to avoid commuting non-commutative operators. This fixes a miscompile in MultiSource/Benchmarks/MiBench/consumer-lame introduced in r258296. llvm-svn: 258366	2016-01-20 23:16:59 +00:00
Chad Rosier	0c74edaf2c	MachineScheduler: Add a command line option to disable post scheduler. llvm-svn: 258364	2016-01-20 23:08:32 +00:00
Chad Rosier	176047fcfd	MachineScheduler: Honor optnone functions in the pre-ra scheduler. llvm-svn: 258363	2016-01-20 22:38:25 +00:00
Rafael Espindola	0ca658f6d3	Simplify the logic. NFC. Found while reviewing the change for PR26152. llvm-svn: 258362	2016-01-20 22:38:23 +00:00
Sanjay Patel	7980a4a5f4	don't repeat function names in comments; NFC llvm-svn: 258360	2016-01-20 22:24:38 +00:00
David Blaikie	2c20fc8a28	Orc: Simplify lambda by using std::set's initializer_list ctor llvm-svn: 258359	2016-01-20 22:24:26 +00:00
George Burgess IV	c1d01b0bd9	Fix typo in an error string. NFC. llvm-svn: 258357	2016-01-20 22:15:23 +00:00
Evgeniy Stepanov	6c4fc98afa	Fix PR26152. Fix the condition for when the new global takes over the name of the existing one to be the negation of the condition for the new global to get internal linkage. llvm-svn: 258355	2016-01-20 22:05:50 +00:00
Evgeniy Stepanov	6bc3fa852c	Fix build warning. error: field 'CCMgr' will be initialized after field 'IndirectStubsMgr' [-Werror,-Wreorder] : DL(TM.createDataLayout()), CCMgr(std::move(CCMgr)), llvm-svn: 258354	2016-01-20 22:02:07 +00:00
Tom Stellard	1f5cbb2395	AMDGPU/SI: Promote i1 SETCC operations Summary: While working on uniform branching, I've hit a few cases where we emit i1 SETCC operations. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16233 llvm-svn: 258352	2016-01-20 21:48:24 +00:00
Matt Arsenault	fd19a3f663	AMDGPU: Fix old comments that mention AMDIL llvm-svn: 258350	2016-01-20 21:22:21 +00:00
Matt Arsenault	8442657275	AMDGPU: Remove AMDGPU.trunc intrinsic llvm-svn: 258348	2016-01-20 21:05:53 +00:00
Matt Arsenault	ef8d51e654	AMDGPU: Remove AMDIL.fraction intrinsic llvm-svn: 258347	2016-01-20 21:05:49 +00:00
Matt Arsenault	d256fbf59a	AMDGPU: Remove AMDIL.round.nearest intrinsic llvm-svn: 258346	2016-01-20 21:05:40 +00:00
Quentin Colombet	8d3acd6266	[GlobalISel] Add the proper cmake plumbing. This patch adds the necessary plumbing to cmake to build the sources related to GlobalISel. To build the sources related to GlobalISel, we need to add -DBUILD_GLOBAL_ISEL=ON. By default, this is OFF, thus GlobalISel sources will not impact people that do not explicitly opt-in. Differential Revision: http://reviews.llvm.org/D15983 llvm-svn: 258344	2016-01-20 20:58:56 +00:00
Matt Arsenault	c463ea6063	AMDGPU: Remove abs intrinsic llvm-svn: 258343	2016-01-20 20:58:29 +00:00
Matt Arsenault	97bf24516f	AMDGPU: Remove min/max intrinsics This removes support for mesa 11.0.x llvm-svn: 258342	2016-01-20 20:50:19 +00:00
Sanjoy Das	b0b3d4c99d	Add a "gc-transition" operand bundle Summary: This adds a new kind of operand bundle to LLVM denoted by the `"gc-transition"` tag. Inputs to `"gc-transition"` operand bundle are lowered into the "transition args" section of `gc.statepoint` by `RewriteStatepointsForGC`. This removes the last bit of functionality that was unsupported in the deopt bundle based code path in `RewriteStatepointsForGC`. Reviewers: pgavlin, JosephTremoulet, reames Subscribers: sanjoy, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16342 llvm-svn: 258338	2016-01-20 19:50:25 +00:00
Simon Atanasyan	65cb28f51e	[llvm-readobj][ELF] Teach llvm-readobj to show arch specific ELF section's flags Some architecture specific ELF section flags might have the same value (for example SHF_X86_64_LARGE and SHF_HEX_GPREL) and we have to check machine architectures to select an appropriate set of possible flags. The patch selects architecture specific flags into separate arrays `ElfxxxSectionFlags` and combines `ElfSectionFlags` and `ElfxxxSectionFlags` before pass to the `StreamWriter::printFlags()` method. Differential Revision: http://reviews.llvm.org/D16269 llvm-svn: 258334	2016-01-20 19:15:18 +00:00
Sanjay Patel	bc36d3b932	fix typo; NFC llvm-svn: 258332	2016-01-20 18:59:48 +00:00
Sanjay Patel	b8733f574f	fix formatting; NFC llvm-svn: 258330	2016-01-20 18:59:16 +00:00
Rafael Espindola	a193e28818	Accept subtractions involving a weak symbol. When a symbol S shows up in an expression in assembly there are two possible interpretations * The expression is referring to the value of S in this file. * The expression is referring to the value after symbol resolution. In the first case the assembler can reason about the value and try to produce a relocation. In the second case, that is only possible if the symbol cannot be preempted. Assemblers are not very consistent about which interpretation gets used. This changes MC to agree with GAS in the case of an expression of the form "Sym - WeakSym". llvm-svn: 258329	2016-01-20 18:57:48 +00:00
Sanjay Patel	3635b71b45	[LibCallSimplifier] don't get fooled by a fake sqrt() The test case will crash without this patch because the subsequent call to hasUnsafeAlgebra() assumes that the call instruction is an FPMathOperator (ie, returns an FP type). This part of the function signature check was omitted for the sqrt() case, but seems to be in place for all other transforms. Before: http://reviews.llvm.org/rL257400 ...we would have needlessly continued execution in optimizeSqrt(), but the bug was harmless because we'd eventually fail some other check and return without damage. This should fix: https://llvm.org/bugs/show_bug.cgi?id=26211 Differential Revision: http://reviews.llvm.org/D16198 llvm-svn: 258325	2016-01-20 17:41:14 +00:00
Lang Hames	a62c027701	[Orc] Fix a use-after-move bug in the Orc C-bindings stack. llvm-svn: 258324	2016-01-20 17:39:52 +00:00
Sanjay Patel	a1fa737ea0	80-cols; NFC llvm-svn: 258323	2016-01-20 16:41:43 +00:00
Keith Walker	a38818ea04	Write AArch64 big endian data fixup entries as BE. There was support for writing the AArch64 big endian data fixup entries in the .eh_frame section in BE. This is changed to write all such fixup entries in BE with no restriction on the section. This is similar to the existing support for fixup entries for ARM. A test is added to check the length field in the .debug_line section as this is an example of where such a fixup occurs. Differential Revision: http://reviews.llvm.org/D16064 llvm-svn: 258320	2016-01-20 15:59:14 +00:00
Tom Stellard	874368ba8b	Correctly initialize SIAnnotateControlFlow Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16304 llvm-svn: 258319	2016-01-20 15:48:27 +00:00
Michael Zuckerman	852ec66515	[AVX512] Adding VPERMB Intrinsics Differential Revision: http://reviews.llvm.org/D16296 llvm-svn: 258316	2016-01-20 15:24:56 +00:00
Marina Yatsina	b35ff9ba57	Fixing bug in rL258132: [X86] Adding support for missing variations of X86 string related instructions There was a bug in my rL258132 because there's an overloading of the "movsd" and "cmpsd" instructions, e.g. movsd can be either "Move Data from String to String" (the case I wanted to handle) or "Move or Merge Scalar Double-Precision Floating-Point Value" (the case that causes the asserts). Added code for escaping the unfamiliar scenarios and falling back to old behviour. Also changed the asserts to llvm_unreachable. llvm-svn: 258312	2016-01-20 14:03:47 +00:00
Krzysztof Parzyszek	d1195ef11e	Proper handling of diamond-like cases in if-conversion If converter was somewhat careless about "diamond" cases, where there was no join block, or in other words, where the true/false blocks did not have analyzable branches. In such cases, it was possible for it to remove (needed) branches, resulting in a loss of entire basic blocks. Differential Revision: http://reviews.llvm.org/D16156 llvm-svn: 258310	2016-01-20 13:14:52 +00:00
Igor Breger	866bd3ac74	AVX512: Store (MOVNTPD, MOVNTPS, MOVNTDQ) using non-temporal hint intrinsic implementation. Differential Revision: http://reviews.llvm.org/D16350 llvm-svn: 258309	2016-01-20 13:11:47 +00:00
Oliver Stannard	80ad1149bf	[AArch64] Fix two bugs in the .inst directive The AArch64 .inst directive was implemented using EmitIntValue, which resulted in both $x and $d (code and data) mapping symbols being emitted at the same address. This fixes it to only emit the $x mapping symbol. EmitIntValue also emits the value in big-endian order when targeting big-endian systems, but instructions are always emitted in little-endian order for AArch64. Differential Revision: http://reviews.llvm.org/D16349 llvm-svn: 258308	2016-01-20 12:54:31 +00:00
Petr Pavlu	b38595a7e2	[LTO] Fix error reporting when a file passed to libLTO is invalid or non-existent This addresses PR26060 where function lto_module_create() could return nullptr but lto_get_error_message() returned an empty string. The error() call after LTOModule::createFromFile() in llvm-lto is then removed because any error from this function should go through the diagnostic handler in llvm-lto which will exit the program. The error() call was added because this previously did not happen when the file was non-existent. This is fixed by the patch. (The situation that llvm-lto reports an error when the input file does not exist is tested by llvm/tools/llvm-lto/error.ll). Differential Revision: http://reviews.llvm.org/D16106 llvm-svn: 258298	2016-01-20 09:03:42 +00:00
Ivan Krasin	d1b806509e	[Verifier] Fix performance regression for LTO builds Summary: Fix a significant performance regression by introducing GlobalValueVisited field and reusing the map. This is a follow up to r257823 that slowed down linking Chrome with LTO by 2.5x. If you revert this commit, please, also revert r257823. BUG=https://llvm.org/bugs/show_bug.cgi?id=26214 Reviewers: pcc, loladiro, joker.eph Subscribers: krasin1, joker.eph, loladiro, pcc Differential Revision: http://reviews.llvm.org/D16338 llvm-svn: 258297	2016-01-20 08:41:22 +00:00
Dan Gohman	4d1c91fc4d	[SelectionDAG] Fold more offsets into GlobalAddresses SelectionDAG previously missed opportunities to fold constants into GlobalAddresses in several areas. For example, given `(add (add GA, c1), y)`, it would often reassociate to `(add (add GA, y), c1)`, missing the opportunity to create `(add GA+c, y)`. This isn't often visible on targets such as X86 which effectively reassociate adds in their complex address-mode folding logic, however it is currently visible on WebAssembly since it currently has very simple address mode folding code that doesn't reassociate anything. This patch fixes this by making SelectionDAG fold offsets into GlobalAddresses at the same times that it folds constants together, so that it doesn't miss any opportunities to perform such folding. Differential Revision: http://reviews.llvm.org/D16090 llvm-svn: 258296	2016-01-20 07:03:08 +00:00
Dan Gohman	2ff7a96abf	[WebAssembly] Minor code cleanups. NFC. llvm-svn: 258294	2016-01-20 05:54:22 +00:00
Dan Gohman	31b1fe588c	[WebAssembly] Remove the Relooper code, as it is not currently being used. llvm-svn: 258293	2016-01-20 05:50:29 +00:00
Dan Gohman	e0f4353910	[WebAssembly] Don't stackify stores across instructions with side effects. llvm-svn: 258285	2016-01-20 04:21:16 +00:00
Joseph Tremoulet	de5c9a8723	[Inliner/WinEH] Honor implicit nounwinds Summary: Funclet EH tables require that a given funclet have only one unwind destination for exceptional exits. The verifier will therefore reject e.g. two cleanuprets with different unwind dests for the same cleanup, or two invokes exiting the same funclet but to different unwind dests. Because catchswitch has no 'nounwind' variant, and because IR producers are not required to annotate calls which will not unwind as 'nounwind', it is legal to nest a call or an "unwind to caller" catchswitch within a funclet pad that has an unwind destination other than caller; it is undefined behavior for such a call or catchswitch to unwind. Normally when inlining an invoke, calls in the inlined sequence are rewritten to invokes that unwind to the callsite invoke's unwind destination, and "unwind to caller" catchswitches in the inlined sequence are rewritten to unwind to the callsite invoke's unwind destination. However, if such a call or "unwind to caller" catchswitch is located in a callee funclet that has another exceptional exit with an unwind destination within the callee, applying the normal transformation would give that callee funclet multiple unwind destinations for its exceptional exits. There would be no way for EH table generation to determine which is the "true" exit, and the verifier would reject the function accordingly. Add logic to the inliner to detect these cases and leave such calls and "unwind to caller" catchswitches as calls and "unwind to caller" catchswitches in the inlined sequence. This fixes PR26147. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: alexcrichton, llvm-commits Differential Revision: http://reviews.llvm.org/D16319 llvm-svn: 258273	2016-01-20 02:15:15 +00:00
Xinliang David Li	8c526c0741	[PGO] Add a new interface to be used by Indirect Call Promotion llvm-svn: 258271	2016-01-20 01:26:34 +00:00
Eduard Burtescu	f4029d5940	[NFC] Replace several manual GEP loops with gep_type_iterator. Reviewers: dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16335 llvm-svn: 258262	2016-01-20 00:26:52 +00:00
Xinliang David Li	09181f13bd	Function name change /NFC llvm-svn: 258260	2016-01-20 00:24:36 +00:00
Matthias Braun	3f5b3cdbf1	MachineScheduler: Allow independent scheduling of sub register defs Note that this is disabled by default and still requires a patch to handleMove() which is not upstreamed yet. If the TrackLaneMasks policy/strategy is enabled the MachineScheduler will build a schedule graph where definitions of independent subregisters are no longer serialised. Implementation comments: - Without lane mask tracking a sub register def also counts as a use (except for the first one with the read-undef flag set), with lane mask tracking enabled this is no longer the case. - Pressure Diffs where previously maintained per definition of a vreg with the help of the SSA information contained in the LiveIntervals. With lanemask tracking enabled we cannot do this anymore and instead change the pressure diffs for all uses of the vreg as it becomes live/dead. For this changed style to work correctly we ignore uses of instructions that define the same register again: They won't affect register pressure. - With lanemask tracking we remove all read-undef flags from sub register defs when building the graph and re-add them later when all vreg lanes have become dead. Differential Revision: http://reviews.llvm.org/D14969 llvm-svn: 258259	2016-01-20 00:23:32 +00:00
Matthias Braun	895450b36f	RegisterPressure: Make liveness tracking subregister aware Differential Revision: http://reviews.llvm.org/D14968 llvm-svn: 258258	2016-01-20 00:23:26 +00:00
Matthias Braun	e562f1b2e1	LiveInterval: Add utility class to rename independent subregister usage This renaming is necessary to avoid a subregister aware scheduler accidentally creating liveness "holes" which are rejected by the MachineVerifier. Explanation as found in this patch: Helper class that can divide MachineOperands of a virtual register into equivalence classes of connected components. MachineOperands belong to the same equivalence class when they are part of the same SubRange segment or adjacent segments (adjacent in control flow); Different subranges affected by the same MachineOperand belong to the same equivalence class. Example: vreg0:sub0 = ... vreg0:sub1 = ... vreg0:sub2 = ... ... xxx = op vreg0:sub1 vreg0:sub1 = ... store vreg0:sub0_sub1 The example contains 3 different equivalence classes: - One for the (dead) vreg0:sub2 definition - One containing the first vreg0:sub1 definition and its use, but not the second definition! - The remaining class contains all other operands involving vreg0. We provide a utility function here to rename disjunct classes to different virtual registers. Differential Revision: http://reviews.llvm.org/D16126 llvm-svn: 258257	2016-01-20 00:23:21 +00:00
Tom Stellard	936d0be6f9	AMDGPU/SI: Prevent the DAGCombiner from creating setcc with i1 inputs Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15035 llvm-svn: 258256	2016-01-20 00:13:22 +00:00
Sanjoy Das	d2d9b2b709	[MachineSink] Don't break ImplicitNulls Summary: This teaches MachineSink to not sink instructions that might break the implicit null check optimization that runs later. This should not affect frontends that do not use implicit null checks. Reviewers: aadg, reames, hfinkel, atrick Subscribers: majnemer, llvm-commits Differential Revision: http://reviews.llvm.org/D14632 llvm-svn: 258254	2016-01-20 00:06:14 +00:00
Quentin Colombet	2fd9288cb1	[X86] Do not run shrink-wrapping on function with split-stack attribute or HiPE calling convention. The implementation of the related callbacks in the x86 backend for such functions are not ready to deal with a prologue block that is not the entry block of the function. This fixes PR26107, but the longer term solution would be to fix those callbacks. llvm-svn: 258221	2016-01-19 23:29:03 +00:00
David Majnemer	7c946c0aa3	[MC, COFF] Add .reloc support for WinCOFF This adds rudimentary support for a few relocations that we will use for the CodeView debug format. llvm-svn: 258216	2016-01-19 23:05:27 +00:00
Simon Pilgrim	709595fe14	[X86][SSE] Add VZEXT_MOVL target shuffle decoding. Add support for decoding VZEXT_MOVL target shuffle masks, allowing it to be used as a source in target shuffle combines. llvm-svn: 258215	2016-01-19 23:04:56 +00:00
Quentin Colombet	44cd309151	[MachineFunction] Constify getter. NFC. llvm-svn: 258207	2016-01-19 22:31:12 +00:00
Simon Pilgrim	f1e3dd87e3	[X86][SSE] Add INSERTPS target shuffle combines. As vector shuffles can only reference two inputs many (V)INSERTPS patterns end up being split over two targets shuffles. This patch adds combines to attempt to combine (V)INSERTPS nodes with input/output nodes that are just zeroing out these additional vector elements. Differential Revision: http://reviews.llvm.org/D16072 llvm-svn: 258205	2016-01-19 22:24:12 +00:00
Lang Hames	f5d5e165bf	[Orc] #undef a MACRO after I'm done with it. Suggested by Philip Reames in review of r257951. Thanks Philip! llvm-svn: 258203	2016-01-19 22:20:21 +00:00
Chad Rosier	1efd881aea	[AArch64] Remove a bunch of useless FIXME comments. llvm-svn: 258193	2016-01-19 21:47:24 +00:00
Dan Gohman	d4e981ce4b	[WebAssembly] Remove an unused data member. NFC. llvm-svn: 258192	2016-01-19 21:31:41 +00:00
Chad Rosier	91ad8191d7	[AArch64] Remove more dead code after r258093. llvm-svn: 258191	2016-01-19 21:27:05 +00:00
Xinliang David Li	1d582a4528	Fix a coverage reading bug function record pointer is not advanced when duplicate entry is found. Test case to be added. llvm-svn: 258188	2016-01-19 21:18:12 +00:00
Lang Hames	526cf61162	[Orc] Refactor ObjectLinkingLayer::addObjectSet to defer loading objects until they're needed. Prior to this patch objects were loaded (via RuntimeDyld::loadObject) when they were added to the ObjectLinkingLayer, but were not relocated and finalized until a symbol address was requested. In the interim, another object could be loaded and finalized with the same memory manager, causing relocation/finalization of the first object to fail (as the first finalization call may have marked the allocated memory for the first object read-only). By deferring the loadObject call (and subsequent memory allocations) until an object file is needed we can avoid prematurely finalizing memory. llvm-svn: 258185	2016-01-19 21:06:38 +00:00
Sanjoy Das	c6887c7e27	[SCEV] Fix PR26207 In some cases, the max backedge taken count can be more conservative than the exact backedge taken count (for instance, because ScalarEvolution::getRange is not control-flow sensitive whereas computeExitLimitFromICmp can be). In these cases, computeExitLimitFromCond (specifically the bit that deals with `and` and `or` instructions) can create an ExitLimit instance with a `SCEVCouldNotCompute` max backedge count expression, but a computable exact backedge count expression. This violates an implicit SCEV assumption: a computable exact BE count should imply a computable max BE count. This change - Makes the above implicit invariant explicit by adding an assert to ExitLimit's constructor - Changes `computeExitLimitFromCond` to be more robust around conservative max backedge counts llvm-svn: 258184	2016-01-19 20:53:51 +00:00
Sanjoy Das	bad46059e0	[SCEV] Use range-for; NFC llvm-svn: 258183	2016-01-19 20:53:46 +00:00
JF Bastien	76215efe3e	WebAssembly: mark known failure caused by r258125 The following test program triggers the assertion: https://github.com/gcc-mirror/gcc/blob/master/gcc/testsuite/gcc.c-torture/execute/20030916-1.c llvm-svn: 258182	2016-01-19 20:53:12 +00:00
Kostya Serebryany	8820217862	[libFuzzer] use std::mt19937 for generating random numbers by default. Fix MyStoll to handle negative values. Use std::any_of instead of std::find_if llvm-svn: 258178	2016-01-19 20:33:57 +00:00
Sanjay Patel	ff21b77f07	getParent()->getParent() == getModule() ; NFC llvm-svn: 258176	2016-01-19 19:58:49 +00:00
Sanjay Patel	73930e2b84	function names start with a lowercase letter; NFC Note: There are no uses of these functions outside of SimplifyLibCalls, so they could be static functions in that file. llvm-svn: 258172	2016-01-19 19:46:10 +00:00
Sanjay Patel	2932dde796	fix formatting; NFC llvm-svn: 258167	2016-01-19 19:17:47 +00:00
Sanjay Patel	1af845b00b	don't repeat documentation comments in implementation file; NFC llvm-svn: 258166	2016-01-19 19:16:10 +00:00
Manuel Jacob	becbd7b6ad	Move part of an if condition into an assertion. NFC. llvm-svn: 258163	2016-01-19 19:04:49 +00:00
Michael Zuckerman	553ef84e85	[AVX512] Adding VPERMT2B and VPERMI2B instruction . Differential Revision: http://reviews.llvm.org/D16297 llvm-svn: 258161	2016-01-19 18:47:02 +00:00
Philip Reames	e0ccb686f9	Revert 258157 According the build bots, clang is using the Registry class somewhere as well. Will reapply with appropriate clang changes at a later point. llvm-svn: 258159	2016-01-19 18:41:10 +00:00
Sanjay Patel	a2ab3d6165	[LibCallSimplifier] use instruction-level fast-math-flags to shrink calls This is a continuation of adding FMF to call instructions: http://reviews.llvm.org/rL255555 llvm-svn: 258158	2016-01-19 18:38:52 +00:00
Philip Reames	5419400ae8	[GC] Registry initialization and linkage interactions The Registry class constructs a linked list of nodes whose storage is inside static variables and nodes are added via static initializers. The trick is that those static initializers are in both the LLVM code base, and some random plugin that might get loaded in at runtime. The existing code tries to use C++ templates and their ODR rules to get a single definition of the registry for each type, but, experimentally, this doesn't quite work as designed. (Well, the entire structure doesn't. It might not actually be an ODR problem.) Previously, when I tried moving the GCStrategy class (along with it's registry) from CodeGen to IR, I ran into a problem where asking the GCStrategyRegistry a question would return inconsistent results depending on whether you asked from CodeGen (where the static initializers still were) or Transforms. My best guess is that this is a result of either a) an order of initialization error, or b) we ended up with two copies of the registry being created. I remember at the time having convinced myself it was probably (b), but I don't have any of my notes around from that investigation any more. See http://reviews.llvm.org/rL226311 for the original patch in question. This patch tries to remove the possibility of (b) above. (a) was already fixed in change 258109. Differential Revision: http://reviews.llvm.org/D16170 llvm-svn: 258157	2016-01-19 18:34:27 +00:00
Rong Xu	a4b335c5a6	[PGO] Create the profile data variable before the lowering This patch creates the profile data variable before lowering the profile intrinsics. Reviewers: davidxl, silvas Differential Revision: http://reviews.llvm.org/D16015 llvm-svn: 258156	2016-01-19 18:29:54 +00:00
Sanjay Patel	a46637dede	[LibCallSimplifier] use instruction-level fast-math-flags to transform pow(x, [small integer]) calls This is a continuation of adding FMF to call instructions: http://reviews.llvm.org/rL255555 As with D15937, the intent of the patch is to preserve the current behavior of the transform except that we use the pow call's 'fast' attribute as a trigger rather than a function-level attribute. The TODO comment notes a potential follow-on patch that would propagate FMF to the new instructions. Differential Revision: http://reviews.llvm.org/D16122 llvm-svn: 258153	2016-01-19 18:15:12 +00:00
Chris Ray	b2705cf351	NFC Test Commit whitespace change in a comment Changed whitespace so comments line up. llvm-svn: 258151	2016-01-19 18:01:20 +00:00
Rafael Espindola	a9f441b419	Use larger write sizes for MCFillFragment. This brings the pr26208 testcase down to 3.2 seconds. Not checking it in since it does create a 4GB .o file. llvm-svn: 258149	2016-01-19 17:47:48 +00:00
Sanjay Patel	76380d0013	remove outdated comment; NFC llvm-svn: 258147	2016-01-19 17:29:22 +00:00
Eduard Burtescu	c55147fcdc	[opaque pointer types] [NFC] GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType. Summary: GEPOperator: provide getResultElementType alongside getSourceElementType. This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has. GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType. Reviewers: mjacob, dblaikie Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16275 llvm-svn: 258145	2016-01-19 17:28:00 +00:00
Michael Zuckerman	71a84dc5a5	[AVX512] Adding VPERMB instruction Differential Revision: http://reviews.llvm.org/D16294 llvm-svn: 258144	2016-01-19 17:07:43 +00:00
Dan Gohman	fb437bc669	[WebAssembly] Rematerialize constants rather than hold them live in registers. Teach the register stackifier to rematerialize constants that have multiple uses instead of leaving them in registers. In the WebAssembly encoding, it's the same code size to materialize most constants as it is to read a value from a register. llvm-svn: 258142	2016-01-19 16:59:23 +00:00
Rafael Espindola	2f2a76fc6a	Simplify MCFillFragment. The value size was always 1 or 0, so we don't need to store it. In a no asserts build this takes the testcase of pr26208 from 11 to 10 seconds. llvm-svn: 258141	2016-01-19 16:57:08 +00:00
Chad Rosier	a47859252e	Typo. llvm-svn: 258137	2016-01-19 16:50:45 +00:00
Marina Yatsina	adac739033	[X86] Add support for "xlat m8" According to x86 spec "xlat m8" is a legal instruction and it is equivalent to "xlatb". Differential Revision: http://reviews.llvm.org/D15150 llvm-svn: 258135	2016-01-19 16:35:38 +00:00
Manuel Jacob	a2e0ca38ae	Fix constant folding of constant vector GEPs with undef or null as pointer argument. Reviewers: eddyb Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16321 llvm-svn: 258134	2016-01-19 16:34:31 +00:00
Marina Yatsina	d7dac8fde4	[X86] Adding support for missing variations of X86 string related instructions The following are legal according to X86 spec: ins mem, DX outs DX, mem lods mem stos mem scas mem cmps mem, mem movs mem, mem Differential Revision: http://reviews.llvm.org/D14827 llvm-svn: 258132	2016-01-19 15:37:56 +00:00
Manuel Jacob	eacda01c05	Rename Variable `Ptr` to `PtrTy`. NFC. llvm-svn: 258130	2016-01-19 15:21:15 +00:00
Rafael Espindola	b750d2d403	Handle 64 bit offsets. No tests since llvm-mc takes 14 seconds on it. I will try to improve it and then test. Part of pr26208. llvm-svn: 258129	2016-01-19 15:19:08 +00:00
Dan Gohman	d5444c8fc8	[WebAssembly] Disable some WebAssembly-specific optimization passes at -O0. llvm-svn: 258127	2016-01-19 14:55:02 +00:00
Dan Gohman	5556fe04f3	[WebAssembly] Use the templated form of MachineFunction::getSubtarget(). NFC. llvm-svn: 258126	2016-01-19 14:53:19 +00:00
Dan Gohman	e8c29f17af	[WebAssembly] Re-enable loop idiom recognition for memcpy et al. llvm-svn: 258125	2016-01-19 14:49:23 +00:00
Asaf Badouh	19e99238a0	[X86][AVX512]fix dag & add intrinsics for fixupimm cover all width and types (pd/ps/sd/ss) of fixupimm instruction and inrtinsics Differential Revision: http://reviews.llvm.org/D16313 llvm-svn: 258124	2016-01-19 14:21:39 +00:00
Philip Reames	4a8129f191	[GC] Lower vectors-of-pointers directly by default This commit changes the default on our lowering of vectors-of-pointers from splitting in RS4GC to reporting them in the final stack map. All of the changes to do so are already in place and tested. Assuming no problems are unearthed in the next week, we will be deleting the old code entirely next Monday. llvm-svn: 258111	2016-01-19 04:18:24 +00:00
Philip Reames	d600d56e01	[GC] Consolidate all built in GCs into a single file [NFC] Combine a bunch of small files into a single, still rather small, file. The primary purpose of this is to get all of the static initializers into a single file so as to have a well defined order of initialization. llvm-svn: 258109	2016-01-19 03:57:18 +00:00
Kelvin Li	67f0e62e45	parseArch() supports more variations of arch names for PowerPC builds llvm-svn: 258103	2016-01-19 00:04:41 +00:00
Tobias Edler von Koch	ef41afb1e8	Add a change accidentally left out from r258100 Also remove an executable bit introduced by r258083. llvm-svn: 258101	2016-01-18 23:35:24 +00:00
Tobias Edler von Koch	c19c96e06f	[LTO] Restore original linkage of externals prior to splitting Summary: This is a companion patch for http://reviews.llvm.org/D16124. Internalized symbols increase the size of strongly-connected components in SCC-based module splitting and thus reduce the amount of parallelism. This patch records the original linkage of non-local symbols prior to internalization and then restores it just before splitting/CodeGen. This is also useful for cases where the linker requires symbols to remain external, for instance, so they can be placed according to linker script rules. It's currently under its own flag (-restore-globals) but should eventually share a common flag with D16124. Reviewers: joker.eph, pcc Subscribers: slarin, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D16229 llvm-svn: 258100	2016-01-18 23:24:54 +00:00
Simon Pilgrim	82a3dcbbfd	Fixed MSVC warning that not all control paths return a value. llvm-svn: 258099	2016-01-18 22:54:46 +00:00
Matt Arsenault	348623d27f	AMDGPU: Reduce 64-bit SRAs llvm-svn: 258096	2016-01-18 22:09:04 +00:00
Matt Arsenault	862bf93c73	AMDGPU: Split 64-bit and of constant up This breaks the tests that were meant for testing 64-bit inline immediates, so move those to shl where they won't be broken up. This should be repeated for the other related bit ops. llvm-svn: 258095	2016-01-18 22:01:13 +00:00
Chad Rosier	8c72479955	[AArch64] Remove unused arguments. NFC. AFAICT, these have been unused since the initial backend import. llvm-svn: 258093	2016-01-18 21:56:40 +00:00
Matt Arsenault	97aeb607e4	AMDGPU: Generalize shl combine Reduce 64-bit shl with constant > 32. We already special cased this for the == 32 case, but this also works for any >= 32 constant. llvm-svn: 258092	2016-01-18 21:55:14 +00:00
Matt Arsenault	e1a6e6ae7f	AMDGPU: Reduce 64-bit lshr by constant to 32-bit 64-bit shifts are very slow on some subtargets. llvm-svn: 258090	2016-01-18 21:43:36 +00:00
Adam Nemet	0049eafd88	[LAA] Include function name in debug output llvm-svn: 258088	2016-01-18 21:16:33 +00:00
Matt Arsenault	1478312f09	AMDGPU: Add subtarget feature for instruction rates llvm-svn: 258085	2016-01-18 21:13:50 +00:00
Simon Pilgrim	269ccdb97d	Fixed MSVC Win64 warning of implicit conversion of 32-bit shift to 64-bits. llvm-svn: 258084	2016-01-18 21:11:19 +00:00
Sergei Larin	72115d5fb6	Add to the split module utility an SCC based method which allows not to globalize any local variables. Summary: Currently llvm::SplitModule as the first step globalizes all local objects, which might not be desirable in some scenarios. This change adds a new flag to llvm::SplitModule that uses SCC approach to search for a balanced partition without the need to externalize symbols. Such partition might not be possible or fully balanced for a given number of partitions, and is a function of the module properties (global/local dependencies within the module). Joint development Tobias Edler von Koch (tobias@codeaurora.org) and Sergei Larin (slarin@codeaurora.org) Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D16124 llvm-svn: 258083	2016-01-18 21:07:13 +00:00
Simon Pilgrim	4c1241282f	[X86][AVX2] Broadcast subvectors AVX2 can only broadcast from the zero'th element of a vector, but if the broadcastable element is the zero'th element of a 128-bit subvector its advantageous to extract the subvector, broadcast from that and avoid the loading of shuffle mask data that would be needed for VPERMPS/VPERMD. The only exception being when the source type is 4f64 or 4i64 which can directly use the immediate shuffle VPERMPD/VPERMQ directly. Differential Revision: http://reviews.llvm.org/D16050 llvm-svn: 258081	2016-01-18 20:59:04 +00:00
Krzysztof Parzyszek	c8377aa82b	[Hexagon] Recognize more copy-equivalents in RDF optimizations llvm-svn: 258076	2016-01-18 20:45:51 +00:00
Krzysztof Parzyszek	7ff439af60	[RDF] Improvements to copy propagation - Allow any instruction to define equality between registers. - Keep the DFG updated. llvm-svn: 258075	2016-01-18 20:43:57 +00:00
Krzysztof Parzyszek	fa20b3ad96	[RDF] Improve compile-time performance of dead code elimination llvm-svn: 258074	2016-01-18 20:42:47 +00:00
Krzysztof Parzyszek	a9c1a4bbfd	[RDF] Allow unlinking ref nodes from data-flow chains only llvm-svn: 258073	2016-01-18 20:41:34 +00:00
Craig Topper	7b0b0f6cca	[TableGen] Use FoldingSets instead of DenseMaps to unique UnOpInit, BinOpInit and TernOpInit. This remove the memory needed to store the key for the DenseMap. NFC llvm-svn: 258071	2016-01-18 20:36:06 +00:00
Craig Topper	4b8a0e95c4	[TableGen] Fix an assert I missed in r258063. llvm-svn: 258068	2016-01-18 19:59:05 +00:00
Tom Stellard	a09c2fcc35	TargetLowering: Improve handling of (setcc ([sz]ext x) 0, cc) in SimplifySetCC Summary: When SimplifySetCC sees a setcc node that compares the result of a value extension operation with a constant, it tries to simplify the setcc node by eliminating the extension and shrinking the constant. If shrinking the inputs to setcc is deemed not desirable by the target (e.g. the target does not want a setcc comparing i1 values), then it is still possible to optimize this sequence in some cases. This patch adds the following combines to SimplifySetCC when shrinking setcc inputs is not desirable: (setcc ([sz]ext (setcc x, y, cc)), 0, setne) -> (setcc (x, y, cc)) (setcc ([sz]ext (setcc x, y, cc)), 0, seteq) -> (setcc (x, Y, !cc)) There are no tests for this yet, but once AMDGPU correctly implements TargetLowering::isTypeDesirableForOp(), this new combine will be exercised by the existing CodeGen/AMDGPU/setcc-opt.ll test. Reviewers: resistor, arsenm Subscribers: jroelofs, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15034 llvm-svn: 258067	2016-01-18 19:55:21 +00:00
Craig Topper	a5ac333af9	[TableGen] Merge the SuperClass Record and SMRange vector into a single vector. This removes the state needed to manage the extra vector thus reducing the size of the Record class. NFC llvm-svn: 258065	2016-01-18 19:52:37 +00:00
Craig Topper	22fa1c4ac6	[TableGen] Allocate the Init pointer array for BitsInit/ListInit after the BitsInit/ListInit object itself. Saves a bit of memory. NFC llvm-svn: 258063	2016-01-18 19:52:24 +00:00
Sanjay Patel	5b6411a86b	combine clauses with same output ; NFCI llvm-svn: 258062	2016-01-18 19:17:58 +00:00
Sanjay Patel	4f9ef4b7f0	use m_OneUse ; NFCI llvm-svn: 258059	2016-01-18 18:36:38 +00:00
Sanjay Patel	08bcf5f0bd	fix variable names, typos ; NFC llvm-svn: 258058	2016-01-18 18:28:09 +00:00
Sanjay Patel	5dcccbe4e7	fix typo; NFC llvm-svn: 258057	2016-01-18 17:50:23 +00:00
Igor Breger	7327a3bf3b	AVX512: Masked store intrinsic implementation. Implemented intrinsic for the follow instructions (store) : VMOVDQU8/16/32/64, VMOVDQA32/64, VMOVAPS/PD, VMOVUPS/PD. Differential Revision: http://reviews.llvm.org/D16271 llvm-svn: 258047	2016-01-18 13:52:57 +00:00
Elena Demikhovsky	0090e74974	Added Cannonlake processor to X86 Target Differential Revision: http://reviews.llvm.org/D16289 llvm-svn: 258046	2016-01-18 13:00:31 +00:00
Igor Breger	74d74d20c2	AVX512 : Change v8i1 bitconvert GR8 pattern, remove unnecessary movzbl instruction. code example , previous implementation. movzbl %dil, %eax kmovw %eax, %k0 new code kmovw %edi, %k0 Differential Revision: http://reviews.llvm.org/D16287 llvm-svn: 258045	2016-01-18 12:02:45 +00:00
Oliver Stannard	ec1b7475d8	[ARM] Operands for PKHTB alias should be swapped When the shift immediate is zero, PKHTB is an alias for PKHBT, but the order of the input operands needs to be swapped. Differential Revision: http://reviews.llvm.org/D16288 llvm-svn: 258044	2016-01-18 11:56:35 +00:00
Michael Zuckerman	b0cb95e40b	[AVX512] adding AVXVBMI feature flag Fixing wrong typo (avx515) → (avx512) Review over the shoulder by asaf . Differential Revision: http://reviews.llvm.org/D16190 llvm-svn: 258041	2016-01-18 11:12:47 +00:00
Xinliang David Li	7d5bc597d7	[Coverage] move a local var to be BinaryCoverageReader's member The symtab is logically referenced beyond the call to the create method. This changes makes sure its lifetime matches that of the reader. llvm-svn: 258036	2016-01-18 06:48:01 +00:00
Junmo Park	d69eaa7d57	Remove extra whitespace. NFC. llvm-svn: 258035	2016-01-18 06:42:51 +00:00
Eduard Burtescu	4fdc6b48ef	Revert assert added in rL258028 as the alloca and OtherPtr types may differ in address space. llvm-svn: 258029	2016-01-18 00:20:34 +00:00
Eduard Burtescu	313153c723	[opaque pointer types] Alloca: use getAllocatedType() instead of getType()->getPointerElementType(). Reviewers: mjacob Subscribers: llvm-commits, dblaikie Differential Revision: http://reviews.llvm.org/D16272 llvm-svn: 258028	2016-01-18 00:10:01 +00:00
Sanjay Patel	8534c989be	fix variable names; NFC llvm-svn: 258027	2016-01-17 23:18:05 +00:00
Sanjay Patel	efea56a88c	fix typos; NFC llvm-svn: 258026	2016-01-17 23:13:48 +00:00
Manuel Jacob	51a8af4316	[opaque pointer types] [breaking-change] [NFC] SimplifyGEPInst: take the source element type of the GEP as an argument. Patch by Eduard Burtescu. Reviewers: dblaikie, mjacob Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D16281 llvm-svn: 258024	2016-01-17 22:46:43 +00:00
Manuel Jacob	67d88f5507	[opaque pointer types] [NFC] CallSite: use getFunctionType() instead of going through PointerType::getElementType. Patch by Eduard Burtescu. Reviewers: dblaikie, mjacob Subscribers: dsanders, llvm-commits, dblaikie Differential Revision: http://reviews.llvm.org/D16273 llvm-svn: 258023	2016-01-17 22:37:39 +00:00
Manuel Jacob	5fe6c9d94d	[NFC] Remove one dead PointerType::getElementType() call. Reviewers: dblaikie, mjacob Subscribers: llvm-commits, dblaikie Patch by Eduard Burtescu. Differential Revision: http://reviews.llvm.org/D16274 llvm-svn: 258022	2016-01-17 22:28:28 +00:00
Sanjoy Das	aa011535f3	[IndVars] Fix PR25576 `LCSSASafePhiForRAUW` as computed was incorrect -- in cases like these (this exact example does not actually trigger the bug): define i32 @f(i32 %n, i1* %c) { entry: br label %outer.loop outer.loop: br label %inner.loop inner.loop: %iv = phi i32 [ 0, %outer.loop ], [ %iv.inc, %inner.loop ] %iv.inc = add nuw nsw i32 %iv, 1 %tc = udiv i32 %n, 13 %be.cond = icmp ult i32 %iv, %tc br i1 %be.cond, label %inner.loop, label %inner.exit inner.exit: %iv.lcssa = phi i32 [ %iv, %inner.loop ] %outer.be.cond = load volatile i1, i1* %c br i1 %outer.be.cond, label %outer.loop, label %leave leave: %iv.lcssa.lcssa = phi i32 [ %iv.lcssa, %inner.exit ] ret i32 %iv.lcssa.lcssa } `LCSSASafePhiForRAUW` is true for `%iv.lcssa` when re-rewriting the exit value of `%iv` for `%inner.loop` to `%tc` (this can happen due to `SCEVExpander::findExistingExpansion`), but the RAUW breaks LCSSA. To fix this, instead of computing `SafePhi` with special logic, decide the safety of RAUW directly via `replacementPreservesLCSSAForm`. llvm-svn: 258016	2016-01-17 18:12:52 +00:00
Sanjoy Das	2c8efc2a82	[IndVars] Use emplace_back; NFC llvm-svn: 258015	2016-01-17 18:12:48 +00:00
Michael Zuckerman	65f549e895	[AVX512] adding AVXVBMI feature flag The feature flag is for VPERMB,VPERMI2B,VPERMT2B and VPMULTISHIFTQB instructions. More about the instruction can be found in: hattps://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf Differential Revision: http://reviews.llvm.org/D16190 llvm-svn: 258012	2016-01-17 13:42:12 +00:00
Artur Pilipenko	de18f1640e	Fix buildbot failure introduced by 258010. Remove local variables became unused. llvm-svn: 258011	2016-01-17 12:59:40 +00:00
Artur Pilipenko	bb5abf9eb3	Push isDereferenceableAndAlignedPointer down into isSafeToLoadUnconditionally Reviewed By: reames Differential Revision: http://reviews.llvm.org/D16226 llvm-svn: 258010	2016-01-17 12:35:29 +00:00
Igor Breger	908122c363	AVX512: Use MemIntrinsicSDNode to implement load/store intrinsic. Differential Revision: http://reviews.llvm.org/D16184 llvm-svn: 258009	2016-01-17 12:10:24 +00:00
Michael Zuckerman	04a3249a24	[AVX512] Adding VPERMW/D/Q VPERMPS/D Intrinsics Differential Revision: http://reviews.llvm.org/D16189 llvm-svn: 258008	2016-01-17 11:33:29 +00:00
Michael Zuckerman	365c9dfcf3	[AVX512] Adding VPERMQ VPERMPD Intrinsics Differential Revision: http://reviews.llvm.org/D16194 llvm-svn: 258006	2016-01-17 08:32:14 +00:00
Simon Pilgrim	5698ba2abd	[X86][AVX] Enable extraction of upper 128-bit subvectors for 'half undef' shuffle lowering Added support for the extraction of the upper 128-bit subvectors for lower/upper half undef shuffles if it would reduce the number of extractions/insertions or avoid loads of AVX2 permps/permd shuffle masks. Minor follow up to D15477. llvm-svn: 258000	2016-01-16 22:30:20 +00:00
Manuel Jacob	e6438acb66	GlobalValue: use getValueType() instead of getType()->getPointerElementType(). Reviewers: mjacob Subscribers: jholewinski, arsenm, dsanders, dblaikie Patch by Eduard Burtescu. Differential Revision: http://reviews.llvm.org/D16260 llvm-svn: 257999	2016-01-16 20:30:46 +00:00

... 3 4 5 6 7 ...

86721 Commits