llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00

Author	SHA1	Message	Date
Chad Rosier	175af3914a	Add a triple to this test. llvm-svn: 169803	2012-12-11 00:51:36 +00:00
Chandler Carruth	ac8f03ddc1	Fix a miscompile in the DAG combiner. Previously, we would incorrectly try to reduce the width of this load, and would end up transforming: (truncate (lshr (sextload i48 <ptr> as i64), 32) to i32) to (truncate (zextload i32 <ptr+4> as i64) to i32) We lost the sext attached to the load while building the narrower i32 load, and replaced it with a zext because lshr always zext's the results. Instead, bail out of this combine when there is a conflict between a sextload and a zext narrowing. The rest of the DAG combiner still optimize the code down to the proper single instruction: movswl 6(...),%eax Which is exactly what we wanted. Previously we read past the end and missed the sign extension: movl 6(...), %eax llvm-svn: 169802	2012-12-11 00:36:57 +00:00
Paul Redmond	fde20fa567	move X86-specific test This test case uses -mcpu=corei7 so it belongs in CodeGen/X86 Reviewed by: Nadav llvm-svn: 169801	2012-12-11 00:36:43 +00:00
Bill Wendling	10c1be166f	Fix grammar-o. llvm-svn: 169798	2012-12-11 00:23:07 +00:00
Chad Rosier	0b2e4a1ba8	Fall back to the selection dag isel to select tail calls. This shouldn't affect codegen for -O0 compiles as tail call markers are not emitted in unoptimized compiles. Testing with the external/internal nightly test suite reveals no change in compile time performance. Testing with -O1, -O2 and -O3 with fast-isel enabled did not cause any compile-time or execution-time failures. All tests were performed on my x86 machine. I'll monitor our arm testers to ensure no regressions occur there. In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue and objc_retainAutoreleaseReturnValue as tail calls unconditionally. While it's theoretically true that this is just an optimization, it's an optimization that we very much want to happen even at -O0, or else ARC applications become substantially harder to debug. Part of rdar://12553082 llvm-svn: 169796	2012-12-11 00:18:02 +00:00
Eric Christopher	2d11b002bc	Refactor out the abbreviation handling into a separate class that controls each of the abbreviation sets (only a single one at the moment) and computes offsets separately as well for each set of DIEs. No real function change, ordering of abbreviations for the skeleton CU changed but only because we're computing in a separate order. Fix the testcase not to care. llvm-svn: 169793	2012-12-10 23:34:43 +00:00
Evan Cheng	86dd733bc8	Some enhancements for memcpy / memset inline expansion. 1. Teach it to use overlapping unaligned load / store to copy / set the trailing bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies. 2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g. x86 and ARM. 3. When memcpy from a constant string, do not replace the load with a constant if it's not possible to materialize an integer immediate with a single instruction (required a new target hook: TLI.isIntImmLegal()). 4. Use unaligned load / stores more aggressively if target hooks indicates they are "fast". 5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8. Also increase the threshold to something reasonable (8 for memset, 4 pairs for memcpy). This significantly improves Dhrystone, up to 50% on ARM iOS devices. rdar://12760078 llvm-svn: 169791	2012-12-10 23:21:26 +00:00
Arnold Schwaighofer	182d1ce4b7	Optimistically analyse Phi cycles Analyse Phis under the starting assumption that they are NoAlias. Recursively look at their inputs. If they MayAlias/MustAlias there must be an input that makes them so. Addresses bug 14351. llvm-svn: 169788	2012-12-10 23:02:41 +00:00
Lang Hames	313bb2d202	Defer call to InitSections until after MCContext has been initialized. If InitSections is called before the MCContext is initialized it could cause duplicate temporary symbols to be emitted later (after context initialization resets the temporary label counter). llvm-svn: 169785	2012-12-10 22:49:11 +00:00
Anshuman Dasgupta	38feea3c57	Fix PR14568: Avoid the DFA packetizer from making an invalid read beyond array bounds. No test case since I cannot reproduce an ICE with this bug. According to Carlos -- the bug reporter -- a segfault occurs only when LLVM is compiled with a specific version of GCC. llvm-svn: 169783	2012-12-10 22:45:57 +00:00
Eric Christopher	b3b9b702cb	Rearrange vars and make comments more obvious. llvm-svn: 169780	2012-12-10 22:25:41 +00:00
Eric Christopher	5b2c77f097	Remove blank line at top of file. llvm-svn: 169779	2012-12-10 22:25:38 +00:00
Eric Christopher	9fed81d6be	Fix a coding style nit. llvm-svn: 169776	2012-12-10 22:00:20 +00:00
Nadav Rotem	a043fa4083	Enable the loop vectorizer only on O2 and above. (Still disabled by default) llvm-svn: 169774	2012-12-10 21:45:01 +00:00
Tom Stellard	3801f0fed5	LegalizeDAG: Allow type promotion of scalar loads llvm-svn: 169773	2012-12-10 21:41:58 +00:00
Tom Stellard	c8da3bd0a1	LegalizeDAG: Allow type promotion for scalar stores llvm-svn: 169772	2012-12-10 21:41:54 +00:00
Nadav Rotem	417eaafbc4	Split the LoopVectorizer into H and CPP. llvm-svn: 169771	2012-12-10 21:39:02 +00:00
Bill Wendling	0af6f08453	Revert r169656. The linker will call `lto_codegen_add_must_preserve_symbol' on all globals that should be kept around. The linker will pretend that a dylib is being created. <rdar://problem/12528059> llvm-svn: 169770	2012-12-10 21:33:45 +00:00
Eli Bendersky	139a219553	Add a test for explicitly exercising the mc-relax-all flag. llvm-svn: 169764	2012-12-10 20:36:01 +00:00
Eli Bendersky	074c9e1b36	Cleanup formatting, comments and naming. llvm-svn: 169762	2012-12-10 20:13:43 +00:00
Akira Hatanaka	c10e48ba6a	[mips] Set HWEncoding field of registers. Use delete function getMipsRegisterNumbering and use MCRegisterInfo::getEncodingValue instead. llvm-svn: 169760	2012-12-10 20:04:40 +00:00
Eric Christopher	c67794597d	Use the somewhat semantic term "split dwarf" it more matches what's going on and makes a lot of the terminology in comments make more sense. llvm-svn: 169758	2012-12-10 19:51:21 +00:00
Eric Christopher	2bf7bdcd23	Delete the FissionCU. llvm-svn: 169757	2012-12-10 19:51:18 +00:00
Eric Christopher	67243c354a	Reorder fission variables. llvm-svn: 169756	2012-12-10 19:51:13 +00:00
Bill Wendling	bb1f8f293a	Don't use a red zone for code coverage if the user specified `-mno-red-zone'. The `-mno-red-zone' flag wasn't being propagated to the functions that code coverage generates. This allowed some of them to use the red zone when that wasn't allowed. <rdar://problem/12843084> llvm-svn: 169754	2012-12-10 19:46:49 +00:00
Nadav Rotem	196fc7cc8c	Add support for reverse induction variables. For example: while (i--) sum+=A[i]; llvm-svn: 169752	2012-12-10 19:25:06 +00:00
Jim Grosbach	14cbefa1f6	CMake: Don't run 'git svn' if there is no .git/svn directory. If the local checkout does not have 'git svn' references set up, don't try to use 'git svn' for version information. llvm-svn: 169749	2012-12-10 19:03:37 +00:00
Eli Bendersky	9c8e9c6edd	This patch adds statistics for other non-DWARF fragments emitted by the assembler. This is useful in order to know how the numbers add up, since in particular the Align fragments account for a non-trivial portion of the emitted fragments (especially on -O0 which sets relax-all). llvm-svn: 169747	2012-12-10 18:59:39 +00:00
Hal Finkel	3b65689ab9	Use GetUnderlyingObjects in misched misched used GetUnderlyingObject in order to break false load/store dependencies, and the -enable-aa-sched-mi feature similarly relied on GetUnderlyingObject in order to ensure it is safe to use the aliasing analysis. Unfortunately, GetUnderlyingObject does not recurse through phi nodes, and so (especially due to LSR) all of these mechanisms failed for induction-variable-dependent loads and stores inside loops. This change replaces uses of GetUnderlyingObject with GetUnderlyingObjects (which will recurse through phi and select instructions) in misched. Andy reviewed, tested and simplified this patch; Thanks! llvm-svn: 169744	2012-12-10 18:49:16 +00:00
Sean Silva	ffc628ff80	Fix funky copy-pasted grammatical error. PR14343 llvm-svn: 169742	2012-12-10 18:37:26 +00:00
Chandler Carruth	7e4aad1c1f	Revert "Make '-mtune=x86_64' assume fast unaligned memory accesses." Accidental commit... git svn betrayed me. Sorry for the noise. llvm-svn: 169741	2012-12-10 18:23:52 +00:00
Chandler Carruth	a64587b996	Make '-mtune=x86_64' assume fast unaligned memory accesses. Summary: Not all chips targeted by x86_64 have this feature, but a dramatically increasing number do. Specifying a chip-specific tuning parameter will continue to turn the feature on or off as appropriate for that particular chip, but the generic flag should try to achieve the best performance on the most widely available hardware. Today, the number of chips with fast UA access dwarfs those without in the x86-64 space. Note that this also brings LLVM's code generation for this '-march' flag more in line with that of modern GCCs. CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D195 llvm-svn: 169740	2012-12-10 18:22:42 +00:00
Chandler Carruth	d5a4e512ca	Fix a typo in my previous commit -- bloomfield is 0x1A not 0x2A. Thanks to the PaX folks for noticing in review! We need some tests here, any sugestions welcome... llvm-svn: 169739	2012-12-10 18:22:40 +00:00
Chandler Carruth	3e5f2c328d	Address a FIXME and update the fast unaligned memory feature for newer Intel chips. The model number rules were determined by inspecting Intel's documentation for their newer chip model numbers. My understanding is that all of the newer Intel chips have fast unaligned memory access, but if anyone is concerned about a particular chip, just shout. No tests updated; it's not clear we have dedicated tests for the chips' various features, but if anyone would like tests (or can point me at some existing ones), I'm happy to oblige. llvm-svn: 169730	2012-12-10 09:18:44 +00:00
Chandler Carruth	4686de879c	Add a new visitor for walking the uses of a pointer value. This visitor provides infrastructure for recursively traversing the use-graph of a pointer-producing instruction like an alloca or a malloc. It maintains a worklist of uses to visit, so it can handle very deep recursions. It automatically looks through instructions which simply translate one pointer to another (bitcasts and GEPs). It tracks the offset relative to the original pointer as long as that offset remains constant and exposes it during the visit as an APInt offset. Finally, it performs conservative escape analysis. However, currently it has some limitations that should be addressed going forward: 1) It doesn't handle vectors of pointers. 2) It doesn't provide a cheaper visitor when the constant offset tracking isn't needed. 3) It doesn't support non-instruction pointer values. The current functionality is exactly what is required to implement the SROA pointer-use visitors in terms of this one, rather than in terms of their own ad-hoc base visitor, which was always very poorly specified. SROA has been converted to use this, and the code there deleted which this utility now provides. Technically speaking, using this new visitor allows SROA to handle a few more cases than it previously did. It is now more aggressive in ignoring chains of instructions which look like they would defeat SROA, but in fact do not because they never result in a read or write of memory. While this is "neat", it shouldn't be interesting for real programs as any such chains should have been removed by others passes long before we get to SROA. As a consequence, I've not added any tests for these features -- it shouldn't be part of SROA's contract to perform such heroics. The goal is to extend the functionality of this visitor going forward, and re-use it from passes like ASan that can benefit from doing a detailed walk of the uses of a pointer. Thanks to Ben Kramer for the code review rounds and lots of help reviewing and debugging this patch. llvm-svn: 169728	2012-12-10 08:28:39 +00:00
Craig Topper	0f4945c76d	Teach DAG combine to handle vector add/sub with vectors of all 0s. llvm-svn: 169727	2012-12-10 08:12:29 +00:00
NAKAMURA Takumi	db0a1830f6	[CMake] TARGET_TRIPLE may be internal alias of LLVM_DEFAULT_TARGET_TRIPLE. llvm-svn: 169726	2012-12-10 07:14:29 +00:00
NAKAMURA Takumi	10a9cdfc27	[CMake] Update dependencies to intrinsics_gen corresponding to r169711. llvm-svn: 169724	2012-12-10 05:27:15 +00:00
Bill Wendling	5d24fa9e9d	Revert to old behavior until linker can pass export-dynamic option. llvm-svn: 169720	2012-12-10 02:51:16 +00:00
Chandler Carruth	c9b6bd9712	Fix PR14548: SROA was crashing on a mixture of i1 and i8 loads and stores. When SROA was evaluating a mixture of i1 and i8 loads and stores, in just a particular case, it would tickle a latent bug where we compared bits to bytes rather than bits to bits. As a consequence of the latent bug, we would allow integers through which were not byte-size multiples, a situation the later rewriting code was never intended to handle. In release builds this could trigger all manner of oddities, but the reported issue in PR14548 was forming invalid bitcast instructions. The only downside of this fix is that it makes it more clear that SROA in its current form is not capable of handling mixed i1 and i8 loads and stores. Sometimes with the previous code this would work by luck, but usually it would crash, so I'm not terribly worried. I'll watch the LNT numbers just to be sure. llvm-svn: 169719	2012-12-10 00:54:45 +00:00
Dmitri Gribenko	891cde588c	Documentation: convert ReleaseNotes.html to reST. Patch by Anthony Mykhailenko with small fixes by me. llvm-svn: 169714	2012-12-09 23:14:26 +00:00
Michael Ilseman	2f7539fd12	Reorganize FastMathFlags to be a wrapper around unsigned, and streamline some interfaces. llvm-svn: 169712	2012-12-09 21:12:04 +00:00
Paul Redmond	e43761293d	LoopVectorize: support vectorizing intrinsic calls - added function to VectorTargetTransformInfo to query cost of intrinsics - vectorize trivially vectorizable intrinsic calls such as sin, cos, log, etc. Reviewed by: Nadav llvm-svn: 169711	2012-12-09 20:42:17 +00:00
Michael Ilseman	92f7651045	Have the bitcode reader/writer just use FPMathOperator's fast math enum directly llvm-svn: 169710	2012-12-09 20:23:16 +00:00
Paul Redmond	b778deb83a	test commit. llvm-svn: 169709	2012-12-09 19:46:31 +00:00
Chris Lattner	ae27e4b10f	So many people have touched this, it doesn't make sense to ascribe authorship anymore. llvm-svn: 169704	2012-12-09 16:55:39 +00:00
Jakub Staszak	3375cd11b4	Use m_OneUse pattern instead of hasOneUse() method. No functionality change. llvm-svn: 169703	2012-12-09 16:06:44 +00:00
Sean Silva	099e0abea5	docs: Convert GarbageCollection.html to reST Patch by Alexander Zinenko! llvm-svn: 169702	2012-12-09 15:52:47 +00:00
Jakub Staszak	30bae3f07e	Remove trailing spaces. llvm-svn: 169701	2012-12-09 15:37:46 +00:00
Dmitri Gribenko	d029408178	Documentation: HowToReleaseLLVM.rst: remove trailing whitespace. llvm-svn: 169700	2012-12-09 15:33:26 +00:00

1 2 3 4 5 ...

87254 Commits