llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 04:02:41 +01:00

Author	SHA1	Message	Date
Bradley Smith	40ea4329b1	[ARM64] Ensure immediates in extend operands are in a valid range Also emit a more useful diagnostic when they are not. llvm-svn: 208318	2014-05-08 14:12:12 +00:00
Bradley Smith	0bcfb4a0bc	[ARM64] Check for proper immediate in shift/extend operands llvm-svn: 208317	2014-05-08 14:11:16 +00:00
Christian Pirker	35d96c7f86	ARM big endian function argument passing llvm-svn: 208316	2014-05-08 14:06:24 +00:00
Hal Finkel	faaba5686b	Fix a spelling error llvm-svn: 208314	2014-05-08 13:42:57 +00:00
Daniel Sanders	c6c9c916df	[mips] Implement l[wd]c3, and s[wd]c3. Summary: These instructions were added in MIPS-I, and MIPS-II but were removed in MIPS-III. Interestingly, GAS continues to accept them when assembling for MIPS-III. For the moment, these instructions will follow GAS and accept them for MIPS-III and newer but this will be tightened up when the invalid-*.s tests are added. Depends on D3647 Reviewers: vmedic Reviewed By: vmedic Differential Revision: http://reviews.llvm.org/D3648 llvm-svn: 208311	2014-05-08 13:02:11 +00:00
Ed Maste	9040058e24	Add isOSFreeBSD triple test For http://reviews.llvm.org/D3448 llvm-svn: 208309	2014-05-08 13:00:15 +00:00
Dario Domizioli	df090599ce	Revert test commit. Removed blank line. llvm-svn: 208308	2014-05-08 12:54:43 +00:00
James Molloy	294269a69e	[ARM64-BE] Teach fast-isel about how to set up sub-word stack arguments for big endian calls. SelectionDAG already knows about this, but fast-isel was ignorant. llvm-svn: 208307	2014-05-08 12:53:50 +00:00
Daniel Sanders	8071a219e6	[mips] Marked up instructions added in MIPS-II and tested that IAS for -mcpu=mips1 does not accept them Summary: A small number of instructions are rejected with the wrong error message. These have been placed in a separate test for now. There seems to be some parsing quirk that triggers when these instructions are disabled. Depends on D3571 Reviewers: vmedic Reviewed By: vmedic Differential Revision: http://reviews.llvm.org/D3647 llvm-svn: 208305	2014-05-08 12:40:48 +00:00
Daniel Sanders	94fae7d980	[mips] Implement tlbp, tlbr, tlbwi, and tlbwr Reviewers: vmedic, dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D3571 llvm-svn: 208301	2014-05-08 11:51:18 +00:00
Dario Domizioli	97c0aef837	Test commit. Added blank line. llvm-svn: 208298	2014-05-08 11:28:14 +00:00
Tim Northover	72838ce201	ARM64: make sure FastISel emits SSA MachineInstrs We need to use a temporary register for a 2-step operation like REM. llvm-svn: 208297	2014-05-08 10:30:56 +00:00
Evgeniy Stepanov	196ad52640	[asan] Preserve flags in asm instrumentation. Patch by Yuri Gorshenin. llvm-svn: 208296	2014-05-08 09:55:24 +00:00
Daniel Sanders	237454fcb1	Use a vector of unique_ptrs to fix a memory leak introduced in r208179. Also removed an inaccurate comment that stated that a DenseMap was used as storage for the ListInit's. It's currently using a FoldingSet. I expect there's a better way to fix this but I haven't found it yet. FoldingSet is incompatible with the Pool template and I'm not sure if FoldingSet can be safely replaced with a DenseMap of computed FoldingSetID's to ListInit's. llvm-svn: 208293	2014-05-08 09:29:28 +00:00
Hal Finkel	c52e65b830	Move late partial-unrolling thresholds into the processor definitions The old method used by X86TTI to determine partial-unrolling thresholds was messy (because it worked by testing target features), and also would not correctly identify the target CPU if certain target features were disabled. After some discussions on IRC with Chandler et al., it was decided that the processor scheduling models were the right containers for this information (because it is often tied to special uop dispatch-buffer sizes). This does represent a small functionality change: - For generic x86-64 (which uses the SB model and, thus, will get some unrolling). - For AMD cores (because they still currently use the SB scheduling model) - For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump the default threshold to 50; we're working on a test case for this). Otherwise, nothing has changed for any other targets. The logic, however, has been moved into BasicTTI, so other targets may now also opt-in to this functionality simply by setting LoopMicroOpBufferSize in their processor model definitions. llvm-svn: 208289	2014-05-08 09:14:44 +00:00
Tobias Grosser	358e9a97e7	Revert "SCEV: Use I = vector<>.erase(I) to iterate and delete at the same time" as committed in r208282. The original commit was incorrect. llvm-svn: 208286	2014-05-08 07:55:34 +00:00
Hao Liu	be513c440d	AArch64/ARM64: Port NEON post-increment load/store with 2/3/4 vectors to ARM64 backend. llvm-svn: 208284	2014-05-08 07:38:13 +00:00
Tobias Grosser	4c447db9fb	SCEV: Use I = vector<>.erase(I) to iterate and delete at the same time llvm-svn: 208282	2014-05-08 07:12:44 +00:00
Richard Smith	34ed6bf95c	[modules] Add missing #include. llvm-svn: 208276	2014-05-08 02:34:32 +00:00
Saleem Abdulrasool	fffc610ca8	test: fix silly typo Oh silly Darwin and your case insensitive file system. llvm-svn: 208274	2014-05-08 01:41:04 +00:00
Saleem Abdulrasool	84a61727f4	ARM: support FK_SecRel_2 relocations on WoA This adds FK_SecRel_2 relocation support to ARM. This enables the building of object files for armv7-windows-msvc which enables CodeView line tables for debugging as opposed to armv7-windows-itanium which currently uses DWARF. llvm-svn: 208273	2014-05-08 01:35:57 +00:00
Richard Smith	90c10bd484	Simplify and fix incorrect comment. No functionality change. llvm-svn: 208272	2014-05-08 01:08:43 +00:00
Filipe Cabecinhas	275860c4fd	Lower certain build_vectors to insertps instructions Summary: Vectors built with zeros and elements in the same order as another (source) vector are optimized to be built using a single insertps instruction. Also optimize when we move one element in a vector to a different place in that vector while zeroing out some of the other elements. Further optimizations are possible, described in TODO comments. I will be implementing at least some of them in the near future. Added some tests for different cases where this optimization triggers. Reviewers: nadav, delena, craig.topper Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3521 llvm-svn: 208271	2014-05-08 00:25:16 +00:00
Lang Hames	eb0e367f97	Back out r208257 while I investigate tester failures. llvm-svn: 208267	2014-05-07 23:35:53 +00:00
Duncan P. N. Exon Smith	f5258e4d2c	GlobalValue: Assert symbols with local linkage have default visibility The change to ExtractGV.cpp has no functionality change except to avoid the asserts. Existing testcases already cover this, so I didn't add a new one. llvm-svn: 208264	2014-05-07 23:00:22 +00:00
Duncan P. N. Exon Smith	c74b2b0974	IR: Don't allow non-default visibility on local linkage Visibilities of `hidden` and `protected` are meaningless for symbols with local linkage. - Change the assembler to reject non-default visibility on symbols with local linkage. - Change the bitcode reader to auto-upgrade `hidden` and `protected` to `default` when the linkage is local. - Update LangRef. <rdar://problem/16141113> llvm-svn: 208263	2014-05-07 22:57:20 +00:00
Duncan P. N. Exon Smith	bba2550124	LTO: Assert visibility of local linkage when merging symbols `ModuleLinker::getLinkageResult()` shouldn't create symbols with local linkage and non-default visibility -- in fact, symbols with local linkage shouldn't be merged at all. Assert to that effect. llvm-svn: 208262	2014-05-07 22:55:46 +00:00
Duncan P. N. Exon Smith	9739ca9e59	LTO: Check local linkage first Since visibility is meaningless for symbols with local linkage, check local linkage before visibility when setting symbol attributes. When linkage is `internal` and the visibility is `hidden`, the exposed attribute is now `LTO_SYMBOL_SCOPE_INTERNAL` instead of `LTO_SYMBOL_SCOPE_HIDDEN`. Although the bitfield allows both to be specified, the combination is nonsense anyway. Given changes (in progress) to drop visibility when a symbol has local linkage, this almost has no functionality change: it's mostly a cleanup to clarify the logic. The exception is when something has `appending` linkage. Before this change, such symbols would be advertised as `LTO_SYMBOL_SCOPE_INTERNAL`; now, they'll be given `LTO_SYMBOL_SCOPE_COMMON`. Unfortunately this is really awkward to test. This only changes what we advertise to linkers (before running LTO), not what the final object looks like. In theory I could add `DEBUG` output to `llvm-lto` (and test with "REQUIRES: asserts"), but follow-up commits to disallow `internal hidden` simplify this anyway. <rdar://problem/16141113> llvm-svn: 208261	2014-05-07 22:53:14 +00:00
Quentin Colombet	548eb2e304	[X86] Add a test case for r208252. Prior to r208252, the FMA 231 family was marked as isCommutable. However the memory variants of this family are not commutable. Therefore, we did not implemented the findCommutedOpIndices for those variants and missed that the default implementation (more or less: commute indices 1 and 2) was firing behind our back. As a result, as demonstrated in the test case before the fix, we were transforming a = b * c + a into a = a * c + b. I.e., before r208252 we were generating for this test case: vmovaps %xmm0, %xmm1 vmoss (%rsi), %xmm0 vfmadd231ss (%rdi), %xmm1, %xmm0 Instead of: vmoss (%rsi), %xmm1 vfmadd231ss (%rdi), %xmm1, %xmm0 <rdar://problem/16800495> llvm-svn: 208260	2014-05-07 22:52:58 +00:00
Lang Hames	0dd4713eee	[RuntimeDyld] Make RuntimeDyldImpl::resolveExternalSymbols preserve the relocation entries it applies. Prior to this patch, RuntimeDyldImpl::resolveExternalSymbols discarded relocations for external symbols once they had been applied. This causes issues if the client calls MCJIT::finalizeLoadedModules more than once, and updates the location of any symbols in between (e.g. by calling MCJIT::mapSectionAddress). No test case yet: None of our in-tree memory managers support moving sections around. I'll have to hack up a dummy memory manager before I can write a unit test. Fixes <rdar://problem/16764378> llvm-svn: 208257	2014-05-07 22:34:08 +00:00
Hal Finkel	a22ec95e68	[X86TTI] Remove the unrolling branch limits The loop stream detector (LSD) on modern Intel cores, which optimizes the execution of small loops, has limits on the number of taken branches in addition to uop-count limits (modern AMD cores have similar limits). Unfortunately, at the IR level, estimating the number of branches that will be taken is difficult. For one thing, it strongly depends on later passes (block placement, etc.). The original implementation took a conservative approach and limited the maximal BB DFS depth of the loop. However, fairly-extensive benchmarking by several of us has revealed that this is the wrong approach. In fact, there are zero known cases where the branch limit prevents a detrimental unrolling (but plenty of cases where it does prevent beneficial unrolling). While we could improve the current branch counting logic by incorporating branch probabilities, this further complication seems unjustified without a motivating regression. Instead, unless and until a regression appears, the branch counting will be removed. llvm-svn: 208255	2014-05-07 22:25:18 +00:00
Justin Bogner	81ce7e44c7	llvm-cov: Fix some funny indentation (NFC) Noticed by Duncan Exon Smith. Thanks! llvm-svn: 208253	2014-05-07 21:50:43 +00:00
Quentin Colombet	9b13d839be	[X86] Selectively mark the FMA variants inside a family as isCommutable. Given a FMA family (e.g., 213, 231), not all the variants (i.e., register or memory) are commutable. E.g., for the 213 family (with the syntax src1, src2, src3): fmaXXX213 A, B, reg3/mem3 == fmaXXX213 B, A, reg3/mem3 Now consider the 231 family: fmaXXX231 A, B, reg3 == fmaXXX231 A, reg3, B But fmaXXX231 A, B, mem3 != fmaXXX231 A, mem3, B Indeed, mem3 cannot be the second argument of the memory variant of fmaXXX231. Working on a reduced test case! <rdar://problem/16800495> llvm-svn: 208252	2014-05-07 21:43:35 +00:00
Eric Christopher	91701a28d0	Reformat a couple of functions for clarity. llvm-svn: 208248	2014-05-07 21:05:47 +00:00
Nico Weber	4dc9c7ea6b	Let OnDiskHashTable call the destructor of its Items. OnDiskHashTable::insert() calls the Item constructor via placement new, but nothing called the destructor. This matters in cases when the Info template parameter has key_type or data_type typedefs that have a destructor, for example like IdentifierIndexWriterTrait in clang's GlobalModuleIndex.cpp. This fixes a 5-year old bug that's been around since the OnDiskHashTable code was added in r64192. Bug found by LSan! llvm-svn: 208243	2014-05-07 19:55:38 +00:00
Rafael Espindola	290b146df0	Replace a virtual with an override. llvm-svn: 208242	2014-05-07 19:52:32 +00:00
Jyotsna Verma	c9cece4644	[Hexagon] Add New TSFlags to be used in the upcoming patches. llvm-svn: 208239	2014-05-07 19:07:34 +00:00
Sebastian Pop	866eb1eecf	avoid segfaulting Quotient and Remainder don't have to be initialized. llvm-svn: 208238	2014-05-07 19:00:37 +00:00
Sebastian Pop	8f355b84ae	do not collect undef terms llvm-svn: 208237	2014-05-07 19:00:32 +00:00
Matt Arsenault	903ece3700	Fix using wrong result type for setcc. When reducing the bitwidth of a comparison against a constant, the original setcc's result type was used, which was incorrect. No test since I don't think any other in tree targets change the bitwidth of the setcc type depending on the bitwidth of the compared type. llvm-svn: 208236	2014-05-07 18:26:58 +00:00
Eric Christopher	423c1a5415	Debug.h already includes raw_ostream.h, no need to include it again. llvm-svn: 208235	2014-05-07 18:19:04 +00:00
Adam Nemet	9c5e483a57	[Test] Remove c-index-test from the list of substitutions All the tests are under the clang tests and none should be under llvm moving forward. The topic was discussed in this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140428/214905.html llvm-svn: 208234	2014-05-07 18:16:02 +00:00
Sebastian Pop	d5cb815565	split delinearization pass in 3 steps To compute the dimensions of the array in a unique way, we split the delinearization analysis in three steps: - find parametric terms in all memory access functions - compute the array dimensions from the set of terms - compute the delinearized access functions for each dimension The first step is executed on all the memory access functions such that we gather all the patterns in which an array is accessed. The second step reduces all this information in a unique description of the sizes of the array. The third step is delinearizing each memory access function following the common description of the shape of the array computed in step 2. This rewrite of the delinearization pass also solves a problem we had with the previous implementation: because the previous algorithm was by induction on the structure of the SCEV, it would not correctly recognize the shape of the array when the memory access was not following the nesting of the loops: for example, see polly/test/ScopInfo/multidim_only_ivs_3d_reverse.ll ; void foo(long n, long m, long o, double A[n][m][o]) { ; ; for (long i = 0; i < n; i++) ; for (long j = 0; j < m; j++) ; for (long k = 0; k < o; k++) ; A[i][k][j] = 1.0; Starting with this patch we no longer delinearize access functions that do not contain parameters, for example in test/Analysis/DependenceAnalysis/GCD.ll ;; for (long int i = 0; i < 100; i++) ;; for (long int j = 0; j < 100; j++) { ;; A[2i - 4j] = i; ;; B++ = A[6i + 8*j]; these accesses will not be delinearized as the upper bound of the loops are constants, and their access functions do not contain SCEVUnknown parameters. llvm-svn: 208232	2014-05-07 18:01:20 +00:00
Chandler Carruth	17e2aa3c0a	[x86] Make the 'x86-64' cpu, what I see as and many use as the generic default architecture for reasonable modern x86 processors, actually be modern. This processor model should essentially be "tuned" for modern x86 chips as much as possible without undue penalties on any specific architecture. Previously we weren't even using the nice scheduling models. There are a few other tweaks needed here, but this change at least I have benchmarked across a decent swatch of chips (intel's clovertown, westmere, and sandybridge; amd's istanbul) and seen no significant regressions. If anyone has suggested ways to test this, just let me know. Somewhat alarmingly, no existing tests failed. llvm-svn: 208230	2014-05-07 17:37:03 +00:00
Chandler Carruth	36ad1ec2b0	Tidy up whitespace with clang-format prior to making significant changes. llvm-svn: 208229	2014-05-07 17:36:59 +00:00
Simon Atanasyan	a8319970e6	[yaml2obj] Support ELF x86 relocations. llvm-svn: 208228	2014-05-07 17:06:38 +00:00
Rafael Espindola	643c61edc3	Style update: don't duplicate the function name. llvm-svn: 208227	2014-05-07 17:04:45 +00:00
Alexey Samsonov	0b0a2744cc	[CMake] Add build rules for llvm-PerfectShuffle utility llvm-svn: 208225	2014-05-07 16:54:00 +00:00
Rafael Espindola	04d183f259	Style update: don't duplicate the function name. llvm-svn: 208224	2014-05-07 16:43:23 +00:00
Chad Rosier	da933c5aba	[ARM64][fast-isel] Disable target specific optimizations at -O0. Functionally, this patch disables the dead register elimination pass and the load/store pair optimization pass at -O0. The ILP optimizations don't require the optimization level to be checked because the call to addILPOpts is predicated with the necessary check. The AdvSIMDScalar pass is disabled by default at all optimization levels. This patch leaves that pass disabled by default. Also, move command-line options into ARM64TargetMachine.cpp and add a few additional flags to aid in debugging. This fixes an issue with the -debug-pass=Structure flag where passes were printed, but not actually run (i.e., AdvSIMDScalar pass). llvm-svn: 208223	2014-05-07 16:41:55 +00:00

1 2 3 4 5 ...

103299 Commits