llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 20:43:44 +02:00

Author	SHA1	Message	Date
Richard Sandiford	9be3fcae69	[SystemZ] Optimize selects between 0 and -1 Since z has no setcc instruction as such, the choice of setBooleanContents is a bit arbitrary. Currently it's set to ZeroOrOneBooleanContent, so we produced a branch-free form when selecting between 0 and 1, but not when selecting between 0 and -1. This patch handles the latter case too. At some point I'd like to measure whether it's better to use conditional moves for constant selects on z196, but that's future work. llvm-svn: 196578	2013-12-06 09:53:09 +00:00
Kostya Serebryany	fc32a3e5d2	[asan] rewrite asan's stack frame layout Summary: Rewrite asan's stack frame layout. First, most of the stack layout logic is moved into a separte file to make it more testable and (potentially) useful for other projects. Second, make the frames more compact by using adaptive redzones (smaller for small objects, larger for large objects). Third, try to minimized gaps due to large alignments (this is hypothetical since today we don't see many stack vars aligned by more than 32). The frames indeed become more compact, but I'll still need to run more benchmarks before committing, but I am sking for review now to get early feedback. This change will be accompanied by a trivial change in compiler-rt tests to match the new frame sizes. Reviewers: samsonov, dvyukov Reviewed By: samsonov CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2324 llvm-svn: 196568	2013-12-06 09:00:17 +00:00
Juergen Ributzka	47e3315816	[Stackmap] Update stackmap unit test to use AnyRegCC. llvm-svn: 196552	2013-12-06 00:28:54 +00:00
Yi Jiang	0bd0569be5	Apply transformation on OS X 10.9+ and iOS 7.0+: pow(10, x) ―> __exp10(x) llvm-svn: 196544	2013-12-05 22:42:50 +00:00
Renato Golin	7c406f2a69	Move test to X86 dir Test is platform independent, but I don't want to force vector-width, or that could spoil the pragma test. llvm-svn: 196539	2013-12-05 21:45:39 +00:00
Renato Golin	a4d4a4c44f	Add #pragma vectorize enable/disable to LLVM The intended behaviour is to force vectorization on the presence of the flag (either turn on or off), and to continue the behaviour as expected in its absence. Tests were added to make sure the all cases are covered in opt. No tests were added in other tools with the assumption that they should use the PassManagerBuilder in the same way. This patch also removes the outdated -late-vectorize flag, which was on by default and not helping much. The pragma metadata is being attached to the same place as other loop metadata, but nothing forbids one from attaching it to a function (to enable #pragma optimize) or basic blocks (to hint the basic-block vectorizers), etc. The logic should be the same all around. Patches to Clang to produce the metadata will be produced after the initial implementation is agreed upon and committed. Patches to other vectorizers (such as SLP and BB) will be added once we're happy with the pass manager changes. llvm-svn: 196537	2013-12-05 21:20:02 +00:00
Yuchen Wu	dda2c44c38	llvm-cov: Changed extension from .llcov to .gcov. llvm-svn: 196530	2013-12-05 20:45:36 +00:00
Andrew Trick	192311ab9a	MI-Sched: handle latency of in-order operations with the new machine model. The per-operand machine model allows the target to define "unbuffered" processor resources. This change is a quick, cheap way to model stalls caused by the latency of operations that use such resources. This only applies when the processor's micro-op buffer size is non-zero (Out-of-Order). We can't precisely model in-order stalls during out-of-order execution, but this is an easy and effective heuristic. It benefits cortex-a9 scheduling when using the new machine model, which is not yet on by default. MI-Sched for armv7 was evaluated on Swift (and only not enabled because of a performance bug related to predication). However, we never evaluated Cortex-A9 performance on MI-Sched in its current form. This change adds MI-Sched functionality to reach performance goals on A9. The only remaining change is to allow MI-Sched to run as a PostRA pass. I evaluated performance using a set of options to estimate the performance impact once MI sched is default on armv7: -mcpu=cortex-a9 -disable-post-ra -misched-bench -scheditins=false For a simple saxpy loop I see a 1.7x speedup. Here are the llvm-testsuite results: (min run time over 2 runs, filtering tiny changes) Speedups: \| Benchmarks/BenchmarkGame/recursive \| 52.39% \| \| Benchmarks/VersaBench/beamformer \| 20.80% \| \| Benchmarks/Misc/pi \| 19.97% \| \| Benchmarks/Misc/mandel-2 \| 19.95% \| \| SPEC/CFP2000/188.ammp \| 18.72% \| \| Benchmarks/McCat/08-main/main \| 18.58% \| \| Benchmarks/Misc-C++/Large/sphereflake \| 18.46% \| \| Benchmarks/Olden/power \| 17.11% \| \| Benchmarks/Misc-C++/mandel-text \| 16.47% \| \| Benchmarks/Misc/oourafft \| 15.94% \| \| Benchmarks/Misc/flops-7 \| 14.99% \| \| Benchmarks/FreeBench/distray \| 14.26% \| \| SPEC/CFP2006/470.lbm \| 14.00% \| \| mediabench/mpeg2/mpeg2dec/mpeg2decode \| 12.28% \| \| Benchmarks/SmallPT/smallpt \| 10.36% \| \| Benchmarks/Misc-C++/Large/ray \| 8.97% \| \| Benchmarks/Misc/fp-convert \| 8.75% \| \| Benchmarks/Olden/perimeter \| 7.10% \| \| Benchmarks/Bullet/bullet \| 7.03% \| \| Benchmarks/Misc/mandel \| 6.75% \| \| Benchmarks/Olden/voronoi \| 6.26% \| \| Benchmarks/Misc/flops-8 \| 5.77% \| \| Benchmarks/Misc/matmul_f64_4x4 \| 5.19% \| \| Benchmarks/MiBench/security-rijndael \| 5.15% \| \| Benchmarks/Misc/flops-6 \| 5.10% \| \| Benchmarks/Olden/tsp \| 4.46% \| \| Benchmarks/MiBench/consumer-lame \| 4.28% \| \| Benchmarks/Misc/flops-5 \| 4.27% \| \| Benchmarks/mafft/pairlocalalign \| 4.19% \| \| Benchmarks/Misc/himenobmtxpa \| 4.07% \| \| Benchmarks/Misc/lowercase \| 4.06% \| \| SPEC/CFP2006/433.milc \| 3.99% \| \| Benchmarks/tramp3d-v4 \| 3.79% \| \| Benchmarks/FreeBench/pifft \| 3.66% \| \| Benchmarks/Ptrdist/ks \| 3.21% \| \| Benchmarks/Adobe-C++/loop_unroll \| 3.12% \| \| SPEC/CINT2000/175.vpr \| 3.12% \| \| Benchmarks/nbench \| 2.98% \| \| SPEC/CFP2000/183.equake \| 2.91% \| \| Benchmarks/Misc/perlin \| 2.85% \| \| Benchmarks/Misc/flops-1 \| 2.82% \| \| Benchmarks/Misc-C++-EH/spirit \| 2.80% \| \| Benchmarks/Misc/flops-2 \| 2.77% \| \| Benchmarks/NPB-serial/is \| 2.42% \| \| Benchmarks/ASC_Sequoia/CrystalMk \| 2.33% \| \| Benchmarks/BenchmarkGame/n-body \| 2.28% \| \| Benchmarks/SciMark2-C/scimark2 \| 2.27% \| \| Benchmarks/Olden/bh \| 2.03% \| \| skidmarks10/skidmarks \| 1.81% \| \| Benchmarks/Misc/flops \| 1.72% \| Slowdowns: \| Benchmarks/llubenchmark/llu \| -14.14% \| \| Benchmarks/Polybench/stencils/seidel-2d \| -5.67% \| \| Benchmarks/Adobe-C++/functionobjects \| -5.25% \| \| Benchmarks/Misc-C++/oopack_v1p8 \| -5.00% \| \| Benchmarks/Shootout/hash \| -2.35% \| \| Benchmarks/Prolangs-C++/ocean \| -2.01% \| \| Benchmarks/Polybench/medley/floyd-warshall \| -1.98% \| \| Polybench/linear-algebra/kernels/3mm \| -1.95% \| \| Benchmarks/McCat/09-vor/vor \| -1.68% \| llvm-svn: 196516	2013-12-05 17:55:58 +00:00
Arnold Schwaighofer	120880c780	SLPVectorizer: An in-tree vectorized entry cannot also be a scalar external use We were creating external uses for scalar values in MustGather entries that also had a ScalarToTreeEntry (they also are present in a vectorized tuple). This meant we would keep a value 'alive' as a scalar and vectorized causing havoc. This is not necessary because when we create a MustGather vector we explicitly create external uses entries for the insertelement instructions of the MustGather vector elements. Fixes PR18129. radar://15582184 llvm-svn: 196508	2013-12-05 15:14:40 +00:00
Kostya Serebryany	eb57b3e248	[tsan] fix PR18146: sometimes a variable written into vptr could have an integer type (after other optimizations) llvm-svn: 196507	2013-12-05 15:03:02 +00:00
Justin Holewinski	925169cb4e	[NVPTX] Fix off-by-one error when creating the VT list for an SDNode llvm-svn: 196503	2013-12-05 12:58:00 +00:00
Matheus Almeida	b651cddc0c	[mips] Small code generation improvement for conditional operator (select) in case the operands are constants and its difference is \|1\|. It should be possible in those cases to rematerialize the result using MIPS's slt and similar instructions. The small update to some of the tests in cmov.ll, sel1c.ll and sel2c.ll was needed otherwise the optimization implemented in this patch would have been triggered (difference between the operands was 1) and that would have changed the semantic of the tests. llvm-svn: 196498	2013-12-05 12:07:05 +00:00
Matheus Almeida	f0fc3cf095	[mips][msa] Fix issue with immediate fields of LD/ST instructions not being correctly encoded/decoded. In more detail, immediate fields of LD/ST instructions should be divided/multiplied by the size of the data format before encoding and after decoding, respectively. llvm-svn: 196494	2013-12-05 11:06:22 +00:00
Tim Northover	d04bb11dd7	ARM: fix yet another stack-folding bug We were trying to fold the stack adjustment into the wrong instruction in the situation where the entire basic-block was epilogue code. Really, it can only ever be valid to do the folding precisely where the "add sp, ..." would be placed so there's no need for a separate iterator to track that. Should fix PR18136. llvm-svn: 196493	2013-12-05 11:02:02 +00:00
Alp Toker	e845f8af67	Correct word hyphenations This patch tries to avoid unrelated changes other than fixing a few hyphen-related ambiguities and contractions in nearby lines. llvm-svn: 196471	2013-12-05 05:44:44 +00:00
Rafael Espindola	b4226966a9	Hide the stub created for MO_ExternalSymbol too. given declare void @llvm.memset.p0i8.i32(i8* nocapture, i8, i32, i32, i1) declare void @foo() define void @bar() { call void @foo() call void @llvm.memset.p0i8.i32(i8* null, i8 0, i32 188, i32 1, i1 false) ret void } We used to produce L_foo$stub: .indirect_symbol _foo .ascii "\364\364\364\364\364" _memset$stub: .indirect_symbol _memset .ascii "\364\364\364\364\364" We not produce a private stub for memset too. Stubs are not needed with recent linkers, but we still produce them for darwin8. Thanks to David Fang for confirming that gcc used to do this too. llvm-svn: 196468	2013-12-05 05:19:12 +00:00
Matt Arsenault	6f14dd54b4	R600/SI: Add comments for number of used registers. llvm-svn: 196467	2013-12-05 05:15:35 +00:00
NAKAMURA Takumi	44a125b7f6	Move llvm/test/MC/ELF/thumb-st_other.s to test/MC/ARM. llvm-svn: 196457	2013-12-05 02:21:44 +00:00
Jiangning Liu	7825595e77	For AArch64, add missing register cost calculation for big value types like v4i64 and v8i64. llvm-svn: 196456	2013-12-05 02:12:01 +00:00
Cameron McInally	00a0d8b6f3	Add FileCheck statements for r196435. llvm-svn: 196449	2013-12-05 01:20:36 +00:00
Eric Christopher	fe3790d105	Make these two tests resilient in the face of compile unit size changes. llvm-svn: 196444	2013-12-05 01:00:12 +00:00
Logan Chien	558333e1e1	[mc] Fix ELF st_other flag. ELF_Other_Weakref and ELF_Other_ThumbFunc seems to be LLVM internal ELF symbol flags. These should not be emitted to object file. This commit defines ELF_STO_Shift for the target-defined flags for st_other, and increase the value of ELF_Other_Shift to 16. llvm-svn: 196440	2013-12-05 00:34:11 +00:00
Cameron McInally	675f9245aa	Add AVX512 patterns for v16i32 broadcast and v2i64 zero extend load. Patch by Aleksey Bader. llvm-svn: 196435	2013-12-05 00:11:25 +00:00
Kevin Enderby	218f72b95b	Fix a bug in darwin's 32-bit X86 handling of evaluating fixups. Where it would use a scattered relocation entry but falls back to a normal relocation entry because the FixupOffset is more than 24-bits. The bug is in the X86MachObjectWriter::RecordScatteredRelocation() where it changes reference parameter FixedValue but then returns false to indicate it did not create a scattered relocation entry. The fix is simply to save the original value of the parameter FixedValue at the start of the method and restore it if we are returning false in that case. rdar://15526046 llvm-svn: 196432	2013-12-04 23:36:24 +00:00
David Peixotto	b6710ff7c7	Add support for parsing ARM symbol variants on ELF targets ARM symbol variants are written with parens instead of @ like this: .word __GLOBAL_I_a(target1) This commit adds support for parsing these symbol variants in expressions. We introduce a new flag to MCAsmInfo that indicates the parser should use parens to parse the symbol variant. The expression parser is modified to look for symbol variants using parens instead of @ when the corresponding MCAsmInfo flag is true. The MCAsmInfo parens flag is enabled only for ARM on ELF. By adding this flag to MCAsmInfo, we are able to get rid of redundant ARM-specific symbol variants and use the generic variants instead (e.g. VK_GOT instead of VK_ARM_GOT). We use the new UseParensForSymbolVariant attribute in MCAsmInfo to correctly print the symbol variants for arm. To achive this we need to keep a handle to the MCAsmInfo in the MCSymbolRefExpr class that we can check when printing the symbol variant. Updated Tests: Changed case of symbol variant to match the generic kind. test/CodeGen/ARM/tls-models.ll test/CodeGen/ARM/tls1.ll test/CodeGen/ARM/tls2.ll test/CodeGen/Thumb2/tls1.ll test/CodeGen/Thumb2/tls2.ll PR18080 llvm-svn: 196424	2013-12-04 22:43:20 +00:00
David Blaikie	399690c42e	DebugInfo: Improve test to use llvm-dwarfdump llvm-svn: 196396	2013-12-04 18:40:29 +00:00
David Blaikie	1c49e697e0	Test fix for r196394 llvm-svn: 196395	2013-12-04 18:34:28 +00:00
Cameron McInally	97a9fa294d	Fix assembly syntax for AVX512 vector blend instructions. llvm-svn: 196393	2013-12-04 18:05:36 +00:00
Cameron McInally	9c9a78a238	Suppress '(x < y) ? a : 0 -> (x < y) & a' transform on X86 architectures with dedicated mask registers. Patch by Aleksey Bader. llvm-svn: 196386	2013-12-04 14:52:33 +00:00
Daniel Jasper	ca41e63412	Un-revert r196358: "llvm-cov: Added support for function checksums." And add the proper fix. llvm-svn: 196367	2013-12-04 08:57:17 +00:00
Daniel Jasper	a7dd8af910	Revert r196358: "llvm-cov: Added support for function checksums." This currently breaks clang/test/CodeGen/code-coverage.c. The root cause is that the newly introduced access to Funcs[j] is out of bounds. llvm-svn: 196365	2013-12-04 08:23:33 +00:00
Kevin Qin	f5b717aa75	[AArch64 Neon] Add ACLE intrinsic vceqz_f64. llvm-svn: 196362	2013-12-04 08:02:34 +00:00
Kevin Qin	f93a2e8673	[AArch64 NEON] Add missing compare intrinsics. llvm-svn: 196360	2013-12-04 07:53:28 +00:00
Yuchen Wu	b1a23c9951	llvm-cov: Added support for function checksums. The function checksums are hashed from the concatenation of the function name and line number. llvm-svn: 196358	2013-12-04 06:00:17 +00:00
Rafael Espindola	f394b917ee	Produce deterministic coff files. llvm-svn: 196341	2013-12-04 02:02:55 +00:00
Rafael Espindola	94d08ca0e8	Add -mcpu=core2 to all llc invocations in this test. Should fix the atom buildbot. llvm-svn: 196340	2013-12-04 01:25:24 +00:00
Juergen Ributzka	8504aa2736	[Stackmap] Specify the triple and cpu to fix the unit test. llvm-svn: 196339	2013-12-04 01:02:37 +00:00
Juergen Ributzka	f7f5626671	[Stackmap] Emit multi-byte nops for X86. llvm-svn: 196334	2013-12-04 00:39:08 +00:00
Reed Kotler	45b4f281f2	final patch for very long conditional branches for mips16 constant islands. this completes the basic port of ARM constant islands to Mips16. More testing, code review, cleanup is in order but basically everything seems to be working. A bug in gas is preventing some of the runtime testing but I hope to resolve this soon. llvm-svn: 196331	2013-12-03 23:42:51 +00:00
NAKAMURA Takumi	63cd9136f2	check-llvm: Ask llvm-config about assertion mode, instead of llc. Add --assertion-mode to llvm-config. It emits ON or OFF according to NDEBUG. llvm-svn: 196329	2013-12-03 23:22:25 +00:00
Rafael Espindola	5a7ad862c1	Use CHECK-LABEL to make this test more strict. llvm-svn: 196321	2013-12-03 21:12:36 +00:00
Rafael Espindola	167cd3e1cb	Fix mingw32 thiscall + sret. Unlike msvc, when handling a thiscall + sret gcc will * Put the sret in %ecx * Put the this pointer is (%esp) This fixes, for example, calling stringstream::str. llvm-svn: 196312	2013-12-03 20:51:23 +00:00
Yuchen Wu	ad35ed9bc2	llvm-cov: Another fix to llvm-cov test. Copy all test files to temporary directory, not just test.* files. Tests didn't fail because the missing files occurred in XFAILS. llvm-svn: 196305	2013-12-03 19:05:03 +00:00
Yunzhong Gao	05c1966c8c	Teach the internalize pass to skip dllexported symbols because they could be referenced in a way that even the linker does not see. Differential Revision: http://llvm-reviews.chandlerc.com/D2280 llvm-svn: 196300	2013-12-03 18:05:14 +00:00
Arnold Schwaighofer	1f4eee9e2f	opt: Mirror vectorization presets of clang clang enables vectorization at optimization levels > 1 and size level < 2. opt should behave similarily. Loop vectorization and SLP vectorization can be disabled with the flags -disable-(loop/slp)-vectorization. llvm-svn: 196294	2013-12-03 16:33:06 +00:00
Renato Golin	3e3ed5a197	Fix lit config for disabled MCJIT tests on ARM Separating permanent from temporary targets, added the bug that will fix the temporary (PR18057). llvm-svn: 196274	2013-12-03 13:48:28 +00:00
James Molloy	674f480ca4	Addrspacecasts are no-ops on ARM. Testcase added. llvm-svn: 196269	2013-12-03 11:23:11 +00:00
Richard Sandiford	f84d58699b	[SystemZ] Fix choice of known-zero mask in insertion optimization The backend converts 64-bit ORs into subreg moves if the upper 32 bits of one operand and the low 32 bits of the other are known to be zero. It then tries to peel away redundant ANDs from the upper 32 bits. Since AND masks are canonicalized to exclude known-zero bits, the test ORs the mask and the known-zero bits together before checking for redundancy. The problem was that it was using the wrong node when checking for known-zero bits, so could drop ANDs that were still needed. llvm-svn: 196267	2013-12-03 11:01:54 +00:00
Michael Liao	fecace10ac	Enhance the fix of PR17631 - The fix to PR17631 fixes part of the cases where 'vzeroupper' should not be issued before 'call' insn. There're other cases where helper calls will be inserted not limited to epilog. These helper calls do not follow the standard calling convention and won't clobber any YMM registers. (So far, all call conventions will clobber any or part of YMM registers.) This patch enhances the previous fix to cover more cases 'vzerosupper' should not be inserted by checking if that function call won't clobber any YMM registers and skipping it if so. llvm-svn: 196261	2013-12-03 09:17:32 +00:00
Renato Golin	fb4bd004c0	Disable Remote MCJIT tests on ARM The communication protocol is unstable on ARM when compiled with Clang, which is disrupting the self-hosting buildbots that are going to be added this week. I'm working on a solution, but remote MCJIT is not high-priority for ARM at the moment, so it might take a while. llvm-svn: 196257	2013-12-03 08:39:15 +00:00

1 2 3 4 5 ...

21891 Commits