llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 19:52:54 +01:00

Author	SHA1	Message	Date
Krzysztof Parzyszek	55f2053e29	[Hexagon] Remove incorrect intrinsic definition and invalid testcase The intrinsic int_hexagon_S2_asr_i_vh was mapped to S2_asr_r_vh, which is wrong. The testcase vasrh.select.ll was using an invalid immediate for that intrinsic. This is not a proper testcase, since at the MIR level such use of this intrinsic should never appear. Together with 824b25fc02, this completes the fix for llvm.org/PR44090.	2019-11-21 09:18:15 -06:00
Miloš Stojanović	914eac4e5e	[mips] Add a 'generic' Mips CPU Having a generic CPU removes a warning when creating a subtarget without the CPU being explicitly specified. Differential Revision: https://reviews.llvm.org/D70490	2019-11-21 15:17:21 +01:00
Clement Courbet	b2fbb18bcf	[DAGCombiner] Use the right thumbv7meb triple for ARM big-endian test.	2019-11-21 15:07:35 +01:00
Sjoerd Meijer	34c7cd0723	[LV] PreferPredicateOverEpilog respecting option Follow-up of cb47b8783: don't query TTI->preferPredicateOverEpilogue when option -prefer-predicate-over-epilog is set to false, i.e. when we prefer not to predicate the loop. Differential Revision: https://reviews.llvm.org/D70382	2019-11-21 14:06:10 +00:00
Alexey Lapshin	f3d2e77d90	[Debuginfo][NFC] removes redundant semicolon.	2019-11-21 16:16:24 +03:00
Dmitri Gribenko	3f3210569f	Make coding standards document more inclusive Summary: Patch by Doug Gregor, Tres Popp, and Dmitri Gribenko. Reviewers: chandlerc Subscribers: hfinkel, bmcreusillet, arsenm, doug.gregor, mgrang, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69354	2019-11-21 13:37:17 +01:00
Clement Courbet	1db10b92d1	[DAGCombiner] Add tests for thumb load-combine.	2019-11-21 13:12:12 +01:00
Simon Pilgrim	d53b84ad2b	Statistic - Fix MSVC shadow warning against global PrintOnExit static variable. NFC.	2019-11-21 12:08:01 +00:00
Ilya Biryukov	a1512ae17b	Reland 9f3fdb0d7fab: [Driver] Use VFS to check if sanitizer blacklists exist With updates to various LLVM tools that use SpecialCastList. It was tempting to use RealFileSystem as the default, but that makes it too easy to accidentally forget passing VFS in clang code.	2019-11-21 11:56:09 +01:00
Pavel Labath	290efde9ec	dwarfdump --statistics: Use new location list api Summary: This patch removes manual location list handling in the statistics code and replaces it with the new DWARFDie api, which provides access to a "cooked" location list. This has the following effects: - the code now properly handles split-dwarf location lists - it will automatically support dwarf5 location lists once support for those is added - it properly handles location lists with base address selection entries - it fixes a bug where the location list code was using the first DW_AT_ranges range as a "base address" of the compile unit (it should have used DW_AT_low_pc instead. The effect of this was that the computation of the start address of a variable in its scope was broken for these kinds of compile units. This only manifested itself on linked files, since in object files the first DW_AT_ranges range normally starts at 0. Since pretty much every kind of location list was broken in some way, it's hard to verify that the new implementation is correct -- the output will be different in all non-trivial cases, and mostly with good reason. Most of the existing statistics tests continue to pass though, and a visual inspection of the statistics for non-trivial inputs shows that the data is more "reasonable" now. I have updated the "dwo statistics" test to include the new numbers, as the previous ones were completely bogus, and I have added a targeted test for the "base address" bug. Reviewers: dblaikie, cmtice, vsk Subscribers: aprantl, SouraVX, JDevlieghere, djtodoro, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70444	2019-11-21 11:55:21 +01:00
Simon Atanasyan	22c21b278b	[mips] Rename test case. NFC	2019-11-21 13:50:15 +03:00
Simon Atanasyan	99e3c4bcdd	[mips] Remove unused `IsPCRelativeLoad` MIPS instructions attribute. NFC This attribute is always set to zero.	2019-11-21 13:50:14 +03:00
Simon Atanasyan	3fe3405f88	[mips] Remove addresses from the test case. NFC It reduces "diff" after addition more tests in the future.	2019-11-21 13:50:14 +03:00
Benjamin Kramer	f571493eae	Revert "[DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151 " Summary: Revert "[DependenceAnalysis] Dependecies for loads marked with "ivnariant.load" should not be shared with general accesses. Fix for https://bugs.llvm.org/show_bug.cgi?id=42151" This reverts commit 5f026b6d9e882941fde9b7e5dc0a2d807f7f24f5. We're (tensorflow.org/xla team) seeing some misscompiles with the new change, only at -O3, with fast math disabled. I'm still trying to come up with a useful/small/external example, but for now, the following IR: ``` ; ModuleID = '__compute_module' source_filename = "__compute_module" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-grtev4-linux-gnu" @0 = private unnamed_addr constant [4 x i8] c"\DB\0F\C9@" @1 = private unnamed_addr constant [4 x i8] c"\00\00\00?" ; Function Attrs: uwtable define void @jit_wrapped_fun.31(i8* %retval, i8* noalias %run_options, i8 noalias %params, i8 noalias %buffer_table, i64* noalias %prof_counters) #0 { entry: %fusion.invar_address.dim.2 = alloca i64 %fusion.invar_address.dim.1 = alloca i64 %fusion.invar_address.dim.0 = alloca i64 %fusion.1.invar_address.dim.2 = alloca i64 %fusion.1.invar_address.dim.1 = alloca i64 %fusion.1.invar_address.dim.0 = alloca i64 %0 = getelementptr inbounds i8, i8* %buffer_table, i64 1 %1 = load i8, i8* %0, !invariant.load !0, !dereferenceable !1, !align !2 %parameter.3 = bitcast i8* %1 to [2 x [1 x [4 x float]]]* %2 = getelementptr inbounds i8, i8* %buffer_table, i64 5 %3 = load i8, i8* %2, !invariant.load !0, !dereferenceable !1, !align !2 %fusion.1 = bitcast i8* %3 to [2 x [1 x [4 x float]]]* store i64 0, i64* %fusion.1.invar_address.dim.0 br label %fusion.1.loop_header.dim.0 fusion.1.loop_header.dim.0: ; preds = %fusion.1.loop_exit.dim.1, %entry %fusion.1.indvar.dim.0 = load i64, i64* %fusion.1.invar_address.dim.0 %4 = icmp uge i64 %fusion.1.indvar.dim.0, 2 br i1 %4, label %fusion.1.loop_exit.dim.0, label %fusion.1.loop_body.dim.0 fusion.1.loop_body.dim.0: ; preds = %fusion.1.loop_header.dim.0 store i64 0, i64* %fusion.1.invar_address.dim.1 br label %fusion.1.loop_header.dim.1 fusion.1.loop_header.dim.1: ; preds = %fusion.1.loop_exit.dim.2, %fusion.1.loop_body.dim.0 %fusion.1.indvar.dim.1 = load i64, i64* %fusion.1.invar_address.dim.1 %5 = icmp uge i64 %fusion.1.indvar.dim.1, 1 br i1 %5, label %fusion.1.loop_exit.dim.1, label %fusion.1.loop_body.dim.1 fusion.1.loop_body.dim.1: ; preds = %fusion.1.loop_header.dim.1 store i64 0, i64* %fusion.1.invar_address.dim.2 br label %fusion.1.loop_header.dim.2 fusion.1.loop_header.dim.2: ; preds = %fusion.1.loop_body.dim.2, %fusion.1.loop_body.dim.1 %fusion.1.indvar.dim.2 = load i64, i64* %fusion.1.invar_address.dim.2 %6 = icmp uge i64 %fusion.1.indvar.dim.2, 4 br i1 %6, label %fusion.1.loop_exit.dim.2, label %fusion.1.loop_body.dim.2 fusion.1.loop_body.dim.2: ; preds = %fusion.1.loop_header.dim.2 %7 = load float, float* bitcast ([4 x i8]* @0 to float) %8 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]] %parameter.3, i64 0, i64 %fusion.1.indvar.dim.0, i64 0, i64 %fusion.1.indvar.dim.2 %9 = load float, float* %8, !invariant.load !0, !noalias !3 %10 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %parameter.3, i64 0, i64 %fusion.1.indvar.dim.0, i64 0, i64 %fusion.1.indvar.dim.2 %11 = load float, float* %10, !invariant.load !0, !noalias !3 %12 = fmul float %9, %11 %13 = fmul float %7, %12 %14 = call float @llvm.log.f32(float %13) %15 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %fusion.1, i64 0, i64 %fusion.1.indvar.dim.0, i64 0, i64 %fusion.1.indvar.dim.2 store float %14, float* %15, !alias.scope !7, !noalias !8 %invar.inc2 = add nuw nsw i64 %fusion.1.indvar.dim.2, 1 store i64 %invar.inc2, i64* %fusion.1.invar_address.dim.2 br label %fusion.1.loop_header.dim.2 fusion.1.loop_exit.dim.2: ; preds = %fusion.1.loop_header.dim.2 %invar.inc1 = add nuw nsw i64 %fusion.1.indvar.dim.1, 1 store i64 %invar.inc1, i64* %fusion.1.invar_address.dim.1 br label %fusion.1.loop_header.dim.1 fusion.1.loop_exit.dim.1: ; preds = %fusion.1.loop_header.dim.1 %invar.inc = add nuw nsw i64 %fusion.1.indvar.dim.0, 1 store i64 %invar.inc, i64* %fusion.1.invar_address.dim.0 br label %fusion.1.loop_header.dim.0 fusion.1.loop_exit.dim.0: ; preds = %fusion.1.loop_header.dim.0 %16 = getelementptr inbounds i8, i8* %buffer_table, i64 4 %17 = load i8, i8* %16, !invariant.load !0, !dereferenceable !9, !align !2 %parameter.1 = bitcast i8* %17 to float* %18 = getelementptr inbounds i8, i8* %buffer_table, i64 2 %19 = load i8, i8* %18, !invariant.load !0, !dereferenceable !10, !align !2 %parameter.2 = bitcast i8* %19 to [3 x [1 x float]]* %20 = getelementptr inbounds i8, i8* %buffer_table, i64 0 %21 = load i8, i8* %20, !invariant.load !0, !dereferenceable !11, !align !2 %fusion = bitcast i8* %21 to [2 x [3 x [4 x float]]]* store i64 0, i64* %fusion.invar_address.dim.0 br label %fusion.loop_header.dim.0 fusion.loop_header.dim.0: ; preds = %fusion.loop_exit.dim.1, %fusion.1.loop_exit.dim.0 %fusion.indvar.dim.0 = load i64, i64* %fusion.invar_address.dim.0 %22 = icmp uge i64 %fusion.indvar.dim.0, 2 br i1 %22, label %fusion.loop_exit.dim.0, label %fusion.loop_body.dim.0 fusion.loop_body.dim.0: ; preds = %fusion.loop_header.dim.0 store i64 0, i64* %fusion.invar_address.dim.1 br label %fusion.loop_header.dim.1 fusion.loop_header.dim.1: ; preds = %fusion.loop_exit.dim.2, %fusion.loop_body.dim.0 %fusion.indvar.dim.1 = load i64, i64* %fusion.invar_address.dim.1 %23 = icmp uge i64 %fusion.indvar.dim.1, 3 br i1 %23, label %fusion.loop_exit.dim.1, label %fusion.loop_body.dim.1 fusion.loop_body.dim.1: ; preds = %fusion.loop_header.dim.1 store i64 0, i64* %fusion.invar_address.dim.2 br label %fusion.loop_header.dim.2 fusion.loop_header.dim.2: ; preds = %fusion.loop_body.dim.2, %fusion.loop_body.dim.1 %fusion.indvar.dim.2 = load i64, i64* %fusion.invar_address.dim.2 %24 = icmp uge i64 %fusion.indvar.dim.2, 4 br i1 %24, label %fusion.loop_exit.dim.2, label %fusion.loop_body.dim.2 fusion.loop_body.dim.2: ; preds = %fusion.loop_header.dim.2 %25 = mul nuw nsw i64 %fusion.indvar.dim.2, 1 %26 = add nuw nsw i64 0, %25 %27 = udiv i64 %26, 4 %28 = mul nuw nsw i64 %fusion.indvar.dim.0, 1 %29 = add nuw nsw i64 0, %28 %30 = udiv i64 %29, 2 %31 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %fusion.1, i64 0, i64 %29, i64 0, i64 %26 %32 = load float, float* %31, !alias.scope !7, !noalias !8 %33 = mul nuw nsw i64 %fusion.indvar.dim.1, 1 %34 = add nuw nsw i64 0, %33 %35 = udiv i64 %34, 3 %36 = load float, float* %parameter.1, !invariant.load !0, !noalias !3 %37 = getelementptr inbounds [3 x [1 x float]], [3 x [1 x float]]* %parameter.2, i64 0, i64 %34, i64 0 %38 = load float, float* %37, !invariant.load !0, !noalias !3 %39 = fsub float %36, %38 %40 = fmul float %39, %39 %41 = mul nuw nsw i64 %fusion.indvar.dim.2, 1 %42 = add nuw nsw i64 0, %41 %43 = udiv i64 %42, 4 %44 = mul nuw nsw i64 %fusion.indvar.dim.0, 1 %45 = add nuw nsw i64 0, %44 %46 = udiv i64 %45, 2 %47 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %parameter.3, i64 0, i64 %45, i64 0, i64 %42 %48 = load float, float* %47, !invariant.load !0, !noalias !3 %49 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %parameter.3, i64 0, i64 %45, i64 0, i64 %42 %50 = load float, float* %49, !invariant.load !0, !noalias !3 %51 = fmul float %48, %50 %52 = fdiv float %40, %51 %53 = fadd float %32, %52 %54 = fneg float %53 %55 = load float, float* bitcast ([4 x i8]* @1 to float) %56 = fmul float %54, %55 %57 = getelementptr inbounds [2 x [3 x [4 x float]]], [2 x [3 x [4 x float]]] %fusion, i64 0, i64 %fusion.indvar.dim.0, i64 %fusion.indvar.dim.1, i64 %fusion.indvar.dim.2 store float %56, float* %57, !alias.scope !8, !noalias !12 %invar.inc5 = add nuw nsw i64 %fusion.indvar.dim.2, 1 store i64 %invar.inc5, i64* %fusion.invar_address.dim.2 br label %fusion.loop_header.dim.2 fusion.loop_exit.dim.2: ; preds = %fusion.loop_header.dim.2 %invar.inc4 = add nuw nsw i64 %fusion.indvar.dim.1, 1 store i64 %invar.inc4, i64* %fusion.invar_address.dim.1 br label %fusion.loop_header.dim.1 fusion.loop_exit.dim.1: ; preds = %fusion.loop_header.dim.1 %invar.inc3 = add nuw nsw i64 %fusion.indvar.dim.0, 1 store i64 %invar.inc3, i64* %fusion.invar_address.dim.0 br label %fusion.loop_header.dim.0 fusion.loop_exit.dim.0: ; preds = %fusion.loop_header.dim.0 %58 = getelementptr inbounds i8, i8* %buffer_table, i64 3 %59 = load i8, i8* %58, !invariant.load !0, !dereferenceable !2, !align !2 %tuple.30 = bitcast i8* %59 to [1 x i8] %60 = bitcast [2 x [3 x [4 x float]]]* %fusion to i8* %61 = getelementptr inbounds [1 x i8], [1 x i8]* %tuple.30, i64 0, i64 0 store i8* %60, i8** %61, !alias.scope !14, !noalias !8 ret void } ; Function Attrs: nounwind readnone speculatable willreturn declare float @llvm.log.f32(float) #1 attributes #0 = { uwtable "no-frame-pointer-elim"="false" } attributes #1 = { nounwind readnone speculatable willreturn } !0 = !{} !1 = !{i64 32} !2 = !{i64 8} !3 = !{!4, !6} !4 = !{!"buffer: {index:0, offset:0, size:96}", !5} !5 = !{!"XLA global AA domain"} !6 = !{!"buffer: {index:5, offset:0, size:32}", !5} !7 = !{!6} !8 = !{!4} !9 = !{i64 4} !10 = !{i64 12} !11 = !{i64 96} !12 = !{!13, !6} !13 = !{!"buffer: {index:3, offset:0, size:8}", !5} !14 = !{!13} ``` gets (correctly) optimized to the one below without the change: ``` ; ModuleID = '__compute_module' source_filename = "__compute_module" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-grtev4-linux-gnu" ; Function Attrs: nofree nounwind uwtable define void @jit_wrapped_fun.31(i8* nocapture readnone %retval, i8* noalias nocapture readnone %run_options, i8 noalias nocapture readnone %params, i8 noalias nocapture readonly %buffer_table, i64* noalias nocapture readnone %prof_counters) local_unnamed_addr #0 { entry: %0 = getelementptr inbounds i8, i8* %buffer_table, i64 1 %1 = bitcast i8 %0 to [2 x [1 x [4 x float]]] %2 = load [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %1, align 8, !invariant.load !0, !dereferenceable !1, !align !2 %3 = getelementptr inbounds i8, i8* %buffer_table, i64 5 %4 = bitcast i8 %3 to [2 x [1 x [4 x float]]] %5 = load [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %4, align 8, !invariant.load !0, !dereferenceable !1, !align !2 %6 = bitcast [2 x [1 x [4 x float]]]* %2 to <4 x float>* %7 = load <4 x float>, <4 x float>* %6, align 8, !invariant.load !0, !noalias !3 %8 = fmul <4 x float> %7, %7 %9 = fmul <4 x float> %8, <float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000> %10 = call <4 x float> @llvm.log.v4f32(<4 x float> %9) %11 = bitcast [2 x [1 x [4 x float]]]* %5 to <4 x float>* store <4 x float> %10, <4 x float>* %11, align 8, !alias.scope !7, !noalias !8 %12 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 1, i64 0, i64 0 %13 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 1, i64 0, i64 0 %14 = bitcast float* %12 to <4 x float>* %15 = load <4 x float>, <4 x float>* %14, align 8, !invariant.load !0, !noalias !3 %16 = fmul <4 x float> %15, %15 %17 = fmul <4 x float> %16, <float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000> %18 = call <4 x float> @llvm.log.v4f32(<4 x float> %17) %19 = bitcast float* %13 to <4 x float>* store <4 x float> %18, <4 x float>* %19, align 8, !alias.scope !7, !noalias !8 %20 = getelementptr inbounds i8, i8* %buffer_table, i64 4 %21 = bitcast i8 %20 to float %22 = load float, float* %21, align 8, !invariant.load !0, !dereferenceable !9, !align !2 %23 = getelementptr inbounds i8, i8* %buffer_table, i64 2 %24 = bitcast i8 %23 to [3 x [1 x float]] %25 = load [3 x [1 x float]], [3 x [1 x float]]* %24, align 8, !invariant.load !0, !dereferenceable !10, !align !2 %26 = load i8, i8* %buffer_table, align 8, !invariant.load !0, !dereferenceable !11, !align !2 %27 = load float, float* %22, align 8, !invariant.load !0, !noalias !3 %.phi.trans.insert28 = getelementptr inbounds [3 x [1 x float]], [3 x [1 x float]]* %25, i64 0, i64 2, i64 0 %.pre29 = load float, float* %.phi.trans.insert28, align 8, !invariant.load !0, !noalias !3 %28 = bitcast [3 x [1 x float]]* %25 to <2 x float>* %29 = load <2 x float>, <2 x float>* %28, align 8, !invariant.load !0, !noalias !3 %30 = insertelement <2 x float> undef, float %27, i32 0 %31 = shufflevector <2 x float> %30, <2 x float> undef, <2 x i32> zeroinitializer %32 = fsub <2 x float> %31, %29 %33 = fmul <2 x float> %32, %32 %shuffle30 = shufflevector <2 x float> %33, <2 x float> undef, <8 x i32> <i32 0, i32 0, i32 0, i32 0, i32 1, i32 1, i32 1, i32 1> %34 = fsub float %27, %.pre29 %35 = fmul float %34, %34 %36 = insertelement <4 x float> undef, float %35, i32 0 %37 = shufflevector <4 x float> %36, <4 x float> undef, <4 x i32> zeroinitializer %shuffle = shufflevector <4 x float> %10, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %38 = fmul <4 x float> %7, %7 %shuffle31 = shufflevector <4 x float> %38, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %39 = fdiv <8 x float> %shuffle30, %shuffle31 %40 = fadd <8 x float> %shuffle, %39 %41 = fmul <8 x float> %40, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01> %42 = bitcast i8* %26 to <8 x float>* store <8 x float> %41, <8 x float>* %42, align 8, !alias.scope !8, !noalias !12 %43 = getelementptr inbounds i8, i8* %26, i64 32 %44 = fdiv <4 x float> %37, %38 %45 = fadd <4 x float> %10, %44 %46 = fmul <4 x float> %45, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01> %47 = bitcast i8* %43 to <4 x float>* store <4 x float> %46, <4 x float>* %47, align 8, !alias.scope !8, !noalias !12 %.phi.trans.insert = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 1, i64 0, i64 0 %.phi.trans.insert12 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 1, i64 0, i64 0 %48 = bitcast float* %.phi.trans.insert to <4 x float>* %49 = load <4 x float>, <4 x float>* %48, align 8, !alias.scope !7, !noalias !8 %50 = bitcast float* %.phi.trans.insert12 to <4 x float>* %51 = load <4 x float>, <4 x float>* %50, align 8, !invariant.load !0, !noalias !3 %shuffle.1 = shufflevector <4 x float> %49, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %52 = getelementptr inbounds i8, i8* %26, i64 48 %53 = fmul <4 x float> %51, %51 %shuffle31.1 = shufflevector <4 x float> %53, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %54 = fdiv <8 x float> %shuffle30, %shuffle31.1 %55 = fadd <8 x float> %shuffle.1, %54 %56 = fmul <8 x float> %55, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01> %57 = bitcast i8* %52 to <8 x float>* store <8 x float> %56, <8 x float>* %57, align 8, !alias.scope !8, !noalias !12 %58 = getelementptr inbounds i8, i8* %26, i64 80 %59 = fdiv <4 x float> %37, %53 %60 = fadd <4 x float> %49, %59 %61 = fmul <4 x float> %60, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01> %62 = bitcast i8* %58 to <4 x float>* store <4 x float> %61, <4 x float>* %62, align 8, !alias.scope !8, !noalias !12 %63 = getelementptr inbounds i8, i8* %buffer_table, i64 3 %64 = bitcast i8** %63 to [1 x i8]* %65 = load [1 x i8], [1 x i8]* %64, align 8, !invariant.load !0, !dereferenceable !2, !align !2 %66 = getelementptr inbounds [1 x i8], [1 x i8]* %65, i64 0, i64 0 store i8* %26, i8** %66, align 8, !alias.scope !14, !noalias !8 ret void } ; Function Attrs: nounwind readnone speculatable willreturn declare <4 x float> @llvm.log.v4f32(<4 x float>) #1 attributes #0 = { nofree nounwind uwtable "no-frame-pointer-elim"="false" } attributes #1 = { nounwind readnone speculatable willreturn } !0 = !{} !1 = !{i64 32} !2 = !{i64 8} !3 = !{!4, !6} !4 = !{!"buffer: {index:0, offset:0, size:96}", !5} !5 = !{!"XLA global AA domain"} !6 = !{!"buffer: {index:5, offset:0, size:32}", !5} !7 = !{!6} !8 = !{!4} !9 = !{i64 4} !10 = !{i64 12} !11 = !{i64 96} !12 = !{!13, !6} !13 = !{!"buffer: {index:3, offset:0, size:8}", !5} !14 = !{!13} ``` and (incorrectly) optimized to the one below with the change: ``` ; ModuleID = '__compute_module' source_filename = "__compute_module" target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-grtev4-linux-gnu" ; Function Attrs: nofree nounwind uwtable define void @jit_wrapped_fun.31(i8* nocapture readnone %retval, i8* noalias nocapture readnone %run_options, i8 noalias nocapture readnone %params, i8 noalias nocapture readonly %buffer_table, i64* noalias nocapture readnone %prof_counters) local_unnamed_addr #0 { entry: %0 = getelementptr inbounds i8, i8* %buffer_table, i64 1 %1 = bitcast i8 %0 to [2 x [1 x [4 x float]]] %2 = load [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %1, align 8, !invariant.load !0, !dereferenceable !1, !align !2 %3 = getelementptr inbounds i8, i8* %buffer_table, i64 5 %4 = bitcast i8 %3 to [2 x [1 x [4 x float]]] %5 = load [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %4, align 8, !invariant.load !0, !dereferenceable !1, !align !2 %6 = bitcast [2 x [1 x [4 x float]]]* %2 to <4 x float>* %7 = load <4 x float>, <4 x float>* %6, align 8, !invariant.load !0, !noalias !3 %8 = fmul <4 x float> %7, %7 %9 = fmul <4 x float> %8, <float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000> %10 = call <4 x float> @llvm.log.v4f32(<4 x float> %9) %11 = bitcast [2 x [1 x [4 x float]]]* %5 to <4 x float>* store <4 x float> %10, <4 x float>* %11, align 8, !alias.scope !7, !noalias !8 %12 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 1, i64 0, i64 0 %13 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 1, i64 0, i64 0 %14 = bitcast float* %12 to <4 x float>* %15 = load <4 x float>, <4 x float>* %14, align 8, !invariant.load !0, !noalias !3 %16 = fmul <4 x float> %15, %15 %17 = fmul <4 x float> %16, <float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000, float 0x401921FB60000000> %18 = call <4 x float> @llvm.log.v4f32(<4 x float> %17) %19 = bitcast float* %13 to <4 x float>* store <4 x float> %18, <4 x float>* %19, align 8, !alias.scope !7, !noalias !8 %20 = getelementptr inbounds i8, i8* %buffer_table, i64 4 %21 = bitcast i8 %20 to float %22 = load float, float* %21, align 8, !invariant.load !0, !dereferenceable !9, !align !2 %23 = getelementptr inbounds i8, i8* %buffer_table, i64 2 %24 = bitcast i8 %23 to [3 x [1 x float]] %25 = load [3 x [1 x float]], [3 x [1 x float]]* %24, align 8, !invariant.load !0, !dereferenceable !10, !align !2 %26 = load i8, i8* %buffer_table, align 8, !invariant.load !0, !dereferenceable !11, !align !2 %27 = load float, float* %22, align 8, !invariant.load !0, !noalias !3 %.phi.trans.insert28 = getelementptr inbounds [3 x [1 x float]], [3 x [1 x float]]* %25, i64 0, i64 2, i64 0 %.pre29 = load float, float* %.phi.trans.insert28, align 8, !invariant.load !0, !noalias !3 %28 = bitcast [3 x [1 x float]]* %25 to <2 x float>* %29 = load <2 x float>, <2 x float>* %28, align 8, !invariant.load !0, !noalias !3 %30 = insertelement <2 x float> undef, float %27, i32 0 %31 = shufflevector <2 x float> %30, <2 x float> undef, <2 x i32> zeroinitializer %32 = fsub <2 x float> %31, %29 %33 = fmul <2 x float> %32, %32 %shuffle32 = shufflevector <2 x float> %33, <2 x float> undef, <8 x i32> <i32 0, i32 0, i32 0, i32 0, i32 1, i32 1, i32 1, i32 1> %34 = fsub float %27, %.pre29 %35 = fmul float %34, %34 %36 = insertelement <4 x float> undef, float %35, i32 0 %37 = shufflevector <4 x float> %36, <4 x float> undef, <4 x i32> zeroinitializer %shuffle = shufflevector <4 x float> %10, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %38 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 0, i64 0, i64 3 %39 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 0, i64 0, i64 3 %40 = fmul <4 x float> %7, %7 %41 = shufflevector <4 x float> %40, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 undef, i32 undef, i32 undef, i32 undef> %42 = fdiv <8 x float> %shuffle32, %41 %43 = fadd <8 x float> %shuffle, %42 %44 = fmul <8 x float> %43, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01> %45 = bitcast i8* %26 to <8 x float>* store <8 x float> %44, <8 x float>* %45, align 8, !alias.scope !8, !noalias !12 %46 = extractelement <4 x float> %10, i32 0 %47 = getelementptr inbounds i8, i8* %26, i64 32 %48 = extractelement <4 x float> %10, i32 1 %49 = extractelement <4 x float> %10, i32 2 %50 = load float, float* %38, align 4, !alias.scope !7, !noalias !8 %51 = load float, float* %39, align 4, !invariant.load !0, !noalias !3 %52 = fmul float %51, %51 %53 = insertelement <4 x float> undef, float %52, i32 3 %54 = fdiv <4 x float> %37, %53 %55 = insertelement <4 x float> undef, float %46, i32 0 %56 = insertelement <4 x float> %55, float %48, i32 1 %57 = insertelement <4 x float> %56, float %49, i32 2 %58 = insertelement <4 x float> %57, float %50, i32 3 %59 = fadd <4 x float> %58, %54 %60 = fmul <4 x float> %59, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01> %61 = bitcast i8* %47 to <4 x float>* store <4 x float> %60, <4 x float>* %61, align 8, !alias.scope !8, !noalias !12 %.phi.trans.insert = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 1, i64 0, i64 0 %.phi.trans.insert12 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 1, i64 0, i64 0 %62 = bitcast float* %.phi.trans.insert to <4 x float>* %63 = load <4 x float>, <4 x float>* %62, align 8, !alias.scope !7, !noalias !8 %64 = bitcast float* %.phi.trans.insert12 to <4 x float>* %65 = load <4 x float>, <4 x float>* %64, align 8, !invariant.load !0, !noalias !3 %shuffle.1 = shufflevector <4 x float> %63, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %66 = getelementptr inbounds i8, i8* %26, i64 48 %67 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %5, i64 0, i64 1, i64 0, i64 3 %68 = getelementptr inbounds [2 x [1 x [4 x float]]], [2 x [1 x [4 x float]]]* %2, i64 0, i64 1, i64 0, i64 3 %69 = fmul <4 x float> %65, %65 %70 = shufflevector <4 x float> %69, <4 x float> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %71 = fdiv <8 x float> %shuffle32, %70 %72 = fadd <8 x float> %shuffle.1, %71 %73 = fmul <8 x float> %72, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01> %74 = bitcast i8* %66 to <8 x float>* store <8 x float> %73, <8 x float>* %74, align 8, !alias.scope !8, !noalias !12 %75 = extractelement <4 x float> %69, i32 0 %76 = extractelement <4 x float> %63, i32 0 %77 = getelementptr inbounds i8, i8* %26, i64 80 %78 = extractelement <4 x float> %69, i32 1 %79 = extractelement <4 x float> %63, i32 1 %80 = extractelement <4 x float> %69, i32 2 %81 = extractelement <4 x float> %63, i32 2 %82 = load float, float* %67, align 4, !alias.scope !7, !noalias !8 %83 = load float, float* %68, align 4, !invariant.load !0, !noalias !3 %84 = fmul float %83, %83 %85 = insertelement <4 x float> undef, float %75, i32 0 %86 = insertelement <4 x float> %85, float %78, i32 1 %87 = insertelement <4 x float> %86, float %80, i32 2 %88 = insertelement <4 x float> %87, float %84, i32 3 %89 = fdiv <4 x float> %37, %88 %90 = insertelement <4 x float> undef, float %76, i32 0 %91 = insertelement <4 x float> %90, float %79, i32 1 %92 = insertelement <4 x float> %91, float %81, i32 2 %93 = insertelement <4 x float> %92, float %82, i32 3 %94 = fadd <4 x float> %93, %89 %95 = fmul <4 x float> %94, <float -5.000000e-01, float -5.000000e-01, float -5.000000e-01, float -5.000000e-01> %96 = bitcast i8* %77 to <4 x float>* store <4 x float> %95, <4 x float>* %96, align 8, !alias.scope !8, !noalias !12 %97 = getelementptr inbounds i8, i8* %buffer_table, i64 3 %98 = bitcast i8** %97 to [1 x i8]* %99 = load [1 x i8], [1 x i8]* %98, align 8, !invariant.load !0, !dereferenceable !2, !align !2 %100 = getelementptr inbounds [1 x i8], [1 x i8]* %99, i64 0, i64 0 store i8* %26, i8** %100, align 8, !alias.scope !14, !noalias !8 ret void } ; Function Attrs: nounwind readnone speculatable willreturn declare <4 x float> @llvm.log.v4f32(<4 x float>) #1 attributes #0 = { nofree nounwind uwtable "no-frame-pointer-elim"="false" } attributes #1 = { nounwind readnone speculatable willreturn } !0 = !{} !1 = !{i64 32} !2 = !{i64 8} !3 = !{!4, !6} !4 = !{!"buffer: {index:0, offset:0, size:96}", !5} !5 = !{!"XLA global AA domain"} !6 = !{!"buffer: {index:5, offset:0, size:32}", !5} !7 = !{!6} !8 = !{!4} !9 = !{i64 4} !10 = !{i64 12} !11 = !{i64 96} !12 = !{!13, !6} !13 = !{!"buffer: {index:3, offset:0, size:8}", !5} !14 = !{!13} ``` This results in bad numerical answers when used through XLA. Again, it's not that easy to give a small fully-reproducible example, but the misscompare is: ``` Expected literal: ( f32[2,3,4] { { { nan, -inf, -3181.35, -inf }, { nan, -inf, -28.2577019, -inf }, { nan, -inf, -28.2577019, -inf } }, { { -inf, -inf, -inf, -inf }, { -6.60753046e+28, -1.47314833e+23, -inf, -inf }, { -2.43504347e+30, -5.42892693e+24, -inf, -inf } } } ) Actual literal: ( f32[2,3,4] { { { nan, -inf, -3181.35, -inf }, { nan, -inf, -inf, -inf }, { inf, -inf, -28.2577019, -inf } }, { { -inf, -inf, -inf, -inf }, { -6.60753046e+28, -1.47314833e+23, -inf, -inf }, { -2.43504347e+30, -5.42892693e+24, -inf, -inf } } } ) ``` Reviewers: sanjoy.google, sanjoy, ebrevnov, jdoerfert, reames, chandlerc Subscribers: hiraditya, Charusso, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70516	2019-11-21 11:40:15 +01:00
Ilya Biryukov	bea2bb8fa3	Revert "[Driver] Use VFS to check if sanitizer blacklists exist" This reverts commit ba6f906854263375cff3257d22d241a8a259cf77. Commit caused compilation errors on llvm tests. Will fix and re-land.	2019-11-21 11:31:14 +01:00
Martin Storsjö	6cbd2973fc	[COFF] Widen PE32Header fields to fit 64 bit versions The PE32Header struct is only used by COFFYAML, for intermediate storage. The struct doesn't match the on-disk struct layout as it uses native integers instead of e.g. support::ulittle32_t, so just widen the fields to fit values for object::pe32plus_header, in addition to object::pe32_header. This avoids truncating the 64 bit ImageBase for 64 bit executables. Differential Revision: https://reviews.llvm.org/D70464	2019-11-21 12:05:00 +02:00
Ilya Biryukov	bd70bf9cb9	[Driver] Use VFS to check if sanitizer blacklists exist Summary: This is a follow-up to 590f279c456bbde632eca8ef89a85c478f15a249, which moved some of the callers to use VFS. It turned out more code in Driver calls into real filesystem APIs and also needs an update. Reviewers: gribozavr2, kadircet Reviewed By: kadircet Subscribers: ormris, mgorny, hiraditya, llvm-commits, jkorous, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D70440	2019-11-21 11:00:30 +01:00
Tim Corringham	511fbac991	[AMDGPU] Add attribute for target loop unroll threshold default Summary: Add a function attribute to allow the target specific default loop unroll threshold to be specified on a per-function basis. This allows a front-end to give guidance where it has insight that is not available to the back-end, while still allowing the target specific heuristics to also have an effect. Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68873	2019-11-21 09:47:28 +00:00
David Stenberg	cb47dbbe79	[DebugInfo] Refactor DIExpression [SZ]Ext creation into function [NFC] Summary: Also, replace the SmallVector with a normal C array. Reviewers: vsk Reviewed By: vsk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70498	2019-11-21 10:44:04 +01:00
Florian Hahn	ee8bfe440e	[Support] Don't check XCR0 when detecting avx512 on Darwin. Darwin lazily saves the AVX512 context on first use [1]: instead of checking that it already does to figure out if the OS supports AVX512, trust that the kernel will do the right thing and always assume the context save support is available. [1] https://github.com/apple/darwin-xnu/blob/xnu-4903.221.2/osfmk/i386/fpu.c#L174 Reviewers: ab, RKSimon, craig.topper Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D70453	2019-11-21 09:18:00 +00:00
Clement Courbet	ba35e653b7	[DAGCombine][NFC] Use ArrayRef and correctly size SmallVectors. In preparation for D70487.	2019-11-21 08:53:37 +01:00
James Y Knight	abfa68cadb	D'oh. Fix assert after a84922916e6eddf701b39fbd7fe0222cb0fee1d6. (Which was attempting to fix unused variable warning in NDEBUG mode after 8ba56f322abf848cec78ff7f814f3ad84cd778be)	2019-11-20 22:22:51 -05:00
James Y Knight	1fc283ee90	Fix unused variable warning in NDEBUG mode after 8ba56f322abf848cec78ff7f814f3ad84cd778be	2019-11-20 22:05:05 -05:00
River Riddle	b76b006b31	Tablegen: Remove the error for duplicate include files. This error was originally added a while(7 years) ago when including multiple files was basically always an error. Tablegen now has preprocessor support, which allows for building nice c/c++ style include guards. With the current error being reported, we unfortunately need to double guard when including files: * In user of MyFile.td #ifndef MYFILE_TD include MyFile.td #endif * In MyFile.td #ifndef MYFILE_TD #define MYFILE_TD ... #endif Differential Revision: https://reviews.llvm.org/D70410	2019-11-20 18:24:10 -08:00
Lang Hames	2aa858aa23	[Error] Remove a broken code fragment accidentally included in 76bcbaafab2.	2019-11-20 17:50:22 -08:00
Lang Hames	d8fb36540a	[Orc][Modules] Fix Modules build fallout from a34680a33eb. In a34680a33eb OrcError.h and Orc/RPC/*.h were split out from the rest of ExecutionEngine in order to eliminate false dependencies for remote JIT targets (see https://reviews.llvm.org/D68732), however this broke modules builds (see https://reviews.llvm.org/D69817). This patch splits these headers out into a separate module, LLVM_OrcSupport, in order to fix the modules build. Fixes <rdar://56377508>.	2019-11-20 17:34:34 -08:00
Alina Sbirlea	21c5787ed7	[MemorySSA] Moving at the end often means before terminator. Moving accesses in MemorySSA at InsertionPlace::End, when an instruction is moved into a block, almost always means insert at the end of the block, but before the block terminator. This matters when the block terminator is a MemoryAccess itself (an invoke), and the insertion must be done before the terminator for the update to be correct. Insert an additional position: InsertionPlace:BeforeTerminator and update current usages where this applies. Resolves PR44027.	2019-11-20 17:11:00 -08:00
Craig Topper	aedb09ba63	[X86] Fix i16->f128 sitofp to promote the i16 to i32 before trying to form a libcall. Previously one of the test cases added here gave an error.	2019-11-20 17:09:32 -08:00
Craig Topper	d77164bb2a	[X86] Fix f128->i16 fptosi to promote the i16 to i32 before trying to form a libcall. Previously one of the test cases added here gave an error.	2019-11-20 17:09:31 -08:00
Adrian Prantl	d2a56513d0	Fix an offset underflow bug in DwarfExpression when describing small values with subregisters DwarfExpression::addMachineReg() knows how to build a larger register that isn't expressible in DWARF by combining multiple subregisters. However, if the entire value fits into just one subregister, it would still emit the other subregisters, leading to all sorts of inconsistencies down the line. This patch fixes that by moving an already existing(!) check whether the subregister's offset is before the end of the value to the right place. rdar://problem/57294211 Differential Revision: https://reviews.llvm.org/D70508	2019-11-20 17:07:54 -08:00
Philip Reames	f12fe720d1	Precommit tests for forthcoming widenable.condition transforms	2019-11-20 17:02:04 -08:00
Josh Kunz	e084a10545	[docs] Tiny rewording in the portability FAQ entry The entry reads better with these two words swapped.	2019-11-20 16:40:30 -08:00
Alina Sbirlea	27de8339d1	[MemorySSA] Update analysis when the terminator is a memory instruction. Update MemorySSA when moving the terminator instruction, as that may be a memory touching instruction. Resolves PR44029.	2019-11-20 16:36:52 -08:00
Reid Kleckner	ce03a2121f	[ADT] Move to_vector from STLExtras.h to SmallVector.h Nothing breaks, so this probably has zero impact, but it seems nice, since to_vector is more like makeArrayRef, and should really live in SmallVector.h.	2019-11-20 16:35:19 -08:00
Peter Collingbourne	f0d31eaac6	gn build: check-clang depends on llvm-cxxfilt.	2019-11-20 16:25:54 -08:00
Eric Christopher	3d68f1e222	Revert "[AArch64] Add the pipeline model for Exynos M5" as it's causing test failures in llvm-mca. This reverts commit 9bdfee2a3bd13d405ce1592930182f23849d2897.	2019-11-20 16:04:52 -08:00
Eric Christopher	7b534b3c55	Temporarily Revert "[SLP] allow forming 2-way reduction patterns" and update testcases. After speaking with Sanjay - seeing a number of miscompiles and working on tracking down a testcase. None of the follow on patches seem to have helped so far. This reverts commit 8a0aa5310bccbb42d16d11db090419fcefdd1376.	2019-11-20 16:00:53 -08:00
Eric Christopher	6d982d0bf7	Temporarily Revert "Temporarily Revert "[SLP] allow forming 2-way reduction patterns"" as there were testcase changes after that need to also be reverted. This reverts commit cd8748a15f2d18861b3548eb26ed2b52e5ee50b4.	2019-11-20 15:39:47 -08:00
Yonghong Song	1efbe554a0	[BPF] Fix a bug in peephole optimization One of current peephole optimiations is to remove SLL/SRL if the sub register has been zero extended. This phase has two bugs and one limitations. First, for the physical subregister used in pseudo insn COPY like below, it permits incorrect optimization. %0:gpr32 = COPY $w0 ... %4:gpr = MOV_32_64 %0:gpr32 %5:gpr = SLL_ri %4:gpr(tied-def 0), 32 %6:gpr = SRA_ri %5:gpr(tied-def 0), 32 The $w0 could be from the return value of a previous function call and its upper 32-bit value might contain some non-zero values. The same applies to function arguments. Second, the current code may permits removing SLL/SRA like below: %0:gpr32 = COPY $w0 %1:gpr32 = COPY %0:gpr32 ... %4:gpr = MOV_32_64 %1:gpr32 %5:gpr = SLL_ri %4:gpr(tied-def 0), 32 %6:gpr = SRA_ri %5:gpr(tied-def 0), 32 The reason is that it did not follow def-use chain to skip all intermediate 32bit-to-32bit COPY instructions. The current implementation is also very conservative for PHI instructions. If any PHI insn component is another PHI or COPY insn, it will just permit SLL/SRA. This patch fixed the issue as follows: - During def/use chain traversal, if any physical register is read, SLL/SRA will be preserved as these physical registers are mostly from function return values or current function arguments. - Recursively visit all COPY and PHI instructions.	2019-11-20 15:19:59 -08:00
Eric Christopher	fd20022682	Temporarily Revert "[SLP] allow forming 2-way reduction patterns" After speaking with Sanjay - seeing a number of miscompiles and working on tracking down a testcase. None of the follow on patches seem to have helped so far. This reverts commit 7ff57705ba196ce649d6034614b3b9df57e1f84f.	2019-11-20 15:19:31 -08:00
Lang Hames	4783cb036c	[Support][Error] Unfriend FileError. It is not special. FileError doesn't need direct access to Error's internals as it can access the payload via handleErrors. (ErrorList remains special: it is not possible to write a handler for it, due to the special auto-unpacking treatment that it receives from handleErrors.)	2019-11-20 15:08:37 -08:00
Evandro Menezes	4ab9ed6b8f	[AArch64] Add the pipeline model for Exynos M5 Add the scheduling and cost models for Exynos M5.	2019-11-20 16:56:07 -06:00
Evgenii Stepanov	60a4b60e92	Cherry-pick gtest fix for asan tests. Summary: `681454dae4` Clone+exec death test allocates a single page of stack to run chdir + exec on. This is not enough when gtest is built with ASan and run on particular hardware. With ASan on x86_64, ExecDeathTestChildMain has frame size of 1728 bytes. Call to chdir() in ExecDeathTestChildMain ends up in _dl_runtime_resolve_xsavec, which attempts to save register state on the stack; according to cpuid(0xd) XSAVE register save area size is 2568 on my machine. This results in something like this in all death tests: Result: died but not with expected error. ... [ DEATH ] AddressSanitizer:DEADLYSIGNAL [ DEATH ] ================================================================= [ DEATH ] ==178637==ERROR: AddressSanitizer: stack-overflow on address ... PiperOrigin-RevId: 278709790 Reviewers: pcc Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70332	2019-11-20 14:12:51 -08:00
Piotr Sobczak	1af506a300	[AMDGPU][SILoadStoreOptimizer] Merge TBUFFER loads/stores Summary: Extend SILoadStoreOptimizer to merge tbuffer loads and stores. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69794	2019-11-20 22:59:30 +01:00
Krzysztof Parzyszek	e8804e3800	[Hexagon] Fix two testcase errors This fixes issues discovered in https://reviews.llvm.org/D63973.	2019-11-20 15:06:35 -06:00
Craig Topper	079091fb71	[X86] Mark vector STRICT_FP_ROUND as Legal instead of Custom. The Custom handler doesn't do anything for these nodes anyway. SelectionDAGISel won't mutate them if they are Legal or Custom. X86 has custom code for mutating them due to missing isel patterns. When the isel patterns are added Legal will be the right answer. So go ahead a change it now since that's where we'll end up.	2019-11-20 13:03:51 -08:00
Philip Reames	75e10dcf71	Move widenable branch formation into makeGuardControlFlowExplicit helper This is mostly NFC, but I removed the setting of the guard's calling convention onto the WC call. Why? Because it was untested, and was producing an ill defined output as the declaration's convention wasn't been changed leaving a mismatch which is UB.	2019-11-20 12:54:05 -08:00
Stanislav Mekhanoshin	2fa5c427a3	[AMDGPU] Fixed mfma test check. NFC.	2019-11-20 12:33:12 -08:00
Michael Liao	d8d1683ed2	[AMDGPU] Keep consistent check of legal addressing mode. Summary: - Add test cases for GFX10, which has narrower offset range compared to GFX9. Reviewers: rampitec, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70473	2019-11-20 15:08:17 -05:00
Jonas Devlieghere	c1440088ae	[FileCollector] Ignore empty paths. Don't insert empty strings into the StringSet<> because that triggers an assert in its implementation.	2019-11-20 10:57:44 -08:00

1 2 3 4 5 ...

188088 Commits