llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00

Author	SHA1	Message	Date
Logan Chien	e6b3d37d36	[arm] Implement ARM .arch directive. llvm-svn: 197052	2013-12-11 17:16:25 +00:00
Tim Northover	d66403a4e1	ARM: constrain register-class in fast-isel The tests were no longer using fast-isel at all (MachO needs an "ios" rather than "darwin" triple at the moment and Linux needs ARM mode). Once that was corrected, the verifier complained about a t2ADDri created for the alloca. llvm-svn: 197046	2013-12-11 16:04:57 +00:00
NAKAMURA Takumi	54fa39136d	Prune redundant dependencies in LLVMBuild.txt. llvm-svn: 196988	2013-12-11 00:30:57 +00:00
Tim Northover	54d769f4eb	Make Triple's isOSBinFormatXXX functions partition triple-space. Most users would be surprised if "isCOFF" and "isMachO" were simultaneously true, unless they'd put the compiler in a box with a gun attached to a photon detector. This makes sure precisely one of the three formats is true for any triple and simplifies some target logic based on that. llvm-svn: 196934	2013-12-10 16:57:43 +00:00
NAKAMURA Takumi	3fda77b6c7	Add proper dependencies to LLVMBuild.txt in llvm/lib. I'll prune redundant deps in LLVMBuild.txt, later. llvm-svn: 196881	2013-12-10 05:39:34 +00:00
Rafael Espindola	3a54357741	Add comments documenting the ARM datalayout string. llvm-svn: 196850	2013-12-10 00:37:37 +00:00
Rafael Espindola	b9446f75e7	Simplify further. Thanks to Jim Grosbach for noticing it. llvm-svn: 196846	2013-12-10 00:15:35 +00:00
Rafael Espindola	eb7e18aa45	Refactor the construction of the DataLayout string on ARM. llvm-svn: 196843	2013-12-09 23:56:41 +00:00
Tim Northover	2ef97554d8	ARM: fix folding of stack-adjustment (yet again). When trying to eliminate an "sub sp, sp, #N" instruction by folding it into an existing push/pop using dummy registers, we need to account for the fact that this might affect precisely how "fp" gets set in the prologue. We were attempting this, but assuming that whenever we performed a fold it would make a difference. This is false, for example, in: push {r4, r7, lr} add fp, sp, #4 vpush {d8} sub sp, sp, #8 we can fold the "sub" into the "vpush", forming "vpush {d7, d8}". However, in that case the "add fp" instruction mustn't change, which we were getting wrong before. Should fix PR18160. llvm-svn: 196725	2013-12-08 15:56:50 +00:00
Ana Pazos	c66716b626	Added support for mcpu krait - krait processor currently modeled with the same features as A9. - Krait processor additionally has VFP4 (fused multiply add/sub) and hardware division features enabled. - krait has currently the same Schedule model as A9 - krait cpu flag is not recognized by the GNU assembler yet, it is replaced with march=armv7-a to avoid a lower march from being used. llvm-svn: 196619	2013-12-06 22:48:17 +00:00
Weiming Zhao	f90fe366af	Bug 18149: [AArch32] VSel instructions has no ARMCC field The current peephole optimizing for compare inst assumes an instr that uses CPSR has an MO for ARM Cond code.However, for VSEL instructions (vseqeq, vselgt, vselgt, vselvs), there is no such operand nor do they support the modification of Cond Code. llvm-svn: 196588	2013-12-06 17:56:48 +00:00
Andrew Trick	192311ab9a	MI-Sched: handle latency of in-order operations with the new machine model. The per-operand machine model allows the target to define "unbuffered" processor resources. This change is a quick, cheap way to model stalls caused by the latency of operations that use such resources. This only applies when the processor's micro-op buffer size is non-zero (Out-of-Order). We can't precisely model in-order stalls during out-of-order execution, but this is an easy and effective heuristic. It benefits cortex-a9 scheduling when using the new machine model, which is not yet on by default. MI-Sched for armv7 was evaluated on Swift (and only not enabled because of a performance bug related to predication). However, we never evaluated Cortex-A9 performance on MI-Sched in its current form. This change adds MI-Sched functionality to reach performance goals on A9. The only remaining change is to allow MI-Sched to run as a PostRA pass. I evaluated performance using a set of options to estimate the performance impact once MI sched is default on armv7: -mcpu=cortex-a9 -disable-post-ra -misched-bench -scheditins=false For a simple saxpy loop I see a 1.7x speedup. Here are the llvm-testsuite results: (min run time over 2 runs, filtering tiny changes) Speedups: \| Benchmarks/BenchmarkGame/recursive \| 52.39% \| \| Benchmarks/VersaBench/beamformer \| 20.80% \| \| Benchmarks/Misc/pi \| 19.97% \| \| Benchmarks/Misc/mandel-2 \| 19.95% \| \| SPEC/CFP2000/188.ammp \| 18.72% \| \| Benchmarks/McCat/08-main/main \| 18.58% \| \| Benchmarks/Misc-C++/Large/sphereflake \| 18.46% \| \| Benchmarks/Olden/power \| 17.11% \| \| Benchmarks/Misc-C++/mandel-text \| 16.47% \| \| Benchmarks/Misc/oourafft \| 15.94% \| \| Benchmarks/Misc/flops-7 \| 14.99% \| \| Benchmarks/FreeBench/distray \| 14.26% \| \| SPEC/CFP2006/470.lbm \| 14.00% \| \| mediabench/mpeg2/mpeg2dec/mpeg2decode \| 12.28% \| \| Benchmarks/SmallPT/smallpt \| 10.36% \| \| Benchmarks/Misc-C++/Large/ray \| 8.97% \| \| Benchmarks/Misc/fp-convert \| 8.75% \| \| Benchmarks/Olden/perimeter \| 7.10% \| \| Benchmarks/Bullet/bullet \| 7.03% \| \| Benchmarks/Misc/mandel \| 6.75% \| \| Benchmarks/Olden/voronoi \| 6.26% \| \| Benchmarks/Misc/flops-8 \| 5.77% \| \| Benchmarks/Misc/matmul_f64_4x4 \| 5.19% \| \| Benchmarks/MiBench/security-rijndael \| 5.15% \| \| Benchmarks/Misc/flops-6 \| 5.10% \| \| Benchmarks/Olden/tsp \| 4.46% \| \| Benchmarks/MiBench/consumer-lame \| 4.28% \| \| Benchmarks/Misc/flops-5 \| 4.27% \| \| Benchmarks/mafft/pairlocalalign \| 4.19% \| \| Benchmarks/Misc/himenobmtxpa \| 4.07% \| \| Benchmarks/Misc/lowercase \| 4.06% \| \| SPEC/CFP2006/433.milc \| 3.99% \| \| Benchmarks/tramp3d-v4 \| 3.79% \| \| Benchmarks/FreeBench/pifft \| 3.66% \| \| Benchmarks/Ptrdist/ks \| 3.21% \| \| Benchmarks/Adobe-C++/loop_unroll \| 3.12% \| \| SPEC/CINT2000/175.vpr \| 3.12% \| \| Benchmarks/nbench \| 2.98% \| \| SPEC/CFP2000/183.equake \| 2.91% \| \| Benchmarks/Misc/perlin \| 2.85% \| \| Benchmarks/Misc/flops-1 \| 2.82% \| \| Benchmarks/Misc-C++-EH/spirit \| 2.80% \| \| Benchmarks/Misc/flops-2 \| 2.77% \| \| Benchmarks/NPB-serial/is \| 2.42% \| \| Benchmarks/ASC_Sequoia/CrystalMk \| 2.33% \| \| Benchmarks/BenchmarkGame/n-body \| 2.28% \| \| Benchmarks/SciMark2-C/scimark2 \| 2.27% \| \| Benchmarks/Olden/bh \| 2.03% \| \| skidmarks10/skidmarks \| 1.81% \| \| Benchmarks/Misc/flops \| 1.72% \| Slowdowns: \| Benchmarks/llubenchmark/llu \| -14.14% \| \| Benchmarks/Polybench/stencils/seidel-2d \| -5.67% \| \| Benchmarks/Adobe-C++/functionobjects \| -5.25% \| \| Benchmarks/Misc-C++/oopack_v1p8 \| -5.00% \| \| Benchmarks/Shootout/hash \| -2.35% \| \| Benchmarks/Prolangs-C++/ocean \| -2.01% \| \| Benchmarks/Polybench/medley/floyd-warshall \| -1.98% \| \| Polybench/linear-algebra/kernels/3mm \| -1.95% \| \| Benchmarks/McCat/09-vor/vor \| -1.68% \| llvm-svn: 196516	2013-12-05 17:55:58 +00:00
Andrew Trick	9518dde658	Fix the A9 machine model. VTRN writes two registers. llvm-svn: 196514	2013-12-05 17:55:49 +00:00
Tim Northover	d04bb11dd7	ARM: fix yet another stack-folding bug We were trying to fold the stack adjustment into the wrong instruction in the situation where the entire basic-block was epilogue code. Really, it can only ever be valid to do the folding precisely where the "add sp, ..." would be placed so there's no need for a separate iterator to track that. Should fix PR18136. llvm-svn: 196493	2013-12-05 11:02:02 +00:00
Alp Toker	e845f8af67	Correct word hyphenations This patch tries to avoid unrelated changes other than fixing a few hyphen-related ambiguities and contractions in nearby lines. llvm-svn: 196471	2013-12-05 05:44:44 +00:00
David Peixotto	b6710ff7c7	Add support for parsing ARM symbol variants on ELF targets ARM symbol variants are written with parens instead of @ like this: .word __GLOBAL_I_a(target1) This commit adds support for parsing these symbol variants in expressions. We introduce a new flag to MCAsmInfo that indicates the parser should use parens to parse the symbol variant. The expression parser is modified to look for symbol variants using parens instead of @ when the corresponding MCAsmInfo flag is true. The MCAsmInfo parens flag is enabled only for ARM on ELF. By adding this flag to MCAsmInfo, we are able to get rid of redundant ARM-specific symbol variants and use the generic variants instead (e.g. VK_GOT instead of VK_ARM_GOT). We use the new UseParensForSymbolVariant attribute in MCAsmInfo to correctly print the symbol variants for arm. To achive this we need to keep a handle to the MCAsmInfo in the MCSymbolRefExpr class that we can check when printing the symbol variant. Updated Tests: Changed case of symbol variant to match the generic kind. test/CodeGen/ARM/tls-models.ll test/CodeGen/ARM/tls1.ll test/CodeGen/ARM/tls2.ll test/CodeGen/Thumb2/tls1.ll test/CodeGen/Thumb2/tls2.ll PR18080 llvm-svn: 196424	2013-12-04 22:43:20 +00:00
Chad Rosier	59cd9a3090	Update the UseFusedMAC definition to directly specify its dependence on having VFP4. Patch by Daniel Stewart! llvm-svn: 196390	2013-12-04 17:16:36 +00:00
James Molloy	674f480ca4	Addrspacecasts are no-ops on ARM. Testcase added. llvm-svn: 196269	2013-12-03 11:23:11 +00:00
Rafael Espindola	5f33387c40	Refactor the setting of PrivateGlobalPrefix. No functionality change. llvm-svn: 196170	2013-12-02 23:39:26 +00:00
Rafael Espindola	c36d63a948	Move getSymbolWithGlobalValueBase to TargetLoweringObjectFile. This allows it to be used in TargetLoweringObjectFileImpl.cpp. llvm-svn: 196117	2013-12-02 16:25:47 +00:00
Rafael Espindola	cee11d6f43	Remove dead code. MO_JumpTableIndex and MO_ExternalSymbol don't show up on inline asm. Keeping parts of the old asm printer just to print inline asm to a string that we then parse back looks like a hack. llvm-svn: 196111	2013-12-02 15:36:37 +00:00
Tim Northover	c144b1204e	ARM: decide whether to use movw/movt based on "minsize" attribute. llvm-svn: 196102	2013-12-02 14:46:26 +00:00
Tim Northover	46df9f449d	ARM: add pseudo-instructions for lit-pool global materialisation These are used by MachO only at the moment, and (much like the existing MOVW/MOVT set) work around the fact that the labels used in the actual instructions often contain PC-dependent components, which means that repeatedly materialising the same global can't be CSEed. With small modifications, it could be adapted to how ELF finds the address of _GLOBAL_OFFSET_TABLE_, which would give similar benefits in PIC mode there. llvm-svn: 196090	2013-12-02 10:35:41 +00:00
Rafael Espindola	427ca8d886	Change the default of AsmWriterClassName and isMCAsmWriter. llvm-svn: 196065	2013-12-02 04:55:42 +00:00
Tim Northover	bcd72d7348	ARM: fix bug in -Oz stack adjustment folding Previously, we clobbered callee-saved registers when folding an "add sp, #N" into a "pop {rD, ...}" instruction. This change checks whether a register we're going to add to the "pop" could actually be live outside the function before doing so and should fix the issue. This should fix PR18081. llvm-svn: 196046	2013-12-01 14:16:24 +00:00
NAKAMURA Takumi	9b851e876f	[CMake] Let add_public_tablegen_target() provide intrinsics_gen, too. I think, in principle, intrinsics_gen may be added explicitly. That said, it can be added incidentally, since each target already has dependencies to llvm-tblgen. Almost all source files depend on both CommonTaleGen and intrinsics_gen. Explicit add_dependencies() have been pruned under lib/Target. llvm-svn: 195929	2013-11-28 17:04:31 +00:00
NAKAMURA Takumi	99f544b37e	[CMake] Let add_public_tablegen_target responsible to provide dependency to CommonTableGen. add_public_tablegen_target adds *CommonTableGen to LLVM_COMMON_DEPENDS. LLVM_COMMON_DEPENDS affects add_llvm_library (and other add_target stuff) within its scope. llvm-svn: 195927	2013-11-28 17:04:04 +00:00
NAKAMURA Takumi	5dbd3bcf3d	[CMake] Prune include_directories() in llvm/lib/Target. add_llvm_target() sets them. llvm-svn: 195921	2013-11-28 14:53:30 +00:00
Tim Northover	f0a2ff9091	Darwin-ARM: use movw/movt for static relocations llvm-svn: 195759	2013-11-26 12:45:05 +00:00
Tim Northover	86362cbc00	Fix indentation typo llvm-svn: 195660	2013-11-25 17:04:35 +00:00
Tim Northover	35cd30aae4	ARM: remove special cases for Darwin dynamic-no-pic mode. These are handled almost identically to static mode (and ELF's global address materialisation), except that a symbol may have "$non_lazy_ptr" appended. This can be handled by passing appropriate flags along with the instruction instead of using entirely separate pseudo-instructions. llvm-svn: 195655	2013-11-25 16:24:52 +00:00
Tim Northover	9a6b1a0022	ARM: remove unused patterns. There is no sane way for an LEApcrel (= single ADR) instruction to generate a global address on any ARM target I know of. Fortunately, no-one was trying to any more, but there were vestigial patterns. llvm-svn: 195644	2013-11-25 14:40:57 +00:00
Amara Emerson	368f3c89e8	[ARM] Enable FeatureMP for Cortex-A5 by default. Patch by Oliver Stannard. llvm-svn: 195640	2013-11-25 13:17:15 +00:00
Richard Barton	98aacc3f23	Add support for Cortex-A12. Patch by Oliver Stannard! llvm-svn: 195448	2013-11-22 11:53:16 +00:00
Lang Hames	433095d3fe	Fix a typo where we were creating <def,kill> operands instead of <def,dead> ones. Add an assertion to make sure we catch this in the future. Fixes <rdar://problem/15464559>. llvm-svn: 195401	2013-11-22 00:46:32 +00:00
Artyom Skrobov	3d8780e502	[ARM] add basic Cortex-A7 support to LLVM backend llvm-svn: 195358	2013-11-21 14:03:21 +00:00
Juergen Ributzka	5357a6d64b	[weak vtables] Remove a bunch of weak vtables This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. The memory leaks in this version have been fixed. Thanks Alexey for pointing them out. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 195064	2013-11-19 00:57:56 +00:00
Alexey Samsonov	3bfef6bdb6	Revert r194865 and r194874. This change is incorrect. If you delete virtual destructor of both a base class and a subclass, then the following code: Base *foo = new Child(); delete foo; will not cause the destructor for members of Child class. As a result, I observe plently of memory leaks. Notable examples I investigated are: ObjectBuffer and ObjectBufferStream, AttributeImpl and StringSAttributeImpl. llvm-svn: 194997	2013-11-18 09:31:53 +00:00
Juergen Ributzka	ee3af15269	[weak vtables] Remove a bunch of weak vtables This patch removes most of the trivial cases of weak vtables by pinning them to a single object file. Differential Revision: http://llvm-reviews.chandlerc.com/D2068 Reviewed by Andy llvm-svn: 194865	2013-11-15 22:34:48 +00:00
Bob Wilson	d433cf7463	Avoid illegal integer promotion in fastisel Stop folding constant adds into GEP when the type size doesn't match. Otherwise, the adds' operands are effectively being promoted, changing the conditions of an overflow. Results are different when: sext(a) + sext(b) != sext(a + b) Problem originally found on x86-64, but also fixed issues with ARM and PPC, which used similar code. <rdar://problem/15292280> Patch by Duncan Exon Smith! llvm-svn: 194840	2013-11-15 19:09:27 +00:00
Tim Northover	f4b55b1a75	ARM: produce friendly error for invalid inline asm We used to perform an invalid operation on an MVT and crash, which wasn't much fun. Patch by Oliver Stannard. llvm-svn: 194714	2013-11-14 17:15:39 +00:00
Weiming Zhao	1594297410	Enable generating legacy IT block for AArch32 By default, the behavior of IT block generation will be determinated dynamically base on the arch (armv8 vs armv7). This patch adds backend options: -arm-restrict-it and -arm-no-restrict-it. The former one restricts the generation of IT blocks (the same behavior as thumbv8) for both arches. The later one allows the generation of legacy IT block (the same behavior as ARMv7 Thumb2) for both arches. Clang will support -mrestrict-it and -mno-restrict-it, which is compatible with GCC. llvm-svn: 194592	2013-11-13 18:29:49 +00:00
Tim Northover	872e6a81fc	ARM: diagnose invalid system LDM/STM The system LDM and STM instructions can't usually writeback to the base register. The one exception is when an LDM is actually an exception-return (i.e. contains PC in the register list). (There's already a test that "ldm sp!, {r0-r3, pc}^" works, which is why there is no positive test). rdar://problem/15223374 llvm-svn: 194512	2013-11-12 21:32:41 +00:00
Bradley Smith	d0276d3c63	[ARM] Add support for FP_HP_extension build attribute llvm-svn: 194470	2013-11-12 10:38:05 +00:00
Artyom Skrobov	7871752687	[ARM] Add support for MVFR2 which is new in ARMv8 llvm-svn: 194416	2013-11-11 19:56:13 +00:00
Benjamin Kramer	1a2fc10168	Remove some unnecessary temporary strings. llvm-svn: 194335	2013-11-09 22:48:13 +00:00
Logan Chien	1eab705170	[arm] Refine ARMBuildAttrs.h. This commit cleans up some comments in ARMBuildAttrs.h. Besides, this commit fixes an error related to AllowWMMXv1 and AllowWMMXv2 (although they are not used currently.) llvm-svn: 194327	2013-11-09 14:16:52 +00:00
Tim Northover	e68673eeb6	ARM: fold prologue/epilogue sp updates into push/pop for code size ARM prologues usually look like: push {r7, lr} sub sp, sp, #4 If code size is extremely important, this can be optimised to the single instruction: push {r6, r7, lr} where we don't actually care about the contents of r6, but pushing it subtracts 4 from sp as a side effect. This should implement such a conversion, predicated on the "minsize" function attribute (-Oz) since I've yet to find any code it actually makes faster. llvm-svn: 194264	2013-11-08 17:18:07 +00:00
Artyom Skrobov	1890ff3a6d	[ARM] Handling for coprocessor instructions that are undefined starting from ARMv8 (Thumb encodings) llvm-svn: 194263	2013-11-08 16:25:50 +00:00
Artyom Skrobov	ef73b63766	[ARM] Handling for coprocessor instructions that are undefined starting from ARMv8 (ARM encodings) llvm-svn: 194261	2013-11-08 16:16:30 +00:00
Artyom Skrobov	6b3d6a326b	[ARM] In ARMAsmParser, MatchCoprocessorOperandName() permitted p10 and p11 as operands for coprocessor instructions, resulting in encodings that clash with FP/NEON instruction encodings llvm-svn: 194253	2013-11-08 09:16:31 +00:00
Tim Northover	d5ccdcd01b	ARM: permit bare dmb/dsb/isb aliases on Cortex-M0 Cortex-M0 supports these 32-bit instructions despite being Thumb1 only (mostly). We knew about that but not that the aliases without the default "sy" operand were also permitted. llvm-svn: 194094	2013-11-05 21:36:02 +00:00
Tim Northover	96044852e0	ARM: remove unnecessary state-tracking during frame lowering. ResolveFrameIndex had what appeared to be a very nasty hack for when the frame-index referred to a callee-saved register. In this case it "adjusted" the offset so that the address was correct if (and only if) the MachineInstr immediately followed the respective push. This "worked" for all forms of GPR & DPR but was only ever used to set the frame pointer itself, and once this was put in a more sensible location the entire state-tracking machinery it relied on became redundant. So I stripped it. The only wrinkle is that "add r7, sp, #0" might theoretically be slower (need an actual ALU slot) compared to "mov r7, sp" so I added a micro-optimisation that also makes emitARMRegUpdate and emitT2RegUpdate also work when NumBytes == 0. No test changes since there shouldn't be any functionality change. llvm-svn: 194025	2013-11-04 23:04:15 +00:00
Bob Wilson	f7bb300deb	Enable optimization of sin / cos pair into call to __sincos_stret for iOS7+. rdar://12856873 Patch by Evan Cheng, with a fix for rdar://13209539 by Tilmann Scheller llvm-svn: 193942	2013-11-03 06:14:38 +00:00
Bradley Smith	687de5605a	[ARM] Add Virtualization subtarget feature and more build attributes in this area Add a Virtualization ARM subtarget feature along with adding proper build attribute emission for Tag_Virtualization_use (encodes Virtualization and TrustZone) and Tag_MPextension_use. Also rework test/CodeGen/ARM/2010-10-19-mc-elf-objheader.ll testcase to something that is more maintainable. This changes the focus of this testcase away from testing CPU defaults (which is tested elsewhere), onto specifically testing that attributes are encoded correctly. llvm-svn: 193859	2013-11-01 13:27:35 +00:00
Bradley Smith	c999953064	[ARM] Fix Tag_ABI_HardFP_use build attribute Fix Tag_ABI_HardFP_use build attribute to handle single precision FP, replace deprecated Tag_ABI_HardFP_use value of 3 with 0 and also add some tests for Tag_ABI_VFP_args. llvm-svn: 193856	2013-11-01 11:21:16 +00:00
Jim Grosbach	4b7ad25546	Legalize: Improve legalization of long vector extends. When an extend more than doubles the size of the elements (e.g., a zext from v16i8 to v16i32), the normal legalization method of splitting the vectors will run into problems as by the time the destination vector is legal, the source vector is illegal. The end result is the operation often becoming scalarized, with the typical horrible performance. For example, on x86_64, the simple input of: define void @bar(<16 x i8> %a, <16 x i32>* %p) nounwind { %tmp = zext <16 x i8> %a to <16 x i32> store <16 x i32> %tmp, <16 x i32>*%p ret void } Generates: .section __TEXT,__text,regular,pure_instructions .section __TEXT,__const .align 5 LCPI0_0: .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .long 255 ## 0xff .section __TEXT,__text,regular,pure_instructions .globl _bar .align 4, 0x90 _bar: vpunpckhbw %xmm0, %xmm0, %xmm1 vpunpckhwd %xmm0, %xmm1, %xmm2 vpmovzxwd %xmm1, %xmm1 vinsertf128 $1, %xmm2, %ymm1, %ymm1 vmovaps LCPI0_0(%rip), %ymm2 vandps %ymm2, %ymm1, %ymm1 vpmovzxbw %xmm0, %xmm3 vpunpckhwd %xmm0, %xmm3, %xmm3 vpmovzxbd %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vandps %ymm2, %ymm0, %ymm0 vmovaps %ymm0, (%rdi) vmovaps %ymm1, 32(%rdi) vzeroupper ret So instead we can check if there are legal types that enable us to split more cleverly when the input vector is already legal such that we don't turn it into an illegal type. If the extend is such that it's more than doubling the size of the input we check if - the number of vector elements is even, - the source type is legal, - the type of a split source is illegal, - the type of an extended (by doubling element size) source is legal, and - the type of that extended source when split is legal. If the conditions are met, instead of just splitting both the destination and the source types, we create an extend that only goes up one "step" (doubling the element width), and the continue legalizing the rest of the operation normally. The result is that this operates as a new, more effecient, termination condition for the loop of "split the operation until the destination type is legal." With this change, the above example now compiles to: _bar: vpxor %xmm1, %xmm1, %xmm1 vpunpcklbw %xmm1, %xmm0, %xmm2 vpunpckhwd %xmm1, %xmm2, %xmm3 vpunpcklwd %xmm1, %xmm2, %xmm2 vinsertf128 $1, %xmm3, %ymm2, %ymm2 vpunpckhbw %xmm1, %xmm0, %xmm0 vpunpckhwd %xmm1, %xmm0, %xmm3 vpunpcklwd %xmm1, %xmm0, %xmm0 vinsertf128 $1, %xmm3, %ymm0, %ymm0 vmovaps %ymm0, 32(%rdi) vmovaps %ymm2, (%rdi) vzeroupper ret This generalizes a custom lowering that was added a while back to the ARM backend. That lowering is no longer necessary, and is removed. The testcases for it, however, provide excellent ARM tests for this change and so remain. rdar://14735100 llvm-svn: 193727	2013-10-31 00:20:48 +00:00
Artyom Skrobov	de6ac0d99e	[ARM] NEON instructions were erroneously decoded from certain invalid encodings llvm-svn: 193705	2013-10-30 18:10:09 +00:00
Manman Ren	92012bba37	Struct byval cleanup: add helper functions to reduce code duplication. Helper functions are added: emitPostLd: emit a post-increment load operation with given size. emitPostSt: emit a post-increment store operation with given size. No functionality change. llvm-svn: 193656	2013-10-29 22:27:32 +00:00
Rafael Espindola	3e081640d5	Move getSymbol to TargetLoweringObjectFile. This allows constructing a Mangler with just a TargetMachine. llvm-svn: 193630	2013-10-29 17:28:26 +00:00
Rafael Espindola	68ddc56344	Add a helper getSymbol to AsmPrinter. llvm-svn: 193627	2013-10-29 17:07:16 +00:00
Amara Emerson	010fd3dfe5	[ARM] Make sure HasCRC is initialized to false in Subtarget. llvm-svn: 193624	2013-10-29 16:54:52 +00:00
Bernard Ogden	2ff9cbbf41	ARM: Add subtarget feature for CRC Adds a subtarget feature for the CRC instructions (optional in v8-A) to the ARM (32-bit) backend. Differential Revision: http://llvm-reviews.chandlerc.com/D2036 llvm-svn: 193599	2013-10-29 09:47:35 +00:00
Arnold Schwaighofer	4c1da89bfc	ARM cost model: Unaligned vectorized double stores are expensive Updated a test case that assumed that <2 x double> would vectorize to use <4 x float>. radar://15338229 llvm-svn: 193574	2013-10-29 01:33:57 +00:00
Arnold Schwaighofer	fe80e563da	ARM cost model: Account for zero cost scalar SROA instructions By vectorizing a series of srl, or, ... instructions we have obfuscated the intention so much that the backend does not know how to fold this code away. radar://15336950 llvm-svn: 193573	2013-10-29 01:33:53 +00:00
Lang Hames	ae80875f08	Return early from getUnconditionalBranchTargetOpValue if the branch target is an MCExpr, in order to avoid writing an encoded zero value in the immediate field. When getUnconditionalBranchTargetOpValue is called with an MCExpr target, we don't know what the final immediate field value should be. We shouldn't explicitly set the immediate field to an encoded zero value as zero is encoded with a non-zero bit pattern. This leads to bits being set that pollute the final immediate value. The nature of the encoding is such that the polluted bits only affect very large immediate values, explaining why this hasn't caused problems earlier. Fixes <rdar://problem/15155975>. llvm-svn: 193535	2013-10-28 20:51:11 +00:00
Logan Chien	043426ee56	[arm] Implement eabi_attribute, cpu, and fpu directives. This commit allows the ARM integrated assembler to parse and assemble the code with .eabi_attribute, .cpu, and .fpu directives. To implement the feature, this commit moves the code from AttrEmitter to ARMTargetStreamers, and several new test cases related to cortex-m4, cortex-r5, and cortex-a15 are added. Besides, this commit also change the Subtarget->isFPOnlySP() to Subtarget->hasD16() to match the usage of .fpu directive. This commit changes the test cases: * Several .eabi_attribute directives in 2010-09-29-mc-asm-header-test.ll are removed because the .fpu directive already cover the functionality. * In the Cortex-A15 test case, the value for Tag_Advanced_SIMD_arch has be changed from 1 to 2, which is more precise. llvm-svn: 193524	2013-10-28 17:51:12 +00:00
Tim Northover	130230544c	ARM: allow .thumb_func to be separated from symbol definition When assembling, a .thumb_func directive is supposed to be applicable to the next symbol definition, even if there are intervening directives. We were racing ahead to try and find it, and this commit should fix the issue. Patch by Gabor Ballabas llvm-svn: 193403	2013-10-25 12:49:50 +00:00
Tim Northover	ec8c784fa3	ARM: don't expand atomicrmw inline on Cortex-M0 There's a barrier instruction so that should still be used, but most actual atomic operations are going to need a platform decision on the correct behaviour (either nop if single-threaded or OS-support otherwise). rdar://problem/15287210 llvm-svn: 193399	2013-10-25 09:30:24 +00:00
Jim Grosbach	1b0b1ab2d5	ARM: Tweak usage of '*vfp' compiler_rt functions. Only use them if the subtarget has ARM mode, as these routines are implemented as ARM code. rdar://15302004 llvm-svn: 193381	2013-10-24 23:07:11 +00:00
David Peixotto	b587e6848c	Remove class abstraction from ARM struct byval lowering This commit changes the struct byval lowering for arm to use inline checks for the subtarget instead of a class abstraction to represent the differences. The class abstraction was judged to be too much code for this task. No intended functionality change. llvm-svn: 193357	2013-10-24 16:39:36 +00:00
Tim Northover	27cc10f26d	ARM: Mark double-precision instructions as such This prevents us from silently accepting invalid instructions on (for example) Cortex-M4 with just single-precision VFP support. No tests for the extra Pat Requires because they're essentially assertions: the affected code should have been lowered to libcalls before ISel. rdar://problem/15302004 llvm-svn: 193354	2013-10-24 15:49:39 +00:00
Tim Northover	7ebbaf0161	ARM: add a couple more NEON predicates. The fused multiply instructions were added in VFPv4 but are still NEON instructions, in particular they shouldn't be available on a Cortex-M4 not matter how floaty it is. llvm-svn: 193342	2013-10-24 12:48:05 +00:00
Tim Northover	0b20ff06da	ARM: mark various aliases with their architecture requirements. If an alias inherits directly from InstAlias then it doesn't get any default "Requires" values, so llvm-mc will allow it even on architectures that don't support the underlying instruction. This tidies up the obvious VFP and NEON cases I found. llvm-svn: 193340	2013-10-24 12:22:58 +00:00
Tim Northover	adc6407bf1	ARM: Use non-VFP softcalls on embedded Darwinish targets The compiler-rt functions __adddf3vfp and so on exist purely to allow Thumb1 code to make use of VFP instructions by switching back to ARM mode, they make no sense for M-class processors which don't even have an ARM mode. Given that justification, in practice this is a platform ABI decision so the actual check is based on that rather than CPU features. rdar://problem/15302004 llvm-svn: 193327	2013-10-24 10:37:09 +00:00
Tim Northover	420889afe2	ARM: fix assert on unpredictable POP instruction. POP instructions are aliased to the ARM LDM variants but have different syntax. This caused two problems: we tried to access a non-existent operand to annotate the '!', and the error message didn't make much sense. With some vigorous hand-waving in the error message both problems can be fixed. llvm-svn: 193322	2013-10-24 09:37:18 +00:00
Artyom Skrobov	2ef42e5c31	Make ARM hint ranges consistent, and add tests for these ranges llvm-svn: 193238	2013-10-23 10:14:40 +00:00
Tim Northover	e4acb9a5a2	ARM: provide diagnostics on more writeback LDM/STM instructions The set of circumstances where the writeback register is allowed to be in the list of registers is rather baroque, but I think this implements them all on the assembly parsing side. For disassembly, we still warn about an ARM-mode LDM even if the architecture revision is < v7 (the required architecture information isn't available). It's a silly instruction anyway, so hopefully no-one will mind. rdar://problem/15223374 llvm-svn: 193185	2013-10-22 19:00:39 +00:00
Jim Grosbach	4ccaf97952	ARM: Thumb2 copy for GPRPair needs to use thumb instructions. Use tMOVr instead of plain MOVr. rdar://15193017 llvm-svn: 193139	2013-10-22 02:29:37 +00:00
Jim Grosbach	fb572e0475	ARM: Clean up copyPhysReg() a bit. No functional change, just cleaning things up for readability. llvm-svn: 193138	2013-10-22 02:29:35 +00:00
Richard Barton	4f1967c83f	Pure refactoring change. Patch by Artyom Skrobov. llvm-svn: 192977	2013-10-18 14:41:50 +00:00
Richard Barton	cb6c32ac32	Add hint disassembly syntax for 16-bit Thumb hint instructions. Patch by Artyom Skrobov llvm-svn: 192972	2013-10-18 14:09:49 +00:00
Silviu Baranga	0e6bbf1a98	Add hardware division as a default feature on Cortex-A15. Also add test cases to check this, and change diagnostics for the hwdiv-arm feature to something useful. llvm-svn: 192963	2013-10-18 10:18:40 +00:00
David Peixotto	839f0f98a5	17309 ARM backend incorrectly lowers COPY_STRUCT_BYVAL_I32 for thumb1 targets This commit implements the correct lowering of the COPY_STRUCT_BYVAL_I32 pseudo-instruction for thumb1 targets. Previously, the lowering of COPY_STRUCT_BYVAL_I32 generated the post-increment forms of ldr/ldrh/ldrb instructions. Thumb1 does not have the post-increment form of these instructions so the generated assembly contained invalid instructions. Passing the generated assembly to gcc caused it to complain with an error like this: Error: cannot honor width suffix -- `ldrb r3,[r0],#1' and the integrated assembler would generate an object file with an invalid instruction encoding. This commit contains a small test case that demonstrates the problem with thumb1 targets as well as an expanded test case that more throughly tests the lowering of byval struct passing for arm, thumb1, and thumb2 targets. llvm-svn: 192916	2013-10-17 19:52:05 +00:00
David Peixotto	22f270719b	Refactor lowering for COPY_STRUCT_BYVAL_I32 This commit refactors the lowering of the COPY_STRUCT_BYVAL_I32 pseudo-instruction in the ARM backend. We introduce a new helper class that encapsulates all of the operations needed during the lowering. The operations are implemented for each subtarget in different subclasses. Currently only arm and thumb2 subtargets are supported. This refactoring was done to easily implement support for thumb1 subtargets. This initial patch does not add support for thumb1, but is only a refactoring. A follow on patch will implement the support for thumb1 subtargets. No intended functionality change. llvm-svn: 192915	2013-10-17 19:49:22 +00:00
Rafael Espindola	90d8b36e1e	Add a MCAsmInfoELF class and factor some code into it. We had a MCAsmInfoCOFF, but no common class for all the ELF MCAsmInfos before. llvm-svn: 192760	2013-10-16 01:34:32 +00:00
Manman Ren	39d1a84681	Struct byval: fix a copy-paste error for thumb2. PR17309 llvm-svn: 192730	2013-10-15 19:42:32 +00:00
Bernard Ogden	f482ee15d7	Add Cortex-A57 support llvm-svn: 192591	2013-10-14 13:17:07 +00:00
Bernard Ogden	ec0167a2ce	Add subtarget feature support for Cortex-A53 Some previous implicit defaults have changed, for example FP and NEON are now on by default. llvm-svn: 192590	2013-10-14 13:16:57 +00:00
Amara Emerson	bf6dcda63c	[ARM] Fix FP ABI attributes with no VFP enabled. llvm-svn: 192458	2013-10-11 16:03:43 +00:00
Benjamin Kramer	4cb23ab23b	ARM: Put isV8EligibleForIT into the llvm namespace. While there make it inline. llvm-svn: 192350	2013-10-10 14:35:45 +00:00
Tim Northover	50b95fa75d	ARM: correct liveness flags during ARMLoadStoreOpt When we had a sequence like: s1 = VLDRS [r0, 1], Q0<imp-def> s3 = VLDRS [r0, 2], Q0<imp-use,kill>, Q0<imp-def> s0 = VLDRS [r0, 0], Q0<imp-use,kill>, Q0<imp-def> s2 = VLDRS [r0, 4], Q0<imp-use,kill>, Q0<imp-def> we were gathering the {s0, s1} loads below the s3 load. This is fine, but confused the verifier since now the s3 load had Q0<imp-use> with no definition above it. This should mark such uses <undef> as well. The liveness structure at the beginning and end of the block is unaffected, and the true sN definitions should prevent any dodgy reorderings being introduced elsewhere. rdar://problem/15124449 llvm-svn: 192344	2013-10-10 09:28:20 +00:00
Benjamin Kramer	2f83eef8d9	Flip the ownership of MCStreamer and MCTargetStreamer. MCStreamer now owns the target streamer. This prevents leaking the target streamer. llvm-svn: 192303	2013-10-09 17:23:41 +00:00
Rafael Espindola	6267c79fdb	Add a MCTargetStreamer interface. This patch fixes an old FIXME by creating a MCTargetStreamer interface and moving the target specific functions for ARM, Mips and PPC to it. The ARM streamer is still declared in a common place because it is used from lib/CodeGen/ARMException.cpp, but the Mips and PPC are completely hidden in the corresponding Target directories. I will send an email to llvmdev with instructions on how to use this. llvm-svn: 192181	2013-10-08 13:08:17 +00:00
Manman Ren	b284db0070	Struct byval: use the correct alignment for loads generated to load from struct byval to registers. We used to pass 0 which means the alignment of PtrVT. Even when the alignment of the struct is smaller than 4, the LOADs would have alignment of 4, and further optimizations could combine the LOADs into a ldm, which would cause crash. The fix is to pass the alignment of the struct byval. rdar://problem/15144402 llvm-svn: 192126	2013-10-07 19:47:53 +00:00
Amara Emerson	688cdc2151	[ARM] Improve build attributes emission. llvm-svn: 192111	2013-10-07 16:55:23 +00:00
Rafael Espindola	e60c3625e3	Remove getEHExceptionRegister and getEHHandlerRegister. They haven't been used for a long time. Patch by MathOnNapkins. llvm-svn: 192099	2013-10-07 13:39:22 +00:00
Tim Northover	1979375a30	ARM: allow cortex-m0 to use hint instructions The hint instructions ("nop", "yield", etc) are mostly Thumb2-only, but have been ported across to the v6M architecture. Fortunately, v6M seems to sit nicely between v6 (thumb-1 only) and v6T2, so we can add a feature for it fairly easily. rdar://problem/15144406 llvm-svn: 192097	2013-10-07 11:10:47 +00:00
Rafael Espindola	a1a1d34e51	Remove some really nasty uses of hasRawTextSupport. When MC was first added, targets could use hasRawTextSupport to keep features working before they were added to the MC interface. The design goal of MC is to provide an uniform api for printing assembly and object files. Short of relaxations and other corner cases, a object file is just another representation of the assembly. It was never the intention that targets would keep doing things like if (hasRawTextSupport()) Set flags in one way. else Set flags in another way. When they do that they create two code paths and the object file is no longer just another representation of the assembly. This also then requires testing with llc -filetype=obj, which is extremelly brittle. This patch removes some of these hacks by replacing them with smaller ones. The ARM flag setting is trivial, so I just moved it to the constructor. For Mips, the patch adds two temporary hack directives that allow the assembly to represent the same things as the object file was already able to. The hope is that the mips developers will replace the hack directives with the same ones that gas uses and drop the -print-hack-directives flag. I will also try to implement a target streamer interface, so that we can move this out of the common code. In summary, for any new work, two rules of the thumb are * Don't use "llc -filetype=obj" in tests. * Don't add calls to hasRawTextSupport. llvm-svn: 192035	2013-10-05 16:42:21 +00:00
Matthias Braun	f7ddf86363	ARM: optimizeSelect has to consider the previous register class optimizeSelect folds (predicated) copy instructions, it must not ignore the original register class of the operand when replacing the register with the copies dest register. llvm-svn: 191963	2013-10-04 16:52:56 +00:00

1 2 3 4 5 ...

7131 Commits