llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00

Author	SHA1	Message	Date
Joey Gouly	355a09f268	[ARMv8] Add CodeGen support for VSEL. This uses the ARMcmov pattern that Tim cleaned up in r188995. Thanks to Simon Tatham for his floating point help! llvm-svn: 189024	2013-08-22 15:29:11 +00:00
Joey Gouly	67edac2b5e	[ARM] Constrain some register classes in EmitAtomicBinary64 so that we pass these tests with -verify-machineinstrs. llvm-svn: 189006	2013-08-22 12:19:24 +00:00
Logan Chien	2891b0c61c	Fix ARM FastISel PIC function call. The function call to external function should come with PLT relocation type if the PIC relocation model is used. llvm-svn: 189002	2013-08-22 12:08:04 +00:00
Tim Northover	eb7a86ed88	ARM: use TableGen patterns to select CMOV operations. Back in the mists of time (2008), it seems TableGen couldn't handle the patterns necessary to match ARM's CMOV node that we convert select operations to, so we wrote a lot of fairly hairy C++ to do it for us. TableGen can deal with it now: there were a few minor differences to CodeGen (see tests), but nothing obviously worse that I could see, so we should probably address anything that does come up in a localised manner. llvm-svn: 188995	2013-08-22 09:57:11 +00:00
Tim Northover	7e34f41ef8	ARM: respect tied 64-bit inlineasm operands when printing The code for 'Q' and 'R' operand modifiers needs to look through tied operands to discover the register class. llvm-svn: 188990	2013-08-22 06:51:04 +00:00
Michael Gottesman	e5ccfcac27	[stackprotector] When finding the split point to splice off the end of a parentmbb into a successmbb, include any DBG_VALUE MI. Fix for PR16954. llvm-svn: 188987	2013-08-22 05:40:50 +00:00
Jim Grosbach	d6ff6507fa	ARM: R9 is not safe to use for tcGPR. Indirect tail-calls shouldn't use R9 for the branch destination, as it's not reliably a call-clobbered register. rdar://14793425 llvm-svn: 188967	2013-08-22 00:14:24 +00:00
Tom Stellard	721e3acccd	SelectionDAG: Make sure stores are always added to the LegalizedNodes list When truncated vector stores were being custom lowered in VectorLegalizer::LegalizeOp(), the old (illegal) and new (legal) node pair was not being added to LegalizedNodes list. Instead of the legalized result being passed to VectorLegalizer::TranslateLegalizeResult(), the result was being passed back into VectorLegalizer::LegalizeOp(), which ended up adding a (new, new) pair to the list instead. This was causing an assertion failure when a custom lowered truncated vector store was the last instruction a basic block and the VectorLegalizer was unable to find it in the LegalizedNodes list when updating the DAG root. llvm-svn: 188953	2013-08-21 22:42:58 +00:00
Manman Ren	1b7973fc19	TBAA: remove !tbaa from testing cases when they are not needed. This will make it easier to turn on struct-path aware TBAA since the metadata format will change. llvm-svn: 188944	2013-08-21 22:20:53 +00:00
Juergen Ributzka	d12ce0859f	Teach BaseIndexOffset::match to identify base pointers in loops. The small utility function that pattern matches Base + Index + Offset patterns for loads and stores fails to recognize the base pointer for loads/stores from/into an array at offset 0 inside a loop. As a result DAGCombiner::MergeConsecutiveStores was not able to merge all stores. This commit fixes the issue by adding an additional pattern match and also a test case. Reviewer: Nadav llvm-svn: 188936	2013-08-21 21:53:38 +00:00
Hao Liu	7962606ca8	A minor change for an obvous problem caused by r188451: def imm0_63 : Operand<i32>, ImmLeaf<i32, [{ return Imm >= 0 && Imm < 63;}]>{ As it seems Imm <63 should be Imm <= 63. ImmLeaf is used in pattern match, but there is already a function check the shift amount range, so just remove ImmLeaf. Also add a test to check 63. llvm-svn: 188911	2013-08-21 17:47:53 +00:00
Joey Gouly	ec3b9aa53e	Add -mcpu to two X86 tests. These tests are failing on Haswell CPUs due to different instruction selection. llvm-svn: 188908	2013-08-21 17:14:31 +00:00
Elena Demikhovsky	44bbb2b413	AVX-512: Added SHIFT instructions. llvm-svn: 188899	2013-08-21 09:36:02 +00:00
Richard Sandiford	1dc05c13d2	[SystemZ] Define remainig *MUL_LOHI patterns The initial port used MLG(R) for i64 UMUL_LOHI but left the other three combinations as not-legal-or-custom. Although 32x32->{32,32} multiplications exist, they're not as quick as doing a normal 64-bit multiplication, so it didn't seem like i32 SMUL_LOHI and UMUL_LOHI would be useful. There's also no direct instruction for i64 SMUL_LOHI, so it needs to be implemented in terms of UMUL_LOHI. However, not defining these patterns means that we don't convert division by a constant into multiplication, so this patch fills in the other cases. The new i64 SMUL_LOHI sequence is simpler than the one that we used previously for 64x64->128 multiplication, so int-mul-08.ll now tests the full sequence. llvm-svn: 188898	2013-08-21 09:34:56 +00:00
Richard Sandiford	e6e07910e3	[SystemZ] Use FI[EDX]BRA for codegen llvm-svn: 188895	2013-08-21 09:04:20 +00:00
Akira Hatanaka	a80bdd3ab0	[mips] Add support for mfhc1 and mthc1. llvm-svn: 188848	2013-08-20 23:47:25 +00:00
Reed Kotler	3e323b240e	Add an option which permits the user to specify using a bitmask, that various functions be compiled as mips32, without having to add attributes. This is useful in certain situations where you don't want to have to edit the function attributes in the source. For now it's only an option used for the compiler developers when debugging the mips16 port. llvm-svn: 188826	2013-08-20 20:53:09 +00:00
Jim Grosbach	343f1fbc39	ARM: Fix fast-isel copy/paste-o. Update testcase to be more careful about checking register values. While regexes are general goodness for these sorts of testcases, in this example, the registers are constrained by the calling convention, so we can and should check their explicit values. rdar://14779513 llvm-svn: 188819	2013-08-20 19:12:42 +00:00
Elena Demikhovsky	f09dad5d90	AVX-512: Added more patterns for VMOVSS, VMOVSD, VMOVD, VMOVQ llvm-svn: 188786	2013-08-20 11:00:29 +00:00
Daniel Sanders	30561c36b8	[mips][msa] Removed fcge, fcgt, fsge, fsgt These instructions were present in a draft spec but were removed before publication. llvm-svn: 188782	2013-08-20 09:41:47 +00:00
Richard Sandiford	add1a68f21	[SystemZ] Use SRST to optimize memchr SystemZTargetLowering::emitStringWrapper() previously loaded the character into R0 before the loop and made R0 live on entry. I'd forgotten that allocatable registers weren't allowed to be live across blocks at this stage, and it confused LiveVariables enough to cause a miscompilation of f3 in memchr-02.ll. This patch instead loads R0 in the loop and leaves LICM to hoist it after RA. This is actually what I'd tried originally, but I went for the manual optimisation after noticing that R0 often wasn't being hoisted. This bug forced me to go back and look at why, now fixed as r188774. We should also try to optimize null checks so that they test the CC result of the SRST directly. The select between null and the SRST GPR result could then usually be deleted as dead. llvm-svn: 188779	2013-08-20 09:38:48 +00:00
Daniel Sanders	91c40d80de	[mips][msa] Added insve llvm-svn: 188777	2013-08-20 09:22:54 +00:00
Richard Sandiford	6a0b1638b4	Fix test typo and add usual "br %r14" test llvm-svn: 188775	2013-08-20 09:14:46 +00:00
Richard Sandiford	fcd54a3b89	Fix overly pessimistic shortcut in post-RA MachineLICM Post-RA LICM keeps three sets of registers: PhysRegDefs, PhysRegClobbers and TermRegs. When it sees a definition of R it adds all aliases of R to the corresponding set, so that when it needs to test for membership it only needs to test a single register, rather than worrying about aliases there too. E.g. the final candidate loop just has: unsigned Def = Candidates[i].Def; if (!PhysRegClobbers.test(Def) && ...) { to test whether register Def is multiply defined. However, there was also a shortcut in ProcessMI to make sure we didn't add candidates if we already knew that they would fail the final test. This shortcut was more pessimistic than the final one because it checked whether _any alias_ of the defined register was multiply defined. This is too conservative for targets that define register pairs. E.g. on z, R0 and R1 are sometimes used as a pair, so there is a 128-bit register that aliases both R0 and R1. If a loop used R0 and R1 independently, and the definition of R0 came first, we would be able to hoist the R0 assignment (because that used the final test quoted above) but not the R1 assignment (because that meant we had two definitions of the paired R0/R1 register and would fail the shortcut in ProcessMI). This patch just uses the same check for the ProcessMI shortcut as we use in the final candidate loop. llvm-svn: 188774	2013-08-20 09:11:13 +00:00
Tim Northover	cec1079024	ARM: implement some simple f64 materializations. Previously we used a const-pool load for virtually all 64-bit floating values. Actually, we can get quite a few common values (including 0.0, 1.0) via "vmov" instructions of one stripe or another. llvm-svn: 188773	2013-08-20 08:57:11 +00:00
Daniel Sanders	15341e9a12	[mips][msa] Added and.v, bmnz.v, bmz.v, bsel.v, nor.v, or.v, xor.v llvm-svn: 188767	2013-08-20 08:38:21 +00:00
Hal Finkel	4bb40e7c8d	Don't form PPC CTR-based loops around a copysignl call copysign/copysignf never become function calls (because the SDAG expansion code does not lower to the corresponding function call, but rather directly implements the associated logic), but copysignl almost always is lowered into a call to the requested libm functon (and, thus, might clobber CTR). llvm-svn: 188727	2013-08-19 23:35:24 +00:00
Paul Redmond	404ef5af36	Improve the widening of integral binary vector operations - split WidenVecRes_Binary into WidenVecRes_Binary and WidenVecRes_BinaryCanTrap - WidenVecRes_BinaryCanTrap preserves the original behaviour for operations that can trap - WidenVecRes_Binary simply widens the operation and improves codegen for 3-element vectors by allowing widening and promotion on x86 (matches the behaviour of unary and ternary operation widening) - use WidenVecRes_Binary for operations on integers. Reviewed by: nrotem llvm-svn: 188699	2013-08-19 20:01:35 +00:00
Elena Demikhovsky	f1afd2e4db	AVX-512: added arithmetic and logical operations. ADD, SUB, MUL integer and FP types. OR, AND, XOR. Added embeded broadcast form for these instructions. llvm-svn: 188673	2013-08-19 13:26:14 +00:00
Richard Sandiford	5e32ef0acd	[SystemZ] Add negative integer absolute (load negative) For now this matches the equivalent of (neg (abs ...)), which did hit a few times in projects/test-suite. We should probably also match cases where absolute-like selects are used with reversed arguments. llvm-svn: 188671	2013-08-19 12:56:58 +00:00
Richard Sandiford	aee0958460	[SystemZ] Add integer absolute (load positive) llvm-svn: 188670	2013-08-19 12:48:54 +00:00
Richard Sandiford	841d24aa5a	[SystemZ] Add support for sibling calls This first cut is pretty conservative. The final argument register (R6) is call-saved, so we would need to make sure that the R6 argument to a sibling call is the same as the R6 argument to the calling function, which seems worth keeping as a separate patch. Saying that integer truncations are free means that we no longer use the extending instructions LGF and LLGF for spills in int-conv-09.ll and int-conv-10.ll. Instead we treat the registers as 64 bits wide and truncate them to 32-bits where necessary. I think it's unlikely we'd use LGF and LLGF for spills in other situations for the same reason, so I'm removing the tests rather than replacing them. The associated code is generic and applies to many more instructions than just LGF and LLGF, so there is no corresponding code removal. llvm-svn: 188669	2013-08-19 12:42:31 +00:00
Hal Finkel	23714b2e52	Add ExpandFloatOp_FCOPYSIGN to handle ppcf128-related expansions We had previously been asserting when faced with a FCOPYSIGN f64, ppcf128 node because there was no way to expand the FCOPYSIGN node. Because ppcf128 is the sum of two doubles, and the first double must have the larger magnitude, we can take the sign from the first double. As a result, in addition to fixing the crash, this is also an optimization. llvm-svn: 188655	2013-08-19 06:55:37 +00:00
Hal Finkel	9591220c33	Add the PPC fcpsgn instruction Modern PPC cores support a floating-point copysign instruction, and we can use this to lower the FCOPYSIGN node (which is created from calls to the libm copysign function). A couple of extra patterns are necessary because the operand types of FCOPYSIGN need not agree. llvm-svn: 188653	2013-08-19 05:01:02 +00:00
Tim Northover	057a4d7c26	ARM: make sure we keep inline asm operands tied. When patching inlineasm nodes to use GPRPair for 64-bit values, we were dropping the information that two operands were tied, which effectively broke the live-interval of vregs affected. llvm-svn: 188643	2013-08-18 18:06:03 +00:00
Elena Demikhovsky	406cf0ea6d	AVX-512: Added VMOVD, VMOVQ, VMOVSS, VMOVSD instructions. llvm-svn: 188637	2013-08-18 13:08:57 +00:00
Tom Stellard	ad43e88afa	R600: Expand vector FRINT ops Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 188598	2013-08-16 23:51:33 +00:00
Tom Stellard	e42573d2cc	R600: Expand vector FFLOOR ops Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 188597	2013-08-16 23:51:29 +00:00
Tom Stellard	0721bae8ba	R600: Expand vector float operations for both SI and R600 Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> llvm-svn: 188596	2013-08-16 23:51:24 +00:00
Jim Grosbach	58bafdb9f1	ARM: Properly constrain comparison fastisel register classes. Ongoing 'make the verifier happy' improvements to ARM fast-isel. rdar://12594152 llvm-svn: 188595	2013-08-16 23:37:40 +00:00
Jim Grosbach	4829862a24	ARM: Fast-isel register class constrain for extends. Properly constrain the operand register class for instructions used in [sz]ext expansion. Update more tests to use the verifier now that we're getting the register classes correct. rdar://12594152 llvm-svn: 188594	2013-08-16 23:37:36 +00:00
Jim Grosbach	de05043e78	ARM: Fix more fast-isel verifier failures. Teach the generic instruction selection helper functions to constrain the register classes of their input operands. For non-physical register references, the generic code needs to be careful not to mess that up when replacing references to result registers. As the comment indicates for MachineRegisterInfo::replaceRegWith(), it's important to call constrainRegClass() first. rdar://12594152 llvm-svn: 188593	2013-08-16 23:37:31 +00:00
Jim Grosbach	7f992f45da	ARM: Clean up fast-isel machine verifier errors. Lots of machine verifier errors result from using a plain GPR regclass for incoming argument copies. A more restrictive rGPR class is more appropriate since it more accurately represents what's happening, plus it lines up better with isel later on so the verifier is happier. Reduces the number of ARM fast-isel tests not running with the verifier enabled by over half. rdar://12594152 llvm-svn: 188592	2013-08-16 23:37:23 +00:00
Reed Kotler	4a69818916	Fix a subtle difference between running clang vs llc for mips16. This regards how mips16 is viewed. It's not really a target type but there has always been a target for it in the td files. It's more properly -mcpu=mips32 -mattr=+mips16 . This is how clang treats it but we have always had the -mcpu=mips16 which I probably should delete now but it will require updating all the .ll test cases for mips16. In this case it changed how we decide if we have a count bits instruction and whether instruction lowering should then expand ctlz. Now that we have dual mode compilation, -mattr=+mips16 really just indicates the inital processor mode that we are compiling for. (It is also possible to have -mcpu=64 -mattr=+mips16 but as far as I know, nobody has even built such a processor, though there is an architecture manual for this). llvm-svn: 188586	2013-08-16 23:05:18 +00:00
Daniel Dunbar	e491451730	[tests] Another attempt to workaround broken misched-copy.s test on some buildbots. llvm-svn: 188567	2013-08-16 18:01:18 +00:00
Michel Danzer	65d5ad5728	R600/SI: Add pattern for xor of i1 Fixes two recent piglit regressions with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188559	2013-08-16 16:19:31 +00:00
Michel Danzer	acc130ec54	R600/SI: Fix broken encoding of DS_WRITE_B32 The logic in SIInsertWaits::getHwCounts() only really made sense for SMRD instructions, and trying to shoehorn it into handling DS_WRITE_B32 caused it to corrupt the encoding of that by clobbering the first operand with the second one. Undo that damage and only apply the SMRD logic to that. Fixes some derivates related piglit regressions with radeonsi. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> llvm-svn: 188558	2013-08-16 16:19:24 +00:00
Benjamin Kramer	700f0ccf14	When initializing the PIC global base register on ARM/ELF add pc to fix the address. This unbreaks PIC with fast isel on ELF targets (PR16717). The output matches what GCC and SDag do for PIC but may not cover all of the many flavors of PIC that exist. llvm-svn: 188551	2013-08-16 12:52:08 +00:00
Richard Sandiford	06a13f49c8	[SystemZ] Use SRST to implement strlen and strnlen It would also make sense to use it for memchr; I'm working on that now. llvm-svn: 188547	2013-08-16 11:41:43 +00:00
Richard Sandiford	93a75a2a56	[SystemZ] Use MVST to implement strcpy and stpcpy llvm-svn: 188546	2013-08-16 11:29:37 +00:00

1 2 3 4 5 ...

8000 Commits