llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 06:22:56 +02:00

Author	SHA1	Message	Date
Andy Ayers	abf4852bd1	Support for emitting inline stack probes For CoreCLR on Windows, stack probes must be emitted as inline sequences that probe successive stack pages between the current stack limit and the desired new stack pointer location. This implements support for the inline expansion on x64. For in-body alloca probes, expansion is done during instruction lowering. For prolog probes, a stub call is initially emitted during prolog creation, and expanded after epilog generation, to avoid complications that arise when introducing new machine basic blocks during prolog and epilog creation. Added a new test case, modified an existing one to exclude non-x64 coreclr (for now). Add test case Fix tests llvm-svn: 252578	2015-11-10 01:50:49 +00:00
Reid Kleckner	da90738ed9	[WinEH] Remove isBarrier from instructions that do not return Fixes machine verification failures with David's latest EH change. llvm-svn: 252541	2015-11-09 23:34:42 +00:00
Sanjay Patel	a712e76470	add a SelectionDAG method to check if no common bits are set in two nodes; NFCI This was suggested in: http://reviews.llvm.org/D13956 and is a follow-on to: http://reviews.llvm.org/rL252515 http://reviews.llvm.org/rL252519 This lets us remove logically equivalent/duplicated code from DAGCombiner and X86ISelDAGToDAG. A corresponding function for IR instructions already exists in ValueTracking. llvm-svn: 252539	2015-11-09 23:31:38 +00:00
David Majnemer	624c46d055	[WinEH] Don't emit CATCHRET from visitCatchPad Instead, emit a CATCHPAD node which will get selected to a target specific sequence. llvm-svn: 252528	2015-11-09 23:07:48 +00:00
Sanjay Patel	4f4bd24ec4	[x86] try harder to match bitwise 'or' into an LEA The motivation for this patch starts with the epic fail example in PR18007: https://llvm.org/bugs/show_bug.cgi?id=18007 ...unfortunately, this patch makes no difference for that case, but it solves some simpler cases. We'll get there some day. :) The current 'or' matching code was using computeKnownBits() via isBaseWithConstantOffset() -> MaskedValueIsZero(), but that's an unnecessarily limited use. We can do more by copying the logic in ValueTracking's haveNoCommonBitsSet(), so we can treat the 'or' as if it was an 'add'. There's a TODO comment here because we should lift the bit-checking logic into a helper function, so it's not duplicated in DAGCombiner. An example of the better LEA matching: leal (%rdi,%rdi), %eax andl $1, %esi orl %esi, %eax Becomes: andl $1, %esi leal (%rsi,%rdi,2), %eax Differential Revision: http://reviews.llvm.org/D13956 llvm-svn: 252515	2015-11-09 21:16:49 +00:00
Reid Kleckner	f7f2cdef8b	[WinEH] Tweak funclet prologue/epilogue insertion to pass verifier For some reason we'd never run MachineVerifier on WinEH code, and you explicitly have to ask for it with llc. I added it to a few test cases to get some coverage. Fixes PR25461. llvm-svn: 252512	2015-11-09 21:04:00 +00:00
David Majnemer	b682989c7b	[WinEH] Update PHIs of CATCHRET successors The TailDuplication machine pass ran across a malformed CFG: a PHI node referred it's predecessor's predecessor instead of it's predecessor. This occurred because we split the edge in X86ISelLowering when we processed the CATCHRET but forgot to do something about the PHI nodes. This fixes PR25444. llvm-svn: 252413	2015-11-08 02:36:00 +00:00
Joseph Tremoulet	6a3a04f00f	[WinEH] Update exception pointer registers Summary: The CLR's personality routine passes these in rdx/edx, not rax/eax. Make getExceptionPointerRegister a virtual method parameterized by personality function to allow making this distinction. Similarly make getExceptionSelectorRegister a virtual method parameterized by personality function, for symmetry. Reviewers: pgavlin, majnemer, rnk Subscribers: jyknight, dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D14344 llvm-svn: 252383	2015-11-07 01:11:31 +00:00
Ahmed Bougacha	f32f56a1a1	[X86] Fold (trunc (i32 (zextload i16))) into vbroadcast. When matching non-LSB-extracting truncating broadcasts, we now insert the necessary SRL. If the scalar resulted from a load, the SRL will be folded into it, creating a narrower, offset, load. However, i16 loads aren't Desirable, so we get i16->i32 zextloads. We already catch i16 aextloads; catch these as well. llvm-svn: 252363	2015-11-06 23:16:48 +00:00
Ahmed Bougacha	fe77e7174c	[X86] SRL non-LSB extracts when folding to truncating broadcasts. Now that we recognize this, we can support it instead of bailing out. That is, we can fold: (v8i16 (shufflevector (v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))), <1,1,...,1>)) into: (v8i16 (vbroadcast (i16 (trunc (srl Y, 16))))) llvm-svn: 252362	2015-11-06 23:16:43 +00:00
Ahmed Bougacha	92bc790d0a	[X86] Don't fold non-LSB extracts into truncating broadcasts. We used to incorrectly assume that the offset we're extracting from was a multiple of the element size. So, we'd fold: (v8i16 (shufflevector (v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))), <1,1,...,1>)) into: (v8i16 (vbroadcast (i16 (trunc Y)))) whereas we should have extracted the higher bits from X. Instead, bail out if the assumption doesn't hold. llvm-svn: 252361	2015-11-06 23:16:38 +00:00
Andrew Kaylor	3cc19fd26f	Improved the operands commute transformation for X86-FMA3 instructions. All 3 operands of FMA3 instructions are commutable now. Patch by Slava Klochkov Reviewers: Quentin Colombet(qcolombet), Ahmed Bougacha(ab). Differential Revision: http://reviews.llvm.org/D13269 llvm-svn: 252335	2015-11-06 19:47:25 +00:00
Reid Kleckner	09e241eb0f	[WinEH] Mark funclet entries and exits as clobbering all registers Summary: In this implementation, LiveIntervalAnalysis invents a few register masks on basic block boundaries that preserve no registers. The nice thing about this is that it prevents the prologue inserter from thinking it needs to spill all XMM CSRs, because it doesn't see any explicit physreg defs in the MI. Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer Subscribers: MatzeB, llvm-commits Differential Revision: http://reviews.llvm.org/D14407 llvm-svn: 252318	2015-11-06 17:06:38 +00:00
Reid Kleckner	a707d8e0c6	[WinEH] Split EH_RESTORE out of CATCHRET for 32-bit EH This adds the EH_RESTORE x86 pseudo instr, which is responsible for restoring the stack pointers: EBP and ESP, and ESI if stack realignment is involved. We only need this on 32-bit x86, because on x64 the runtime restores CSRs for us. Previously we had to keep the CATCHRET instruction around during SEH so that we could convince X86FrameLowering to restore our frame pointers. Now we can split these instructions earlier. This was confusing, because we had a return instruction which wasn't really a return and was ultimately going to be removed by X86FrameLowering. This change also simplifies X86FrameLowering, which really shouldn't be building new MBBs. No observable functional change currently, but with the new register mask stuff in D14407, CATCHRET will become a register allocator barrier, and our existing tests rely on us having reasonable register allocation around SEH. llvm-svn: 252266	2015-11-06 01:49:05 +00:00
Tim Northover	227737c0c9	Remove windows line endings introduced by r252177. NFC. llvm-svn: 252217	2015-11-05 21:54:58 +00:00
Reid Kleckner	c9f0155a98	[WinEH] Fix funclet prologues with stack realignment We already had a test for this for 32-bit SEH catchpads, but those don't actually create funclets. We had a bug that only appeared in funclet prologues, where we would establish EBP and ESI as our FP and BP, and then downstream prologue code would overwrite them. While I was at it, I fixed Win64+funclets+stackrealign. This issue doesn't come up as often there due to the ABI requring 16 byte stack alignment, but now we can rest easy that AVX and WinEH will work well together =P. llvm-svn: 252210	2015-11-05 21:09:49 +00:00
Oleg Ranevskyy	839befd4e4	[DebugInfo] Fix ARM/AArch64 prologue_end position. Related to D11268. Summary: This review is related to another review request http://reviews.llvm.org/D11268, does the same and merely fixes a couple of issues with it. D11268 is quite old and has merge conflicts against the current trunk. This request - rebases D11268 onto the new trunk; - resolves the merge conflicts; - fixes the prologue_end tests, which do not pass due to the subprogram definitions not marked as distinct. Reviewers: echristo, rengolin, kubabrecka Subscribers: aemerson, rengolin, jyknight, dsanders, llvm-commits, asl Differential Revision: http://reviews.llvm.org/D14338 llvm-svn: 252177	2015-11-05 17:50:17 +00:00
Petar Jovanovic	73dbdc3960	Add cfi instr for CFA calculation when movpc is expanded to call and pop This fixes the issue of wrong CFA calculation in the following case: 0x08048400 <+0>: push %ebx 0x08048401 <+1>: sub $0x8,%esp 0x08048404 <+4>: call 0x8048409 <test+9> 0x08048409 <+9>: pop %eax 0x0804840a <+10>: add $0x1bf7,%eax 0x08048410 <+16>: mov %eax,%ebx 0x08048412 <+18>: call 0x80483f0 <bar> 0x08048417 <+23>: add $0x8,%esp 0x0804841a <+26>: pop %ebx 0x0804841b <+27>: ret The highlighted instructions are a product of movpc instruction. The call instruction changes the stack pointer, and pop instruction restores its value. However, the rule for computing CFA is not updated and is wrong on the pop instruction. So, e.g. backtrace in gdb does not work when on the pop instruction. This adds cfi instructions for both call and pop instructions. cfi_adjust_cfa_offset** instruction is used with the appropriate offset for setting the rules to calculate CFA correctly. Patch by Violeta Vukobrat. Differential Revision: http://reviews.llvm.org/D14021 llvm-svn: 252176	2015-11-05 17:19:59 +00:00
Asaf Badouh	f3f551dd7e	revert rev. 252153 due to build failure on ubuntu [X86][AVX512] add comi with Sae llvm-svn: 252154	2015-11-05 08:55:54 +00:00
Asaf Badouh	c9c8bfa4c4	[X86][AVX512] add comi with Sae add builtin_ia32_vcomisd and builtin_ia32_vcomisd Differential Revision: http://reviews.llvm.org/D14331 llvm-svn: 252153	2015-11-05 08:45:06 +00:00
Asaf Badouh	e9eadcdf13	[X86][AVX512] small bugfix in VPBROADCASTM VPBROADCASTMW2D and VPBROADCASTMB2Q Differential Revision: http://reviews.llvm.org/D14335 llvm-svn: 252151	2015-11-05 08:08:21 +00:00
Joseph Tremoulet	44ef00af57	[WinEH] Fix establisher param reg in CLR funclets Summary: The CLR's personality routine passes the pointer to the establisher frame in RCX, not RDX. Reviewers: pgavlin, majnemer, rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14343 llvm-svn: 252135	2015-11-05 02:20:07 +00:00
Quentin Colombet	dacc177d59	[x86] Teach the shrink-wrapping hooks to do the proper thing with Win64. Win64 has some strict requirements for the epilogue. As a result, we disable shrink-wrapping for Win64 unless the block that gets the epilogue is already an exit block. Fixes PR24193. llvm-svn: 252088	2015-11-04 22:37:28 +00:00
Simon Pilgrim	ea116cf21c	Warning fix. llvm-svn: 252078	2015-11-04 21:27:22 +00:00
Simon Pilgrim	d1f5a2789e	[X86][SSE] Add general memory folding for (V)INSERTPS instruction This patch improves the memory folding of the inserted float element for the (V)INSERTPS instruction. The existing implementation occurs in the DAGCombiner and relies on the narrowing of a whole vector load into a scalar load (and then converted into a vector) to (hopefully) allow folding to occur later on. Not only has this proven problematic for debug builds, it also prevents other memory folds (notably stack reloads) from happening. This patch removes the old implementation and moves the folding code to the X86 foldMemoryOperand handler. A new private 'special case' function - foldMemoryOperandCustom - has been added to deal with memory folding of instructions that can't just use the lookup tables - (V)INSERTPS is the first of several that could be done. It also tweaks the memory operand folding code with an additional pointer offset that allows existing memory addresses to be modified, in this case to convert the vector address to the explicit address of the scalar element that will be inserted. Unlike the previous implementation we now set the insertion source index to zero, although this is ignored for the (V)INSERTPSrm version, anything that relied on shuffle decodes (such as unfolding of insertps loads) was incorrectly calculating the source address - I've added a test for this at insertps-unfold-load-bug.ll Differential Revision: http://reviews.llvm.org/D13988 llvm-svn: 252074	2015-11-04 20:48:09 +00:00
Sanjoy Das	442b82fb2f	[IR] Add bounds checking to paramHasAttr Summary: This is intended to make a later change simpler. Note: adding this bounds checking required fixing `X86FastISel`. As far I can tell I've preserved original behavior but a careful review will be appreciated. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14304 llvm-svn: 252073	2015-11-04 20:33:45 +00:00
Andrew Kaylor	e42c061561	Created new X86 FMA3 opcodes (FMA_Int) that are used now for lowering of scalar FMA intrinsics. Patch by Slava Klochkov The key difference between FMA and FMA_Int opcodes is that FMA_Int opcodes are handled more conservatively. It is illegal to commute the 1st operand of FMA*_Int instructions as the upper bits of scalar FMA intrinsic result must be taken from the 1st operand, but such commute transformation would change those upper bits and invalidate the intrinsic's result. Reviewers: Quentin Colombet, Elena Demikhovsky Differential Revision: http://reviews.llvm.org/D13710 llvm-svn: 252060	2015-11-04 18:10:41 +00:00
Michael Kuperstein	7726e4d796	[ELF] elfiamcu triple should imply e_machine == EM_IAMCU Differential Revision: http://reviews.llvm.org/D14109 llvm-svn: 252043	2015-11-04 11:21:50 +00:00
Michael Kuperstein	715d358c66	[X86] DAGCombine should not introduce FILD in soft-float mode The x86 "sitofp i64 to double" dag combine, in 32-bit mode, lowers sitofp directly to X86ISD::FILD (or FILD_FLAG). This should not be done in soft-float mode. llvm-svn: 252042	2015-11-04 11:17:53 +00:00
Simon Pilgrim	ac4c196247	[X86][XOP] Add support for the matching of the VPCMOV bit select instruction XOP has the VPCMOV instruction that performs the common vector bit select operation OR( AND( SRC1, SRC3 ), AND( SRC2, ~SRC3 ) ) This patch adds tablegen pattern matching for this instruction. Differential Revision: http://reviews.llvm.org/D8841 llvm-svn: 251975	2015-11-03 20:27:01 +00:00
Michael Kuperstein	5991145dd5	[X86] Generate .cfi_adjust_cfa_offset correctly when pushing arguments When push instructions are being used to pass function arguments on the stack, and either EH or debugging are enabled, we need to generate .cfi_adjust_cfa_offset directives appropriately. For (synch) EH, it is enough for the CFA offset to be correct at every call site, while for debugging we want to be correct after every push. Darwin does not support this well, so don't use pushes whenever it would be required. Differential Revision: http://reviews.llvm.org/D13767 llvm-svn: 251904	2015-11-03 08:17:25 +00:00
Igor Breger	207c14b67f	AVX512: add encoding tests for vmovq/d instructions. llvm-svn: 251903	2015-11-03 07:30:17 +00:00
Igor Breger	dd070c17bb	AVX512: Implemented encoding and intrinsics for VBROADCASTI32x2 and VBROADCASTF32x2 instructions. Differential Revision: http://reviews.llvm.org/D14216 llvm-svn: 251781	2015-11-02 07:39:36 +00:00
Craig Topper	0354af2e80	[X86] Remove assertions that check for valid scale values on scatter/gather intrinsics. Nothing upstream prevented illegal values from getting here. llvm-svn: 251780	2015-11-02 07:24:40 +00:00
Craig Topper	150b1555c7	[X86] Fold 'if' followed by just an llvm_unreachable into an assert. llvm-svn: 251778	2015-11-02 07:24:34 +00:00
Craig Topper	df102b96ca	[X86] Use isa instead of dyn_cast in a bool context. NFC llvm-svn: 251777	2015-11-02 07:24:32 +00:00
Craig Topper	cbb29b67b1	[X86] Remove some llvm_unreachables after switches that already have an unreachable in their default case. llvm-svn: 251776	2015-11-02 07:24:30 +00:00
Craig Topper	055cdb4c83	[X86] Remove a 'break' after an llvm_unreachable. llvm-svn: 251775	2015-11-02 07:24:27 +00:00
Craig Topper	490cb1be76	[X86] Use cast instead of dyn_cast and a null check marked unreachable. llvm-svn: 251774	2015-11-02 07:24:25 +00:00
Craig Topper	edd05abeaa	[X86] Use MVT instead of EVT when the type is known to be simple. NFC llvm-svn: 251772	2015-11-02 05:24:22 +00:00
Elena Demikhovsky	f42814b247	AVX-512: Optimized SIMD truncate operations for AVX512F set. Optimized <8 x i32> to <8 x i16> <4 x i64> to < 4 x i32> <16 x i16> to <16 x i8> All these oprtrations use now AVX512F set (KNL). Before this change it was implemented with AVX2 set. Differential Revision: http://reviews.llvm.org/D14108 llvm-svn: 251764	2015-11-01 11:45:47 +00:00
Craig Topper	9f7fa4107c	[X86] Replace getScalarType with getVectorElementType when the type is already known to be a vector. This should result in slightly less code. NFC llvm-svn: 251751	2015-10-31 21:44:52 +00:00
Craig Topper	bfd831d3f2	[X86] Convert to MVT instead of calling EVT functions since we already know the type is simple. NFC llvm-svn: 251745	2015-10-31 18:14:17 +00:00
Craig Topper	057ba18026	[X86] Call getScalarSizeInBits() instead of getScalarType().getScalarSizeInBits(). NFC llvm-svn: 251744	2015-10-31 18:14:15 +00:00
Craig Topper	403a7454ac	[X86] Remove two const references to the return value of a constructor and just use normal object creation syntax. NFC llvm-svn: 251743	2015-10-31 17:28:02 +00:00
Craig Topper	70916f79d8	[X86] Replace EVT with MVT in some more places. NFC llvm-svn: 251742	2015-10-31 17:27:59 +00:00
Craig Topper	c1776ebb91	[X86] Fix indentation of case statements in switch. NFC llvm-svn: 251741	2015-10-31 17:27:56 +00:00
Craig Topper	e9bbc5ba7a	[X86] Reduce math for index calculation for inserting and extracting subvectors and elements by exploiting the fact that all supported vector types have a power 2 number of elements. llvm-svn: 251740	2015-10-31 17:27:52 +00:00
Craig Topper	8ef684fd95	[X86] Use is128BitVector/is256BitVector/is512BitVector in place of getSizeInBits == in some places. NFC llvm-svn: 251687	2015-10-30 04:31:18 +00:00
Craig Topper	da6d448666	[X86] Minor formatting fixes. NFC. llvm-svn: 251686	2015-10-30 04:31:14 +00:00

1 2 3 4 5 ...

12318 Commits