llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-23 13:02:52 +02:00

Author	SHA1	Message	Date
Duncan P. N. Exon Smith	e1ff992bb4	DebugInfo: Overload get() in DIDescriptor subclasses Continue to simplify the `DIDescriptor` subclasses, so that they behave more like raw pointers. Remove `getRaw()`, replace it with an overloaded `get()`, and overload the arrow and cast operators. Two testcases started to crash on the arrow operators with this change because of `scope:` references that weren't real scopes. I fixed them. Soon I'll add verifier checks for them too. This also adds explicit dereference operators. Previously, the builtin dereference against `operator MDNode *()` would have worked, but now the builtins are ambiguous. llvm-svn: 233030	2015-03-23 21:54:07 +00:00
Ahmed Bougacha	dda2ff1737	[AArch64, ARM] Enable GlobalMerge with -O3 rather than -O1. The pass used to be enabled by default with CodeGenOpt::Less (-O1). This is too aggressive, considering the pass indiscriminately merges all globals together. Currently, performance doesn't always improve, and, on code that uses few globals (e.g., the odd file- or function- static), more often than not is degraded by the optimization. Lengthy discussion can be found on llvmdev (AArch64-focused; ARM has similar problems): http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-February/082800.html Also, it makes tooling and debuggers less useful when dealing with globals and data sections. GlobalMerge needs to better identify those cases that benefit, and this will be done separately. In the meantime, move the pass to run with -O3 rather than -O1, on both ARM and AArch64. llvm-svn: 233024	2015-03-23 21:17:36 +00:00
Chad Rosier	99ad3fa4ff	[AArch64] Add FileCheck that was missing from test in r232967. llvm-svn: 233013	2015-03-23 20:25:15 +00:00
Matt Arsenault	c20cbf651a	R600/SI: Allow commuting compares This enables very common cases to switch to the smaller encoding. All of the standard LLVM canonicalizations of comparisons are the opposite of what we want. Compares with constants are moved to the RHS, but the first operand can be an inline immediate, literal constant, or SGPR using the 32-bit VOPC encoding. There are additional bad canonicalizations that should also be fixed, such as canonicalizing ge x, k to gt x, (k + 1) if this makes k no longer an inline immediate value. llvm-svn: 232988	2015-03-23 18:45:30 +00:00
Chad Rosier	c67eff5c3b	[AArch64] Enable rematerialization of float 0 values. Patch by Geoff Berry<gberry@codeaurora.org>. llvm-svn: 232967	2015-03-23 17:19:34 +00:00
Bradley Smith	f7359fa871	Revert "[ARM] Add more pattern matching for f16 <-> f64 conversions" This change is incorrect since it converts double rounding into single rounding, which can produce different results. Instead this optimization will be done by modifying Clang's codegen to not produce double rounding in the first place. This reverts commit r232954. llvm-svn: 232962	2015-03-23 16:52:52 +00:00
Tom Stellard	1f3265f20e	R600/SI: Fix crash in SIInstrInfo::areLoadsFromSameBasePtr() This function assumed that SMRD instructions always have immediate offsets, which is not always the case. llvm-svn: 232957	2015-03-23 16:06:01 +00:00
Bradley Smith	cb2fa3d357	[ARM] Add more pattern matching for f16 <-> f64 conversions Specifically when the conversion is done in two steps, f16 -> f32 -> f64. For example: %1 = tail call float @llvm.convert.from.fp16.f32(i16 %0) %conv = fpext float %1 to double to: vcvtb.f64.f16 llvm-svn: 232954	2015-03-23 15:59:54 +00:00
Petar Jovanovic	8e9b052c46	Fix sign extension for MIPS64 in makeLibCall function Fixing sign extension in makeLibCall for MIPS64. In MIPS64 architecture all 32 bit arguments (int, unsigned int, float 32 (soft float)) must be sign extended. This fixes test "MultiSource/Applications/oggenc/". Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D7791 llvm-svn: 232943	2015-03-23 12:28:13 +00:00
Hal Finkel	70188cc8ca	[SDAG] Don't widen VSETCC during type legalization for split operands Because the operands of a vector SETCC node can be of a different type from the result (and often are), it can happen that even if we'd prefer to widen the result type of the SETCC, the operands have been split instead. In this case, the SETCC result also must be split. This mirrors what is done in WidenVecRes_SELECT, and should be NFC elsewhere because if the operands are not widened the following calls to GetWidenedVector will assert (which is what was happening in the test case). llvm-svn: 232935	2015-03-23 08:22:43 +00:00
Matt Arsenault	4a540d8fd4	R600: Cleanup test with multiple check prefixes llvm-svn: 232901	2015-03-21 19:15:46 +00:00
Simon Pilgrim	c93710534f	Tidied up vec_zero_cse.ll test. NFCI. Added target triple and refactored the CHECKs to be per function. llvm-svn: 232894	2015-03-21 14:05:12 +00:00
Tim Northover	489835620f	AArch64: simplify test case llvm-svn: 232886	2015-03-21 04:37:08 +00:00
Eric Christopher	e8e68a5117	Remove the bare getSubtargetImpl call from the AArch64 port. As part of this add a test that shows we can generate code for functions that specifically enable a subtarget feature. llvm-svn: 232884	2015-03-21 04:04:50 +00:00
Eric Christopher	496632fcd2	Remove the bare getSubtargetImpl call from the PPC port. As part of this add a test that shows we can generate code with for functions that differ by subtarget feature. llvm-svn: 232882	2015-03-21 03:36:02 +00:00
Eric Christopher	3d3373d3e2	Cache the Function dependent subtarget on the MachineFunction. As preparation for removing the getSubtargetImpl() call from TargetMachine go ahead and flip the switch on caching the function dependent subtarget and remove the bare getSubtargetImpl call from the X86 port. As part of this add a few tests that show we can generate code and assemble on X86 based on features/cpu on the Function. llvm-svn: 232879	2015-03-21 03:13:10 +00:00
Ahmed Bougacha	3337019a5f	[CodeGen][IfCvt] Don't re-ifcvt blocks with unanalyzable terminators. If we couldn't analyze its terminator (i.e., it's an indirectbr, or some other weirdness), we can't safely re-if-convert a predicated block, because we can't tell whether the predicated terminator can fallthrough (it does). Currently, we would completely ignore the fallthrough successor. In the added testcase, this means we used to generate: ... @ %entry: cmp r5, #21 ittt ne @ %cc1f: cmpne r7, #42 @ %cc2t: strne.w r5, [r8] movne pc, r10 @ %cc1t: ... Whereas the successor of %cc1f was originally %bb1. With the fix, we get the correct: ... @ %entry: cmp r5, #21 itt eq @ %cc1t: streq.w r5, [r11] moveq pc, r0 @ %cc1f: cmp r7, #42 itt ne @ %cc2t: strne.w r5, [r8] movne pc, r10 @ %bb1: ... rdar://20192768 Differential Revision: http://reviews.llvm.org/D8509 llvm-svn: 232872	2015-03-21 01:23:15 +00:00
Ahmed Bougacha	6bc0aa2395	[AArch64] Prefer UZP for concat_vector of illegal truncs. Follow-up to r232459: prefer a UZP shuffle to the intermediate truncs. llvm-svn: 232871	2015-03-21 01:08:39 +00:00
Andrew Kaylor	7b78ee54b5	Fixing a bug with WinEH PHI handling llvm-svn: 232851	2015-03-20 21:42:54 +00:00
Sanjay Patel	34ad366455	[X86] Prefer blendps over insertps codegen for one special case With this patch, for this one exact case, we'll generate: blendps %xmm0, %xmm1, $1 instead of: insertps %xmm0, %xmm1, $0 If there's a memory operand available for load folding and we're optimizing for size, we'll still generate the insertps. The detailed performance data motivation for this may be found in D7866; in summary, blendps has 2-3x throughput vs. insertps on widely used chips. Differential Revision: http://reviews.llvm.org/D8332 llvm-svn: 232850	2015-03-20 21:19:52 +00:00
Rafael Espindola	06353319f0	Don't declare all text sections at the start of the .s The code this patch removes was there to make sure the text sections went before the dwarf sections. That is necessary because MachO uses offsets relative to the start of the file, so adding a section can change relaxations. The dwarf sections were being printed at the start just to produce symbols pointing at the start of those sections. The underlying issue was fixed in r231898. The dwarf sections are now printed when they are about to be used, which is after we printed the text sections. To make sure we don't regress, the patch makes the MachO streamer assert if CodeGen puts anything unexpected after the DWARF sections. llvm-svn: 232842	2015-03-20 20:00:01 +00:00
John Brawn	2e601255af	[ARM] Fix handling of thumb1 out-of-range frame offsets LocalStackSlotPass assumes that isFrameOffsetLegal doesn't change its answer when the base register changes. Unfortunately this isn't true in thumb1, where SP-based loads allow a larger offset than non-SP-based loads, and this causes the base register reuse code to generate instructions that are unencodable, causing an assertion failure. Solve this by adding a BaseReg parameter to isFrameOffsetLegal, which ARMBaseRegisterInfo can then make use of to give the correct answer. Differential Revision: http://reviews.llvm.org/D8419 llvm-svn: 232825	2015-03-20 17:20:07 +00:00
Daniel Jasper	86b4584e4b	[MBP] Don't outline short optional branches With the option -outline-optional-branches, LLVM will place optional branches out of line (more details on r231230). With this patch, this is not done for short optional branches. A short optional branch is a branch containing a single block with an instruction count below a certain threshold (defaulting to 3). Still everything is guarded under -outline-optional-branches). Outlining a short branch can't significantly improve code locality. It can however decrease performance because of the additional jmp and in cases where the optional branch is hot. This fixes a compile time regression I have observed in a benchmark. Review: http://reviews.llvm.org/D8108 llvm-svn: 232802	2015-03-20 10:00:37 +00:00
Tom Stellard	1b70c0dfce	R600/SI: Add missing CHECK-LABEL lines to a test llvm-svn: 232797	2015-03-20 03:12:42 +00:00
Owen Anderson	480ad2b319	Fix a nasty bug in DAGCombine of STORE nodes. This is very related to the bug fixed in r174431. The problem is that SelectionDAG does not include alignment in the uniquing of loads and stores. When an otherwise no-op DAGCombine would increase the alignment of a load or store, the original node would be returned (with the alignment increased), which would cause the node not to be processed by any further DAGCombines. I don't have a direct testcase for this that manifests on an in-tree target, but I did see some noise in the tests for other targets and have updated them for it. llvm-svn: 232780	2015-03-19 22:48:57 +00:00
Reid Kleckner	11bd622395	WinEH: Make llvm.eh.actions emission match the EH docs This switches the sense of the i32 values and updates the test cases. We can also use CHECK-SAME to clean up some tests, and reduce the visual noise from bitcasts. llvm-svn: 232774	2015-03-19 22:31:02 +00:00
Sanjay Patel	7bfdf498b8	[X86, AVX] use blends instead of insert128 with index 0 Another case of x86-specific shuffle strength reduction: avoid generating insert*128 instructions with index 0 because they are slower than their non-lane-changing blend equivalents. Shuffle lowering already catches most of these cases, but the zero vector case and some other paths such as in the modified test in vector-shuffle-256-v32.ll were getting through. Differential Revision: http://reviews.llvm.org/D8366 llvm-svn: 232773	2015-03-19 22:29:40 +00:00
Krzysztof Parzyszek	6bde8ba970	Unxfail test/CodeGen/Generic/vector.ll now passing on Hexagon llvm-svn: 232758	2015-03-19 20:22:17 +00:00
Artem Belevich	a6dfb5604e	Add support for __nvvm_reflect changes in libdevice in CUDA-7.0 Summary: CUDA 7.0's libdevice uses slightly different IR to call __nvvm_reflect and that triggers an assertion in nvvm_reflect optimization pass. This change allows nvvm_reflect pass to deal with both old and new ways to pass an argument to __nvvm_reflect. Test Plan: ninja check-all Reviewers: eliben, echristo Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D8399 llvm-svn: 232732	2015-03-19 17:05:35 +00:00
Krzysztof Parzyszek	9cc7bfdeec	[Hexagon] Add support for vector instructions llvm-svn: 232728	2015-03-19 16:33:08 +00:00
Rafael Espindola	db1aacd9f9	Note that we don't support COFF on PPC. Should bring back the windows bots. llvm-svn: 232701	2015-03-19 02:40:56 +00:00
Simon Pilgrim	3e49d9aa2b	Fixed failing test due to missing target triple causing different results on different buildbots. llvm-svn: 232685	2015-03-18 22:51:45 +00:00
Rafael Espindola	3b1d10d125	Teach getDefaultFormat that we only support ELF on some architectures. This should bring the windows bots back. It is a bit ugly, but it is better than what we had before: The triple would say that the object format was COFF, but llc/llvm-mc would produce an ELF. llvm-svn: 232683	2015-03-18 22:19:16 +00:00
Simon Pilgrim	6f98dca24d	[X86][SSE] Avoid scalarization of v2i64 vector shifts (REAPPLIED) Fixed broken tests. Differential Revision: http://reviews.llvm.org/D8416 llvm-svn: 232682	2015-03-18 22:18:51 +00:00
Eric Christopher	60fdac43a1	Revert "[X86][SSE] Avoid scalarization of v2i64 vector shifts" as it appears to have broken tests/bots. This reverts commit r232660. llvm-svn: 232670	2015-03-18 21:01:00 +00:00
Reid Kleckner	ce90ac0104	Use WinEHPrepare to outline SEH finally blocks No outlining is necessary for SEH catch blocks. Use the blockaddr of the handler in place of the usual outlined function. Reviewers: majnemer, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D8370 llvm-svn: 232664	2015-03-18 20:26:53 +00:00
Simon Pilgrim	97919c9f36	[X86][SSE] Avoid scalarization of v2i64 vector shifts Currently v2i64 vectors shifts (non-equal shift amounts) are scalarized, costing 4 x extract, 2 x x86-shifts and 2 x insert instructions - and it gets even more awkward on 32-bit targets. This patch separately shifts the vector by both shift amounts and then shuffles the partial results back together, costing 2 x shuffles and 2 x sse-shifts instructions (+ 2 movs on pre-AVX hardware). Note - this patch only improves the SHL / LSHR logical shifts as only these are supported in SSE hardware. Differential Revision: http://reviews.llvm.org/D8416 llvm-svn: 232660	2015-03-18 19:35:31 +00:00
Matthias Braun	77986c7d5d	TableGen: Fix register class lane masks being too conservative. When calculating the lanemask of a register class we have to include the masks of subregisters supported by any of the class members, not just the ones supported by all class members. This fixes problems when coalescing towards a subclass with additional subregisters available. The attached testcase works fine as is, but does crash if you enable subregister liveness on x86 without this change applied. llvm-svn: 232652	2015-03-18 17:56:09 +00:00
Sanjay Patel	fa74d9a602	Use utils/update_llc_test_checks.py to update all CHECKs The checks here were so vague that we could nuke intrinsics from existence and still pass the test because we'd match the function name. llvm-svn: 232647	2015-03-18 16:38:44 +00:00
Krzysztof Parzyszek	74e58441b5	[Hexagon] Intrinsics for circular and bit-reversed loads and stores llvm-svn: 232645	2015-03-18 16:23:44 +00:00
Sanjay Patel	e60e76fab6	fixed to test features, not CPU model The 'vmovntdq' was only passing due to a fluke in SandyBridge codegen that splits 32-byte stores in half, but that meant that the test was not correctly checking for the 32-byte store that we thought we were generating. The lax checking in this file will be addressed in another commit. There are bigger problems here. llvm-svn: 232644	2015-03-18 16:07:10 +00:00
Krzysztof Parzyszek	7c0a6d7439	[Hexagon] Handle ENDLOOP0 in InsertBranch and RemoveBranch llvm-svn: 232643	2015-03-18 15:56:43 +00:00
Daniel Jasper	3b0ddfa292	Change test to accept an additional critical edge split. The two hot blocks are right next to each other and I verified that there is no performance regression by compressing/uncompressing some files with a minigzip built with the different options. llvm-svn: 232629	2015-03-18 12:45:45 +00:00
John Brawn	e0a10a9be6	[ARM] Align stack objects passed to memory intrinsics Memcpy, and other memory intrinsics, typically tries to use LDM/STM if the source and target addresses are 4-byte aligned. In CodeGenPrepare look for calls to memory intrinsics and, if the object is on the stack, 4-byte align it if it's large enough that we expect that memcpy would want to use LDM/STM to copy it. Differential Revision: http://reviews.llvm.org/D7908 llvm-svn: 232627	2015-03-18 12:01:59 +00:00
John Brawn	e32213ecbc	Add missing newline to end of test file. llvm-svn: 232626	2015-03-18 10:45:12 +00:00
Josh Magee	9342392187	Add testcases for BEXTR. These BEXTR cases are a check for the 64-bit load form and two negative cases where the bitrange is non-contiguous. From a private patch equivalent to r189742/PR17028. llvm-svn: 232580	2015-03-18 01:34:06 +00:00
Krzysztof Parzyszek	f36358576e	Missed testcase for r232577 llvm-svn: 232578	2015-03-18 00:44:46 +00:00
David Majnemer	ec30fe4691	DAGCombiner: fold (xor (shl 1, x), -1) -> (rotl ~1, x) Targets which provide a rotate make it possible to replace a sequence of (XOR (SHL 1, x), -1) with (ROTL ~1, x). This saves an instruction on architectures like X86 and POWER(64). Differential Revision: http://reviews.llvm.org/D8350 llvm-svn: 232572	2015-03-18 00:03:36 +00:00
David Majnemer	de51ea1b14	COFF: Let globals with private linkage reside in their own section COFF COMDATs (for selection kinds other than 'select any') require at least one non-section symbol in the symbol table. Satisfy this by morally enhancing the linkage from private to internal. Differential Revision: http://reviews.llvm.org/D8394 llvm-svn: 232570	2015-03-17 23:54:51 +00:00
Pirama Arumuga Nainar	26178b30ce	Fix bug while building FP16 constant vectors for AArch64 Summary: Building FP16 constant vectors caused the FP16 data to be bitcast to i64. This patch creates a BITCAST node with the correct value, and adds a test to verify correct handling. Reviewers: mcrosier Reviewed By: mcrosier Subscribers: mcrosier, jmolloy, ab, srhines, llvm-commits, rengolin, aemerson Differential Revision: http://reviews.llvm.org/D8369 llvm-svn: 232562	2015-03-17 23:10:29 +00:00

1 2 3 4 5 ...

12280 Commits