llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-24 11:42:57 +01:00

Author	SHA1	Message	Date
Bruno Cardoso Lopes	3584c02d83	- Use a more appropriate name for Owen's ARM Parser isMCR hack since the same operands can be present in cdp/cdp2 instructions. Also increase the hack with cdp/cdp2 instructions. - Fix the encoding of cdp/cdp2 instructions for ARM (no thumb and thumb2 yet) and add testcases for t hem. llvm-svn: 123927	2011-01-20 18:06:58 +00:00
Bruno Cardoso Lopes	75712e8a7a	Add mcr2 and mrc2 support to thumb2 targets llvm-svn: 123919	2011-01-20 16:58:48 +00:00
Bruno Cardoso Lopes	f377d1721e	Add mcr* and mr*c support to thumb targets llvm-svn: 123917	2011-01-20 16:35:57 +00:00
Michael J. Spencer	d74f931baa	Disable this test until I can figure out why it's broken. Not xfailed because it usese 100% CPU and times out, so it's annoying to run it. llvm-svn: 123915	2011-01-20 16:24:07 +00:00
Kalle Raiskila	070fb5e54d	Allow sign-extending of i8 and i16 to i128 on SPU. llvm-svn: 123912	2011-01-20 15:49:06 +00:00
Duncan Sands	1faa8712c9	At -O123 the early-cse pass is run before instcombine has run. According to my auto-simplier the transform most missed by early-cse is (zext X) != 0 -> X != 0. This patch adds this transform and some related logic to InstructionSimplify and removes some of the logic from instcombine (unfortunately not all because there are several situations in which instcombine can improve things by making new instructions, whereas instsimplify is not allowed to do this). At -O2 this often results in more than 15% more simplifications by early-cse, and results in hundreds of lines of bitcode being eliminated from the testsuite. I did see some small negative effects in the testsuite, for example a few additional instructions in three programs. One program, 483.xalancbmk, got an additional 35 instructions, which seems to be due to a function getting an additional instruction and then being inlined all over the place. llvm-svn: 123911	2011-01-20 13:21:55 +00:00
Eric Christopher	f7579ff174	Expand invalid return values for umulo and smulo. Handle these similarly to add/sub by doing the normal operation and then checking for overflow afterwards. This generally relies on the DAG handling the later invalid operations as well. Fixes the 64-bit part of rdar://8622122 and rdar://8774702. llvm-svn: 123908	2011-01-20 08:54:28 +00:00
Evan Cheng	5c5e42a878	Add test. llvm-svn: 123906	2011-01-20 08:38:21 +00:00
Evan Cheng	6dc21c7358	Sorry, several patches in one. TargetInstrInfo: Change produceSameValue() to take MachineRegisterInfo as an optional argument. When in SSA form, targets can use it to make more aggressive equality analysis. Machine LICM: 1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead. 2. Fix a bug which prevent CSE of instructions which are not re-materializable. 3. Use improved form of produceSameValue. ARM: 1. Teach ARM produceSameValue to look pass some PIC labels. 2. Look for operands from different loads of different constant pool entries which have same values. 3. Re-implement PIC GA materialization using movw + movt. Combine the pair with a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible to re-materialize the instruction, allow machine LICM to hoist the set of instructions out of the loop and make it possible to CSE them. It's a bit hacky, but it significantly improve code quality. 4. Some minor bug fixes as well. With the fixes, using movw + movt to materialize GAs significantly outperform the load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap and 176.gcc ~10%. llvm-svn: 123905	2011-01-20 08:34:58 +00:00
Michael J. Spencer	c3f075e648	Object: Add some tests! llvm-svn: 123899	2011-01-20 06:39:15 +00:00
Venkatraman Govindaraju	5280b2876f	Sparc backend: Implements a delay slot filler that attempt to fill delay slots with useful instructions. llvm-svn: 123884	2011-01-20 05:08:26 +00:00
Eric Christopher	1b0e5debb4	If we can, lower the multiply part of a umulo/smulo call to a libcall with an invalid type then split the result and perform the overflow check normally. Fixes the 32-bit parts of rdar://8622122 and rdar://8774702. llvm-svn: 123864	2011-01-20 00:29:24 +00:00
Devang Patel	729c5e59af	Fix debug info for merged global. llvm-svn: 123862	2011-01-20 00:02:16 +00:00
Nick Lewycky	51c13384f5	Similarly, analyze truncate through multiply. llvm-svn: 123842	2011-01-19 18:56:00 +00:00
Nick Lewycky	9867e58096	Add a missed SCEV fold that is required to continue analyzing the IR produced by indvars through the scev expander. trunc(add x, y) --> add(trunc x, y). Currently SCEV largely folds the other way which is probably wrong, but preserved to minimize churn. Instcombine doesn't do this fold either, demonstrating a missed optz'n opportunity on code doing add+trunc+add. llvm-svn: 123838	2011-01-19 16:59:46 +00:00
Bruno Cardoso Lopes	0f7a30b1cb	Fix the encoding of mrrc and mcrr family of instructions. Also add testcases for mcr and mrc llvm-svn: 123837	2011-01-19 16:56:52 +00:00
Rafael Espindola	ce499efe1d	Add unnamed_addr when we can show that address of a global is not used. llvm-svn: 123834	2011-01-19 16:32:21 +00:00
Nick Lewycky	5a538b62ca	Add a missing SCEV simplification sext(zext x) --> zext x. llvm-svn: 123832	2011-01-19 15:56:12 +00:00
Owen Anderson	ed4acd59cb	When matching asm operands, always try to match the most restricted type first. Unfortunately, while this is the "right" thing to do, it breaks some ARM asm parsing tests because MemMode5 and ThumbMemModeReg are ambiguous. This is tricky to resolve since neither is a subset of the other. XFAIL the test for now. The old way was broken in other ways, just ways we didn't happen to be testing, and our ARM asm parsing is going to require significant revisiting at a later point anyways. llvm-svn: 123786	2011-01-18 23:01:21 +00:00
Bruno Cardoso Lopes	e0f8fee637	Create two new generic classes to represent the following VMRS/VMSR variations: vmrs reg, fpexc vmrs reg, fpsid vmsr fpexc, reg vmsr fpsid, reg llvm-svn: 123783	2011-01-18 21:58:20 +00:00
Bruno Cardoso Lopes	82c6fe3dfe	Fix MRS encoding for arm and thumb. llvm-svn: 123778	2011-01-18 21:31:35 +00:00
Bruno Cardoso Lopes	6e4c5af01e	Fix the encoding of t2ISB by using the right class and also parse it correctly llvm-svn: 123776	2011-01-18 21:17:09 +00:00
Dan Gohman	df668227fb	Teach BasicAA to return PartialAlias in cases where both pointers are pointing to the same object, one pointer is accessing the entire object, and the other is access has a non-zero size. This prevents TBAA from kicking in and saying NoAlias in such cases. llvm-svn: 123775	2011-01-18 21:16:06 +00:00
Bruno Cardoso Lopes	c1e21b06b9	Follow the current hack set and enable the correct parsing of bkpt while in thumb mode. llvm-svn: 123772	2011-01-18 20:55:11 +00:00
Chris Lattner	4832a9d32c	fix rdar://8878965, a regression I introduced with the recent llvm.objectsize changes. llvm-svn: 123771	2011-01-18 20:53:04 +00:00
Bruno Cardoso Lopes	94247155c4	Add support for parsing and encoding ARM's official syntax for the BFI instruction llvm-svn: 123770	2011-01-18 20:45:56 +00:00
Bruno Cardoso Lopes	6c5db0236a	Add support for mips32 madd and msub instructions. Patch by Akira Hatanaka llvm-svn: 123760	2011-01-18 19:29:17 +00:00
Duncan Sands	732cb58b61	For completeness, generalize the (X + Y) - Y -> X transform and add X - (X + 1) -> -1. These were not recommended by my auto-simplifier since they don't fire often enough. However they do fire from time to time, for example they remove one subtraction from the final bitcode for 483.xalancbmk. llvm-svn: 123755	2011-01-18 11:50:19 +00:00
Duncan Sands	2abe6f500f	Simplify (X<<1)-X into X. According to my auto-simplier this is the most common missed simplification in fully optimized code. It occurs sporadically in the testsuite, and many times in 403.gcc: the final bitcode has 131 fewer subtractions after this change. The reason that the multiplies are not eliminated is the same reason that instcombine did not catch this: they are used by other instructions (instcombine catches this with a more general transform which in general is only profitable if the operands have only one use). llvm-svn: 123754	2011-01-18 09:24:58 +00:00
Daniel Dunbar	ba39b2fdc1	McARM: Start marking T2 address operands as such, for the benefit of the parser. llvm-svn: 123722	2011-01-18 03:06:03 +00:00
Benjamin Kramer	869dc645f1	Fix an off-by-one error in ctpop combining. llvm-svn: 123664	2011-01-17 18:00:28 +00:00
Devang Patel	ec7c842bfa	Update tests to accomodate unnamed_addr introduction. llvm-svn: 123663	2011-01-17 17:54:17 +00:00
Benjamin Kramer	e9488ed8eb	Add a DAGCombine to turn (ctpop x) u< 2 into (x & x-1) == 0. This shaves off 4 popcounts from the hacked 186.crafty source. This is enabled even when a native popcount instruction is available. The combined code is one operation longer but it should be faster nevertheless. llvm-svn: 123621	2011-01-17 12:04:57 +00:00
Kalle Raiskila	8eaf0e83d5	Don't crash SPU BE with memory accesses with big alignmnet. llvm-svn: 123620	2011-01-17 11:59:20 +00:00
Evan Cheng	53ec6fc591	Materialize GA addresses with movw + movt pairs for Darwin in PIC mode. e.g. movw r0, :lower16:(L_foo$non_lazy_ptr-(LPC0_0+4)) movt r0, :upper16:(L_foo$non_lazy_ptr-(LPC0_0+4)) LPC0_0: add r0, pc, r0 It's not yet enabled by default as some tests are failing. I suspect bugs in down stream tools. llvm-svn: 123619	2011-01-17 08:03:18 +00:00
Nick Lewycky	8f0b243661	Test for lazy value info's ability to prove the absense of NULLs in pointers. llvm-svn: 123601	2011-01-16 21:57:20 +00:00
Michael J. Spencer	3ea4ed2e6b	Make everyone happy this time. llvm-svn: 123599	2011-01-16 21:34:34 +00:00
Anders Carlsson	c9781e5764	Teach DAE to look for functions whose arguments are unused, and change all callers to pass in an undefvalue instead. llvm-svn: 123596	2011-01-16 21:25:33 +00:00
Michael J. Spencer	b0510b04d5	Try and fix this test. For some reason llvm-ar thinks that the file exists when it shouldn't, but I have no way to verify that it doesn't actually exist on the buildbot. llvm-svn: 123594	2011-01-16 20:52:58 +00:00
Rafael Espindola	9afb7af08a	Update tests. llvm-svn: 123591	2011-01-16 18:02:57 +00:00
Rafael Espindola	41852873f7	Don't merge two constants if we care about the address of both. This fixes the original testcase in PR8927. It also causes a clang binary built with a patched clang to increase in size by 0.21%. We can probably get some of the size back by writing a pass that detects that a global never has its pointer compared and adds unnamed_addr to it (maybe extend global opt). It is also possible that there are some other cases clang could add unnamed_addr to. I will investigate extending globalopt next. llvm-svn: 123584	2011-01-16 17:05:09 +00:00
Owen Anderson	b86db71ad0	Reduce and merge testcases. llvm-svn: 123579	2011-01-16 09:13:31 +00:00
Chris Lattner	dde85de90f	fix PR8514, a bug where the "heroic" transformation of shift/and into and/shift would cause nodes to move around and a dangling pointer to happen. The code tried to avoid this with a HandleSDNode, but got the details wrong. llvm-svn: 123578	2011-01-16 08:48:11 +00:00
Chris Lattner	a4454efc85	fix PR8932, a case where arg promotion could infinitely promote. llvm-svn: 123574	2011-01-16 08:09:24 +00:00
Chris Lattner	29f339f87c	if an alloca is only ever accessed as a unit, and is accessed with load/store instructions, then don't try to decimate it into its individual pieces. This will just make a mess of the IR and is pointless if none of the elements are individually accessed. This was generating really terrible code for std::bitset (PR8980) because it happens to be lowered by clang as an {[8 x i8]} structure instead of {i64}. The testcase now is optimized to: define i64 @test2(i64 %X) { br label %L2 L2: ; preds = %0 ret i64 %X } before we generated: define i64 @test2(i64 %X) { %sroa.store.elt = lshr i64 %X, 56 %1 = trunc i64 %sroa.store.elt to i8 %sroa.store.elt8 = lshr i64 %X, 48 %2 = trunc i64 %sroa.store.elt8 to i8 %sroa.store.elt9 = lshr i64 %X, 40 %3 = trunc i64 %sroa.store.elt9 to i8 %sroa.store.elt10 = lshr i64 %X, 32 %4 = trunc i64 %sroa.store.elt10 to i8 %sroa.store.elt11 = lshr i64 %X, 24 %5 = trunc i64 %sroa.store.elt11 to i8 %sroa.store.elt12 = lshr i64 %X, 16 %6 = trunc i64 %sroa.store.elt12 to i8 %sroa.store.elt13 = lshr i64 %X, 8 %7 = trunc i64 %sroa.store.elt13 to i8 %8 = trunc i64 %X to i8 br label %L2 L2: ; preds = %0 %9 = zext i8 %1 to i64 %10 = shl i64 %9, 56 %11 = zext i8 %2 to i64 %12 = shl i64 %11, 48 %13 = or i64 %12, %10 %14 = zext i8 %3 to i64 %15 = shl i64 %14, 40 %16 = or i64 %15, %13 %17 = zext i8 %4 to i64 %18 = shl i64 %17, 32 %19 = or i64 %18, %16 %20 = zext i8 %5 to i64 %21 = shl i64 %20, 24 %22 = or i64 %21, %19 %23 = zext i8 %6 to i64 %24 = shl i64 %23, 16 %25 = or i64 %24, %22 %26 = zext i8 %7 to i64 %27 = shl i64 %26, 8 %28 = or i64 %27, %25 %29 = zext i8 %8 to i64 %30 = or i64 %29, %28 ret i64 %30 } In this case, instcombine was able to eliminate the nonsense, but in PR8980 enough PHIs are in play that instcombine backs off. It's better to not generate this stuff in the first place. llvm-svn: 123571	2011-01-16 06:18:28 +00:00
Chris Lattner	2067fb2a93	enhance FoldOpIntoPhi in instcombine to try harder when a phi has multiple uses. In some cases, all the uses are the same operation, so instcombine can go ahead and promote the phi. In the testcase this pushes an add out of the loop. llvm-svn: 123568	2011-01-16 05:28:59 +00:00
Evan Cheng	144b435a15	Spill R4 if it's going to be used to restore SP from FP. llvm-svn: 123567	2011-01-16 05:14:33 +00:00
Owen Anderson	6e0fa67f91	Improve the safety of my globalopt enhancement by ensuring that the bitcast of the stored value to the new store type is always. Also, add a testcase. llvm-svn: 123563	2011-01-16 04:33:33 +00:00
Chris Lattner	aba06ce448	fix PR8983, a broken assertion. llvm-svn: 123562	2011-01-16 03:43:53 +00:00
Venkatraman Govindaraju	fe346f6cba	Implement AnalyzeBranch in Sparc Backend. llvm-svn: 123561	2011-01-16 03:15:11 +00:00

1 2 3 4 5 ...

12057 Commits