1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-27 22:12:47 +01:00
Commit Graph

4025 Commits

Author SHA1 Message Date
Evan Cheng
56b78e409e Fix an obvious typo which caused an isel assertion. rdar://8964854.
llvm-svn: 125023
2011-02-07 18:50:47 +00:00
Devang Patel
46db608b81 Reduce test case, smaller is better.
llvm-svn: 125019
2011-02-07 18:24:18 +00:00
Bob Wilson
65f4a70b82 Add codegen support for using post-increment NEON load/store instructions.
The vld1-lane, vld1-dup and vst1-lane instructions do not yet support using
post-increment versions, but all the rest of the NEON load/store instructions
should be handled now.

llvm-svn: 125014
2011-02-07 17:43:21 +00:00
Jason W Kim
b0d4492aa1 Rework some .ARM.attribute work for improved gcc compatibility.
Unified EmitTextAttribute for both Asm and Obj emission (.cpu only)
Added necessary cortex-A8 related attrs for codegen compat tests.

llvm-svn: 124995
2011-02-07 00:49:53 +00:00
NAKAMURA Takumi
07a84f5950 Target/X86: Tweak allocating shadow area (aka home) on Win64. It must be enough for caller to allocate one.
llvm-svn: 124949
2011-02-05 15:11:32 +00:00
Bob Wilson
d46874eb5d Move a test that ended up in the wrong place.
llvm-svn: 124933
2011-02-05 04:15:50 +00:00
Devang Patel
930b4b16a1 Merge .debug_loc entries whenever possible to reduce debug_loc size.
llvm-svn: 124904
2011-02-04 22:57:18 +00:00
Nick Lewycky
a4f2b5a934 Mark that the return is using EAX so that we don't use it for some other
purpose. Fixes PR9080!

llvm-svn: 124903
2011-02-04 22:44:08 +00:00
Devang Patel
a586bb8ecd DebugLoc associated with a machine instruction is used to emit location entries. DebugLoc associated with a DBG_VALUE is used to identify lexical scope of the variable. After register allocation, while inserting DBG_VALUE remember original debug location for the first instruction and reuse it, otherwise dwarf writer may be mislead in identifying the variable's scope.
llvm-svn: 124845
2011-02-04 01:43:25 +00:00
Richard Osborne
5c655f451e Add XCore intrinsics for resource instructions.
llvm-svn: 124794
2011-02-03 13:14:25 +00:00
Rafael Espindola
b0a802c8bf Add -march to fix the bots.
llvm-svn: 124774
2011-02-03 04:21:01 +00:00
Rafael Espindola
5bfba89832 Fix PR9127 by reversing the operands even if they have more then one use.
Reversing the operands allows us to fold, but doesn't force us to. Also, at
this point the DAG is still being optimized, so the check for hasOneUse is not
very precise.

llvm-svn: 124773
2011-02-03 03:58:05 +00:00
Richard Osborne
5ee859cb22 Add support for trampolines on the XCore.
llvm-svn: 124722
2011-02-02 14:57:41 +00:00
Evan Cheng
c7ce7e2ac3 Given a pair of floating point load and store, if there are no other uses of
the load, then it may be legal to transform the load and store to integer
load and store of the same width.

This is done if the target specified the transformation as profitable. e.g.
On arm, this can transform:
vldr.32 s0, []
vstr.32 s0, []

to

ldr r12, []
str r12, []

rdar://8944252

llvm-svn: 124708
2011-02-02 01:06:55 +00:00
Devang Patel
97c467ee47 Keep track of incoming argument's location while emitting LiveIns.
llvm-svn: 124611
2011-01-31 21:38:14 +00:00
Richard Osborne
11cdda2346 Fix bug where ReduceLoadWidth was creating illegal ZEXTLOAD instructions.
llvm-svn: 124587
2011-01-31 17:41:44 +00:00
Benjamin Kramer
6b3c3de09a Teach DAGCombine to fold fold (sra (trunc (sr x, c1)), c2) -> (trunc (sra x, c1+c2) when c1 equals the amount of bits that are truncated off.
This happens all the time when a smul is promoted to a larger type.

On x86-64 we now compile "int test(int x) { return x/10; }" into
  movslq  %edi, %rax
  imulq $1717986919, %rax, %rax
  movq  %rax, %rcx
  shrq  $63, %rcx
  sarq  $34, %rax <- used to be "shrq $32, %rax; sarl $2, %eax"
  addl  %ecx, %eax

This fires 96 times in gcc.c on x86-64.

llvm-svn: 124559
2011-01-30 16:38:43 +00:00
Evan Cheng
4af5487b74 Re-apply r124518 with fix. Watch out for invalidated iterator.
llvm-svn: 124526
2011-01-29 04:46:23 +00:00
Evan Cheng
1f943b9b13 Revert r124518. It broke Linux self-host.
llvm-svn: 124522
2011-01-29 02:43:04 +00:00
Evan Cheng
a1e4cb5f09 Re-commit r124462 with fixes. Tail recursion elim will now dup ret into unconditional predecessor to enable TCE on demand.
llvm-svn: 124518
2011-01-29 01:29:26 +00:00
Evan Cheng
5b6c72e549 Revert r124462. There are a few big regressions that I need to fix first.
llvm-svn: 124478
2011-01-28 07:12:38 +00:00
Rafael Espindola
d93551f227 Add a triple.
llvm-svn: 124471
2011-01-28 03:57:55 +00:00
Rafael Espindola
9bc19ee478 Print the visibility of declarations.
llvm-svn: 124468
2011-01-28 03:20:10 +00:00
Evan Cheng
7031f450b3 - Stop simplifycfg from duplicating "ret" instructions into unconditional
branches. PR8575, rdar://5134905, rdar://8911460.
- Allow codegen tail duplication to dup small return blocks after register
  allocation is done.

llvm-svn: 124462
2011-01-28 02:19:21 +00:00
Eric Christopher
3906b33289 Add a testcase for my last checkin.
llvm-svn: 124358
2011-01-27 06:01:17 +00:00
NAKAMURA Takumi
8ace7260cc Target/X86: Tweak win64's tailcall.
llvm-svn: 124272
2011-01-26 02:04:09 +00:00
NAKAMURA Takumi
066378440a Fix whitespace.
llvm-svn: 124270
2011-01-26 02:03:37 +00:00
Devang Patel
fce915414e Resolve DanglingDbgValue of PHI nodes where the use follows dbg.value intrinisic.
llvm-svn: 124203
2011-01-25 18:09:58 +00:00
Evan Cheng
8e47a6e196 Don't merge restore with tail call instruction.
llvm-svn: 124167
2011-01-25 01:28:33 +00:00
Devang Patel
431a9b9c2f Speculatively revert r124138.
llvm-svn: 124142
2011-01-24 20:04:37 +00:00
Devang Patel
5ccc4e884c Resolve DanglingDbgValue of PHI nodes where the use follows dbg.value intrinisic.
llvm-svn: 124138
2011-01-24 19:24:37 +00:00
Chris Lattner
9ba0a83f2b fix a missing shuffle pattern, PR9009. Patch by Artiom Myaskouvskey!
llvm-svn: 124102
2011-01-24 03:42:46 +00:00
Venkatraman Govindaraju
66369057ae Pass sret arguments through the stack instead of through registers in Sparc backend. It makes the code generated more compliant with the sparc32 ABI.
llvm-svn: 124030
2011-01-22 13:05:16 +00:00
Venkatraman Govindaraju
3c37c9914f Added ICC, FCC as uses of movcc instruction to generate correct code when -mattr=v9 is used.
llvm-svn: 124027
2011-01-22 11:36:24 +00:00
Venkatraman Govindaraju
c20beed917 Sparc backend:
Rename FLUSH to FLUSHW.
 Output "ta 3" instead of a "flushw" instruction if v8 instruction set is used.

llvm-svn: 123997
2011-01-21 22:00:00 +00:00
Evan Cheng
0dfe28a9b5 Last round of fixes for movw + movt global address codegen.
1. Fixed ARM pc adjustment.
2. Fixed dynamic-no-pic codegen
3. CSE of pc-relative load of global addresses.

It's now enabled by default for Darwin.

llvm-svn: 123991
2011-01-21 18:55:51 +00:00
Venkatraman Govindaraju
6a083c355b Implement support for byval arguments in Sparc backend.
llvm-svn: 123974
2011-01-21 14:00:01 +00:00
Andrew Trick
e0bccb5f87 Enable support for precise scheduling of the instruction selection
DAG. Disable using "-disable-sched-cycles".

For ARM, this enables a framework for modeling the cpu pipeline and
counting stalls. It also activates several heuristics to drive
scheduling based on the model. Scheduling is inherently imprecise at
this stage, and until spilling is improved it may defeat attempts to
schedule. However, this framework provides greater control over
tuning codegen.

Although the flag is not target-specific, it should have very little
affect on the default scheduler used by x86. The only two changes that
affect x86 are:
- scheduling a high-latency operation bumps the current cycle so independent
  operations can have their latency covered. i.e. two independent 4
  cycle operations can produce results in 4 cycles, not 8 cycles.
- Two operations with equal register pressure impact and no
  latency-based stalls on their uses will be prioritized by depth before height
  (height is irrelevant if no stalls occur in the schedule below this point).

llvm-svn: 123971
2011-01-21 06:19:05 +00:00
Andrew Trick
7155e98904 Convert -enable-sched-cycles and -enable-sched-hazard to -disable
flags. They are still not enable in this revision.

Added TargetInstrInfo::isZeroCost() to fix a fundamental problem with
the scheduler's model of operand latency in the selection DAG.

Generalized unit tests to work with sched-cycles.

llvm-svn: 123969
2011-01-21 05:51:33 +00:00
Evan Cheng
52fe62c996 Don't be overly aggressive with CSE of "ldr constantpool". If it's a pc-relative
value, the "add pc" must be CSE'ed at the same time. We could follow the same
approach as T2 by adding pseudo instructions that combine the ldr + "add pc".
But the better approach is to use movw + movt (which I will enable soon), so
I'll leave this as a TODO.

llvm-svn: 123949
2011-01-20 23:55:07 +00:00
Kalle Raiskila
070fb5e54d Allow sign-extending of i8 and i16 to i128 on SPU.
llvm-svn: 123912
2011-01-20 15:49:06 +00:00
Eric Christopher
f7579ff174 Expand invalid return values for umulo and smulo. Handle these similarly
to add/sub by doing the normal operation and then checking for overflow
afterwards. This generally relies on the DAG handling the later invalid
operations as well.

Fixes the 64-bit part of rdar://8622122 and rdar://8774702.

llvm-svn: 123908
2011-01-20 08:54:28 +00:00
Evan Cheng
5c5e42a878 Add test.
llvm-svn: 123906
2011-01-20 08:38:21 +00:00
Evan Cheng
6dc21c7358 Sorry, several patches in one.
TargetInstrInfo:
Change produceSameValue() to take MachineRegisterInfo as an optional argument.
When in SSA form, targets can use it to make more aggressive equality analysis.

Machine LICM:
1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead.
2. Fix a bug which prevent CSE of instructions which are not re-materializable.
3. Use improved form of produceSameValue.

ARM:
1. Teach ARM produceSameValue to look pass some PIC labels.
2. Look for operands from different loads of different constant pool entries
   which have same values.
3. Re-implement PIC GA materialization using movw + movt. Combine the pair with
   a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible
   to re-materialize the instruction, allow machine LICM to hoist the set of
   instructions out of the loop and make it possible to CSE them. It's a bit
   hacky, but it significantly improve code quality.
4. Some minor bug fixes as well.

With the fixes, using movw + movt to materialize GAs significantly outperform the
load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap
and 176.gcc ~10%.

llvm-svn: 123905
2011-01-20 08:34:58 +00:00
Venkatraman Govindaraju
5280b2876f Sparc backend: Implements a delay slot filler that attempt to fill delay slots
with useful instructions.

llvm-svn: 123884
2011-01-20 05:08:26 +00:00
Eric Christopher
1b0e5debb4 If we can, lower the multiply part of a umulo/smulo call to a libcall
with an invalid type then split the result and perform the overflow check
normally.

Fixes the 32-bit parts of rdar://8622122 and rdar://8774702.

llvm-svn: 123864
2011-01-20 00:29:24 +00:00
Devang Patel
729c5e59af Fix debug info for merged global.
llvm-svn: 123862
2011-01-20 00:02:16 +00:00
Chris Lattner
4832a9d32c fix rdar://8878965, a regression I introduced with the recent
llvm.objectsize changes.

llvm-svn: 123771
2011-01-18 20:53:04 +00:00
Bruno Cardoso Lopes
6c5db0236a Add support for mips32 madd and msub instructions. Patch by Akira Hatanaka
llvm-svn: 123760
2011-01-18 19:29:17 +00:00
Benjamin Kramer
869dc645f1 Fix an off-by-one error in ctpop combining.
llvm-svn: 123664
2011-01-17 18:00:28 +00:00