1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

71671 Commits

Author SHA1 Message Date
Che-Liang Chiou
a1aa7de4a9 ptx: add integer div and rem instruction
Patched by Dan Bailey

llvm-svn: 129848
2011-04-20 09:28:55 +00:00
Che-Liang Chiou
1792164516 ptx: add floating-point comparison to setp
Patched by Dan Bailey

llvm-svn: 129847
2011-04-20 09:28:20 +00:00
Che-Liang Chiou
25911afe57 ptx: fix parameter ordering
Patched by Dan Bailey

llvm-svn: 129846
2011-04-20 09:27:19 +00:00
Nick Lewycky
8304ea55f2 This should always be signed chars, so use int8_t. This fixes a miscompile when
llvm is built with unsigned chars where an immediate such as 0xff would be zero
extended to 64-bits, turning "cmp $0xff,%eax" into
"cmp $0xffffffffffffffff,%eax".

llvm-svn: 129845
2011-04-20 03:19:42 +00:00
Rafael Espindola
09e9797728 Remove unused arguments.
llvm-svn: 129844
2011-04-20 03:08:09 +00:00
Eric Christopher
4c3c7c8211 Rewrite the expander for umulo/smulo to remember to sign extend the input
manually and pass all (now) 4 arguments to the mul libcall. Add a new
ExpandLibCall for just this (copied gratuitously from type legalization).

Fixes rdar://9292577

llvm-svn: 129842
2011-04-20 01:19:45 +00:00
Daniel Dunbar
187b2237d2 llc: Fix a refacto, .loc support didn't work before 10.6.
llvm-svn: 129841
2011-04-20 00:47:19 +00:00
Sean Callanan
c550ce0d48 Made the MC disassembler check before accessing
MCInst operands for ARM.  This allows it to be
more tolerant of malformed MCInsts or incorrect
instruction metadata.

llvm-svn: 129840
2011-04-20 00:43:34 +00:00
Daniel Dunbar
82a4062a4e ADT/Triple: Renambe isOSX... methods to isMacOSX for consistency with the OS
triple component.

llvm-svn: 129838
2011-04-20 00:14:25 +00:00
Johnny Chen
7ceb5e40b6 Fix typo in the comment.
llvm-svn: 129837
2011-04-19 23:58:52 +00:00
Daniel Dunbar
3f19424547 ADT/Triple: Drop support for -osx style triples, we are going with -macosx
instead.

llvm-svn: 129836
2011-04-19 23:55:20 +00:00
Daniel Dunbar
6af8a09825 ADT/Triple: Add support for Triple::MacOSX per feedback from Chris, will remove
Triple::OSX once Clang has moved.

llvm-svn: 129833
2011-04-19 23:34:12 +00:00
Daniel Dunbar
1c0a0fde5c ADT/Triple: Move a variety of clients to using isOSDarwin() and isOSWindows()
predicates.

llvm-svn: 129816
2011-04-19 21:14:45 +00:00
Daniel Dunbar
9934e8436d ADT/Triple: Add isOSDarwin() and isOSWindows() helper functions.
llvm-svn: 129815
2011-04-19 21:12:05 +00:00
Daniel Dunbar
0fa8b6f591 ADT/Triple: Fix Triple::getArchNameForAssembler to support OSX and iOS
enumeration values.

llvm-svn: 129814
2011-04-19 21:07:03 +00:00
Daniel Dunbar
a90cba45c7 Target/X86: Eliminate uses of getDarwinVers().
llvm-svn: 129813
2011-04-19 21:04:12 +00:00
Daniel Dunbar
deb1713b7f Target/X86: Add getTargetTriple() accessor.
llvm-svn: 129812
2011-04-19 21:01:47 +00:00
Daniel Dunbar
c325e8720a Target/PPC: Kill off DarwinVers, which is now dead.
llvm-svn: 129811
2011-04-19 20:59:24 +00:00
Daniel Dunbar
685eb363c9 Target/PPC: Eliminate a use of getDarwinVers().
llvm-svn: 129810
2011-04-19 20:57:03 +00:00
Daniel Dunbar
df2c6f8ce7 Target/PPC: Add a TargetTriple field.
llvm-svn: 129809
2011-04-19 20:54:28 +00:00
Chris Lattner
649aa9aa1e add a helper method.
llvm-svn: 129806
2011-04-19 20:47:57 +00:00
Daniel Dunbar
d11dec1469 llc: Eliminate a use of getDarwinMajorNumber().
- As before, there is a minor semantic change here (evidenced by the test
   change) for Darwin triples that have no version component. I debated changing
   the default behavior of isOSVersionLT, but decided it made more sense for
   triples to be explicit.

llvm-svn: 129805
2011-04-19 20:46:13 +00:00
Daniel Dunbar
6a52508cb9 Target: Eliminate a use of getDarwinMajorNumber().
llvm-svn: 129803
2011-04-19 20:44:08 +00:00
Daniel Dunbar
140e365c49 CodeGen: Eliminate a use of getDarwinMajorNumber().
- There is a minor semantic change here (evidenced by the test change) for
   Darwin triples that have no version component. I debated changing the default
   behavior of isOSVersionLT, but decided it made more sense for triples to be
   explicit.

llvm-svn: 129802
2011-04-19 20:32:39 +00:00
Daniel Dunbar
1e1f05faeb ADT/Triple: Add helper function for OS X version checks.
llvm-svn: 129801
2011-04-19 20:30:10 +00:00
Daniel Dunbar
6f283d346f ADT/Triple: Add isOSVersionLT helper function.
llvm-svn: 129800
2011-04-19 20:30:07 +00:00
Daniel Dunbar
e24d3d1bc5 ADT/Triple: Generalize and simplify getDarwinNumber to just be getOSVersion.
llvm-svn: 129799
2011-04-19 20:24:34 +00:00
Daniel Dunbar
ce4e39d010 ADT/Triple: Add support for more explicit "osx" and "ios" OS names.
llvm-svn: 129798
2011-04-19 20:19:27 +00:00
Stuart Hastings
89cb281cf8 Delete unnecessary variable. <rdar://problem/7662569>
llvm-svn: 129796
2011-04-19 20:09:38 +00:00
Eric Christopher
21ad2325df Remove some duplicate op action entries and reorganize.
llvm-svn: 129781
2011-04-19 18:49:19 +00:00
Bob Wilson
3daeb462cb This patch combines several changes from Evan Cheng for rdar://8659675.
Making use of VFP / NEON floating point multiply-accumulate / subtraction is
difficult on current ARM implementations for a few reasons.
1. Even though a single vmla has latency that is one cycle shorter than a pair
   of vmul + vadd, a RAW hazard during the first (4? on Cortex-a8) can cause
   additional pipeline stall. So it's frequently better to single codegen
   vmul + vadd.
2. A vmla folowed by a vmul, vmadd, or vsub causes the second fp instruction to
   stall for 4 cycles. We need to schedule them apart.
3. A vmla followed vmla is a special case. Obvious issuing back to back RAW
   vmla + vmla is very bad. But this isn't ideal either:
     vmul
     vadd
     vmla
   Instead, we want to expand the second vmla:
     vmla
     vmul
     vadd
   Even with the 4 cycle vmul stall, the second sequence is still 2 cycles
   faster.

Up to now, isel simply avoid codegen'ing fp vmla / vmls. This works well enough
but it isn't the optimial solution. This patch attempts to make it possible to
use vmla / vmls in cases where it is profitable.

A. Add missing isel predicates which cause vmla to be codegen'ed.
B. Make sure the fmul in (fadd (fmul)) has a single use. We don't want to
   compute a fmul and a fmla.
C. Add additional isel checks for vmla, avoid cases where vmla is feeding into
   fp instructions (except for the #3 exceptional case).
D. Add ARM hazard recognizer to model the vmla / vmls hazards.
E. Add a special pre-regalloc case to expand vmla / vmls when it's likely the
   vmla / vmls will trigger one of the special hazards.

Enable these fp vmlx codegen changes for Cortex-A9.

llvm-svn: 129775
2011-04-19 18:11:57 +00:00
Bob Wilson
56f64ab701 Add -mcpu=cortex-a9-mp. It's cortex-a9 with MP extension. rdar://8648637.
llvm-svn: 129774
2011-04-19 18:11:52 +00:00
Bob Wilson
0cbbc50f26 Avoid some 's' 16-bit instruction which partially update CPSR
(and add false dependency) when it isn't dependent on last CPSR defining
instruction. rdar://8928208

llvm-svn: 129773
2011-04-19 18:11:49 +00:00
Bob Wilson
886994b683 Avoid write-after-write issue hazards for Cortex-A9.
Add a avoidWriteAfterWrite() target hook to identify register classes that
suffer from write-after-write hazards. For those register classes, try to avoid
writing the same register in two consecutive instructions.

This is currently disabled by default.  We should not spill to avoid hazards!
The command line flag -avoid-waw-hazard can be used to enable waw avoidance.

llvm-svn: 129772
2011-04-19 18:11:45 +00:00
Bob Wilson
48d5451029 Some single-precision VFP instructions can execute in either the VPF or Neon
pipelines, at least on Cortex-A9.

llvm-svn: 129771
2011-04-19 18:11:38 +00:00
Bob Wilson
d83b95fd68 Improvements for the Cortex-A9 scheduling itineraries.
llvm-svn: 129770
2011-04-19 18:11:36 +00:00
Eli Friedman
01f94bd648 Add support for FastISel'ing varargs calls.
llvm-svn: 129765
2011-04-19 17:22:22 +00:00
Jakob Stoklund Olesen
dceb96c62d Force the greedy register allocator to be linked alongside linear scan.
This means that the new register allocator can be used with 'clang -mllvm -regalloc=greedy'.

llvm-svn: 129764
2011-04-19 17:17:58 +00:00
Eli Friedman
bbf7d2ac38 SelectBasicBlock is rather slow even when it doesn't do anything; skip the
unnecessary work where possible.

llvm-svn: 129763
2011-04-19 17:01:08 +00:00
Stuart Hastings
f838ea4959 Support nested CALLSEQ_BEGIN/END; necessary for ARM byval support. <rdar://problem/7662569>
llvm-svn: 129761
2011-04-19 16:16:58 +00:00
Jay Foad
0dcd432074 Trivial simplification.
llvm-svn: 129759
2011-04-19 15:23:29 +00:00
Jakob Stoklund Olesen
c84b16717b Tighten test case a bit.
Ideally, we would match an S-register to its containing D-register, but that
requires arithmetic (divide by 2).

llvm-svn: 129756
2011-04-19 06:14:45 +00:00
Chris Lattner
f15db6c86f Implement support for x86 fastisel of small fixed-sized memcpys, which are generated
en-mass for C++ PODs.  On my c++ test file, this cuts the fast isel rejects by 10x 
and shrinks the generated .s file by 5%

llvm-svn: 129755
2011-04-19 05:52:03 +00:00
Chris Lattner
ec5a480dca tidy up
llvm-svn: 129753
2011-04-19 05:15:59 +00:00
Chris Lattner
7d07af0bf2 Implement support for fast isel of calls of i1 arguments, even though they are illegal,
when they are a truncate from something else.  This eliminates fully half of all the 
fastisel rejections on a test c++ file I'm working with, which should make a substantial
improvement for -O0 compile of c++ code.

This fixed rdar://9297003 - fast isel bails out on all functions taking bools

llvm-svn: 129752
2011-04-19 05:09:50 +00:00
Chris Lattner
3c4af7bfee Handle i1/i8/i16 constant integer arguments to calls by prepromoting them.
Before we would bail out on i1 arguments all together, now we just bail on
non-constant ones.  Also, we used to emit extraneous code.  e.g. test12 was:

	movb	$0, %al
	movzbl	%al, %edi
	callq	_test12

and test13 was:
	movb	$0, %al
	xorl	%edi, %edi
	movb	%al, 7(%rsp)
	callq	_test13f

Now we get:

	movl	$0, %edi
	callq	_test12
and:
	movl	$0, %edi
	callq	_test13f

llvm-svn: 129751
2011-04-19 04:42:38 +00:00
Chris Lattner
87b2a0ab2a be layout aware, to produce:
testb	$1, %al
	je	LBB0_2
## BB#1:                                ## %if.then
	movb	$0, %al

instead of:

	testb	$1, %al
	jne	LBB0_1
	jmp	LBB0_2
LBB0_1:                                 ## %if.then
	movb	$0, %al

how 'bout that.

llvm-svn: 129749
2011-04-19 04:26:32 +00:00
Chris Lattner
d259570b73 fix rdar://9297006 - fast isel bails out on trunc to i1 -> bools cry,
a common cause of fast isel rejects on c++ code.

llvm-svn: 129748
2011-04-19 04:22:17 +00:00
Evan Cheng
e232ee2466 Change A9 scheduling itineraries VLD* / VST* entries default to "aligned". That
is, it assumes addresses are 64-bit aligned (which should be the more common
case). If the alignment is found not to be aligned, then getOperandLatency()
would adjust the operand latency computation by one to compensate for it.
rdar://9294833

llvm-svn: 129742
2011-04-19 01:21:49 +00:00
Jakob Stoklund Olesen
c9861cc9f6 Make tests register allocation independent again.
llvm-svn: 129739
2011-04-19 00:14:43 +00:00