1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

80 Commits

Author SHA1 Message Date
Jim Grosbach
f2f14a2d43 X86: Resolve a long standing FIXME and properly isel pextr[bw].
Generalize the AArch64 .td nodes for AssertZext and AssertSext. Use
them to match the relevant pextr store instructions.

The test widen_load-2.ll requires a slight change because with the
stores gone, the remaining instructions are scheduled in a different
order.

Add test cases for SSE4 and AVX variants.

Resolves rdar://13414672.

Patch by Adam Nemet <anemet@apple.com>.

llvm-svn: 200957
2014-02-07 00:16:33 +00:00
Benjamin Kramer
c63386d01a X86: Turn fp selects into mask operations.
double test(double a, double b, double c, double d) { return a<b ? c : d; }

before:
_test:
	ucomisd	%xmm0, %xmm1
	ja	LBB0_2
	movaps	%xmm3, %xmm2
LBB0_2:
	movaps	%xmm2, %xmm0

after:
_test:
	cmpltsd	%xmm1, %xmm0
	andpd	%xmm0, %xmm2
	andnpd	%xmm3, %xmm0
	orpd	%xmm2, %xmm0

Small speedup on Benchmarks/SmallPT

llvm-svn: 187706
2013-08-04 12:05:16 +00:00
Benjamin Kramer
4ee689feed X86: Add a note.
llvm-svn: 175408
2013-02-17 23:34:14 +00:00
Chris Lattner
4a8f2bcb32 some peepholes that should match horizontal add/sub operations.
llvm-svn: 163103
2012-09-03 02:58:21 +00:00
Benjamin Kramer
080ccc13a6 Add a note for -ffast-math optimization of vector norm.
llvm-svn: 153031
2012-03-19 00:43:34 +00:00
Benjamin Kramer
54f11215e8 This is now implemented.
llvm-svn: 146258
2011-12-09 15:45:57 +00:00
Lang Hames
be4997db2f Add a natural stack alignment field to TargetData, and prevent InstCombine from
promoting allocas to preferred alignments that exceed the natural
alignment. This avoids some potentially expensive dynamic stack realignments.

The natural stack alignment is set in target data strings via the "S<size>"
option. Size is in bits and must be a multiple of 8. The natural stack alignment
defaults to "unspecified" (represented by a zero value), and the "unspecified"
value does not prevent any alignment promotions. Target maintainers that care
about avoiding promotions should explicitly add the "S<size>" option to their
target data strings.

llvm-svn: 141599
2011-10-10 23:42:08 +00:00
Benjamin Kramer
19bcaa5d51 Add a note about SSE4.1 roundss/roundsd.
llvm-svn: 125438
2011-02-12 17:58:16 +00:00
Chris Lattner
908d8e9de2 update this.
llvm-svn: 113116
2010-09-05 20:22:09 +00:00
Chris Lattner
eb4c7e43cc we should pattern match the SSE complex arithmetic ops.
llvm-svn: 112109
2010-08-25 23:31:42 +00:00
Chris Lattner
f4dfc7aaab random improvement for variable shift codegen.
llvm-svn: 111813
2010-08-23 17:30:29 +00:00
Jakob Stoklund Olesen
eeabe43059 Remove obsolete README_SSE note.
We are generating movaps for all XMM register copies, including scalar
floating point values. This is known to be at least as good as movss and movsd
for all known architectures up to and including Nehalem because it avoids a
partial register stall.

The SSEDomainFix pass will switch movaps to movdqa when appropriate (i.e., when
operands come from the integer unit). We don't now that switching movaps to
movapd has any benefit.

The same applies to andps -> pand.

llvm-svn: 108096
2010-07-11 17:13:42 +00:00
Chris Lattner
6a9b6e3253 some notes about suboptimal insertps's
llvm-svn: 107613
2010-07-05 05:48:41 +00:00
Eli Friedman
37ee2be8cc Remove some already-fixed README entries.
llvm-svn: 105377
2010-06-03 01:47:31 +00:00
Eli Friedman
6a9bddea09 Remove README entry which no longer compiles to something sane.
llvm-svn: 105376
2010-06-03 01:16:51 +00:00
Dan Gohman
37bf232609 Floating-point add, sub, and mul are now spelled fadd, fsub, and fmul,
respectively.

llvm-svn: 97531
2010-03-02 01:11:08 +00:00
Dan Gohman
92b6122204 Fix "the the" and similar typos.
llvm-svn: 95781
2010-02-10 16:03:48 +00:00
Chris Lattner
7c64c9ca21 add a note from PR6194
llvm-svn: 95649
2010-02-09 05:45:29 +00:00
Chris Lattner
a635219775 move the PR6214 microoptzn to this file.
llvm-svn: 95299
2010-02-04 07:32:01 +00:00
Chris Lattner
72a06499b3 this is an SSE-specific issue.
llvm-svn: 93373
2010-01-13 23:29:11 +00:00
Chris Lattner
feb6b56242 Bill implemented this.
llvm-svn: 63752
2009-02-04 19:09:07 +00:00
Chris Lattner
eae4653469 add a note, this is why we're faster at SciMark-MonteCarlo with
SSE disabled.

llvm-svn: 63751
2009-02-04 19:08:01 +00:00
Evan Cheng
2a965124b7 The memory alignment requirement on some of the mov{h|l}p{d|s} patterns are 16-byte. That is overly strict. These instructions read / write f64 memory locations without alignment requirement.
llvm-svn: 63195
2009-01-28 08:35:02 +00:00
Chris Lattner
c018045520 add a note
llvm-svn: 56391
2008-09-20 19:17:53 +00:00
Chris Lattner
61e771be29 add a note
llvm-svn: 54964
2008-08-19 00:41:02 +00:00
Evan Cheng
71fbfe73c1 - Fix a x86 vector isel bug: illegal transformation of a vector_shuffle into a
shift.
- Add a readme entry for a missing vector_shuffle optimization that results in
  awful codegen.

llvm-svn: 52740
2008-06-25 20:52:59 +00:00
Evan Cheng
d312ced1cf This is done.
llvm-svn: 51526
2008-05-24 00:10:13 +00:00
Evan Cheng
4f660778f0 Use movlps / movhps to modify low / high half of 16-byet memory location.
llvm-svn: 51501
2008-05-23 21:23:16 +00:00
Dan Gohman
e8422fc112 Elaborate on the entry on integer vector multiplication by constants.
llvm-svn: 51491
2008-05-23 18:05:39 +00:00
Evan Cheng
e7ec4690e1 New entry.
llvm-svn: 51487
2008-05-23 17:28:11 +00:00
Chris Lattner
4c1ffef5af we compile multiply-by-constant into horrible code. Doesn't sse4 have some
instruction for doing this?

llvm-svn: 51473
2008-05-23 04:29:53 +00:00
Chris Lattner
a11adf725d add a note
llvm-svn: 51062
2008-05-13 19:56:20 +00:00
Chris Lattner
c9eb6a7d64 add a note
llvm-svn: 51060
2008-05-13 18:48:54 +00:00
Evan Cheng
9e15622879 Instead of a vector load, shuffle and then extract an element. Load the element from address with an offset.
pshufd $1, (%rdi), %xmm0
        movd %xmm0, %eax
=>
        movl 4(%rdi), %eax

llvm-svn: 51026
2008-05-13 08:35:03 +00:00
Evan Cheng
e4ee4c2870 On x86, it's safe to treat i32 load anyext as a normal i32 load. Ditto for i8 anyext load to i16.
llvm-svn: 51019
2008-05-13 00:54:02 +00:00
Evan Cheng
fcbdc8bd6e Xform bitconvert(build_pair(load a, load b)) to a single load if the load locations are at the right offset from each other.
llvm-svn: 51008
2008-05-12 23:04:07 +00:00
Anton Korobeynikov
ad83aeb489 Add note
llvm-svn: 50959
2008-05-11 14:33:15 +00:00
Chris Lattner
9f994482f5 add a note, this is actually not too bad to implement.
llvm-svn: 49466
2008-04-10 05:54:50 +00:00
Chris Lattner
869325c4c4 move the x86-32 part of PR2108 here.
llvm-svn: 49465
2008-04-10 05:37:47 +00:00
Chris Lattner
b628208161 Finish implementing a readme entry: when inserting an i64 variable
into a vector of zeros or undef, and when the top part is obviously
zero, we can just use movd + shuffle.  This allows us to compile
vec_set-B.ll into:

_test3:
	movl	$1234567, %eax
	andl	4(%esp), %eax
	movd	%eax, %xmm0
	ret

instead of:

_test3:
	subl	$28, %esp
	movl	$1234567, %eax
	andl	32(%esp), %eax
	movl	%eax, (%esp)
	movl	$0, 4(%esp)
	movq	(%esp), %xmm0
	addl	$28, %esp
	ret

llvm-svn: 48090
2008-03-09 05:42:06 +00:00
Chris Lattner
b741ebba29 add a note
llvm-svn: 48064
2008-03-09 01:08:22 +00:00
Chris Lattner
17f68a3075 Implement a readme entry, compiling
#include <xmmintrin.h>
__m128i doload64(short x) {return _mm_set_epi16(0,0,0,0,0,0,0,1);}

into:
	movl	$1, %eax
	movd	%eax, %xmm0
	ret

instead of a constant pool load.

llvm-svn: 48063
2008-03-09 01:05:04 +00:00
Chris Lattner
ff9dc0af80 This one looks easy, add a note.
llvm-svn: 48055
2008-03-08 22:32:39 +00:00
Chris Lattner
b12697f8bb move these to the appropriate file
llvm-svn: 48054
2008-03-08 22:28:45 +00:00
Chris Lattner
83e0b885f8 evan implemented this.
llvm-svn: 47948
2008-03-05 17:11:51 +00:00
Chris Lattner
7571a88209 add a note
llvm-svn: 47939
2008-03-05 07:22:39 +00:00
Chris Lattner
299977b5ca Evan implemented these.
llvm-svn: 47828
2008-03-02 18:05:14 +00:00
Chris Lattner
b714906acf upgrade some entries, remove stuff that is done.
llvm-svn: 47109
2008-02-14 06:19:02 +00:00
Nate Begeman
5f18794295 readme updates
llvm-svn: 47051
2008-02-13 07:06:12 +00:00
Nate Begeman
5a4e290b70 Enable SSE4 codegen and pattern matching.
Add some notes to the README.

llvm-svn: 46949
2008-02-11 04:19:36 +00:00