1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00
Commit Graph

1115 Commits

Author SHA1 Message Date
NAKAMURA Takumi
d21517b7e6 Revert r174343, "When the target-independent DAGCombiner inferred a higher alignment for a load,"
It caused hangups in compiling clang/lib/Parse/ParseDecl.cpp and clang/lib/Driver/Tools.cpp in stage2 on some hosts.

llvm-svn: 174374
2013-02-05 14:44:16 +00:00
Owen Anderson
0d5236250e When the target-independent DAGCombiner inferred a higher alignment for a load,
it would replace the load with one with the higher alignment.  However, it did
not place the new load in the worklist, which prevented later DAG combines in
the same phase (for example, target-specific combines) from ever seeing it.

This patch corrects that oversight, and updates some tests whose output changed
due to slightly different DAGCombine outputs.

llvm-svn: 174343
2013-02-05 06:25:30 +00:00
Shuxin Yang
af2e8dd42f rdar://13126763
Fix a bug in DAGCombine. The symptom is mistakenly optimizing expression
"x + x*x" into "x * 3.0".

llvm-svn: 174239
2013-02-02 00:22:03 +00:00
Nadav Rotem
94213533f7 Revert 172708.
The optimization handles esoteric cases but adds a lot of complexity both to the X86 backend and to other backends.
This optimization disables an important canonicalization of chains of SEXT nodes and makes SEXT and ZEXT asymmetrical.
Disabling the canonicalization of consecutive SEXT nodes into a single node disables other DAG optimizations that assume
that there is only one SEXT node. The AVX mask optimizations is one example. Additionally this optimization does not update the cost model.

llvm-svn: 172968
2013-01-20 08:35:56 +00:00
Elena Demikhovsky
461c2bd18c Optimization for the following SIGN_EXTEND pairs:
v8i8  -> v8i64, 
v8i8  -> v8i32, 
v4i8  -> v4i64, 
v4i16 -> v4i64 
for AVX and AVX2.

Bug 14865.

llvm-svn: 172708
2013-01-17 09:59:53 +00:00
Bill Schmidt
ae8a966ad7 This patch addresses an incorrect transformation in the DAG combiner.
The included test case is derived from one of the GCC compatibility tests.
The problem arises after the selection DAG has been converted to type-legalized
form.  The combiner first sees a 64-bit load that can be converted into a
pre-increment form.  The original load feeds into a SRL that isolates the
upper 32 bits of the loaded doubleword.  This looks like an opportunity for
DAGCombiner::ReduceLoadWidth() to replace the 64-bit load with a 32-bit load.

However, this transformation is not valid, as the replacement load is not
a pre-increment load.  The pre-increment load produces an extra result,
which feeds a subsequent add instruction.  The replacement load only has
one result value, and this value is propagated to all uses of the pre-
increment load, including the add.  Because the add is looking for the
second result value as its operand, it ends up attempting to add a constant
to a token chain, resulting in a crash.

So the patch simply disables this transformation for any load with more than
two result values.

llvm-svn: 172480
2013-01-14 22:04:38 +00:00
Evan Cheng
63710d3c5e Fix a DAG combine bug visitBRCOND() is transforming br(xor(x, y)) to br(x != y).
It cahced XOR's operands before calling visitXOR() but failed to update the
operands when visitXOR changed the XOR node.

rdar://12968664

llvm-svn: 171999
2013-01-09 20:56:40 +00:00
Chandler Carruth
b8c9b84572 Sink AddrMode back into TargetLowering, removing one of the most
peculiar headers under include/llvm.

This struct still doesn't make a lot of sense, but it makes more sense
down in TargetLowering than it did before.

llvm-svn: 171739
2013-01-07 15:14:13 +00:00
Tom Stellard
ccd192dbc4 DAGCombiner: Avoid generating illegal vector INT_TO_FP nodes
DAGCombiner::reduceBuildVecConvertToConvertBuildVec() was making two
mistakes:

1. It was checking the legality of scalar INT_TO_FP nodes and then generating
vector nodes.

2. It was passing the result value type to
TargetLoweringInfo::getOperationAction() when it should have been
passing the value type of the first operand.

llvm-svn: 171420
2013-01-02 22:13:01 +00:00
Chandler Carruth
4c1f3c24db Move all of the header files which are involved in modelling the LLVM IR
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.

There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.

The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.

I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).

I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.

llvm-svn: 171366
2013-01-02 11:36:10 +00:00
Bill Wendling
e0920e4122 Remove the Function::getFnAttributes method in favor of using the AttributeSet
directly.

This is in preparation for removing the use of the 'Attribute' class as a
collection of attributes. That will shift to the AttributeSet class instead.

llvm-svn: 171253
2012-12-30 10:32:01 +00:00
Nadav Rotem
55c5987673 Refactor DAGCombinerInfo. Change the different booleans that indicate if we are before or after different runs of DAGCo, with the CombineLevel enum.
Also, added a new API for checking if we are running before or after the LegalizeVectorOps phase. 

llvm-svn: 171142
2012-12-27 06:47:41 +00:00
Bob Wilson
23814cf0b6 Do not introduce vector operations in functions marked with noimplicitfloat.
<rdar://problem/12879313>

llvm-svn: 170630
2012-12-20 01:36:20 +00:00
Patrik Hagglund
c81d1cc216 Change TargetLowering::isCondCodeLegal to take an MVT, instead of EVT.
llvm-svn: 170524
2012-12-19 10:19:55 +00:00
Elena Demikhovsky
12a5e01a52 Optimized load + SIGN_EXTEND patterns in the X86 backend.
llvm-svn: 170506
2012-12-19 07:50:20 +00:00
Evan Cheng
a1d639a0ea Fix a bug in DAGCombiner::MatchBSwapHWord. Make sure the node has operands before referencing them. rdar://12868039
llvm-svn: 170078
2012-12-13 01:34:32 +00:00
Manman Ren
d29a3f8737 DAGCombine: clamp hi bit in APInt::getBitsSet to avoid assertion
rdar://12838504

llvm-svn: 169951
2012-12-12 01:13:50 +00:00
Patrik Hagglund
caaedc6ade Revert EVT->MVT changes, r169836-169851, due to buildbot failures.
llvm-svn: 169854
2012-12-11 11:14:33 +00:00
Patrik Hagglund
7692ba3a13 Change TargetLowering::isCondCodeLegal to take an MVT, instead of EVT.
llvm-svn: 169843
2012-12-11 09:51:27 +00:00
Chandler Carruth
ac8f03ddc1 Fix a miscompile in the DAG combiner. Previously, we would incorrectly
try to reduce the width of this load, and would end up transforming:

  (truncate (lshr (sextload i48 <ptr> as i64), 32) to i32)
to
  (truncate (zextload i32 <ptr+4> as i64) to i32)

We lost the sext attached to the load while building the narrower i32
load, and replaced it with a zext because lshr always zext's the
results. Instead, bail out of this combine when there is a conflict
between a sextload and a zext narrowing. The rest of the DAG combiner
still optimize the code down to the proper single instruction:

  movswl 6(...),%eax

Which is exactly what we wanted. Previously we read past the end *and*
missed the sign extension:

  movl 6(...), %eax

llvm-svn: 169802
2012-12-11 00:36:57 +00:00
Craig Topper
0f4945c76d Teach DAG combine to handle vector add/sub with vectors of all 0s.
llvm-svn: 169727
2012-12-10 08:12:29 +00:00
Craig Topper
c03f63d739 Remove extra blank line.
llvm-svn: 169692
2012-12-09 08:20:52 +00:00
Craig Topper
a6f44fb06b Teach DAG combine to handle vector logical operations with vectors of all 1s or all 0s. These cases can show up when vectors are split for legalizing. Fix some tests that were dependent on these cases not being combined.
llvm-svn: 169684
2012-12-08 22:49:19 +00:00
Nadav Rotem
5b43aa0b29 Fix a bug in the code that merges consecutive stores. Previously we did not
check if loads that happen in between stores alias with the first store in the
chain, only with the second store onwards.

llvm-svn: 169516
2012-12-06 17:34:13 +00:00
Chandler Carruth
a490793037 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Nadav Rotem
201348aa5e Allow merging multiple store sequences on the same chain.
llvm-svn: 169111
2012-12-02 17:14:09 +00:00
Nadav Rotem
3f31f2a3aa When combining consecutive stores allow loads in between the stores, if the loads do not alias.
llvm-svn: 168832
2012-11-29 00:00:08 +00:00
Rafael Espindola
1fb628bc96 Handle DAG CSE adding new uses during ReplaceAllUsesWith. Fixes PR14333.
llvm-svn: 167912
2012-11-14 05:08:56 +00:00
Owen Anderson
807cea5760 Be careful not to optimize a SELECT_CC into a SETCC post-legalization if the SETCC node would be illegal.
llvm-svn: 167344
2012-11-03 00:17:26 +00:00
Owen Anderson
8f66d7107c Add a few more simple fast-math constant propagations and cancellations.
llvm-svn: 167200
2012-11-01 02:00:53 +00:00
Ulrich Weigand
445bd73056 In various places throughout the code generator, there were special
checks to avoid performing compile-time arithmetic on PPCDoubleDouble.

Now that APFloat supports arithmetic on PPCDoubleDouble, those checks
are no longer needed, and we can treat the type like any other.

llvm-svn: 166958
2012-10-29 18:35:49 +00:00
Michael Liao
18e40965aa Teach DAG combine to fold (buildvec (Xint2fp x)) to (Xint2fp (buildvec x))
- If more than 1 elemennts are defined and target supports the vectorized
  conversion, use the vectorized one instead to reduce the strength on
  conversion operation.

llvm-svn: 166546
2012-10-24 04:14:18 +00:00
Jakub Staszak
66155dae31 Keep coding standard. Don't evaluate getNumOperands() every time.
llvm-svn: 166531
2012-10-24 00:38:25 +00:00
Michael Liao
24ccd71c4e Clean up code and put transformation on (build_vec (ext x)) into a helper func
llvm-svn: 166519
2012-10-23 23:06:52 +00:00
Michael Liao
cdcbe73b38 Simplify condition checking as CONCAT assume all inputs of the same type.
llvm-svn: 166260
2012-10-19 03:17:00 +00:00
Nadav Rotem
12105d6078 In SimplifySelectOps we pulled two loads through a select node despite the fact that one was dependent on the other.
rdar://12513091

llvm-svn: 166196
2012-10-18 18:06:48 +00:00
Michael Liao
f58d16a933 Revert part of r166049 back and enable test case in r166125.
- Folding (trunc (concat ... X )) to (concat ... (trunc X) ...) is valid
  when '...' are all 'undef's.
- r166125 relies on this transformation.

llvm-svn: 166155
2012-10-17 23:45:54 +00:00
Michael Liao
3accf514d6 Revert r166049
- In general, it's unsafe for this transformation.

llvm-svn: 166135
2012-10-17 22:41:15 +00:00
Michael Liao
b168cd6995 Teach DAG combine to fold (extract_subvec (concat v1, ..) i) to v_i
- If the extracted vector has the same type of all vectored being concatenated
  together, it should be simplified directly into v_i, where i is the index of
  the element being extracted.

llvm-svn: 166125
2012-10-17 20:48:33 +00:00
Michael Liao
ee2ce36cda Teach DAG combine to fold (trunc (fptoXi x)) to (fptoXi x)
llvm-svn: 166049
2012-10-16 19:38:35 +00:00
Nadav Rotem
c94270cb4d Refactor the AddrMode class out of TLI to its own header file.
This class is used by LSR and a number of places in the codegen.
This is the first step in de-coupling LSR from TLI, and creating
a new interface in between them.

llvm-svn: 165455
2012-10-08 23:06:34 +00:00
Micah Villmow
bb1a25cd67 Move TargetData to DataLayout.
llvm-svn: 165402
2012-10-08 16:38:25 +00:00
Benjamin Kramer
7a9b49ff0c Remove unused but set variable flagged by GCC.
llvm-svn: 165331
2012-10-05 20:08:45 +00:00
Benjamin Kramer
38480c67a4 Simplify code, don't or a bool with an uint64_t.
No functionality change.

llvm-svn: 165321
2012-10-05 18:19:44 +00:00
Nadav Rotem
2f513f5a29 When merging connsecutive stores, use vectors to store the constant zero.
llvm-svn: 165267
2012-10-04 22:35:15 +00:00
Nadav Rotem
3ffafc2d33 Fix a cycle in the DAG. In this code we replace multiple loads with a single load and
multiple stores with a single load. We create the wide loads and stores (and their chains)
before we remove the scalar loads and stores and fix the DAG chain. We attempted to merge
loads with a different chain. When that happened, the assumption that it is safe to RAUW
broke and a cycle was introduced.

llvm-svn: 165148
2012-10-03 19:30:31 +00:00
Nadav Rotem
b537146d7f A DAGCombine optimization for mergeing consecutive stores to memory. The optimization
is not profitable in many cases because modern processors perform multiple stores
in parallel and merging stores prior to merging requires extra work. We handle two main cases:

1. Store of multiple consecutive constants:
  q->a = 3;
  q->4 = 5;
In this case we store a single legal wide integer.

2. Store of multiple consecutive loads:
  int a = p->a;
  int b = p->b;
  q->a = a;
  q->b = b;
In this case we load/store either ilegal vector registers or legal wide integer registers.

llvm-svn: 165125
2012-10-03 16:11:15 +00:00
Nadav Rotem
972fdd96a1 Revert r164910 because it causes failures to several phase2 builds.
llvm-svn: 164911
2012-09-30 07:17:56 +00:00
Nadav Rotem
f7ea233f5c A DAGCombine optimization for merging consecutive stores. This optimization is not profitable in many cases
because moden processos can store multiple values in parallel, and preparing the consecutive store requires
some work.  We only handle these cases:

1. Consecutive stores where the values and consecutive loads. For example:
 int a = p->a;
 int b = p->b;
 q->a = a;
 q->b = b;

2. Consecutive stores where the values are constants. Foe example:
 q->a = 4;
 q->b = 5;

llvm-svn: 164910
2012-09-30 06:24:14 +00:00
Duncan Sands
a29a3a24a7 Speculatively revert commit 164885 (nadav) in the hope of ressurecting a pile of
buildbots.  Original commit message:

A DAGCombine optimization for merging consecutive stores. This optimization is not profitable in many cases
because moden processos can store multiple values in parallel, and preparing the consecutive store requires
some work.  We only handle these cases:

1. Consecutive stores where the values and consecutive loads. For example:
  int a = p->a;
  int b = p->b;
  q->a = a;
  q->b = b;

2. Consecutive stores where the values are constants. Foe example:
  q->a = 4;
  q->b = 5;

llvm-svn: 164890
2012-09-29 10:25:35 +00:00
Craig Topper
59e472177a Tidy up to match coding standards. Remove 'else' after 'return' and moving operators to end of preceding line. No functional change intended.
llvm-svn: 164887
2012-09-29 07:18:53 +00:00
Craig Topper
36c4380c5c Replace a couple if/elses around similar calls with conditional operators on the varying arguments. No functional change.
llvm-svn: 164886
2012-09-29 06:54:22 +00:00
Nadav Rotem
37dcc09044 A DAGCombine optimization for merging consecutive stores. This optimization is not profitable in many cases
because moden processos can store multiple values in parallel, and preparing the consecutive store requires
some work.  We only handle these cases:

1. Consecutive stores where the values and consecutive loads. For example:
  int a = p->a;
  int b = p->b;
  q->a = a;
  q->b = b;

2. Consecutive stores where the values are constants. Foe example:
  q->a = 4;
  q->b = 5;

llvm-svn: 164885
2012-09-29 06:33:25 +00:00
Sylvestre Ledru
b77340e506 Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767
llvm-svn: 164768
2012-09-27 10:14:43 +00:00
Sylvestre Ledru
1c5e7904de Fix a typo 'iff' => 'if'
llvm-svn: 164767
2012-09-27 09:59:43 +00:00
Nadav Rotem
d5f455d01e Fix 80-col violations.
llvm-svn: 164297
2012-09-20 08:53:31 +00:00
Nadav Rotem
490052b3d2 Fix a dagcombine optimization. The optimization attempts to optimize a bitcast of fneg to integers
by xoring the high-bit. This fails if the source operand is a vector because we need to negate
each of the elements in the vector.

Fix rdar://12281066 PR13813.

llvm-svn: 163802
2012-09-13 14:54:28 +00:00
Craig Topper
a00400675a Teach DAG combiner to constant fold FABS of a BUILD_VECTOR of ConstantFPs. Factor similar code out of FNEG DAG combiner.
llvm-svn: 163587
2012-09-11 01:45:21 +00:00
James Molloy
fe38f1d2b0 Fix an assertion failure when optimising a shufflevector incorrectly into concat_vectors, and a followup bug with SelectionDAG::getNode() creating nodes with invalid types.
llvm-svn: 163511
2012-09-10 14:01:21 +00:00
Craig Topper
fb97f05d3c Teach DAG combiner to constant fold fneg of a BUILD_VECTOR of constants.
llvm-svn: 163483
2012-09-09 22:58:45 +00:00
Roman Divacky
a6678a5602 Constify this properly. Found by gcc48 -Wcast-qual.
llvm-svn: 163256
2012-09-05 22:15:49 +00:00
Silviu Baranga
6f46bb1705 Fixed the DAG combiner to better handle the folding of AND nodes for vector types. The previous code was making the assumption that the length of the bitmask returned by isConstantSplat was equal to the size of the vector type. Now we first make sure that the splat value has at least the length of the vector lane type, then we only use as many fields as we have available in the splat value.
llvm-svn: 163203
2012-09-05 08:57:21 +00:00
Owen Anderson
27ba45c764 Teach DAG combine a number of tricks to simplify FMA expressions in fast-math mode.
llvm-svn: 163051
2012-09-01 06:04:27 +00:00
Michael Liao
2b9d290749 Fix typo
llvm-svn: 163049
2012-09-01 04:09:16 +00:00
Owen Anderson
d21ffd91bd Teach the DAG combiner to turn chains of FADDs (x+x+x+x+...) into FMULs by constants. This is only enabled in unsafe FP math mode, since it does not preserve rounding effects for all such constants.
llvm-svn: 162956
2012-08-30 23:35:16 +00:00
Stepan Dyatkovskiy
56ead97c8d Rejected 169195. As Duncan commented, bitcasting to proper type is wrong approach. We need to insert some valid TRANCATE node here.
llvm-svn: 162354
2012-08-22 09:33:55 +00:00
Stepan Dyatkovskiy
d39f5417bb Fixed DAGCombiner bug (found and localized by James Malloy):
The DAGCombiner tries to optimise a BUILD_VECTOR by checking if it
consists purely of get_vector_elts from one or two source vectors. If
so, it either makes a concat_vectors node or a shufflevector node.

However, it doesn't check the element type width of the underlying
vector, so if you have this sequence:

Node0: v4i16 = ...
Node1: i32 = extract_vector_elt Node0
Node2: i32 = extract_vector_elt Node0
Node3: v16i8 = BUILD_VECTOR Node1, Node2, ...

It will attempt to:

Node0:    v4i16 = ...
NewNode1: v16i8 = concat_vectors Node0, ...

Where this is actually invalid because the element width is completely
different. This causes an assertion failure on DAG legalization stage.

Fix:
If output item type of BUILD_VECTOR differs from input item type.
Make concat_vectors based on input element type and then bitcast it to the output vector type. So the case described above will transformed to:
Node0:    v4i16 = ...
NewNode1: v8i16 = concat_vectors Node0, ...
NewNode2: v16i8 = bitcast NewNode1

llvm-svn: 162195
2012-08-20 07:57:06 +00:00
Owen Anderson
c5b77c0317 Add a roundToIntegral method to APFloat, which can be parameterized over various rounding modes. Use this to implement SelectionDAG constant folding of FFLOOR, FCEIL, and FTRUNC.
llvm-svn: 161807
2012-08-13 23:32:49 +00:00
Elena Demikhovsky
0fec7026d9 Added FMA functionality to X86 target.
llvm-svn: 161110
2012-08-01 12:06:00 +00:00
Nadav Rotem
180a9e3758 Fixed DAGCombine optimizations which generate select_cc for targets
that do not support it (X86 does not lower select_cc).

PR: 13428

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160619
2012-07-23 07:59:50 +00:00
Bill Wendling
343eebdbe4 Remove tabs.
llvm-svn: 160475
2012-07-19 00:04:14 +00:00
Evan Cheng
5e82ad04d5 Back out r160101 and instead implement a dag combine to recover from instcombine transformation.
llvm-svn: 160387
2012-07-17 18:54:11 +00:00
Nadav Rotem
ff9fadbd0a Refactor the code that checks that all operands of a node are UNDEFs.
Add a micro-optimization to getNode of CONCAT_VECTORS when both operands are undefs.
Can't find a testcase for this because VECTOR_SHUFFLE already handles undef operands, but Duncan suggested that we add this.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160229
2012-07-15 08:38:23 +00:00
Nadav Rotem
a6dfd89cd0 Add a dagcombine optimization to convert concat_vectors of undefs into a single undef.
The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node.

llvm-svn: 160221
2012-07-14 21:30:27 +00:00
Owen Anderson
ebcd7c43c9 Only apply the SETCC+SITOFP -> SELECTCC optimization when the SETCC returns an MVT::i1, i.e. before type legalization.
This is a speculative fix for a problem on Mips reported by Akira Hatanaka.

llvm-svn: 160036
2012-07-11 06:38:55 +00:00
Nadav Rotem
5f6e9d5ffe Improve the loading of load-anyext vectors by allowing the codegen to load
multiple scalars and insert them into a vector. Next, we shuffle the elements
into the correct places, as before.
Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the
migration of bitcasts happened too late in the SelectionDAG process.

llvm-svn: 159991
2012-07-10 13:25:08 +00:00
Owen Anderson
e0c4f94d45 Teach the DAG combiner to turn sitofp/uitofp from i1 into a conditional move, since there are only two possible values.
Previously, this would become an integer extension operation, followed by a real integer->float conversion.

llvm-svn: 159957
2012-07-09 20:31:12 +00:00
Evan Cheng
652c7c94d4 Make sure type is not extended or untyped before create a constant of the type. No test case. Found by inspection.
llvm-svn: 159179
2012-06-26 01:19:33 +00:00
Lang Hames
68cf87e3ef Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a
boolean flag to an enum: { Fast, Standard, Strict } (default = Standard).

This option controls the creation by optimizations of fused FP ops that store
intermediate results in higher precision than IEEE allows (E.g. FMAs). The
behavior of this option is intended to match the behaviour specified by a
soon-to-be-introduced frontend flag: '-ffuse-fp-ops'.

Fast mode - allows formation of fused FP ops whenever they're profitable.

Standard mode - allow fusion only for 'blessed' FP ops. At present the only
blessed op is the fmuladd intrinsic. In the future more blessed ops may be
added.

Strict mode - allow fusion only if/when it can be proven that the excess
precision won't effect the result.

Note: This option only controls formation of fused ops by the optimizers.  Fused
operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic)
will always be honored, regardless of the value of this option.

Internally TargetOptions::AllowExcessFPPrecision has been replaced by
TargetOptions::AllowFPOpFusion.

llvm-svn: 158956
2012-06-22 01:09:09 +00:00
Pete Cooper
24914c84a0 Fix potential crash if DAGCombine on stores sees a half type
llvm-svn: 158927
2012-06-21 18:00:39 +00:00
Pete Cooper
b80e2722fa Add users of a MERGE_VALUE node to the worklist to process again when the node is removed. Sorry, no test case. Foudn it by inspection of the code
llvm-svn: 158839
2012-06-20 19:35:43 +00:00
Hal Finkel
f5b3193100 Fix DAGCombine to deal with ext-conversion of pre/post_inc loads.
The test case for this will come with the PPC indexed preinc loads commit.

llvm-svn: 158822
2012-06-20 15:42:48 +00:00
Lang Hames
f0b9601a6d Add DAG-combines for aggressive FMA formation.
This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or
FSUB + FMUL. The combines are performed when:
(a) Either
      AllowExcessFPPrecision option (-enable-excess-fp-precision for llc)
        OR
      UnsafeFPMath option (-enable-unsafe-fp-math)
    are set, and
(b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of
    the FADD/FSUB, and
(c) The FMUL only has one user (the FADD/FSUB).

If your target has fast FMA instructions you can make use of these combines by
overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for
types supported by your FMA instruction, and adding patterns to match ISD::FMA
to your FMA instructions.

llvm-svn: 158757
2012-06-19 22:51:23 +00:00
Lang Hames
a4512782f7 Make comment slightly more helpful.
llvm-svn: 158467
2012-06-14 20:37:15 +00:00
Owen Anderson
38e510d831 Switch the canonical FMA term operand order to match both the comment I wrote and the usual LLVM convention.
llvm-svn: 157708
2012-05-30 18:54:50 +00:00
Owen Anderson
49ddb93c30 Teach DAGCombine to canonicalize the position of a constant in the term operands of an FMA node.
llvm-svn: 157707
2012-05-30 18:50:39 +00:00
Jim Grosbach
15662155b6 DAGCombiner should not change the type of an extract_vector index.
When a combine twiddles an extract_vector, care should be take to preserve
the type of the index operand. No luck extracting a reasonable testcase,
unfortunately.

rdar://11391009

llvm-svn: 156419
2012-05-08 20:56:07 +00:00
Owen Anderson
8adb0322ce Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled.
llvm-svn: 156324
2012-05-07 20:51:25 +00:00
Owen Anderson
e3d41b44cc Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, just like it now knows for FMULs.
llvm-svn: 156029
2012-05-02 22:17:40 +00:00
Owen Anderson
3cfe269707 Teach DAG combine that multiplication by 1.0 can always be constant folded.
llvm-svn: 156023
2012-05-02 21:32:35 +00:00
Elena Demikhovsky
35721fc4f8 ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2
llvm-svn: 155309
2012-04-22 09:39:03 +00:00
Jakob Stoklund Olesen
1947930692 Register DAGUpdateListeners with SelectionDAG.
Instead of passing listener pointers to RAUW, let SelectionDAG itself
keep a linked list of interested listeners.

This makes it possible to have multiple listeners active at once, like
RAUWUpdateListener was already doing. It also makes it possible to
register listeners up the call stack without controlling all RAUW calls
below.

DAGUpdateListener uses an RAII pattern to add itself to the SelectionDAG
list of active listeners.

llvm-svn: 155248
2012-04-20 22:08:46 +00:00
Hal Finkel
457fbe481c Remove dead SD nodes after the combining pass. Fixes PR12201.
llvm-svn: 154786
2012-04-16 03:33:22 +00:00
Nadav Rotem
b05ea8c9af Reapply 154397. Original message:
Fix a dagcombine optimization which assumes that the vsetcc result type is always
of the same size as the compared values. This is ture for SSE/AVX/NEON but not
for all targets.

llvm-svn: 154490
2012-04-11 08:26:11 +00:00
Duncan Sands
6d360055c5 Add a comment noting that the fdiv -> fmul conversion won't generate
multiplication by a denormal, and some tests checking that.

llvm-svn: 154431
2012-04-10 20:35:27 +00:00
Owen Anderson
540d48ddb5 Revert r154397, which was causing make check failures on the buildbots.
llvm-svn: 154414
2012-04-10 18:02:12 +00:00
Nadav Rotem
e5008bb774 Fix a dagcombine optimization which assumes that the vsetcc result type is always
of the same size as the compared values. This is ture for SSE/AVX/NEON but not
for all targets.

llvm-svn: 154397
2012-04-10 14:58:31 +00:00
Anton Korobeynikov
0fc5fe0430 Transform div to mul with reciprocal only when fp imm is legal.
This fixes PR12516 and uncovers one weird problem in legalize (workarounded)

llvm-svn: 154394
2012-04-10 13:22:49 +00:00
Rafael Espindola
9febd1fbf7 Don't try to zExt just to check if an integer constant is zero, it might
not fit in a i64.

llvm-svn: 154364
2012-04-10 00:16:22 +00:00
Rafael Espindola
6b7bf4d0aa Pattern match a setcc of boolean value with 0 as a truncate.
llvm-svn: 154322
2012-04-09 16:06:03 +00:00
Craig Topper
b06257c64d Remove unnecessary type check when combining and/or/xor of swizzles. Move some checks to allow better early out.
llvm-svn: 154309
2012-04-09 07:19:09 +00:00
Craig Topper
a248e92058 Remove unnecessary 'else' on an 'if' that always returns
llvm-svn: 154308
2012-04-09 05:59:53 +00:00
Craig Topper
ee38217fe4 Optimize code slightly. No functionality change.
llvm-svn: 154307
2012-04-09 05:55:33 +00:00
Craig Topper
24c4646a77 Replace some explicit checks with asserts for conditions that should never happen.
llvm-svn: 154305
2012-04-09 05:16:56 +00:00
Benjamin Kramer
e99c184047 Silence sign-compare warning.
llvm-svn: 154297
2012-04-08 19:04:45 +00:00
Duncan Sands
28b9aa998e Only have codegen turn fdiv by a constant into fmul by the reciprocal
when -ffast-math, i.e. don't just always do it if the reciprocal can
be formed exactly.  There is already an IR level transform that does
that, and it does it more carefully.

llvm-svn: 154296
2012-04-08 18:08:12 +00:00
Nadav Rotem
37734277f0 1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new
shuffle node because it could introduce new shuffle nodes that were not
   supported efficiently by the target.

2. Add a more restrictive shuffle-of-shuffle optimization for cases where the
   second shuffle reverses the transformation of the first shuffle.

llvm-svn: 154266
2012-04-07 21:19:08 +00:00
Duncan Sands
cd52f3d447 Convert floating point division by a constant into multiplication by the
reciprocal if converting to the reciprocal is exact.  Do it even if inexact
if -ffast-math.  This substantially speeds up ac.f90 from the polyhedron
benchmarks.

llvm-svn: 154265
2012-04-07 20:04:00 +00:00
Rafael Espindola
88a1aeb123 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011
2012-04-04 12:51:34 +00:00
Owen Anderson
157487e7c5 Add predicates for checking whether targets have free FNEG and FABS operations, and prevent the DAGCombiner from turning them into bitwise operations if they do.
llvm-svn: 153901
2012-04-02 22:10:29 +00:00
Nadav Rotem
a9ec0e024f Optimizing swizzles of complex shuffles may generate additional complex shuffles.
Do not try to optimize swizzles of shuffles if the source shuffle has more than
a single user, except when the source shuffle is also a swizzle.

llvm-svn: 153864
2012-04-02 07:11:12 +00:00
Nadav Rotem
2729f54295 This commit contains a few changes that had to go in together.
1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B))
   (and also scalar_to_vector).

2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src).
   Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B))

3. Optimize swizzles of shuffles:  shuff(shuff(x, y), undef) -> shuff(x, y).

4. Fix an X86ISelLowering optimization which was very bitcast-sensitive.

Code which was previously compiled to this:

movd    (%rsi), %xmm0
movdqa  .LCPI0_0(%rip), %xmm2
pshufb  %xmm2, %xmm0
movd    (%rdi), %xmm1
pshufb  %xmm2, %xmm1
pxor    %xmm0, %xmm1
pshufb  .LCPI0_1(%rip), %xmm1
movd    %xmm1, (%rdi)
ret

Now compiles to this:

movl    (%rsi), %eax
xorl    %eax, (%rdi)
ret

llvm-svn: 153848
2012-04-01 19:31:22 +00:00
Chris Lattner
bab5825de7 fix what looks like a real logic bug, found by PVS-Studio (part of PR12357)
llvm-svn: 153513
2012-03-27 16:27:21 +00:00
Craig Topper
6bea69c868 When combining (vextract shuffle (load ), <1,u,u,u>), 0) -> (load ), add users of the final load to the worklist too. Needed by changes I'm preparing to make to X86 backend.
llvm-svn: 153078
2012-03-20 05:28:39 +00:00
Duncan Sands
725ed6d921 Fix DAG combine which creates illegal vector shuffles. Patch by Heikki Kultala.
llvm-svn: 153035
2012-03-19 15:35:44 +00:00
Nadav Rotem
8cf9105f96 When optimizing certain BUILD_VECTOR nodes into other BUILD_VECTOR nodes, add the new node into the work list because there is a potential for further optimizations.
llvm-svn: 152784
2012-03-15 08:49:06 +00:00
Bill Wendling
c42492d6fa Add a xform to the DAG combiner.
Transform:

        (fsub x, (fadd x, y)) -> (fneg y) and
        (fsub x, (fadd y, x)) -> (fneg y)

if 'unsafe math' is specified.
<rdar://problem/7540295>

llvm-svn: 152777
2012-03-15 05:12:00 +00:00
Evan Cheng
ebe82c24b5 Fortify r152675 a bit. Although I'm not able to come up with a test case that would trigger the truncation case.
llvm-svn: 152678
2012-03-13 22:16:11 +00:00
Evan Cheng
155a7230b7 DAG combine incorrectly optimize (i32 vextract (v4i16 load $addr), c) to
(i16 load $addr+c*sizeof(i16)) and replace uses of (i32 vextract) with the
i16 load. It should issue an extload instead: (i32 extload $addr+c*sizeof(i16)).

rdar://11035895

llvm-svn: 152675
2012-03-13 22:00:52 +00:00
Benjamin Kramer
d2881e4edc Give dagcombiner's worklist some inline capacity.
llvm-svn: 152454
2012-03-10 00:23:58 +00:00
Evan Cheng
f04f2e7a52 Extend r148086 to check for [r +/- reg] address mode. This fixes queens performance regression (due to increased register pressure from overly aggressive pre-inc formation).
llvm-svn: 152162
2012-03-06 23:33:32 +00:00
Owen Anderson
23d0deb35a Make it possible for a target to mark FSUB as Expand. This requires providing a default expansion (FADD+FNEG), and teaching DAGCombine not to form FSUBs post-legalize if they are not legal.
llvm-svn: 152079
2012-03-06 00:29:31 +00:00
James Molloy
9963b8be92 Teach the DAGCombiner that certain loadext nodes followed by ANDs can be converted to zeroexts.
llvm-svn: 150957
2012-02-20 12:02:38 +00:00
James Molloy
29c431b327 Remove extraneous #include and spelling mistake introduced in r150669.
llvm-svn: 150670
2012-02-16 09:48:07 +00:00
James Molloy
e1a6a76cda Modify the algorithm when traversing the DAGCombiner's worklist to be O(log N) for all operations. This fixes a horrible worst case with lots of nodes where 99% of the time was being spent in std::remove.
llvm-svn: 150669
2012-02-16 09:17:04 +00:00
Nadav Rotem
2141a8413e Fix a bug in DAGCombine for the optimization of BUILD_VECTOR. We cant generate a shuffle node from two vectors of different types.
llvm-svn: 150383
2012-02-13 12:42:26 +00:00
Nadav Rotem
ea4aecb3e5 This patch addresses the problem of poor code generation for the zext
v8i8 -> v8i32 on AVX machines. The codegen often scalarizes ANY_EXTEND nodes.
The DAGCombiner has two optimizations that can mitigate the problem. First,
if all of the operands of a BUILD_VECTOR node are extracted from an ZEXT/ANYEXT
nodes, then it is possible to create a new simplified BUILD_VECTOR which uses
UNDEFS/ZERO values to eliminate the scalar ZEXT/ANYEXT nodes.
Second, another dag combine optimization lowers BUILD_VECTOR into a shuffle
vector instruction.

In the case of zext v8i8->v8i32 on AVX, a value in an XMM register is to be
shuffled into a wide YMM register.

This patch modifes the second optimization and allows the creation of
shuffle vectors even when the newly generated vector and the original vector
from which we extract the values are of different types.

llvm-svn: 150340
2012-02-12 15:05:31 +00:00
Nadav Rotem
738ed9b0a5 Add additional documentation to the extract-and-trunc dagcombine optimization.
llvm-svn: 149823
2012-02-05 11:39:23 +00:00
Nadav Rotem
5c5681cf27 The type-legalizer often scalarizes code. One of the common patterns is extract-and-truncate.
In this patch we optimize this pattern and convert the sequence into extract op of a narrow type.
This allows the BUILD_VECTOR dag optimizations to construct efficient shuffle operations in many cases.

llvm-svn: 149692
2012-02-03 13:18:25 +00:00
Nadav Rotem
c23e698b5c Transform: (EXTRACT_VECTOR_ELT( VECTOR_SHUFFLE )) -> EXTRACT_VECTOR_ELT.
llvm-svn: 148337
2012-01-17 21:44:01 +00:00
Craig Topper
cdd2adc29c Teach DAG combiner to turn a BUILD_VECTOR of UNDEFs into an UNDEF of vector type.
llvm-svn: 148297
2012-01-17 09:09:48 +00:00
Benjamin Kramer
3db5296619 DAGCombiner: Deduplicate code.
llvm-svn: 148217
2012-01-15 11:50:43 +00:00
Evan Cheng
e31c929c2d DAGCombine's logic for forming pre- and post- indexed loads / stores were being
overly conservative. It was concerned about cases where it would prohibit
folding simple [r, c] addressing modes. e.g.
  ldr r0, [r2]
  ldr r1, [r2, #4]
=>
  ldr r0, [r2], #4
  ldr r1, [r2]
Change the logic to look for such cases which allows it to form indexed memory
ops more aggressively.

rdar://10674430

llvm-svn: 148086
2012-01-13 01:37:24 +00:00
Chandler Carruth
b3371fa250 Teach the X86 instruction selection to do some heroic transforms to
detect a pattern which can be implemented with a small 'shl' embedded in
the addressing mode scale. This happens in real code as follows:

  unsigned x = my_accelerator_table[input >> 11];

Here we have some lookup table that we look into using the high bits of
'input'. Each entity in the table is 4-bytes, which means this
implicitly gets turned into (once lowered out of a GEP):

  *(unsigned*)((char*)my_accelerator_table + ((input >> 11) << 2));

The shift right followed by a shift left is canonicalized to a smaller
shift right and masking off the low bits. That hides the shift right
which x86 has an addressing mode designed to support. We now detect
masks of this form, and produce the longer shift right followed by the
proper addressing mode. In addition to saving a (rather large)
instruction, this also reduces stalls in Intel chips on benchmarks I've
measured.

In order for all of this to work, one part of the DAG needs to be
canonicalized *still further* than it currently is. This involves
removing pointless 'trunc' nodes between a zextload and a zext. Without
that, we end up generating spurious masks and hiding the pattern.

llvm-svn: 147936
2012-01-11 08:41:08 +00:00
Craig Topper
66a349f55d Replace some uses of hasNUsesOfValue(0, X) with !hasAnyUseOfValue(X)
llvm-svn: 147733
2012-01-07 18:31:09 +00:00
Craig Topper
8648954580 Add some DAG combines for SUBC/SUBE. If nothing uses the carry/borrow out of subc, turn it into a sub. Turn (subc x, x) into 0 with no borrow. Turn (subc x, 0) into x with no borrow. Turn (subc -1, x) into (xor x, -1) with no borrow. Turn sube with no borrow in into subc.
llvm-svn: 147728
2012-01-07 09:06:39 +00:00
Chandler Carruth
90f124f420 Prevent a DAGCombine from firing where there are two uses of
a combined-away node and the result of the combine isn't substantially
smaller than the input, it's just canonicalized. This is the first part
of a significant (7%) performance gain for Snappy's hot decompression
loop.

llvm-svn: 147604
2012-01-05 11:05:55 +00:00
Craig Topper
16ee6a6196 Implement VECTOR_SHUFFLE canonicalizations during DAG combine.
llvm-svn: 147525
2012-01-04 08:07:43 +00:00
Eli Friedman
064187912e Make sure DAGCombiner doesn't introduce multiple loads from the same memory location. PR10747, part 2.
llvm-svn: 147283
2011-12-26 22:49:32 +00:00
Chandler Carruth
e0484f6b37 Initial CodeGen support for CTTZ/CTLZ where a zero input produces an
undefined result. This adds new ISD nodes for the new semantics,
selecting them when the LLVM intrinsic indicates that the undef behavior
is desired. The new nodes expand trivially to the old nodes, so targets
don't actually need to do anything to support these new nodes besides
indicating that they should be expanded. I've done this for all the
operand types that I could figure out for all the targets. Owners of
various targets, please review and let me know if any of these are
incorrect.

Note that the expand behavior is *conservatively correct*, and exactly
matches LLVM's current behavior with these operations. Ideally this
patch will not change behavior in any way. For example the regtest suite
finds the exact same instruction sequences coming out of the code
generator. That's why there are no new tests here -- all of this is
being exercised by the existing test suite.

Thanks to Duncan Sands for reviewing the various bits of this patch and
helping me get the wrinkles ironed out with expanding for each target.
Also thanks to Chris for clarifying through all the discussions that
this is indeed the approach he was looking for. That said, there are
likely still rough spots. Further review much appreciated.

llvm-svn: 146466
2011-12-13 01:56:10 +00:00
Eli Friedman
e74e55c372 Zap unnecessary isIntDivCheap() check. PR11485. No testcase because this doesn't affect any in-tree target.
llvm-svn: 146015
2011-12-07 03:55:52 +00:00
Eli Friedman
5545db0906 Fix an optimization involving EXTRACT_SUBVECTOR in DAGCombine so it behaves correctly. PR11494.
llvm-svn: 145996
2011-12-07 00:11:56 +00:00
Nick Lewycky
7d0d3c2d58 Move global variables in TargetMachine into new TargetOptions class. As an API
change, now you need a TargetOptions object to create a TargetMachine. Clang
patch to follow.

One small functionality change in PTX. PTX had commented out the machine
verifier parts in their copy of printAndVerify. That now calls the version in
LLVMTargetMachine. Users of PTX who need verification disabled should rely on
not passing the command-line flag to enable it.

llvm-svn: 145714
2011-12-02 22:16:29 +00:00
Evan Cheng
1435aa5fdc Revert r145273 and fix in SelectionDAG::InferPtrAlignment() instead.
Conservatively returns zero when the GV does not specify an alignment nor is it
initialized. Previously it returns ABI alignment for type of the GV. However, if
the type is a "packed" type, then the under-specified alignments is attached to
the load / store instructions. In that case, the alignment of the type cannot be
trusted.
rdar://10464621

llvm-svn: 145300
2011-11-28 22:37:34 +00:00
Evan Cheng
567aa3dfb3 DAG combine should not increase alignment of loads / stores with alignment less
than ABI alignment. These are loads / stores from / to "packed" data structures.
Their alignments are intentionally under-specified.

rdar://10301431

llvm-svn: 145273
2011-11-28 20:42:56 +00:00
Eli Friedman
51adc2ea5a Make sure to replace the chain properly when DAGCombining a LOAD+EXTRACT_VECTOR_ELT into a single LOAD. Fixes PR10747/PR11393.
llvm-svn: 144863
2011-11-16 23:50:22 +00:00
Jay Foad
ce22ec7def Remove some unnecessary includes of PseudoSourceValue.h.
llvm-svn: 144634
2011-11-15 07:50:46 +00:00
Eli Friedman
8563e57e38 Don't try to form pre/post-indexed loads/stores until after LegalizeDAG runs. Fixes PR11029.
llvm-svn: 144438
2011-11-12 00:35:34 +00:00
Lang Hames
ee7de1cff0 Lower mem-ops to unaligned i32/i16 load/stores on ARM where supported.
Add support for trimming constants to GetDemandedBits. This fixes some funky
constant generation that occurs when stores are expanded for targets that don't
support unaligned stores natively.

llvm-svn: 144102
2011-11-08 18:56:23 +00:00
Pete Cooper
224434deec Added invariant field to the DAG.getLoad method and changed all calls.
When this field is true it means that the load is from constant (runt-time or compile-time) and so can be hoisted from loops or moved around other memory accesses

llvm-svn: 144100
2011-11-08 18:42:53 +00:00
Richard Osborne
87ed868306 Don't introduce custom nodes after legalization in TargetLowering::BuildSDIV()
and TargetLowering::BuildUDIV(). Fixes PR11283

llvm-svn: 143964
2011-11-07 17:09:05 +00:00
Nadav Rotem
7bfd1f069d Cleanup. Document. Make sure that this build_vector optimization only runs before the op legalizer and that the used type is legal.
llvm-svn: 143358
2011-10-31 20:08:25 +00:00
Benjamin Kramer
eb62811647 Silence compiler warning.
llvm-svn: 143308
2011-10-30 08:39:55 +00:00
Nadav Rotem
6c79131e39 Add a new DAGCombine optimization for BUILD_VECTOR.
If all of the inputs are zero/any_extended, create a new simple BV
which can be further optimized by other BV optimizations.

llvm-svn: 143297
2011-10-29 21:23:04 +00:00
Eli Friedman
76e3969f05 Don't crash on 128-bit sdiv by constant. Found by inspection.
llvm-svn: 143095
2011-10-27 02:06:39 +00:00
Eli Friedman
b36470a3ea Remove a couple redundant checks.
llvm-svn: 142959
2011-10-25 20:34:22 +00:00
Bob Wilson
0273c767c8 Fix a DAG combiner assertion failure when constant folding BUILD_VECTORS.
svn r139159 caused SelectionDAG::getConstant() to promote BUILD_VECTOR operands
with illegal types, even before type legalization.  For this testcase, that led
to one BUILD_VECTOR with i16 operands and another with promoted i32 operands,
which triggered the assertion.

llvm-svn: 142370
2011-10-18 17:34:47 +00:00
Dan Gohman
d63418e497 Fix SimplifySelectCC to add newly created nodes to the DAGCombiner
worklist, as it may be possible to perform further optimization on them.

llvm-svn: 140349
2011-09-22 23:01:29 +00:00
Bruno Cardoso Lopes
1ffbef8ad1 Add a DAGCombine for subvector extracts to remove useless chains of
subvector inserts and extracts. Initial patch by Rackover, Zvi with
some tweak done by me.

llvm-svn: 140204
2011-09-20 23:19:33 +00:00
Eli Friedman
4bae1c4f70 Make the SelectionDAG verify that all the operands of BUILD_VECTOR have the same type. Teach DAGCombiner::visitINSERT_VECTOR_ELT not to make invalid BUILD_VECTORs. Fixes PR10897.
llvm-svn: 139407
2011-09-09 21:04:06 +00:00
Duncan Sands
d1311488fe Add codegen support for vector select (in the IR this means a select
with a vector condition); such selects become VSELECT codegen nodes.
This patch also removes VSETCC codegen nodes, unifying them with SETCC
nodes (codegen was actually often using SETCC for vector SETCC already).
This ensures that various DAG combiner optimizations kick in for vector
comparisons.  Passes dragonegg bootstrap with no testsuite regressions
(nightly testsuite as well as "make check-all").  Patch mostly by
Nadav Rotem.

llvm-svn: 139159
2011-09-06 19:07:46 +00:00
Benjamin Kramer
cca4305d51 Roll back the rest of r126557. It's a hack that will break in some obscure cases.
llvm-svn: 138130
2011-08-19 22:39:31 +00:00
Nadav Rotem
5703db96f2 Revert r137310 because it does not optimize any code on ToT
llvm-svn: 137466
2011-08-12 17:15:04 +00:00
Nadav Rotem
b738be4555 [AVX] When joining two XMM registers into a YMM register, make sure that the
lower XMM register gets in first. This will allow the SUBREG pattern to
elliminate the first vector insertion. 

llvm-svn: 137310
2011-08-11 16:49:36 +00:00
Eli Friedman
99fd6d41b5 Make sure this DAGCombine actually returns an UNDEF of the correct type; PR10476.
llvm-svn: 135993
2011-07-25 22:25:42 +00:00
Chris Lattner
e1fe7061ce land David Blaikie's patch to de-constify Type, with a few tweaks.
llvm-svn: 135375
2011-07-18 04:54:35 +00:00
Eric Christopher
e84fe67d4f Add a dag combine pattern for folding C2-(A+C1) -> (C2-C1)-A
Fixes rdar://9761830

llvm-svn: 135123
2011-07-14 01:12:15 +00:00
Lang Hames
9e52663aa4 Add functions 'hasPredecessor' and 'hasPredecessorHelper' to SDNode. The
hasPredecessorHelper function allows predecessors to be cached to speed up
repeated invocations. This fixes PR10186.

X.isPredecessorOf(Y) now just calls Y.hasPredecessor(X)

Y.hasPredecessor(X) calls Y.hasPredecessorHelper(X, Visited, Worklist) with
empty Visited and Worklist sets (i.e. no caching over invocations).

Y.hasPredecessorHelper(X, Visited, Worklist) caches search state in Visited
and Worklist to speed up repeated calls. The Visited set is searched for X
before going to the worklist to further search the DAG if necessary.

llvm-svn: 134592
2011-07-07 04:31:51 +00:00
Benjamin Kramer
cc91642a94 Revert a part of r126557 which could create unschedulable DAGs.
llvm-svn: 134067
2011-06-29 13:47:25 +00:00
Jay Foad
888cd746b8 Replace the existing forms of ConstantArray::get() with a single form
that takes an ArrayRef.

llvm-svn: 133615
2011-06-22 09:24:39 +00:00
Evan Cheng
40adfc21f6 Teach dag combine to match halfword byteswap patterns.
1. (((x) & 0xFF00) >> 8) | (((x) & 0x00FF) << 8)
   => (bswap x) >> 16
2. ((x&0xff)<<8)|((x&0xff00)>>8)|((x&0xff000000)>>8)|((x&0x00ff0000)<<8))
   => (rotl (bswap x) 16)

This allows us to eliminate most of the def : Pat patterns for ARM rev16
revsh instructions. It catches many more cases for ARM and x86.

rdar://9609108

llvm-svn: 133503
2011-06-21 06:01:08 +00:00
Nick Lewycky
f4886c7374 Add a DAGCombine for (ext (binop (load x), cst)).
llvm-svn: 133124
2011-06-16 01:15:49 +00:00
Nadav Rotem
fbf446f6cd Enable the simplification of truncating-store after fixing the usage of
GetDemandBits (which must operate on the vector element type).

Fix the a usage of getZeroExtendInReg which must also be done on scalar types.

llvm-svn: 133052
2011-06-15 11:19:12 +00:00
Chad Rosier
30333c668f When pattern matching during instruction selection make sure shl x,1 is not
converted to add x,x if x is a undef.  add undef, undef does not guarantee
that the resulting low order bit is zero.
Fixes <rdar://problem/9453156> and <rdar://problem/9487392>.

llvm-svn: 133022
2011-06-14 22:29:10 +00:00
Nadav Rotem
db789f2d76 Disable trunc-store simplification on vectors.
llvm-svn: 132984
2011-06-14 07:18:26 +00:00
Eli Friedman
14c6ce9041 Change this DAGCombine to build AND of SHR instead of SHR of AND; this matches the ordering we prefer in instcombine. Part of rdar://9562809.
The potential DAGCombine which enforces this more generally messes up some other very fragile patterns, so I'm leaving that alone, at least for now.

llvm-svn: 132809
2011-06-09 22:14:44 +00:00
Devang Patel
e829168e05 Revert 121907 (it causes llc crash) and apply original patch from PR9817.
llvm-svn: 131926
2011-05-23 22:04:42 +00:00
Benjamin Kramer
df3070e83d Implement mulo x, 2 -> addo x, x in DAGCombiner.
llvm-svn: 131800
2011-05-21 18:31:55 +00:00
Dan Gohman
5d23a937f7 Misc. code cleanups.
llvm-svn: 131495
2011-05-17 22:20:36 +00:00
Nadav Rotem
57dd315a3b Fixes a bug in the DAGCombiner. LoadSDNodes have two values (data, chain).
If there is a store after the load node, then there is a chain, which means
that there is another user. Thus, asking hasOneUser would fail. Instead we
ask hasNUsesOfValue on the 'data' value.

llvm-svn: 131183
2011-05-11 14:40:50 +00:00
Duncan Sands
afe16d8de1 Indent properly, no functionality change.
llvm-svn: 131082
2011-05-09 08:03:33 +00:00
Eli Friedman
2798137293 PR9055: extend the fix to PR4050 (r70179) to apply to zext and anyext.
Returning a new node makes the code try to replace the old node, which
in the included testcase is killed by CSE.

llvm-svn: 129650
2011-04-16 23:25:34 +00:00
Owen Anderson
0ce6c0f86e Fix another instance of the DAG combiner not using the correct type for the RHS of a shift.
llvm-svn: 129522
2011-04-14 17:30:49 +00:00
Chris Lattner
badb8ca63c have dag combine zap "store undef", which can be formed during call lowering
with undef arguments.

llvm-svn: 129185
2011-04-09 02:32:02 +00:00
Cameron Zwarich
2748634089 Add a RemoveFromWorklist method to DCI. This is needed to do some complicated
transformations in target-specific DAG combines without causing DAGCombiner to
delete the same node twice. If you know of a better way to avoid this (see my
next patch for an example), please let me know.

llvm-svn: 128758
2011-04-02 02:40:26 +00:00
Evan Cheng
d5d2d4a158 Avoid replacing the value of a directly stored load with the stored value if the load is indexed. rdar://9117613.
llvm-svn: 127440
2011-03-11 00:48:56 +00:00
Stuart Hastings
a40e563e4e Can't introduce floating-point immediate constants after legalization.
Radar 9056407.

llvm-svn: 126864
2011-03-02 19:36:30 +00:00
Nadav Rotem
961627db07 Fix typos in the comments.
llvm-svn: 126565
2011-02-27 07:40:43 +00:00
Benjamin Kramer
412ffed4f0 Add some DAGCombines for (adde 0, 0, glue), which are useful to optimize legalized code for large integer arithmetic.
1. Inform users of ADDEs with two 0 operands that it never sets carry
2. Fold other ADDs or ADDCs into the ADDE if possible

It would be neat if we could do the same thing for SETCC+ADD eventually, but we can't do that in target independent code.

llvm-svn: 126557
2011-02-26 22:48:07 +00:00
Owen Anderson
bd26993873 Allow targets to specify a the type of the RHS of a shift parameterized on the type of the LHS.
llvm-svn: 126518
2011-02-25 21:41:48 +00:00
Nadav Rotem
ab7cf630f4 Enable support for vector sext and trunc:
Limit the folding of any_ext and sext  into the load operation to scalars.
Limit the active-bits trunc optimization to scalars.
Document vector trunc and vector sext in LangRef.

Similar to commit 126080 (for enabling zext).

llvm-svn: 126424
2011-02-24 21:01:34 +00:00
Nadav Rotem
1660c0bc25 Fix 9267; Add vector zext support.
The DAGCombiner folds the zext into complex load instructions. This patch
prevents this optimization on vectors since none of the supported targets
knows how to perform load+vector_zext in one instruction.

llvm-svn: 126080
2011-02-20 12:37:50 +00:00
Stuart Hastings
47e45a32a8 Swap VT and DebugLoc operands of getExtLoad() for consistency with
other getNode() methods.  Radar 9002173.

llvm-svn: 125665
2011-02-16 16:23:55 +00:00
Eric Christopher
25956e0c47 Refactor zero folding slightly. Clean up todo.
llvm-svn: 125651
2011-02-16 04:50:12 +00:00
Eric Christopher
8965ba39fb The change for PR9190 wasn't quite right. We need to avoid making the
transformation if we can't legally create a build vector of the correct
type. Check that we can make the transformation first, and add a TODO to
refactor this code with similar cases.

Fixes: PR9223 and rdar://9000350
llvm-svn: 125631
2011-02-16 01:10:03 +00:00
Chris Lattner
c9c0de6faf Revisit my fix for PR9028: the issue is that DAGCombine was
generating i8 shift amounts for things like i1024 types.  Add
an assert in getNode to prevent this from occuring in the future,
fix the buggy transformation, revert my previous patch, and
document this gotcha in ISDOpcodes.h

llvm-svn: 125465
2011-02-13 19:09:16 +00:00
Nadav Rotem
367d73ccb6 A fix for 9165.
The DAGCombiner created illegal BUILD_VECTOR operations.
The patch added a check that either illegal operations are
allowed or that the created operation is legal.

llvm-svn: 125435
2011-02-12 14:40:33 +00:00
Nadav Rotem
19fb107f1e SimplifySelectOps can only handle selects with a scalar condition. Add a check
that the condition is not a vector.

llvm-svn: 125398
2011-02-11 19:57:47 +00:00
Nadav Rotem
7cd5180574 Fix #9190
The bug happens when the DAGCombiner attempts to optimize one of the patterns
of the SUB opcode. It tries to create a zero of type v2i64. This type is legal
on 32bit machines, but the initializer of this vector (i64) is target dependent.
Currently, the initializer attempts to create an i64 zero constant, which fails.
Added a flag to tell the DAGCombiner to create a legal zero, if we require that
the pass would generate legal types.

llvm-svn: 125391
2011-02-11 19:20:37 +00:00
Evan Cheng
c7ce7e2ac3 Given a pair of floating point load and store, if there are no other uses of
the load, then it may be legal to transform the load and store to integer
load and store of the same width.

This is done if the target specified the transformation as profitable. e.g.
On arm, this can transform:
vldr.32 s0, []
vstr.32 s0, []

to

ldr r12, []
str r12, []

rdar://8944252

llvm-svn: 124708
2011-02-02 01:06:55 +00:00