1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 03:53:04 +02:00
Commit Graph

51317 Commits

Author SHA1 Message Date
Craig Topper
0deee76383 Remove unused parameters from the AVX maskmov classes.
llvm-svn: 144985
2011-11-19 04:49:22 +00:00
Andrew Trick
fe5f7fc3b8 Fix a corner case in updating LoopInfo after fully unrolling an outer loop.
The loop tree's inclusive block lists are painful and expensive to
update. (I have no idea why they're inclusive). The design was
supposed to handle this case but the implementation missed it and my
unit tests weren't thorough enough.

Fixes PR11335: loop unroll update.

llvm-svn: 144970
2011-11-18 03:42:41 +00:00
Nadav Rotem
08f8a75c2c Add AVX2 vpbroadcast support
llvm-svn: 144967
2011-11-18 02:49:55 +00:00
Kostya Serebryany
3a83736893 [asan] workaround for reg alloc bug 11395: don't instrument functions with large chunks of inline assembler
llvm-svn: 144962
2011-11-18 01:41:06 +00:00
Chad Rosier
70dab03f8e Guard call to getRegForValue with isTypeLegal check to avoid unnecessary work/dead code.
llvm-svn: 144959
2011-11-18 01:17:34 +00:00
Devang Patel
a0973b0c53 DISubrange supports unsigned lower/upper array bounds, so let's not fake it in the end while emitting DWARF. If a FE needs to encode signed lower/upper array bounds then we need to extend DISubrange or ad DISignedSubrange.
llvm-svn: 144937
2011-11-17 23:43:15 +00:00
Kostya Serebryany
6081213d59 quick fix: remove GlobalVariable::GlobalVariable mistakenly commited at r144933. For some reason this compiles on linux
llvm-svn: 144936
2011-11-17 23:37:53 +00:00
Andrew Trick
7dc21d8c0e Fix an overly general check in SimplifyIndvar to handle useless phi cycles.
The right way to check for a binary operation is
cast<BinaryOperator>. The original check: cast<Instruction> &&
numOperands() == 2 would match phi "instructions", leading to an
infinite loop in extreme corner case: a useless phi with operands
[self, constant] that prior optimization passes failed to remove,
being used in the loop by another useless phi, in turn being used by an
lshr or udiv.

Fixes PR11350: runaway iteration assertion.

llvm-svn: 144935
2011-11-17 23:36:35 +00:00
Kostya Serebryany
3b8d362511 fall back to explicit list of allowed linkages when instrumenting globals in asan; add a test check that asan does not touch linkonce_odr
llvm-svn: 144933
2011-11-17 23:14:59 +00:00
Chad Rosier
7d2af13ccb Add TODO comment.
llvm-svn: 144920
2011-11-17 21:46:13 +00:00
Craig Topper
7297509c73 Fix SSE/AVX integer comparison patterns to understand that all integer vector loads are promoted to i64 vector loads so patterns need a bitconvert. Also slightly simplify the AVX2 variable shift patterns by using the predefined bitconvert pattern fragments.
llvm-svn: 144896
2011-11-17 07:49:38 +00:00
Chad Rosier
47928b03f3 Dead code.
llvm-svn: 144888
2011-11-17 07:24:49 +00:00
Chad Rosier
2673f8862f When fast iseling a GEP, accumulate the offset rather than emitting a series of
ADDs.  MaxOffs is used as a threshold to limit the size of the offset. Tradeoffs
being: (1) If we can't materialize the large constant then we'll cause fast-isel
to bail. (2) Too large of an offset can't be directly encoded in the ADD
resulting in a MOV+ADD.  Generally not a bad thing because otherwise we would
have had ADD+ADD, but on Thumb this turns into a MOVS+MOVT+ADD. Working on a fix
for that. (3) Conversely, too low of a threshold we'll miss opportunities to 
coalesce ADDs.
rdar://10412592

llvm-svn: 144886
2011-11-17 07:15:58 +00:00
Craig Topper
4d39196041 Remove seemingly unnecessary duplicate VROUND definitions.
llvm-svn: 144885
2011-11-17 07:04:00 +00:00
Eli Friedman
d02d82d355 Add support for custom names for library functions in TargetLibraryInfo. Add a custom name for fwrite and fputs on x86-32 OSX. Make SimplifyLibCalls honor the custom
names for fwrite and fputs.

Fixes <rdar://problem/9815881>.

llvm-svn: 144876
2011-11-17 01:27:36 +00:00
Chad Rosier
c9ed1d9072 Don't unconditionally set the kill flag.
rdar://10456186

llvm-svn: 144872
2011-11-17 01:16:53 +00:00
Eli Friedman
1b0e97d7ab Turn on vzeroupper insertion on call boundaries for AVX; it works as far as I know, and I'd like to see wider testing.
llvm-svn: 144867
2011-11-17 00:21:52 +00:00
Eli Friedman
51adc2ea5a Make sure to replace the chain properly when DAGCombining a LOAD+EXTRACT_VECTOR_ELT into a single LOAD. Fixes PR10747/PR11393.
llvm-svn: 144863
2011-11-16 23:50:22 +00:00
Michael J. Spencer
346238dcfc Object/COFF: Support common symbols.
llvm-svn: 144861
2011-11-16 23:36:12 +00:00
Jim Grosbach
fe5f0cfa29 Generalize the fixup info for ARM mode.
We don't (yet) have the granularity in the fixups to be specific about which
bitranges are affected. That's a future cleanup, but we're not there yet.

llvm-svn: 144852
2011-11-16 22:48:37 +00:00
Akira Hatanaka
4068c25521 Lower 64-bit constant pool node.
llvm-svn: 144849
2011-11-16 22:44:38 +00:00
Akira Hatanaka
5f347432ca Lower 64-bit block address.
llvm-svn: 144847
2011-11-16 22:42:10 +00:00
Jim Grosbach
a14a3d22b1 Fix encoding of NOP used for padding in ARM mode .align.
llvm-svn: 144842
2011-11-16 22:40:25 +00:00
Akira Hatanaka
ca79236173 Add patterns for 64-bit tglobaladdr, tblockaddress, tjumptable and tconstpool
nodes.

llvm-svn: 144841
2011-11-16 22:39:56 +00:00
Akira Hatanaka
bb56ec5caf 64-bit jump register instruction.
llvm-svn: 144840
2011-11-16 22:36:01 +00:00
Evan Cheng
5bae2333cb Another missing X86ISD::MOVLPD pattern. rdar://10450317
llvm-svn: 144839
2011-11-16 22:24:44 +00:00
Jim Grosbach
8c8d091ecc ARM assembly parsing for shifted register operands for MOV instruction.
llvm-svn: 144837
2011-11-16 21:50:05 +00:00
Jim Grosbach
6007a95d57 Clean up debug printing of ARM shifted operands.
llvm-svn: 144836
2011-11-16 21:46:50 +00:00
Chad Rosier
36cc01dbd3 Add fast-isel stats to determine who's doing all the work, the
target-independent selector or the target-specific selector.

llvm-svn: 144833
2011-11-16 21:05:28 +00:00
Chad Rosier
bf857d6eaa Fix the stats collection for fast-isel. The failed count was only accounting
for a single miss and not all predecessor instructions that get selected by
the selection DAG instruction selector.  This is still not exact (e.g., over
states misses when folded/dead instructions are present), but it is a step in
the right direction.

llvm-svn: 144832
2011-11-16 21:02:08 +00:00
Jim Grosbach
e9b1f2aead ARM assmebly two operand forms for LSR, ASR, LSL, ROR register.
llvm-svn: 144814
2011-11-16 19:12:24 +00:00
Jim Grosbach
18844fca8d ARM assembly parsing for RRX mnemonic.
rdar://9704684

llvm-svn: 144812
2011-11-16 19:05:59 +00:00
Pete Cooper
4f4a9794b2 Added missing comment about new custom lowering of DEC64
llvm-svn: 144811
2011-11-16 19:03:23 +00:00
Evan Cheng
5242b6aaa1 Disable expensive two-address optimizations at -O0. rdar://10453055
llvm-svn: 144806
2011-11-16 18:44:48 +00:00
Chad Rosier
69f9c432a6 Check to make sure we can select the instruction before trying to put the
operands into a register.  Otherwise, we may materialize dead code.

llvm-svn: 144805
2011-11-16 18:39:44 +00:00
Evan Cheng
cbfacfdb63 Disable the assertion again. Looks like fastisel is still generating bad kill markers.
llvm-svn: 144804
2011-11-16 18:32:14 +00:00
Jim Grosbach
acb7c2d555 ARM mode aliases for bitwise instructions w/ register operands.
rdar://9704684

llvm-svn: 144803
2011-11-16 18:31:45 +00:00
Bob Wilson
70092e2b0a Fix tablegen warning: hasSideEffects is inferred for eh_sjlj_dispatchsetup.
llvm-svn: 144798
2011-11-16 17:09:59 +00:00
NAKAMURA Takumi
9a0c941fff lib/Target/ARM/CMakeLists.txt: Disable optimization in ARMISelLowering.cpp also on MSC15(aka VS9). Seems miscompiled.
llvm-svn: 144794
2011-11-16 09:18:28 +00:00
Evan Cheng
2b239cbcf6 Sink codegen optimization level into MCCodeGenInfo along side relocation model
and code model. This eliminates the need to pass OptLevel flag all over the
place and makes it possible for any codegen pass to use this information.

llvm-svn: 144788
2011-11-16 08:38:26 +00:00
Bob Wilson
dbab14b8ea Record landing pads with a SmallSetVector to avoid multiple entries.
There may be many invokes that share one landing pad, and the previous code
would record the landing pad once for each invoke.  Besides the wasted
effort, a pair of volatile loads gets inserted every time the landing pad is
processed.  The rest of the code can get optimized away when a landing pad
is processed repeatedly, but the volatile loads remain, resulting in code like:

LBB35_18:
Ltmp483:
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r4, [r7, #-72]
        ldr     r2, [r7, #-68]

llvm-svn: 144787
2011-11-16 07:57:21 +00:00
Craig Topper
7a4d482aaa Fix the execution domain on a bunch of SSE/AVX instructions.
llvm-svn: 144784
2011-11-16 07:30:46 +00:00
Bob Wilson
45ab17a709 Update the SP in the SjLj jmpbuf whenever it changes. <rdar://problem/10444602>
This same basic code was in the older version of the SjLj exception handling,
but it was removed in the recent revisions to that code.  It needs to be there.

llvm-svn: 144782
2011-11-16 07:12:00 +00:00
Bob Wilson
0e88464871 Fix ARM SjLj-EH dispatch setup code. <rdar://problem/10444602>
The EmitBasePointerRecalculation function has 2 problems, one minor and one
fatal.  The minor problem is that it inserts the code at the setjmp
instead of in the dispatch block.  The fatal problem is that at the point
where this code runs, we don't know whether there will be a base pointer,
so the entire function is a no-op.  The base pointer recalculation needs to
be handled as it was before, by inserting a pseudo instruction that gets
expanded late.

Most of the support for the old approach is still here, but it no longer
has any connection to the eh_sjlj_dispatchsetup intrinsic.  Clean up the
parts related to the intrinsic and just generate the pseudo instruction
directly.

llvm-svn: 144781
2011-11-16 07:11:57 +00:00
Craig Topper
d842079270 Remove code to enable execution dependency fix pass on VR256. VR128 is sufficient after r144636.
llvm-svn: 144777
2011-11-16 05:02:04 +00:00
Evan Cheng
ab9e2ad9c4 Revert r144568 now that r144730 has fixed the fast-isel kill marker bug.
llvm-svn: 144776
2011-11-16 04:55:01 +00:00
Nick Lewycky
ff690249a9 Merge isObjectPointerWithTrustworthySize with getPointerSize. Use it when
looking at the size of the pointee. Fixes PR11390!

llvm-svn: 144773
2011-11-16 03:49:48 +00:00
Evan Cheng
65d5df1165 If the 2addr instruction has other kills, don't move it below any other uses since we don't want to extend other live ranges.
llvm-svn: 144772
2011-11-16 03:47:42 +00:00
Evan Cheng
27c17a65d1 RescheduleKillAboveMI() must backtrack to before the rescheduled DBG_VALUE instructions. rdar://10451185
llvm-svn: 144771
2011-11-16 03:33:08 +00:00
Evan Cheng
d756c3ec65 Process all uses first before defs to accurately capture register liveness. rdar://10449480
llvm-svn: 144770
2011-11-16 03:05:12 +00:00
Eli Friedman
1f3d774ba4 CONCAT_VECTORS can have more than two operands. PR11389.
llvm-svn: 144768
2011-11-16 02:52:39 +00:00
Eli Friedman
ce0cea66b9 Add a couple asserts so it will be easier to debug if we accidentally pass indexed loads/stores to the legalizer.
llvm-svn: 144767
2011-11-16 02:43:15 +00:00
Kostya Serebryany
4105068ea9 AddressSanitizer, first commit (compiler module only)
llvm-svn: 144758
2011-11-16 01:35:23 +00:00
Kostya Serebryany
dedd750c82 test commit to verify that commit access works (added blank line)
llvm-svn: 144748
2011-11-16 01:14:38 +00:00
Owen Anderson
48a129b50e Rename MVT::untyped to MVT::Untyped to match similar nomenclature.
llvm-svn: 144747
2011-11-16 01:02:57 +00:00
Andrew Trick
fe618116fc Fix SCEV overly optimistic back edge taken count for multi-exit loops.
Fixes PR11375: Different results for 'clang++ huh.cpp'...

llvm-svn: 144746
2011-11-16 00:52:40 +00:00
Chad Rosier
17577c9394 Add FIXME comment.
llvm-svn: 144743
2011-11-16 00:32:20 +00:00
Jakob Stoklund Olesen
5ea426cfa8 Enable -widen-vmovs by default.
This will widen 32-bit register vmov instructions to 64-bit when
possible.  The 64-bit vmovd instructions can then be translated to NEON
vorr instructions by the execution dependency fix pass.

The copies are only widened if they are marked as clobbering the whole
D-register.

llvm-svn: 144734
2011-11-15 23:53:18 +00:00
Eric Christopher
c9b63af4bb Stabilize the output of the dwarf accelerator tables. Fixes a comparison
failure during bootstrap with it turned on.

llvm-svn: 144731
2011-11-15 23:37:17 +00:00
Chad Rosier
71f1bbe1e7 GEPs with all zero indices are trivially coalesced by fast-isel. For example,
%arrayidx135 = getelementptr inbounds [4 x [4 x [4 x [4 x i32]]]]* %M0, i32 0, i64 0
%arrayidx136 = getelementptr inbounds [4 x [4 x [4 x i32]]]* %arrayidx135, i32 0, i64 %idxprom134

Prior to this commit, the GEP instruction that defines %arrayidx136 thought that 
%arrayidx135 was a trivial kill.  The GEP that defines %arrayidx135 doesn't 
generate any code and thus %M0 gets folded into the second GEP.  Thus, we need
to look through GEPs with all zero indices.
rdar://10443319

llvm-svn: 144730
2011-11-15 23:34:05 +00:00
Jim Grosbach
044acb8bee ARM assembly parsing for register range syntax for VLD/VST register lists.
For example,
vld1.f64 {d2-d5}, [r2,:128]!

Should be equivalent to:
vld1.f64 {d2,d3,d4,d5}, [r2,:128]!

It's not documented syntax in the ARM ARM, but it is consistent with what's
accepted for VLDM/VSTM and is unambiguous in meaning, so it's a good thing to
support.

rdar://10451128

llvm-svn: 144727
2011-11-15 23:19:15 +00:00
Jim Grosbach
778bed02bb ARM assembly parsing for data type suffices on NEON VMOV aliases.
llvm-svn: 144722
2011-11-15 22:54:42 +00:00
Nadav Rotem
d8497a8354 Fix MSVC warnings by adding a cast.
llvm-svn: 144721
2011-11-15 22:54:21 +00:00
Nadav Rotem
63be4a26a9 AVX: Add support for vbroadcast from BUILD_VECTOR and refactor some of the vbroadcast code.
llvm-svn: 144720
2011-11-15 22:50:37 +00:00
Jim Grosbach
b8ebc386df ARM assembly parsing two operand forms for shift instructions.
llvm-svn: 144713
2011-11-15 22:27:54 +00:00
Jim Grosbach
d573473cb8 ARM VFP assembly parsing for VADD and VSUB two-operand forms.
llvm-svn: 144710
2011-11-15 22:15:10 +00:00
Jim Grosbach
e991d79b50 ARM accept an immediate offset in memory operands w/o the '#'.
llvm-svn: 144709
2011-11-15 22:14:41 +00:00
Pete Cooper
8441c08e0b Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used
by later instructions.

Only done for DEC64m right now.

Fixes <rdar://problem/6172640>

llvm-svn: 144705
2011-11-15 21:57:53 +00:00
Jim Grosbach
8f360855cf ARM enclosing curly braces optional on one-register VLD/VST instruction lists.
'vld1.f32 d4, [r7]' should be parsed as equivalent to 'vld1.f32 {d4}, [r7]'

rdar://10450488.

llvm-svn: 144701
2011-11-15 21:45:55 +00:00
Jim Grosbach
3c205132ff ARM size suffix on VFP single-precision 'vmov' is optional.
rdar://10435114

llvm-svn: 144698
2011-11-15 21:18:35 +00:00
Devang Patel
11c550b1e5 Insert modified DBG_VALUE into LiveDbgValueMap.
llvm-svn: 144696
2011-11-15 21:03:58 +00:00
Jim Grosbach
f68b81adb7 Fix typo.
llvm-svn: 144695
2011-11-15 21:01:30 +00:00
Jim Grosbach
4d0ad5a4e0 ARM alternate size suffices for VTRN instructions.
rdar://10435076

llvm-svn: 144694
2011-11-15 20:49:46 +00:00
Owen Anderson
f71db061ba Fix a misplaced paren bug.
llvm-svn: 144692
2011-11-15 20:30:41 +00:00
Jim Grosbach
8987b277cb ARM assembly parsing for optional datatype suffix on VFP VMOV GPR<->VFP insns.
Yet more of rdar://10435076.

llvm-svn: 144691
2011-11-15 20:29:42 +00:00
Jim Grosbach
f0690cd90c ARM assembly parsing for two-operand form of 'mul' instruction.
rdar://10449856.

llvm-svn: 144689
2011-11-15 20:14:51 +00:00
Jim Grosbach
8b1d4c989c ARM assembly parsing for two-operand form of 'mul' instruction.
Ongoing rdar://10435114.

llvm-svn: 144688
2011-11-15 20:02:06 +00:00
Jim Grosbach
6dbfffcbf7 Thumb2 two-operand 'mul' instruction wide encoding parsing.
rdar://10449724

llvm-svn: 144684
2011-11-15 19:55:16 +00:00
Owen Anderson
35f049f1fb Fix an ambiguous decoding where we failed to properly decode VMOVv2f32 and VMOVv4f32.
llvm-svn: 144683
2011-11-15 19:55:00 +00:00
Jim Grosbach
df951fa128 Thumb2 assembly parsing for mul.w in IT block fix.
When the 3rd operand is not a low-register, and the first two operands are
the same low register, the parser was incorrectly trying to use the 16-bit
instruction encoding.

rdar://10449281

llvm-svn: 144679
2011-11-15 19:29:45 +00:00
Benjamin Kramer
4a8534a158 StringRefize and simplify.
llvm-svn: 144675
2011-11-15 19:12:09 +00:00
Rafael Espindola
95f4e0c409 We currently use a callback to handle an IL pass deleting a BB that still
has a reference to it. Unfortunately, that doesn't work for codegen passes
since we don't get notified of MBB's being deleted (the original BB stays).

Use that fact to our advantage and after printing a function, check if
any of the IL BBs corresponds to a symbol that was not printed. This fixes
pr11202.

llvm-svn: 144674
2011-11-15 19:08:46 +00:00
Akira Hatanaka
040e92857f Fix functions in MipsFrameLowering.cpp and MipsRegisterInfo.cpp. Use 64-bit
registers and instructions when ABI is N64.

llvm-svn: 144666
2011-11-15 18:53:55 +00:00
Akira Hatanaka
a6c12123f1 Set nomacro before emitting the sequence of instructions that set global pointer
register.

llvm-svn: 144665
2011-11-15 18:44:44 +00:00
Akira Hatanaka
666a872159 Simplify function PassByValArg64.
llvm-svn: 144664
2011-11-15 18:42:25 +00:00
Akira Hatanaka
60491e224c Remove function printMipsSymbolRef.
llvm-svn: 144663
2011-11-15 18:38:35 +00:00
Benjamin Kramer
ea43fbc528 Remove Value::getNameStr. It has been deprecated for a while and provides no additional value over getName().
llvm-svn: 144657
2011-11-15 18:30:12 +00:00
Benjamin Kramer
de467fc4df Missed some users of Value::getNameStr.
llvm-svn: 144656
2011-11-15 18:30:06 +00:00
Akira Hatanaka
4cbb3e4ca4 Delete files.
llvm-svn: 144655
2011-11-15 18:22:48 +00:00
Akira Hatanaka
ef7310d4a2 Remove MipsMCSymbolRefExpr.
llvm-svn: 144654
2011-11-15 18:20:08 +00:00
Jim Grosbach
61c3f1b35b ARM parsing datatype suffix variants for register-writeback VLD1/VST1 instructions.
rdar://10435076

llvm-svn: 144650
2011-11-15 17:49:59 +00:00
Jim Grosbach
176025e803 Tidy up. 80 columns.
llvm-svn: 144649
2011-11-15 16:46:22 +00:00
Benjamin Kramer
a2f57dee6d Remove all remaining uses of Value::getNameStr().
llvm-svn: 144648
2011-11-15 16:27:03 +00:00
Benjamin Kramer
3eeef2e739 Twinify GraphWriter a little bit.
llvm-svn: 144647
2011-11-15 16:26:38 +00:00
Jakob Stoklund Olesen
7b29a27e02 Check all overlaps when looking for used registers.
A function using any RC alias is enough to enable the ExeDepsFix pass.

llvm-svn: 144636
2011-11-15 08:20:43 +00:00
Jay Foad
e81b476d52 Make use of MachinePointerInfo::getFixedStack.
llvm-svn: 144635
2011-11-15 07:51:13 +00:00
Jay Foad
ce22ec7def Remove some unnecessary includes of PseudoSourceValue.h.
llvm-svn: 144634
2011-11-15 07:50:46 +00:00
Jay Foad
ebf2ec40e0 Fix typo in comment.
llvm-svn: 144633
2011-11-15 07:50:05 +00:00
Jay Foad
b4bb79247a Make use of MachinePointerInfo::getFixedStack. This removes all mention
of PseudoSourceValue from lib/Target/.

llvm-svn: 144632
2011-11-15 07:34:52 +00:00
Jay Foad
7d05fea7f8 Remove some unnecessary includes of PseudoSourceValue.h.
llvm-svn: 144631
2011-11-15 07:24:32 +00:00
Craig Topper
320d3ef477 Fix PR11370 for real. Prevents converting 256-bit FP instruction to AVX2 256-bit integer instructions when AVX2 isn't enabled.
llvm-svn: 144629
2011-11-15 06:39:01 +00:00
Evan Cheng
40f68c968e Set SeenStore to true to prevent loads from being moved; also eliminates a non-deterministic behavior.
llvm-svn: 144628
2011-11-15 06:26:51 +00:00
Chandler Carruth
fdcba17bec Rather than trying to use the loop block sequence *or* the function
block sequence when recovering from unanalyzable control flow
constructs, *always* use the function sequence. I'm not sure why I ever
went down the path of trying to use the loop sequence, it is
fundamentally not the correct sequence to use. We're trying to preserve
the incoming layout in the cases of unreasonable control flow, and that
is only encoded at the function level. We already have a filter to
select *exactly* the sub-set of blocks within the function that we're
trying to form into a chain.

The resulting code layout is also significantly better because of this.
In several places we were ending up with completely unreasonable control
flow constructs due to the ordering chosen by the loop structure for its
internal storage. This change removes a completely wasteful vector of
basic blocks, saving memory allocation in the common case even though it
costs us CPU in the fairly rare case of unnatural loops. Finally, it
fixes the latest crasher reduced out of GCC's single source. Thanks
again to Benjamin Kramer for the reduction, my bugpoint skills failed at
it.

llvm-svn: 144627
2011-11-15 06:26:43 +00:00
Craig Topper
a584521daa Properly qualify AVX2 specific parts of execution dependency table. Also enable converting between 256-bit PS/PD operations when AVX1 is enabled. Fixes PR11370.
llvm-svn: 144622
2011-11-15 05:55:35 +00:00
Evan Cheng
47d8f8af84 Add vmov.f32 to materialize f32 immediate splats which cannot be handled by
integer variants. rdar://10437054

llvm-svn: 144608
2011-11-15 02:12:34 +00:00
Jim Grosbach
2ac98a24aa ARM parsing datatype suffix variants for fixed-writeback VLD1/VST1 instructions.
rdar://10435076

llvm-svn: 144606
2011-11-15 01:46:57 +00:00
Nick Lewycky
3858fb95b7 Move WEAK marking to the declaration.
llvm-svn: 144603
2011-11-15 01:23:22 +00:00
Jakob Stoklund Olesen
2709f65821 Break false dependencies before partial register updates.
Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix
about instructions with partial register updates causing false unwanted
dependencies.

The ExecutionDepsFix pass will break the false dependencies if the
updated register was written in the previoius N instructions.

The small loop added to sse-domains.ll runs twice as fast with
dependency-breaking instructions inserted.

llvm-svn: 144602
2011-11-15 01:15:30 +00:00
Jakob Stoklund Olesen
0c310642e2 Track register ages more accurately.
Keep track of the last instruction to define each register individually
instead of per DomainValue.  This lets us track more accurately when a
register was last written.

Also track register ages across basic blocks.  When entering a new
basic block, use the least stale predecessor def as a worst case
estimate for register age.

The register age is used to arbitrate between conflicting domains. The
most recently defined register wins.

llvm-svn: 144601
2011-11-15 01:15:25 +00:00
Nick Lewycky
a68ade6ccc Fix linking for some users who already have tsan enabled code and are trying to
link it against llvm code, by making our definitions weak. "Some users."

llvm-svn: 144596
2011-11-15 00:14:04 +00:00
Jim Grosbach
6846540505 ARM parsing datatype suffix variants for non-writeback VST1 instructions.
rdar://10435076

llvm-svn: 144593
2011-11-14 23:43:46 +00:00
Jim Grosbach
a1a28df278 ARM parsing datatype suffix variants for non-writeback VLD1 instructions.
rdar://10435076

llvm-svn: 144592
2011-11-14 23:32:59 +00:00
Jim Grosbach
8ec84fbe99 Add explanatory comment.
llvm-svn: 144589
2011-11-14 23:21:09 +00:00
Jim Grosbach
1d07736422 Split out the plain '.{8|16|32|64}' suffix handling.
Make it easier to deal with aliases for instructions that do require a suffix
but accept more specific variants of the same size.

llvm-svn: 144588
2011-11-14 23:20:14 +00:00
Jim Grosbach
00283a5c8e ARM parsing optional datatype suffix for VAND/VEOR/VORR instructions.
rdar://10435076

llvm-svn: 144587
2011-11-14 23:11:19 +00:00
Chad Rosier
4e05d7f12c Supporting inline memmove isn't going to be worthwhile. The only way to avoid
violating a dependency is to emit all loads prior to stores.  This would likely
cause a great deal of spillage offsetting any potential gains.

llvm-svn: 144585
2011-11-14 23:04:09 +00:00
Jim Grosbach
4a2f107b04 ARM VLDR/VSTR instructions don't need a size suffix.
Canonicallize on the non-suffixed form, but continue to accept assembly that
has any correctly sized type suffix.

llvm-svn: 144583
2011-11-14 23:03:21 +00:00
Nick Lewycky
a0b2f7ca1d Refactor capture tracking (which already had a couple flags for whether returns
and stores capture) to permit the caller to see each capture point and decide
whether to continue looking.

Use this inside memdep to do an analysis that basicaa won't do. This lets us
solve another devirtualization case, fixing PR8908!

llvm-svn: 144580
2011-11-14 22:49:42 +00:00
Chad Rosier
48b92815e0 Add support for inlining small memcpys.
rdar://10412592

llvm-svn: 144578
2011-11-14 22:46:17 +00:00
Chad Rosier
8aa8f14940 Fix a performance regression from r144565. Positive offsets were being lowered
into registers, rather then encoded directly in the load/store.

llvm-svn: 144576
2011-11-14 22:34:48 +00:00
Jim Grosbach
009733c9e4 ARM assembly parsing type suffix options for VLDR/VSTR.
rdar://10435076

llvm-svn: 144575
2011-11-14 22:28:39 +00:00
Evan Cheng
95e735afa7 Avoid dereferencing off the beginning of lists.
llvm-svn: 144569
2011-11-14 21:11:15 +00:00
Evan Cheng
ba11ee300a At -O0, multiple uses of a virtual registers in the same BB are being marked
"kill". This looks like a bug upstream. Since that's going to take some time
to understand, loosen the assertion and disable the optimization when
multiple kills are seen.

llvm-svn: 144568
2011-11-14 21:02:09 +00:00
Nick Lewycky
53185e9016 Add support for tsan annotations (thread sanitizer, a valgrind-based tool).
These annotations are disabled entirely when either ENABLE_THREADS is off, or
building a release build. When enabled, they add calls to functions with no
statements to ManagedStatic's getters.

Use these annotations to inform tsan that the race used inside ManagedStatic
initialization is actually benign. Thanks to Kostya Serebryany for helping
write this patch!

llvm-svn: 144567
2011-11-14 20:50:16 +00:00
Evan Cheng
2034ff3b0b Add a missing pattern for X86ISD::MOVLPD. rdar://10436044
llvm-svn: 144566
2011-11-14 20:35:52 +00:00
Chad Rosier
65395ac4d0 Add support for Thumb load/stores with negative offsets.
rdar://10412592

llvm-svn: 144565
2011-11-14 20:22:27 +00:00
Benjamin Kramer
8450065be6 Unbreak Release builds.
llvm-svn: 144560
2011-11-14 19:51:48 +00:00
Evan Cheng
f19d257488 Teach two-address pass to re-schedule two-address instructions (or the kill
instructions of the two-address operands) in order to avoid inserting copies.
This fixes the few regressions introduced when the two-address hack was
disabled (without regressing the improvements).
rdar://10422688

llvm-svn: 144559
2011-11-14 19:48:55 +00:00
Pete Cooper
c9d6834f38 Changed SSE4/AVX <2 x i64> extract and insert ops to be Custom lowered
Constant idx case is still done in tablegen but other cases are then expanded

Fixes <rdar://problem/10435460>

llvm-svn: 144557
2011-11-14 19:38:42 +00:00
Benjamin Kramer
720b712c77 Fold ConstantVector::isAllOnesValue into Constant::isAllOnesValue and simplify it.
llvm-svn: 144555
2011-11-14 19:12:20 +00:00
Akira Hatanaka
9dd2dd0320 32-to-64-bit extended load.
llvm-svn: 144554
2011-11-14 19:06:14 +00:00
Akira Hatanaka
282b708fcb AnalyzeCallOperands function for N32/64.
N32/64 places all variable arguments in integer registers (or on stack),
regardless of their types, but follows calling convention of non-vaarg function
when it handles fixed arguments.

llvm-svn: 144553
2011-11-14 19:02:54 +00:00
Akira Hatanaka
b147de0ada Modify LowerFormalArguments to correctly handle vaarg arguments for Mips64.
llvm-svn: 144552
2011-11-14 19:01:09 +00:00
Justin Holewinski
4e7a1c571b PTX: Let LLVM use loads/stores for all mem* intrinsics, instead of relying on custom implementations.
llvm-svn: 144551
2011-11-14 18:58:20 +00:00
Akira Hatanaka
db610fb423 Remove variable that keeps the size of area used to save byval or variable
argument registers on the callee's stack frame, along with functions that set
and get it.
    
It is not necessary to add the size of this area when computing stack size in
emitPrologue, since it has already been accounted for in
PEI::calculateFrameObjectOffsets.

llvm-svn: 144549
2011-11-14 18:56:20 +00:00
Jakob Stoklund Olesen
6035535c96 Fix early-clobber handling in shrinkToUses.
I broke this in r144515, it affected most ARM testers.

<rdar://problem/10441389>

llvm-svn: 144547
2011-11-14 18:45:38 +00:00
Bob Wilson
f76590b3c5 Disable generation of compact unwind encodings. <rdar://problem/10441578>
This still seems to be causing some failures.  It needs more testing before
it gets enabled again.

llvm-svn: 144543
2011-11-14 18:21:07 +00:00
Jim Grosbach
d791cf718c Tidy up. 80 column.
llvm-svn: 144538
2011-11-14 17:52:47 +00:00
Benjamin Kramer
6d85bbc486 Make headers standalone, move a virtual method out of line.
llvm-svn: 144536
2011-11-14 17:22:45 +00:00
Chandler Carruth
9e6d173b9e It helps to deallocate memory as well as allocate it. =] This actually
cleans up all the chains allocated during the processing of each
function so that for very large inputs we don't just grow memory usage
without bound.

llvm-svn: 144533
2011-11-14 10:57:23 +00:00
Chandler Carruth
06afac4924 Remove an over-eager assert that was firing on one of the ARM regression
tests when I forcibly enabled block placement.

It is apparantly possible for an unanalyzable block to fallthrough to
a non-loop block. I don't actually beleive this is correct, I believe
that 'canFallThrough' is returning true needlessly for the code
construct, and I've left a bit of a FIXME on the verification code to
try to track down why this is coming up.

Anyways, removing the assert doesn't degrade the correctness of the algorithm.

llvm-svn: 144532
2011-11-14 10:55:53 +00:00
Chandler Carruth
a1475d9b6b Begin chipping away at one of the biggest quadratic-ish behaviors in
this pass. We're leaving already merged blocks on the worklist, and
scanning them again and again only to determine each time through that
indeed they aren't viable. We can instead remove them once we're going
to have to scan the worklist. This is the easy way to implement removing
them. If this remains on the profile (as I somewhat suspect it will), we
can get a lot more clever here, as the worklist's order is essentially
irrelevant. We can use swapping and fold the two loops to reduce
overhead even when there are many blocks on the worklist but only a few
of them are removed.

llvm-svn: 144531
2011-11-14 09:46:33 +00:00
Chandler Carruth
f89087744e Under the hood, MBPI is doing a linear scan of every successor every
time it is queried to compute the probability of a single successor.
This makes computing the probability of every successor of a block in
sequence... really really slow. ;] This switches to a linear walk of the
successors rather than a quadratic one. One of several quadratic
behaviors slowing this pass down.

I'm not really thrilled with moving the sum code into the public
interface of MBPI, but I don't (at the moment) have ideas for a better
interface. My direction I'm thinking in for a better interface is to
have MBPI actually retain much more state and make *all* of these
queries cheap. That's a lot of work, and would require invasive changes.
Until then, this seems like the least bad (ie, least quadratic)
solution. Suggestions welcome.

llvm-svn: 144530
2011-11-14 09:12:57 +00:00
Chandler Carruth
09418993f8 Reuse the logic in getEdgeProbability within getHotSucc in order to
correctly handle blocks whose successor weights sum to more than
UINT32_MAX. This is slightly less efficient, but the entire thing is
already linear on the number of successors. Calling it within any hot
routine is a mistake, and indeed no one is calling it. It also
simplifies the code.

llvm-svn: 144527
2011-11-14 08:55:59 +00:00
Chandler Carruth
462bb16130 Fix an overflow bug in MachineBranchProbabilityInfo. This pass relied on
the sum of the edge weights not overflowing uint32, and crashed when
they did. This is generally safe as BranchProbabilityInfo tries to
provide this guarantee. However, the CFG can get modified during codegen
in a way that grows the *sum* of the edge weights. This doesn't seem
unreasonable (imagine just adding more blocks all with the default
weight of 16), but it is hard to come up with a case that actually
triggers 32-bit overflow. Fortuately, the single-source GCC build is
good at this. The solution isn't very pretty, but its no worse than the
previous code. We're already summing all of the edge weights on each
query, we can sum them, check for an overflow, compute a scale, and sum
them again.

I've included a *greatly* reduced test case out of the GCC source that
triggers it. It's a pretty lame test, as it clearly is just barely
triggering the overflow. I'd like to have something that is much more
definitive, but I don't understand the fundamental pattern that triggers
an explosion in the edge weight sums.

The buggy code is duplicated within this file. I'll colapse them into
a single implementation in a subsequent commit.

llvm-svn: 144526
2011-11-14 08:50:16 +00:00
Craig Topper
ee3a3cf35c Add AVX2 version of instructions to load folding tables. Also add a bunch of missing SSE/AVX instructions.
llvm-svn: 144525
2011-11-14 08:07:55 +00:00
Craig Topper
e0b34012db Add neverHasSideEffects, mayLoad, and mayStore to many patternless SSE/AVX instructions. Remove MMX check from LowerVECTOR_SHUFFLE since MMX vector types won't go through it anyway.
llvm-svn: 144522
2011-11-14 06:46:21 +00:00
Chad Rosier
0e5094ca87 Add support for ARM halfword load/stores and signed byte loads with negative
offsets.
rdar://10412592

llvm-svn: 144518
2011-11-14 04:09:28 +00:00
Jakob Stoklund Olesen
25e009690c Use getVNInfoBefore() when it makes sense.
llvm-svn: 144517
2011-11-14 01:39:36 +00:00
Chandler Carruth
b7f21af176 Teach machine block placement to cope with unnatural loops. These don't
get loop info structures associated with them, and so we need some way
to make forward progress selecting and placing basic blocks. The
technique used here is pretty brutal -- it just scans the list of blocks
looking for the first unplaced candidate. It keeps placing blocks like
this until the CFG becomes tractable.

The cost is somewhat unfortunate, it requires allocating a vector of all
basic block pointers eagerly. I have some ideas about how to simplify
and optimize this, but I'm trying to get the logic correct first.

Thanks to Benjamin Kramer for the reduced test case out of GCC. Sadly
there are other bugs that GCC is tickling that I'm reducing and working
on now.

llvm-svn: 144516
2011-11-14 00:00:35 +00:00