1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-11-25 20:23:11 +01:00
Commit Graph

16921 Commits

Author SHA1 Message Date
Jim Grosbach
fb64c491ba Enable MCJIT tests on Darwin.
llvm-svn: 163278
2012-09-06 00:59:06 +00:00
Jack Carter
2a8cbd60d3 Mips specific llvm assembler support for branch and jump instructions.
Test case included.

Contributer: Vladimir Medic
llvm-svn: 163277
2012-09-06 00:43:26 +00:00
Jakob Stoklund Olesen
0324528c8c Use predication instead of pseudo-opcodes when folding into MOVCC.
Now that it is possible to dynamically tie MachineInstr operands,
predicated instructions are possible in SSA form:

  %vreg3<def> = SUBri %vreg1, -2147483647, pred:14, pred:%noreg, %opt:%noreg
  %vreg4<def,tied1> = MOVCCr %vreg3<tied0>, %vreg1, %pred:12, pred:%CPSR

Becomes a predicated SUBri with a tied imp-use:

  SUBri %vreg1, -2147483647, pred:13, pred:%CPSR, opt:%noreg, %vreg1<imp-use,tied0>

This means that any instruction that is safe to move can be folded into
a MOVCC, and the *CC pseudo-instructions are no longer needed.

The test case changes reflect that Thumb2SizeReduce recognizes the
predicated instructions. It didn't understand the pseudos.

llvm-svn: 163274
2012-09-05 23:58:02 +00:00
Nick Lewycky
6e3f6cb8d7 Add missing file for test.
llvm-svn: 163272
2012-09-05 23:52:20 +00:00
Nick Lewycky
6e821e0321 Teach libObject about some more ELF relocations. llvm-objdump -r now knows
every relocation in C++ hello world built with debug info.

llvm-svn: 163271
2012-09-05 23:48:54 +00:00
Manman Ren
7459dd03a2 JumpThreading: when default destination is the destination of some cases in a
switch, make sure we include the value for the cases when calculating edge
value from switch to the default destination.

rdar://12241132

llvm-svn: 163270
2012-09-05 23:45:58 +00:00
Jack Carter
f7221de872 Mips specific llvm assembler support for ALU instructions. This includes
register support. Test case included.

Contributer: Vladimir Medic
llvm-svn: 163268
2012-09-05 23:34:03 +00:00
Tim Northover
4e03b89c79 Strip old MachineInstrs *after* we know we can put them back.
Previous patch accidentally decided it couldn't convert a VFP to a
NEON instruction after it had already destroyed the old one. Not a
good move.

llvm-svn: 163230
2012-09-05 18:37:53 +00:00
Pranav Bhandarkar
876ff208b6 LLVM Bug Fix 13709: Remove needless lsr(Rp, #32) instruction access the
subreg_hireg of register pair Rp.

	* lib/Target/Hexagon/HexagonPeephole.cpp(PeepholeDoubleRegsMap): New
	 DenseMap similar to PeepholeMap that additionally records subreg info
	 too.
        (runOnMachineFunction): Record information in PeepholeDoubleRegsMap
        and copy propagate the high sub-reg of Rp0 in Rp1 = lsr(Rp0, #32) to
	the instruction Rx = COPY Rp1:logreg_subreg.
	* test/CodeGen/Hexagon/remove_lsr.ll: New test.
	

llvm-svn: 163214
2012-09-05 16:01:40 +00:00
Silviu Baranga
6f46bb1705 Fixed the DAG combiner to better handle the folding of AND nodes for vector types. The previous code was making the assumption that the length of the bitmask returned by isConstantSplat was equal to the size of the vector type. Now we first make sure that the splat value has at least the length of the vector lane type, then we only use as many fields as we have available in the splat value.
llvm-svn: 163203
2012-09-05 08:57:21 +00:00
Logan Chien
a15abb3d65 Fix UseInitArray option for MIPS target.
llvm-svn: 163193
2012-09-05 06:17:17 +00:00
Dan Gohman
e90f78d5cd Make provenance checking conservative in cases when
pointers-to-strong-pointers may be in play. These can lead to retains and
releases happening in unstructured ways, foiling the optimizer. This fixes
rdar://12150909.

llvm-svn: 163180
2012-09-04 23:16:20 +00:00
Jakob Stoklund Olesen
ef5dcf47b8 Move tie checks into MachineVerifier::visitMachineOperand.
llvm-svn: 163152
2012-09-04 18:38:28 +00:00
Preston Gurd
c80dc7d214 Generic Bypass Slow Div
- CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder,  or both.

Patch by Tyler Nowicki!

llvm-svn: 163150
2012-09-04 18:22:17 +00:00
Sergei Larin
905bc1964f Porting Hexagon MI Scheduler to the new API.
Change current Hexagon MI scheduler to use new converging
scheduler. Integrates DFA resource model into it.

llvm-svn: 163137
2012-09-04 14:49:56 +00:00
Arnold Schwaighofer
d606c6fcdf Patch to implement UMLAL/SMLAL instructions for the ARM architecture
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.

Bug 12213

Patch by Yin Ma!

llvm-svn: 163136
2012-09-04 14:37:49 +00:00
Elena Demikhovsky
61924c155d This patch optimizes shuffle instruction - generates 2 instructions instead of 4.
Since this specific shuffle is widely used in many workloads we have ~10% performance on them.

shufflevector <8 x float> %A, <8 x float> %B, <8 x i32> <i32 0, i32 8, i32 2, i32 10, i32 4, i32 12, i32 6, i32 14>

vmovaps (%rdx), %ymm0
vshufps $8, %ymm0, %ymm0, %ymm0
vmovaps (%rcx), %ymm1
vshufps $8, %ymm0, %ymm1, %ymm1
vunpcklps       %ymm0, %ymm1, %ymm0

vmovaps (%rcx), %ymm0
vmovsldup       (%rdx), %ymm1
vblendps        $85, %ymm0, %ymm1, %ymm0

llvm-svn: 163134
2012-09-04 12:49:02 +00:00
Nadav Rotem
d57aaf8ae1 LICM may hoist an instruction with undefined behavior above a trap.
Scan the body of the loop and find instructions that may trap.
Use this information when deciding if it is safe to hoist or sink instructions.
Notice that we can optimize the search of instructions that may throw in the case of nested loops.

rdar://11518836

llvm-svn: 163132
2012-09-04 10:25:04 +00:00
Alexey Samsonov
0dd3d6f49e Add support for fetching inlining context (stack of source code locations)
by instruction address from DWARF.

Add --inlining flag to llvm-dwarfdump to demonstrate and test this functionality,
so that "llvm-dwarfdump --inlining --address=0x..." now works much like
"addr2line -i 0x...", provided that the binary has debug info
(Clang's -gline-tables-only *is* enough).

llvm-svn: 163128
2012-09-04 08:12:33 +00:00
Bob Wilson
8f6487007d Fix more fallout from r158919, similar to PR13547.
This code used to only handle malloc-like calls, which do not read memory.
r158919 changed it to check isNoAliasFn(), which includes strdup-like and
realloc-like calls, but it was not checking for dependencies on the memory
read by those calls.

llvm-svn: 163106
2012-09-03 05:15:15 +00:00
Nuno Lopes
79377a7289 escape special char when handling CXX_FOR_OCAMLOPT
llvm-svn: 163098
2012-09-02 15:16:51 +00:00
Nuno Lopes
12bc68545b fix test's RUN lines
llvm-svn: 163097
2012-09-02 15:07:25 +00:00
Nadav Rotem
d1815a0763 Not all targets have efficient ISel code generation for select instructions.
For example, the ARM target does not have efficient ISel handling for vector
selects with scalar conditions. This patch adds a TLI hook which allows the
different targets to report which selects are supported well and which selects
should be converted to CF duting codegen prepare.

llvm-svn: 163093
2012-09-02 12:10:19 +00:00
Benjamin Kramer
9e422987c3 LoopRotation: Make the brute force DomTree update more brute force.
We update until we hit a fixpoint. This is probably slow but also
slightly simplifies the code. It should also fix the occasional
invalid domtrees observed when building with expensive checking.

I couldn't find a case where this had a measurable slowdown, but
if someone finds a pathological case where it does we may have
to find a cleverer way of updating dominators here.

Thanks to Duncan for the test case.

llvm-svn: 163091
2012-09-02 11:57:22 +00:00
Nadav Rotem
3425b10f0f Generate better select code by allowing the target to use scalar select, and not sign-extend.
llvm-svn: 163086
2012-09-02 08:20:07 +00:00
Pete Cooper
78e01afae1 Revert "Take account of boolean vector contents when promoting a build vector from i1 to some other type. rdar://problem/12210060"
This reverts commit 5dd9e214fb92847e947f9edab170f9b4e52b908f.

Thanks to Duncan for explaining how this should have been done.

Conflicts:

	test/CodeGen/X86/vec_select.ll

llvm-svn: 163064
2012-09-01 17:37:55 +00:00
Logan Chien
b022dbf7dc Fix Thumb2 fixup kind in the integrated-as.
llvm-svn: 163063
2012-09-01 15:06:36 +00:00
Owen Anderson
27ba45c764 Teach DAG combine a number of tricks to simplify FMA expressions in fast-math mode.
llvm-svn: 163051
2012-09-01 06:04:27 +00:00
NAKAMURA Takumi
528a7eb3d4 llvm/test/CodeGen/X86/fp-fast.ll: Suppress FMA4 on AMD Bulldozer host, corresponding to r162999.
llvm-svn: 163041
2012-09-01 00:26:28 +00:00
Manman Ren
ff50ef38c7 Fix Atom bots for r163036.
llvm-svn: 163040
2012-09-01 00:17:06 +00:00
Manman Ren
9afdad8207 SelectionDAG: when constructing VZEXT_LOAD from other loads, make sure its
output chain is correctly setup.

As an example, if the original load must happen before later stores, we need
to make sure the constructed VZEXT_LOAD is constrained to be before the stores.

rdar://11457792

llvm-svn: 163036
2012-08-31 23:16:57 +00:00
Craig Topper
2e53378ff6 Mark FMA4 instructions as commutable and add them to the folding tables.
llvm-svn: 163035
2012-08-31 23:10:34 +00:00
Michael Liao
6f4b3f358d Fix PR12359
- In addition to undefined, if V2 is zero vector, skip 2nd PSHUFB and POR as
  well as PSHUFB will zero elements with negative indices.

  Patch by Sriram Murali <sriram.murali@intel.com>

llvm-svn: 163018
2012-08-31 20:12:31 +00:00
Jack Carter
a986033975 The instruction DINS may be transformed into DINSU or DEXTM depending
on the size of the extraction and its position in the 64 bit word.

This patch allows support of the dext transformations with mips64 direct
object output.

0 <= msb < 32 0 <= lsb < 32 0 <= pos < 32 1 <= size <= 32
DINS
The field is entirely contained in the right-most word of the doubleword

32 <= msb < 64 0 <= lsb < 32 0 <= pos < 32 2 <= size <= 64
DINSM
The field straddles the words of the doubleword

32 <= msb < 64 32 <= lsb < 64 32 <= pos < 64 1 <= size <= 32
DINSU
The field is entirely contained in the left-most word of the doubleword

llvm-svn: 163010
2012-08-31 18:06:48 +00:00
Craig Topper
917333c8c7 Mark FMA3 instructions as commutable so that the operands to the multiply part can be commuted.
llvm-svn: 163001
2012-08-31 16:31:13 +00:00
Craig Topper
6bb3145d0d Add support for converting llvm.fma to fma4 instructions.
llvm-svn: 162999
2012-08-31 15:40:30 +00:00
Jakob Stoklund Olesen
7dda42fc61 Don't enforce ordered inline asm operands.
I was too optimistic, inline asm can have tied operands that don't
follow the def order.

Fixes PR13742.

llvm-svn: 162998
2012-08-31 15:34:59 +00:00
NAKAMURA Takumi
712abaaf6f llvm/test/CodeGen/X86/vec_select.ll: Fix failure on xmm-less hosts, to add -mattr=+sse2.
FIXME: Should this be tested with both +avx and -avx,+sse2?
llvm-svn: 162983
2012-08-31 10:02:22 +00:00
Jakob Stoklund Olesen
eb687a399c Fix a couple of typos in EmitAtomic.
Thumb2 instructions are mostly constrained to rGPR, not tGPR which is
for Thumb1.

rdar://problem/12203728

llvm-svn: 162968
2012-08-31 02:08:34 +00:00
Jim Grosbach
6d3cb70105 X86: Fix encoding of 'movd %xmm0, %rax'
The assembly string for the VMOVPQIto64rr instruction incorrectly lacked the 'v'
prefix, resulting in mis-assembly of the vanilla movd instruction.

llvm-svn: 162963
2012-08-31 00:30:30 +00:00
Pete Cooper
a83a3953f1 Take account of boolean vector contents when promoting a build vector from i1 to some other type. rdar://problem/12210060
llvm-svn: 162960
2012-08-30 23:58:52 +00:00
Owen Anderson
115dc993c3 Try to make this test more generic to unbreak buildbots.
llvm-svn: 162958
2012-08-30 23:51:20 +00:00
Owen Anderson
d21ffd91bd Teach the DAG combiner to turn chains of FADDs (x+x+x+x+...) into FMULs by constants. This is only enabled in unsafe FP math mode, since it does not preserve rounding effects for all such constants.
llvm-svn: 162956
2012-08-30 23:35:16 +00:00
Michael Gottesman
0a6d89b0cc [llvm] Updated the test fold-vector-select so that we test the vector selects exhaustively.
llvm-svn: 162953
2012-08-30 23:11:49 +00:00
Nadav Rotem
17b9d8cba2 Currently targets that do not support selects with scalar conditions and vector operands - scalarize the code. ARM is such a target
because it does not support CMOV of vectors. To implement this efficientlyi, we broadcast the condition bit and use a sequence of NAND-OR
to select between the two operands. This is the same sequence we use for targets that don't have vector BLENDs (like SSE2).

rdar://12201387

llvm-svn: 162926
2012-08-30 19:17:29 +00:00
Michael Liao
b6735b87b0 Introduce 'UseSSEx' to force SSE legacy encoding
- Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is
  enabled.

  As the penalty of inter-mixing SSE and AVX instructions, we need
  prevent SSE legacy insn from being generated except explicitly
  specified through some intrinsics. For patterns supported by both
  SSE and AVX, so far, we force AVX insn will be tried first relying on
  AddedComplexity or position in td file. It's error-prone and
  introduces bugs accidentally.

  'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited
  by AVX, we need this predicate to force VEX encoding or SSE legacy
  encoding only.

  For insns not inherited by AVX, we still use the previous predicates,
  i.e. 'HasSSEx'. So far, these insns fall into the following
  categories:
  * SSE insns with MMX operands
  * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH,
    CRC, and etc.)
  * SSE4A insns.
  * MMX insns.
  * x87 insns added by SSE.

2 test cases are modified:

 - test/CodeGen/X86/fast-isel-x86-64.ll
   AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be
   selected by fast-isel due to complicated pattern and fast-isel
   fallback to materialize it from constant pool.

 - test/CodeGen/X86/widen_load-1.ll
   AVX code generation is different from SSE one after fixing SSE/AVX
   inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of
   'vmovaps'.

llvm-svn: 162919
2012-08-30 16:54:46 +00:00
Benjamin Kramer
766538b9bb Fix test case.
llvm-svn: 162913
2012-08-30 15:42:45 +00:00
Benjamin Kramer
d473cfcfb4 LoopRotate: Also rotate loops with multiple exits.
The old PHI updating code in loop-rotate was replaced with SSAUpdater a while
ago, it has no problems with comples PHIs. What had to be fixed is detecting
whether a loop was already rotated and updating dominators when multiple exits
were present.

This change increases overall code size a bit, mostly due to additional loop
unrolling opportunities. Passes test-suite and selfhost with -verify-dom-info.
Fixes PR7447.

Thanks to Andy for the input on the domtree updating code.

llvm-svn: 162912
2012-08-30 15:39:42 +00:00
Nadav Rotem
5848719c42 It is illegal to transform (sdiv (ashr X c1) c2) -> (sdiv x (2^c1 * c2)),
because C always rounds towards zero.

Thanks Dirk and Ben.

llvm-svn: 162899
2012-08-30 11:23:20 +00:00
Tim Northover
627f946e05 Add support for moving pure S-register to NEON pipeline if desired
llvm-svn: 162898
2012-08-30 10:17:45 +00:00