1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00
Commit Graph

16879 Commits

Author SHA1 Message Date
Owen Anderson
d21ffd91bd Teach the DAG combiner to turn chains of FADDs (x+x+x+x+...) into FMULs by constants. This is only enabled in unsafe FP math mode, since it does not preserve rounding effects for all such constants.
llvm-svn: 162956
2012-08-30 23:35:16 +00:00
Michael Gottesman
0a6d89b0cc [llvm] Updated the test fold-vector-select so that we test the vector selects exhaustively.
llvm-svn: 162953
2012-08-30 23:11:49 +00:00
Nadav Rotem
17b9d8cba2 Currently targets that do not support selects with scalar conditions and vector operands - scalarize the code. ARM is such a target
because it does not support CMOV of vectors. To implement this efficientlyi, we broadcast the condition bit and use a sequence of NAND-OR
to select between the two operands. This is the same sequence we use for targets that don't have vector BLENDs (like SSE2).

rdar://12201387

llvm-svn: 162926
2012-08-30 19:17:29 +00:00
Michael Liao
b6735b87b0 Introduce 'UseSSEx' to force SSE legacy encoding
- Add 'UseSSEx' to force SSE legacy insn not being selected when AVX is
  enabled.

  As the penalty of inter-mixing SSE and AVX instructions, we need
  prevent SSE legacy insn from being generated except explicitly
  specified through some intrinsics. For patterns supported by both
  SSE and AVX, so far, we force AVX insn will be tried first relying on
  AddedComplexity or position in td file. It's error-prone and
  introduces bugs accidentally.

  'UseSSEx' is disabled when AVX is turned on. For SSE insns inherited
  by AVX, we need this predicate to force VEX encoding or SSE legacy
  encoding only.

  For insns not inherited by AVX, we still use the previous predicates,
  i.e. 'HasSSEx'. So far, these insns fall into the following
  categories:
  * SSE insns with MMX operands
  * SSE insns with GPR/MEM operands only (xFENCE, PREFETCH, CLFLUSH,
    CRC, and etc.)
  * SSE4A insns.
  * MMX insns.
  * x87 insns added by SSE.

2 test cases are modified:

 - test/CodeGen/X86/fast-isel-x86-64.ll
   AVX code generation is different from SSE one. 'vcvtsi2sdq' cannot be
   selected by fast-isel due to complicated pattern and fast-isel
   fallback to materialize it from constant pool.

 - test/CodeGen/X86/widen_load-1.ll
   AVX code generation is different from SSE one after fixing SSE/AVX
   inter-mixing. Exec-domain fixing prefers 'vmovapd' instead of
   'vmovaps'.

llvm-svn: 162919
2012-08-30 16:54:46 +00:00
Benjamin Kramer
766538b9bb Fix test case.
llvm-svn: 162913
2012-08-30 15:42:45 +00:00
Benjamin Kramer
d473cfcfb4 LoopRotate: Also rotate loops with multiple exits.
The old PHI updating code in loop-rotate was replaced with SSAUpdater a while
ago, it has no problems with comples PHIs. What had to be fixed is detecting
whether a loop was already rotated and updating dominators when multiple exits
were present.

This change increases overall code size a bit, mostly due to additional loop
unrolling opportunities. Passes test-suite and selfhost with -verify-dom-info.
Fixes PR7447.

Thanks to Andy for the input on the domtree updating code.

llvm-svn: 162912
2012-08-30 15:39:42 +00:00
Nadav Rotem
5848719c42 It is illegal to transform (sdiv (ashr X c1) c2) -> (sdiv x (2^c1 * c2)),
because C always rounds towards zero.

Thanks Dirk and Ben.

llvm-svn: 162899
2012-08-30 11:23:20 +00:00
Tim Northover
627f946e05 Add support for moving pure S-register to NEON pipeline if desired
llvm-svn: 162898
2012-08-30 10:17:45 +00:00
Michael Liao
5c1090880a Should put test case under test/ExecutionEngine/MCJIT/
llvm-svn: 162885
2012-08-30 00:43:57 +00:00
Michael Liao
0e40defe86 Fix PR13727
- The root cause is that target constant materialization in X86 fast-isel
  creates a PC-rel addressing which may overflow 32-bit range in non-Small code
  model if .rodata section is allocated too far away from code segment in
  MCJIT, which uses Large code model so far.
- Follow the similar logic to fix non-Small code model in fast-isel by skipping
  non-Small code model.

llvm-svn: 162881
2012-08-30 00:30:16 +00:00
Hal Finkel
b356af14b1 Reserve space for the mandatory traceback fields on PPC64.
We need to reserve space for the mandatory traceback fields,
though leaving them as zero is appropriate for now.

Although the ABI calls for these fields to be filled in fully, no
compiler on Linux currently does this, and GDB does not read these
fields.  GDB uses the first word of zeroes during exception handling to
find the end of the function and the size field, allowing it to compute
the beginning of the function.  DWARF information is used for everything
else.  We need the extra 8 bytes of pad so the size field is found in
the right place.

As a comparison, GCC fills in a few of the fields -- language, number
of saved registers -- but ignores the rest.  IBM's proprietary OSes do
make use of the full traceback table facility.

Patch by Bill Schmidt.

llvm-svn: 162854
2012-08-29 20:22:24 +00:00
Benjamin Kramer
b92d13cc42 Make MemoryBuiltins aware of TargetLibraryInfo.
This disables malloc-specific optimization when -fno-builtin (or -ffreestanding)
is specified. This has been a problem for a long time but became more severe
with the recent memory builtin improvements.

Since the memory builtin functions are used everywhere, this required passing
TLI in many places. This means that functions that now have an optional TLI
argument, like RecursivelyDeleteTriviallyDeadFunctions, won't remove dead
mallocs anymore if the TLI argument is missing. I've updated most passes to do
the right thing.

Fixes PR13694 and probably others.

llvm-svn: 162841
2012-08-29 15:32:21 +00:00
Jush Lu
5a78c68e1d [arm-fast-isel] Add support for ARM PIC.
llvm-svn: 162823
2012-08-29 02:41:21 +00:00
NAKAMURA Takumi
46ead41fe9 Create llvm/test/Object/Mips/lit.local.cfg to check Mips in targets_to_build.
llvm-svn: 162819
2012-08-29 01:37:57 +00:00
NAKAMURA Takumi
6a9dea88a7 llvm/test: [CMake] Add profile_rt-shared to deps.
llvm-svn: 162813
2012-08-29 00:37:56 +00:00
NAKAMURA Takumi
dff0b7dd23 llvm/test/Analysis/Profiling: Mark 3 of them as REQUIRES: loadable_module.
FIXME: profile_rt.dll could be built on win32.
llvm-svn: 162811
2012-08-29 00:37:46 +00:00
Jack Carter
e48124ec30 Moved input for objdump test from Mips to Inputs.
llvm-svn: 162808
2012-08-29 00:10:48 +00:00
Manman Ren
478cc27601 Profile: set branch weight metadata with data generated from profiling.
This patch implements ProfileDataLoader which loads profile data generated by
-insert-edge-profiling and updates branch weight metadata accordingly.

Patch by Alastair Murray.

llvm-svn: 162799
2012-08-28 22:21:25 +00:00
Jack Carter
c918c7a81f The instruction DEXT may be transformed into DEXTU or DEXTM depending
on the size of the extraction and its position in the 64 bit word.

This patch allows support of the dext transformations with mips64 direct
object output.

0 <= msb < 32 0 <= lsb < 32 0 <= pos < 32 1 <= size <= 32
DINS
The field is entirely contained in the right-most word of the doubleword

32 <= msb < 64 0 <= lsb < 32 0 <= pos < 32 2 <= size <= 64
DINSM
The field straddles the words of the doubleword

32 <= msb < 64 32 <= lsb < 64 32 <= pos < 64 1 <= size <= 32
DINSU
The field is entirely contained in the left-most word of the doubleword

llvm-svn: 162782
2012-08-28 20:07:41 +00:00
Jack Carter
c0c7230823 Some of the instructions in the Mips instruction set are revision
delimited. llvm-mc -disassemble access these through the -mattr
option.

llvm-objdump -disassemble had no such way to set the attribute so
some instructions were just not recognized for disassembly.

This patch accepts llvm-mc mechanism for specifying the attributes.

llvm-svn: 162781
2012-08-28 19:24:49 +00:00
Jack Carter
a525a54e64 Some instructions are passed to the assembler to be
transformed to the final instruction variant. An
example would be dsrll which is transformed into 
dsll32 if the shift value is greater than 32.

For direct object output we need to do this transformation
in the codegen. If the instruction was inside branch
delay slot, it was being missed. This patch corrects this
oversight.

llvm-svn: 162779
2012-08-28 19:07:39 +00:00
Roman Divacky
7c3f29735a Emit word of zeroes after the last instruction as a start of the mandatory
traceback table on PowerPC64. This helps gdb handle exceptions. The other
mandatory fields are ignored by gdb and harder to implement so just add
there a FIXME.

Patch by Bill Schmidt. PR13641.

llvm-svn: 162778
2012-08-28 19:06:55 +00:00
Hal Finkel
0673920af6 Add PPC Freescale e500mc and e5500 subtargets.
Add subtargets for Freescale e500mc (32-bit) and e5500 (64-bit) to
the PowerPC backend.

Patch by Tobias von Koch.

llvm-svn: 162764
2012-08-28 16:12:39 +00:00
Benjamin Kramer
bc139a63fc InstCombine: Guard the transform introduced in r162743 against large ints and non-const shifts.
llvm-svn: 162751
2012-08-28 13:08:13 +00:00
Nadav Rotem
d0cd39e7c9 Make sure that we don't call getZExtValue on values > 64 bits.
Thanks Benjamin for noticing this.

llvm-svn: 162749
2012-08-28 12:23:22 +00:00
Nadav Rotem
9582c96aef Teach InstCombine to canonicalize [SU]div+[AL]shl patterns.
For example:
  %1 = lshr i32 %x, 2
  %2 = udiv i32 %1, 100

rdar://12182093

llvm-svn: 162743
2012-08-28 10:01:43 +00:00
Bill Wendling
6488dc22bb The commutative flag is already correctly set within the multiclass. If we set
it here, then a 'register-memory' version would wrongly get the commutative
flag.
<rdar://problem/12180135>

llvm-svn: 162741
2012-08-28 07:36:46 +00:00
Craig Topper
02bb8ce5e0 Merge AVX_SET0PSY/AVX_SET0PDY/AVX2_SET0 into a single post-RA pseudo.
llvm-svn: 162738
2012-08-28 07:05:28 +00:00
NAKAMURA Takumi
ec8af629fc llvm/test/CodeGen/X86/pr12312.ll: Add -mtriple=x86_64-unknown-unknown.
llvm-svn: 162736
2012-08-28 04:04:29 +00:00
Michael Liao
1f793b9c47 Fix PR12312
- Add a target-specific DAG optimization to recognize a pattern PTEST-able.
  Such a pattern is a OR'd tree with X86ISD::OR as the root node. When
  X86ISD::OR node has only its flag result being used as a boolean value and
  all its leaves are extracted from the same vector, it could be folded into an
  X86ISD::PTEST node.

llvm-svn: 162735
2012-08-28 03:34:40 +00:00
Jakob Stoklund Olesen
93b4cf4daf Remove extra MayLoad/MayStore flags from atomic_load/store.
These extra flags are not required to properly order the atomic
load/store instructions. SelectionDAGBuilder chains atomics as if they
were volatile, and SelectionDAG::getAtomic() sets the isVolatile bit on
the memory operands of all atomic operations.

The volatile bit is enough to order atomic loads and stores during and
after SelectionDAG.

This means we set mayLoad on atomic_load, mayStore on atomic_store, and
mayLoad+mayStore on the remaining atomic read-modify-write operations.

llvm-svn: 162733
2012-08-28 03:11:32 +00:00
Akira Hatanaka
ab45f57419 Fix mips' long branch pass.
Instructions emitted to compute branch offsets now use immediate operands
instead of symbolic labels. This change was needed because there were problems
when R_MIPS_HI16/LO16 relocations were used to make shared objects.

llvm-svn: 162731
2012-08-28 03:03:05 +00:00
Akira Hatanaka
02455586a1 Fix bug 13532.
In SelectionDAGLegalize::ExpandLegalINT_TO_FP, expand INT_TO_FP nodes without
using any f64 operations if f64 is not a legal type.

Patch by Stefan Kristiansson. 

llvm-svn: 162728
2012-08-28 02:12:42 +00:00
Hal Finkel
367c494415 Allow remat of LI on PPC.
Allow load-immediates to be rematerialised in the register coalescer for
PPC. This makes test/CodeGen/PowerPC/big-endian-formal-args.ll fail,
because it relies on a register move getting emitted. The immediate load is
equivalent, so change this test case.

Patch by Tobias von Koch.

llvm-svn: 162727
2012-08-28 02:10:33 +00:00
Hal Finkel
d28587407f Eliminate redundant CR moves on PPC32.
The 32-bit ABI requires CR bit 6 to be set if the call has fp arguments and
unset if it doesn't. The solution up to now was to insert a MachineNode to
set/unset the CR bit, which produces a CR vreg. This vreg was then copied
into CR bit 6. When the register allocator saw a bunch of these in the same
function, it allocated the set/unset CR bit in some random CR register (1
extra instruction) and then emitted CR moves before every vararg function
call, rather than just setting and unsetting CR bit 6 directly before every
vararg function call. This patch instead inserts a PPCcrset/PPCcrunset
instruction which are then matched by a dedicated instruction pattern.

Patch by Tobias von Koch.

llvm-svn: 162725
2012-08-28 02:10:27 +00:00
Hal Finkel
caa4701e37 Optimize zext on PPC64.
The zeroextend IR instruction is lowered to an 'and' node with an immediate
mask operand, which in turn gets legalised to a sequence of ori's & ands.
This can be done more efficiently using the rldicl instruction.

Patch by Tobias von Koch.

llvm-svn: 162724
2012-08-28 02:10:15 +00:00
Bill Wendling
d49e183a6f Make sure we add the predicate after all of the registers are added.
<rdar://problem/12183003>

llvm-svn: 162703
2012-08-27 22:12:44 +00:00
NAKAMURA Takumi
6fc90acd7c llvm/test/CodeGen/X86/fma.ll: Add -march=x86, or two tests would fail on non-x86 hosts.
llvm-svn: 162667
2012-08-27 11:50:26 +00:00
NAKAMURA Takumi
b1719f040a llvm/test/CodeGen/X86/fma_patterns.ll: Add -mtriple=x86_64. It was incompatible on i686 and Windows x64.
llvm-svn: 162664
2012-08-27 09:37:54 +00:00
Craig Topper
d27b76d897 Commit test change for r162658.
llvm-svn: 162660
2012-08-27 07:55:50 +00:00
Alexey Samsonov
e39e62d8e5 Add basic support for .debug_ranges section to LLVM's DebugInfo library.
This section (introduced in DWARF-3) is used to define instruction address
ranges for functions that are not contiguous and can't be described
by low_pc/high_pc attributes (this is the usual case for inlined subroutines).
The patch is the first step to support fetching complete inlining info from DWARF.

Reviewed by Benjamin Kramer.

llvm-svn: 162657
2012-08-27 07:17:47 +00:00
Anitha Boyapati
0ab56d29f6 FMA3 tests on bdver2 target for changes made in rev 162012. Also made
corresponding changes to existing tests for darwin triple to ensure that
same pattern is tested for bdver2 target.

llvm-svn: 162655
2012-08-27 06:59:01 +00:00
Craig Topper
1d3f44d696 Make sure that FMA3 is favored even when FMA4 is also enabled. Test case for r162454.
llvm-svn: 162653
2012-08-27 03:38:15 +00:00
Jakob Stoklund Olesen
43f9c539ae Infer instruction properties from single-instruction patterns.
Previously, instructions without a primary patterns wouldn't get their
properties inferred. Now, we use all single-instruction patterns for
inference, including 'def : Pat<>' instances.

This causes a lot of instruction flags to change.

- Many instructions no longer have the UnmodeledSideEffects flag because
  their flags are now inferred from a pattern.

- Instructions with intrinsics will get a mayStore flag if they already
  have UnmodeledSideEffects and a mayLoad flag if they already have
  mayStore. This is because intrinsics properties are linear.

- Instructions with atomic_load patterns get a mayStore flag because
  atomic loads can't be reordered. The correct workaround is to create
  pseudo-instructions instead of using normal loads. PR13693.

llvm-svn: 162614
2012-08-24 22:46:53 +00:00
Akira Hatanaka
8411cfdb72 Disable Mips' delay slot filler when optimization level is O0.
llvm-svn: 162589
2012-08-24 20:40:15 +00:00
Akira Hatanaka
8e8bb580a8 In MipsDAGToDAGISel::SelectAddr, fold add node into address operand, if its
second operand is MipsISD::GPRel.

llvm-svn: 162584
2012-08-24 20:21:49 +00:00
Manman Ren
6342812033 BranchProb: modify the definition of an edge in BranchProbabilityInfo to handle
the case of multiple edges from one block to another.

A simple example is a switch statement with multiple values to the same
destination. The definition of an edge is modified from a pair of blocks to
a pair of PredBlock and an index into the successors.

Also set the weight correctly when building SelectionDAG from LLVM IR,
especially when converting a Switch.
IntegersSubsetMapping is updated to calculate the weight for each cluster.

llvm-svn: 162572
2012-08-24 18:14:27 +00:00
Roman Divacky
eab620e38c Lower constant pools and jump tables via TOC on PPC64/SVR4.
In collaboration with Adhemerval Zanella.

llvm-svn: 162562
2012-08-24 16:26:02 +00:00
Eric Christopher
4e278d30eb Use DW_FORM_flag_present to save space in debug information if we're
not in darwin gdb compat mode.

Fixes rdar://10975088

llvm-svn: 162526
2012-08-24 01:14:27 +00:00
Eric Christopher
5e0b3cf4a6 Remove the DW_AT_MIPS_linkage name attribute when we don't need it
output (we're emitting a specification already and the information
isn't changing) and we're not in old gdb compat mode.

Saves 1% on the debug information for a build of llvm.

Fixes rdar://11043421

llvm-svn: 162493
2012-08-23 22:52:55 +00:00