1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00
Commit Graph

10067 Commits

Author SHA1 Message Date
NAKAMURA Takumi
acebc1e6ca X86JITInfo: [x86] Rework r206240, X86CompilationCallback_SSE() should be called for SSE-enabled code generator, even if LLVM is not built with -msse.
llvm-svn: 206261
2014-04-15 08:28:23 +00:00
Nick Lewycky
82ad9fc7c8 Break PseudoSourceValue out of the Value hierarchy. It is now the root of its own tree containing FixedStackPseudoSourceValue (which you can use isa/dyn_cast on) and MipsCallEntry (which you can't). Anything that needs to use either a PseudoSourceValue* and Value* is strongly encouraged to use a MachinePointerInfo instead.
llvm-svn: 206255
2014-04-15 07:22:52 +00:00
Lang Hames
91cdab6916 [MC] Require an MCContext when constructing an MCDisassembler.
This patch re-introduces the MCContext member that was removed from
MCDisassembler in r206063, and requires that an MCContext be passed in at
MCDisassembler construction time. (Previously the MCContext member had been
initialized in an ad-hoc fashion after construction). The MCCContext member
can be used by MCDisassembler sub-classes to construct constant or
target-specific MCExprs.

This patch updates disassemblers for in-tree targets, and provides the
MCRegisterInfo instance that some disassemblers were using through the
MCContext (previously those backends were constructing their own
MCRegisterInfo instances).

llvm-svn: 206241
2014-04-15 04:40:56 +00:00
NAKAMURA Takumi
11d1c480fb X86JITInfo: [x86] Use X86CompilationCallback_SSE() along;
*not* Subtarget->hasSSE1()
  *but* __SSE__, the flag that LLVM libraries are compiled

The callback calls internal LLVM JIT libraries. It may be built with -msse (or above).

FIXME: JIT may use "host" instead of "generic" by default.
llvm-svn: 206240
2014-04-15 04:12:21 +00:00
Jim Grosbach
20e09f25d7 X86: Nuke one more CPU autodetect blurb.
Missed one in r206094. This brings MC and TargetMachine back into sync.

llvm-svn: 206220
2014-04-14 22:23:30 +00:00
David Blaikie
5191eb8e9f Change argument order and add explanatory comment to r206130
Changes requested in code review by Eric Christopher of r206130.

llvm-svn: 206219
2014-04-14 22:23:06 +00:00
David Blaikie
cedce0ba4f Fix instruction debug info location during legalization
I found this from a particular GDB test suite case of inlining
(something similar is provided as a test case) but came across a few
other related cases (other callers of the same functions, and one other
instance of the same coding mistake in a separate function).

I'm not sure what the best way to test this is (let alone to cover the
other cases I discovered), so hopefully this sufficies - open to ideas.

llvm-svn: 206130
2014-04-13 06:39:55 +00:00
Lang Hames
f72ca1f7d5 [X86] unique_ptr'ify one of X86GenericDisassembler's members.
llvm-svn: 206127
2014-04-13 04:09:16 +00:00
Jim Grosbach
51592aeaa3 X86: Remove TargetMachine CPU auto-detection.
This logic is properly in the realm of whatever is creating the
TargetMachine. This makes plain 'llc foo.ll' consistent across
heterogenous machines.

llvm-svn: 206094
2014-04-12 01:34:29 +00:00
Reid Kleckner
f99741400f Move the segmented stack switch to a function attribute
This removes the -segmented-stacks command line flag in favor of a
per-function "split-stack" attribute.

Patch by Luqman Aden and Alex Crichton!

llvm-svn: 205997
2014-04-10 22:58:43 +00:00
NAKAMURA Takumi
d13a996dc0 LLVMBuild.txt: Add missing dependencies.
llvm-svn: 205962
2014-04-10 11:16:47 +00:00
Jim Grosbach
366b84c436 Add support for load folding of avx1 logical instructions
AVX supports logical operations using an operand from memory. Unfortunately
because integer operations were not added until AVX2 the AVX1 logical
operation's types were preventing the isel from folding the loads. In a limited
number of cases the peephole optimizer would fold the loads, but most were
missed. This patch adds explicit patterns with appropriate casts in order for
these loads to be folded.

The included test cases run on reduced examples and disable the peephole
optimizer to ensure the folds are being pattern matched.

Patch by Louis Gerbarg <lgg@apple.com>

rdar://16355124

llvm-svn: 205938
2014-04-09 23:39:25 +00:00
Elena Demikhovsky
56ab81fd87 AVX-512: insert element to mask vector; store i1 data
Implemented INSERT_VECTOR_ELT operation for v16i1 and v8i1 vectors;
Implemented "store" for i1 type

llvm-svn: 205850
2014-04-09 12:37:50 +00:00
NAKAMURA Takumi
99b861df4d X86MCAsmInfoGNUCOFF: Set PointerSize as 8 for targeting x64. It caused DW_LNE_set_address was misemitted on x64.
FIXME: I haven't investigate whether CalleeSaveStackSlotSize should be 8.
llvm-svn: 205772
2014-04-08 15:28:50 +00:00
Elena Demikhovsky
73f5b6faba AVX-512: Added fp_to_uint and uint_to_fp patterns.
llvm-svn: 205754
2014-04-08 07:24:02 +00:00
David Majnemer
5a37407a86 X86: Split the relocation selection up
Before, we would have conditional operators where one side of the
operator would be of type RelocationTypeAMD64 and the other is of type
RelocationTypeI386.  GCC would noisly warn with -Wenum-compare
diagnostic.

Instead, refactor the code so it is more like the X86 ELF object writer.

llvm-svn: 205752
2014-04-08 02:15:13 +00:00
Matt Arsenault
7b6a70a9cf Add DAG parameter to ComputeNumSignBitsForTargetNode
This way, you can check the number of sign bits in the
operands. The depth parameter it already has is pretty useless
without this.

llvm-svn: 205649
2014-04-04 20:13:13 +00:00
Craig Topper
694437e2ef Make consistent use of MCPhysReg instead of uint16_t throughout the tree.
llvm-svn: 205610
2014-04-04 05:16:06 +00:00
Quentin Colombet
419aeb287d Revert r205599, the commit was not intended to have so many changes
llvm-svn: 205600
2014-04-04 02:02:49 +00:00
Quentin Colombet
b4d3858ea5 [RegAllocGreedy][Last Chance Recoloring] Emit diagnostics when last chance
recoloring cut-offs are hit.

This is related to PR18747.

Patch by MAYUR PANDEY <mayur.p@samsung.com>

llvm-svn: 205599
2014-04-04 01:58:57 +00:00
Lang Hames
c739f18eda [X86] As per suggestion from Craig Topper and Hal Finkel, override
TargetInstrInfo::findCommutedOpIndices to enable VFMA*231 commutation, rather
than abusing commuteInstruction.

Thanks very much for the suggestion guys!

llvm-svn: 205489
2014-04-02 23:57:49 +00:00
Lang Hames
e2f8671084 [X86] Make the VFMA*231 variants commutable and relax the alignment restrictions
on FMA3 memory operands. FMA3 instructions are VEX encoded, so they can load
from unaligned memory.

Testcase to follow, along with related patch.

<rdar://problem/16478629>

llvm-svn: 205472
2014-04-02 22:06:16 +00:00
Juergen Ributzka
f7fd4ffde4 Add comments and test case for [X86TTI] Make constant base pointers for GetElementPtr opaque (r204739).
llvm-svn: 205468
2014-04-02 21:45:36 +00:00
Yaron Keren
18f37e5fc5 Added isTargetWindowsMSVC(), renamed isTargetMingw() to isTargetWindowsGNU()
and isTargetCygwin() to isTargetWindowsCygwin() to be consistent with the
four Windows environments in Triple.h.

Suggestion by Saleem Abdulrasool!

llvm-svn: 205393
2014-04-02 04:27:51 +00:00
Yaron Keren
5b97697cc3 If isKnownWindowsMSVCEnvironment then getOS == Triple::Win32 and
Environment == Triple::MSVC so it will never be MinGW or Cygwin.

llvm-svn: 205349
2014-04-01 18:52:55 +00:00
Hal Finkel
0ebd182c7a Implement X86TTI::getUnrollingPreferences
This provides an initial implementation of getUnrollingPreferences for x86.
getUnrollingPreferences is used by the generic (concatenation) unroller, which
is distinct from the unrolling done by the loop vectorizer. Many modern x86
cores have some kind of uop cache and loop-stream detector (LSD) used to
efficiently dispatch small loops, and taking full advantage of this requires
unrolling small loops (small here means 10s of uops).

These caches also have limits on the number of taken branches in the loop, and
so we also cap the loop unrolling factor based on the maximum "depth" of the
loop. This is currently calculated with a partial DFS traversal (partial
because it will stop early if the path length grows too much). This is still an
approximation, and one that is both conservative (because it does not account
for branches eliminated via block placement) and optimistic (because it is only
recording the maximum depth over minimum paths). Nevertheless, because the
loops that fit in these uop caches are so small, it is not clear how much the
details matter.

The original set of patches posted for review produced the following test-suite
performance results (from the TSVC benchmark) at that time:
  ControlLoops-dbl - 13% speedup
  ControlLoops-flt - 15% speedup
  Reductions-dbl - 7.5% speedup

llvm-svn: 205348
2014-04-01 18:50:34 +00:00
Reid Kleckner
1ad832f876 Support segmented stacks on Win64
Identical to Win32 method except the GS segment register is used for TLS
instead of FS and pvArbitrary is at TEB offset 0x28 instead of 0x14.

llvm-svn: 205342
2014-04-01 18:34:21 +00:00
Yaron Keren
0524cf3d81 isTargetWindows() renamed to isTargetKnownWindowsMSVC()
to reflect its current functionality.

Based on Takumi NAKAMURA suggestion.

llvm-svn: 205338
2014-04-01 18:15:34 +00:00
Aaron Ballman
617b4f6afb Attempting to fix r205124, which had failed asserts when built with MSVC.
Suggestion from Yaron Keren.

llvm-svn: 205313
2014-04-01 13:56:35 +00:00
Alexey Volkov
f369538f31 [x86] Do not convert to cmp32 for Atom arch by Sergey Okunev
Differential Revision: http://llvm-reviews.chandlerc.com/D2824

llvm-svn: 205288
2014-04-01 08:13:07 +00:00
Adam Nemet
f8a90f1fdd [X86] Adjust cost of FP_TO_UINT v4f64->v4i32 as well
Pretty obvious follow-on to r205159 to also handle conversion from double
besides float.

Fixes <rdar://problem/16373208>

llvm-svn: 205253
2014-03-31 21:54:48 +00:00
Robert Khasanov
37729a0b8e Test commit.
llvm-svn: 205214
2014-03-31 16:01:38 +00:00
Yaron Keren
c87d06f1b7 Correct OS conditionals following r204977 and r204978.
Previously, MinGW OS was Triple::MinGW and Cygwin was Triple::Cygwin
and now it is Triple::Win32 with Environment being GNU or Cygwin.
So,

  TheTriple.getOS() == Triple::Win32 
  
is replaced by

  TheTriple.isWindowsMSVCEnvironment()

and

  (TheTriple.getOS() == Triple::MinGW32 || TheTriple.getOS() == Triple::Cygwin)
  
is replaced by

  TheTriple.isOSCygMing()

llvm-svn: 205170
2014-03-31 07:59:14 +00:00
Craig Topper
5cb50d052d [C++11] Mark more classes in the X86 target as 'final'.
llvm-svn: 205166
2014-03-31 06:53:13 +00:00
Craig Topper
cad9e27028 Mark a couple of the X86 target classes as final. Allows the compiler to de-virtualize some internal calls.
llvm-svn: 205165
2014-03-31 06:22:15 +00:00
Adam Nemet
de2e9c69f3 [X86] Adjust cost of FP_TO_UINT v8f32->v8i32
There is no direct AVX instruction to convert to unsigned.  I have some ideas
how we may be able to do this with three vector instructions but the current
backend just bails on this to get it scalarized.

See the comment why we need to adjust the cost returned by BasicTTI.

The test is a bit roundabout (and checks assembly rather than bit code) because
I'd like it to work even if at some point we could vectorize this conversion.

Fixes <rdar://problem/16371920>

llvm-svn: 205159
2014-03-30 18:07:13 +00:00
NAKAMURA Takumi
81cbcfd543 X86Subtarget.h: isTargetWindows() should tell whether he is targeting msvc.
FYI, !isWindowsGNUEnvironment() is insufficient. It missed cygwin.

FIXME: The name "isTargetWindows" should be fixed.
llvm-svn: 205124
2014-03-30 04:35:00 +00:00
Rafael Espindola
0b8859a3e5 Completely rewrite ELFObjectWriter::RecordRelocation.
I started trying to fix a small issue, but this code has seen a small fix too
many.

The old code was fairly convoluted. Some of the issues it had:

* It failed to check if a symbol difference was in the some section when
  converting a relocation to pcrel.
* It failed to check if the relocation was already pcrel.
* The pcrel value computation was wrong in some cases (relocation-pc.s)
* It was missing quiet a few cases where it should not convert symbol
  relocations to section relocations, leaving the backends to patch it up.
* It would not propagate the fact that it had changed a relocation to pcrel,
  requiring a quiet nasty work around in ARM.
* It was missing comments.

llvm-svn: 205076
2014-03-29 06:26:49 +00:00
Akira Hatanaka
cca54c2eea [x86] Fix printing of register operands with q modifier.
Emit 32-bit register names instead of 64-bit register names if the target does
not have 64-bit general purpose registers.

<rdar://problem/14653996>

llvm-svn: 205067
2014-03-28 23:28:07 +00:00
David Majnemer
11f5ba4322 X86: Disable IsLegalToCallImmediateAddr for Win32
WinCOFF cannot form PC relative relocations to support absolute
MCValues.  We should reenable this once WinCOFF supports emission of
IMAGE_REL_I386_REL32 relocations.

This fixes PR19272.

llvm-svn: 205058
2014-03-28 21:40:47 +00:00
Saleem Abdulrasool
d42d60171a Canonicalise Windows target triple spellings
Construct a uniform Windows target triple nomenclature which is congruent to the
Linux counterpart.  The old triples are normalised to the new canonical form.
This cleans up the long-standing issue of odd naming for various Windows
environments.

There are four different environments on Windows:

MSVC: The MS ABI, MSVCRT environment as defined by Microsoft
GNU: The MinGW32/MinGW32-W64 environment which uses MSVCRT and auxiliary libraries
Itanium: The MSVCRT environment + libc++ built with Itanium ABI
Cygnus: The Cygwin environment which uses custom libraries for everything

The following spellings are now written as:

i686-pc-win32 => i686-pc-windows-msvc
i686-pc-mingw32 => i686-pc-windows-gnu
i686-pc-cygwin => i686-pc-windows-cygnus

This should be sufficiently flexible to allow us to target other windows
environments in the future as necessary.

llvm-svn: 204977
2014-03-27 22:50:05 +00:00
Quentin Colombet
567636b4e3 [X86][Vector Cost Model] Add a comment to explain the workaround
in my previous commit (r204884).

<rdar://problem/16381225>

llvm-svn: 204972
2014-03-27 22:27:41 +00:00
Rafael Espindola
d78485af3e Remove another unused argument.
llvm-svn: 204961
2014-03-27 20:49:35 +00:00
Rafael Espindola
4e5d391691 Remove unused argument.
llvm-svn: 204956
2014-03-27 20:41:17 +00:00
Rafael Espindola
5c8926deed Prevent alias from pointing to weak aliases.
This adds back r204781.

Original message:

Aliases are just another name for a position in a file. As such, the
regular symbol resolutions are not applied. For example, given

define void @my_func() {
  ret void
}
@my_alias = alias weak void ()* @my_func
@my_alias2 = alias void ()* @my_alias

We produce without this patch:

        .weak   my_alias
my_alias = my_func
        .globl  my_alias2
my_alias2 = my_alias

That is, in the resulting ELF file my_alias, my_func and my_alias are
just 3 names pointing to offset 0 of .text. That is *not* the
semantics of IR linking. For example, linking in a

@my_alias = alias void ()* @other_func

would require the strong my_alias to override the weak one and
my_alias2 would end up pointing to other_func.

There is no way to represent that with aliases being just another
name, so the best solution seems to be to just disallow it, converting
a miscompile into an error.

llvm-svn: 204934
2014-03-27 15:26:56 +00:00
Elena Demikhovsky
624ece9d50 AVX-512: Implemented masking for integer arithmetic & logic instructions.
By Robert Khasanov rob.khasanov@gmail.com

llvm-svn: 204906
2014-03-27 09:45:08 +00:00
Quentin Colombet
252d797b35 [X86][Vectorizer Cost Model] Correct vectorization cost model for v2i64->v2f64
and v4i64->v4f64.

The new costs match what we did for SSE2 and reflect the reality of our codegen.

<rdar://problem/16381225>

llvm-svn: 204884
2014-03-27 00:52:16 +00:00
Jim Grosbach
9f9b8f6709 X86: Correct vectorization cost model for v8f32->v8i8.
Fix the cost model to reflect the reality of our codegen.

rdar://16370633

llvm-svn: 204880
2014-03-27 00:04:11 +00:00
Hans Wennborg
ce9682473f Revert "X86 memcpy lowering: use "rep movs" even when esi is used as base pointer" (r204174)
>  For functions where esi is used as base pointer, we would previously fall ba
>  from lowering memcpy with "rep movs" because that clobbers esi.
>
>  With this patch, we just store esi in another physical register, and restore
>  it afterwards. This adds a little bit of register preassure, but the more
>  efficient memcpy should be worth it.
>
>  Differential Revision: http://llvm-reviews.chandlerc.com/D2968

This didn't work. I was ending up with code like this:

  lea     edi,[esi+38h]
  mov     ecx,0Fh
  mov     edx,esi
  mov     esi,ebx
  rep movs dword ptr es:[edi],dword ptr [esi]
  lea     ecx,[esi+74h] <-- Ooops, we're now using esi before restoring it from edx.
  add     ebx,3Ch
  mov     esi,edx

I guess if we want to do this we need stronger glue or something, or doing the expansion
much later.

llvm-svn: 204829
2014-03-26 16:30:54 +00:00
Cameron McInally
8872097c93 Fix AVX512 Gather and Scatter execution domains.
llvm-svn: 204804
2014-03-26 13:50:50 +00:00