1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 12:02:58 +02:00
Commit Graph

282 Commits

Author SHA1 Message Date
Jim Grosbach
51592aeaa3 X86: Remove TargetMachine CPU auto-detection.
This logic is properly in the realm of whatever is creating the
TargetMachine. This makes plain 'llc foo.ll' consistent across
heterogenous machines.

llvm-svn: 206094
2014-04-12 01:34:29 +00:00
David Majnemer
11f5ba4322 X86: Disable IsLegalToCallImmediateAddr for Win32
WinCOFF cannot form PC relative relocations to support absolute
MCValues.  We should reenable this once WinCOFF supports emission of
IMAGE_REL_I386_REL32 relocations.

This fixes PR19272.

llvm-svn: 205058
2014-03-28 21:40:47 +00:00
David Woodhouse
302d198064 [x86] Support i386-*-*-code16 triple for emitting 16-bit code
llvm-svn: 199648
2014-01-20 12:02:25 +00:00
Nico Rieck
964a13bb4e Decouple dllexport/dllimport from linkage
Representing dllexport/dllimport as distinct linkage types prevents using
these attributes on templates and inline functions.

Instead of introducing further mixed linkage types to include linkonce and
weak ODR, the old import/export linkage types are replaced with a new
separate visibility-like specifier:

  define available_externally dllimport void @f() {}
  @Var = dllexport global i32 1, align 4

Linkage for dllexported globals and functions is now equal to their linkage
without dllexport. Imported globals and functions must be either
declarations with external linkage, or definitions with
AvailableExternallyLinkage.

llvm-svn: 199218
2014-01-14 15:22:47 +00:00
Nico Rieck
e8a579c6bc Revert "Decouple dllexport/dllimport from linkage"
Revert this for now until I fix an issue in Clang with it.

This reverts commit r199204.

llvm-svn: 199207
2014-01-14 12:38:32 +00:00
Nico Rieck
6203d44313 Decouple dllexport/dllimport from linkage
Representing dllexport/dllimport as distinct linkage types prevents using
these attributes on templates and inline functions.

Instead of introducing further mixed linkage types to include linkonce and
weak ODR, the old import/export linkage types are replaced with a new
separate visibility-like specifier:

  define available_externally dllimport void @f() {}
  @Var = dllexport global i32 1, align 4

Linkage for dllexported globals and functions is now equal to their linkage
without dllexport. Imported globals and functions must be either
declarations with external linkage, or definitions with
AvailableExternallyLinkage.

llvm-svn: 199204
2014-01-14 11:55:03 +00:00
David Woodhouse
5bd74bdaf8 [x86] Kill gratuitous X86_{32,64}TargetMachine subclasses, use X86TargetMachine
llvm-svn: 198720
2014-01-08 00:08:50 +00:00
Craig Topper
201bd5add3 [x86] Add basic support for .code16
This is not really expected to work right yet. Mostly because we will
still emit the OpSize (0x66) prefix in all the wrong places, along with
a number of other corner cases. Those will all be fixed in the subsequent
commits.

Patch from David Woodhouse.

llvm-svn: 198584
2014-01-06 04:55:54 +00:00
Tim Northover
1b70928c82 X86: enable AVX2 under Haswell native compilation
Patch by Adam Strzelecki

llvm-svn: 195632
2013-11-25 09:52:59 +00:00
Ekaterina Romanova
eda4e2e4a7 SHLD/SHRD are VectorPath (microcode) instructions known to have poor latency on certain architectures. While generating SHLD/SHRD instructions is acceptable when optimizing for size, optimizing for speed on these platforms should be implemented using alternative sequences of instructions composed of add, adc, shr, shl, or and lea which are directPath instructions. These alternative instructions not only have a lower latency but they also increase the decode bandwidth by allowing simultaneous decoding of a third directPath instruction.
AMD's processors family K7, K8, K10, K12, K15 and K16 are known to have SHLD/SHRD instructions with very poor latency. Optimization guides for these processors recommend using an alternative sequence of instructions. For these AMD's processors, I disabled folding (or (x << c) | (y >> (64 - c))) when we are not optimizing for size.

It might be beneficial to disable this folding for some of the Intel's processors. However, since I couldn't find specific recommendations regarding using SHLD/SHRD instructions on Intel's processors, I haven't disabled this peephole for Intel.

llvm-svn: 195383
2013-11-21 23:21:26 +00:00
Yunzhong Gao
14609c71b2 Adding a feature flag to the llvm backend for x86 TBM instruction set.
Adding TBM feature to bdver2 processor; piledriver supports this instruction set
according to the following document:
http://developer.amd.com/wordpress/media/2012/10/New-Bulldozer-and-Piledriver-Instructions.pdf

Phabricator code review is located here: http://llvm-reviews.chandlerc.com/D1692

llvm-svn: 191324
2013-09-24 18:21:52 +00:00
Craig Topper
7ef60f689f Prevent extra calls to ToggleFeature for Feature64Bit and FeatureCMOV if they've already been enabled. The extra call ends up clearing the bit in FeatureBits since its a 'toggle'. Can't prove that anything was broken because of this since I don't think the FeatureBits for these are used.
llvm-svn: 190920
2013-09-18 06:01:53 +00:00
Craig Topper
01e805a531 Fix X86 subtarget to not overwrite the autodetected features by calling InitMCProcessorInfo right after detecting them. Instead add a new function that only updates the scheduling model and call that.
llvm-svn: 190919
2013-09-18 05:54:09 +00:00
Preston Gurd
0411803c14 Adds support for Atom Silvermont (SLM) - -march=slm
Implements Instruction scheduler latencies for Silvermont,
using latencies from the Intel Silvermont Optimization Guide.

Auto detects SLM.

Turns on post RA scheduler when generating code for SLM.

llvm-svn: 190717
2013-09-13 19:23:28 +00:00
Craig Topper
5c570ba6ec Move operator to end of previous line to match coding standards.
llvm-svn: 190659
2013-09-13 04:41:06 +00:00
Ben Langmuir
9981cd7cfe Partial support for Intel SHA Extensions (sha1rnds4)
Add basic assembly/disassembly support for the first Intel SHA
instruction 'sha1rnds4'. Also includes feature flag, and test cases.

Support for the remaining instructions will follow in a separate patch.

llvm-svn: 190611
2013-09-12 15:51:31 +00:00
Craig Topper
33a600320c Rename mattr names for AVX-512 to from avx-512 -> avx512f, avx-512-pfi -> av512pf, avx-512-cdi -> avx512cd, avx-512-eri->avx512er. This matches better with official docs and what gcc patches appearto be using. I didn't touch the has* functions or the feature flag names to avoid change the td and lowering file while commits are still happening.
llvm-svn: 188859
2013-08-21 03:57:57 +00:00
Craig Topper
3bda7ba32d Fix formatting. No functional change.
llvm-svn: 188746
2013-08-20 05:23:59 +00:00
Craig Topper
0c2625083b Add AVX-512 and related features to the CPUID detection code.
llvm-svn: 188745
2013-08-20 05:22:42 +00:00
Elena Demikhovsky
505373db43 Added encoding prefixes for KNL instructions (EVEX).
Added 512-bit operands printing.
Added instruction formats for KNL instructions.

llvm-svn: 187324
2013-07-28 08:28:38 +00:00
Michael Kuperstein
25056babb2 Re-enable AVX detection on x64 platforms.
llvm-svn: 181313
2013-05-07 14:05:33 +00:00
Aaron Ballman
e1ad194ed5 Unbreaking the non-x86 build bots by protecting the AVX test code properly.
llvm-svn: 180992
2013-05-03 02:52:21 +00:00
Aaron Ballman
38fdf1efd6 Correctly testing for AVX support in x86 based off code from Hosts.cpp.
llvm-svn: 180991
2013-05-03 02:39:21 +00:00
Preston Gurd
0547d81fdb This patch adds the X86FixupLEAs pass, which will reduce instruction
latency for certain models of the Intel Atom family, by converting
instructions into their equivalent LEA instructions, when it is both
useful and possible to do so.

llvm-svn: 180573
2013-04-25 20:29:37 +00:00
Eric Christopher
81bacb7670 Formatting.
llvm-svn: 178589
2013-04-02 23:06:40 +00:00
Michael Liao
427149cbcf Add support of RDSEED defined in AVX2 extension
llvm-svn: 178314
2013-03-28 23:41:26 +00:00
Michael Liao
30577169a4 Add ADX CPUID detection
llvm-svn: 178299
2013-03-28 22:29:53 +00:00
Preston Gurd
b6ed645cb6 For the current Atom processor, the fastest way to handle a call
indirect through a memory address is to load the memory address into
a register and then call indirect through the register.

This patch implements this improvement by modifying SelectionDAG to
force a function address which is a memory reference to be loaded
into a virtual register.

Patch by Sriram Murali.

llvm-svn: 178171
2013-03-27 19:14:02 +00:00
Michael Liao
3515920fbd Add HLE target feature
llvm-svn: 178082
2013-03-26 22:46:02 +00:00
Michael Liao
969ef73c31 Add PREFETCHW codegen support
- Add 'PRFCHW' feature defined in AVX2 ISA extension

llvm-svn: 178040
2013-03-26 17:47:11 +00:00
Nadav Rotem
6489b3c2b7 Revert r176166 because it broke one of the lit tests.
llvm-svn: 176171
2013-02-27 05:56:20 +00:00
Nadav Rotem
c64a5a0435 std::string to StringRef.
llvm-svn: 176166
2013-02-27 05:23:56 +00:00
Bill Wendling
fb87157cc8 Reinitialize the ivars in the subtarget so that they can be reset with the new features.
llvm-svn: 175336
2013-02-16 01:36:26 +00:00
Bill Wendling
3e17fc6664 Temporary revert of 175320.
llvm-svn: 175322
2013-02-15 23:22:32 +00:00
Bill Wendling
ecc7822c1e Reinitialize the ivars in the subtarget.
When we're recalculating the feature set of the subtarget, we need to have the
ivars in their initial state.

llvm-svn: 175320
2013-02-15 23:18:01 +00:00
Bill Wendling
cc20bdf27b Use the 'target-features' and 'target-cpu' attributes to reset the subtarget features.
If two functions require different features (e.g., `-mno-sse' vs. `-msse') then
we want to honor that, especially during LTO. We can do that by resetting the
subtarget's features depending upon the 'target-feature' attribute.

llvm-svn: 175314
2013-02-15 22:31:27 +00:00
Kay Tiong Khoo
45b3d90921 added basic support for Intel ADX instructions
-feature flag, instructions definitions, test cases

llvm-svn: 175196
2013-02-14 19:08:21 +00:00
Evan Cheng
4d1a496923 Restrict sin/cos optimization to 64-bit only for now. 32-bit is a bit messy and less critical.
llvm-svn: 173987
2013-01-30 22:56:35 +00:00
Evan Cheng
2e2cde560f Teach SDISel to combine fsin / fcos into a fsincos node if the following
conditions are met:
1. They share the same operand and are in the same BB.
2. Both outputs are used.
3. The target has a native instruction that maps to ISD::FSINCOS node or
   the target provides a sincos library call.

Implemented the generic optimization in sdisel and enabled it for
Mac OSX. Also added an additional optimization for x86_64 Mac OSX by
using an alternative entry point __sincos_stret which returns the two
results in xmm0 / xmm1.

rdar://13087969
PR13204

llvm-svn: 173755
2013-01-29 02:32:37 +00:00
Preston Gurd
4b0d66f924 Pad Short Functions for Intel Atom
The current Intel Atom microarchitecture has a feature whereby
when a function returns early then it is slightly faster to execute
a sequence of NOP instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction until
the return address is ready.

When compiling for X86 Atom only, this patch will run a pass,
called "X86PadShortFunction" which will add NOP instructions where less
than four cycles elapse between function entry and return.

It includes tests.

This patch has been updated to address Nadav's review comments
- Optimize only at >= O1 and don't do optimization if -Os is set
- Stores MachineBasicBlock* instead of BBNum
- Uses DenseMap instead of std::map
- Fixes placement of braces

Patch by Andy Zhang.

llvm-svn: 171879
2013-01-08 18:27:24 +00:00
Nadav Rotem
900cb45dec Revert revision 171524. Original message:
URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev
Log:
The current Intel Atom microarchitecture has a feature whereby when a function
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171603
2013-01-05 05:42:48 +00:00
Preston Gurd
b1c34fa73f The current Intel Atom microarchitecture has a feature whereby when a function
returns early then it is slightly faster to execute a sequence of NOP
instructions to wait until the return address is ready,
as opposed to simply stalling on the ret instruction
until the return address is ready.

When compiling for X86 Atom only, this patch will run a pass, called
"X86PadShortFunction" which will add NOP instructions where less than four
cycles elapse between function entry and return.

It includes tests.

Patch by Andy Zhang.

llvm-svn: 171524
2013-01-04 20:54:54 +00:00
Chandler Carruth
4c1f3c24db Move all of the header files which are involved in modelling the LLVM IR
into their new header subdirectory: include/llvm/IR. This matches the
directory structure of lib, and begins to correct a long standing point
of file layout clutter in LLVM.

There are still more header files to move here, but I wanted to handle
them in separate commits to make tracking what files make sense at each
layer easier.

The only really questionable files here are the target intrinsic
tablegen files. But that's a battle I'd rather not fight today.

I've updated both CMake and Makefile build systems (I think, and my
tests think, but I may have missed something).

I've also re-sorted the includes throughout the project. I'll be
committing updates to Clang, DragonEgg, and Polly momentarily.

llvm-svn: 171366
2013-01-02 11:36:10 +00:00
Chandler Carruth
d5a4e512ca Fix a typo in my previous commit -- bloomfield is 0x1A not 0x2A.
Thanks to the PaX folks for noticing in review! We need some tests here,
any sugestions welcome...

llvm-svn: 169739
2012-12-10 18:22:40 +00:00
Chandler Carruth
3e5f2c328d Address a FIXME and update the fast unaligned memory feature for newer
Intel chips.

The model number rules were determined by inspecting Intel's
documentation for their newer chip model numbers. My understanding is
that all of the newer Intel chips have fast unaligned memory access, but
if anyone is concerned about a particular chip, just shout.

No tests updated; it's not clear we have dedicated tests for the chips'
various features, but if anyone would like tests (or can point me at
some existing ones), I'm happy to oblige.

llvm-svn: 169730
2012-12-10 09:18:44 +00:00
Chandler Carruth
a490793037 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Roman Divacky
9196fa48cc Switch FreeBSD/i386 back to 4byte stack alignment. This partially
reverts r126226.

llvm-svn: 167632
2012-11-09 20:10:44 +00:00
Michael Liao
59114df23b Add support of RTM from TSX extension
- Add RTM code generation support throught 3 X86 intrinsics:
  xbegin()/xend() to start/end a transaction region, and xabort() to abort a
  tranaction region

llvm-svn: 167573
2012-11-08 07:28:54 +00:00
Andrew Trick
957ebf0670 misched: remove the unused getSpecialAddressLatency hook.
llvm-svn: 165418
2012-10-08 18:54:00 +00:00
Preston Gurd
becab37ab8 Set up MCSchedModel after detecting the CPU type in X86SubTarget.
Corrects a problem whereby MCSchedModel was not being set up when
the CPU type was auto-detected.

Patch by Andy Zhang.

llvm-svn: 165122
2012-10-03 15:55:13 +00:00