1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-21 20:12:56 +02:00
Commit Graph

23254 Commits

Author SHA1 Message Date
Reed Kotler
7545d4833e Remove the form field from Mips16 instruction formats and set things
up so that we can apply the direct object emitter patch. This patch
should be a nop right now and it's test is to not break what is already
there.
 

llvm-svn: 175126
2013-02-14 03:05:25 +00:00
Rafael Espindola
d214fef222 Don't assume the mangling of static functions.
llvm-svn: 175121
2013-02-14 02:49:18 +00:00
Rafael Espindola
cd4cd16ea7 Don't asume that a static function in an extern "C" block will not be mangled.
Since functions with internal linkage don't have language linkage, it is valid
to overload them:

extern "C" {
       static int foo();
       static int foo(int);
}

So we mangle them.

llvm-svn: 175120
2013-02-14 01:58:08 +00:00
Weiming Zhao
1159c1f3f0 temporarily revert the patch due to some conflicts
llvm-svn: 175107
2013-02-13 23:24:40 +00:00
Anshuman Dasgupta
a6322b9e0a Hexagon: add support for predicate-GPR copies.
llvm-svn: 175102
2013-02-13 22:56:34 +00:00
Tom Stellard
9cf905c167 R600: Add support for 128-bit parameters
NOTE: This is a candidate for the Mesa stable branch.
llvm-svn: 175096
2013-02-13 22:05:20 +00:00
Nick Lewycky
9a61e050d5 Don't build tail calls to functions with three inreg arguments on x86-32 PIC.
Fixes PR15250!

llvm-svn: 175092
2013-02-13 21:59:15 +00:00
Weiming Zhao
e51d6cf7ae Bug fix 13622: Add paired register support for inline asm with 64-bit data on ARM
llvm-svn: 175088
2013-02-13 21:43:02 +00:00
Jyotsna Verma
a382dbb48f Hexagon: Use absolute addressing mode loads/stores for global+offset
instead of redefining separate instructions for them.

llvm-svn: 175086
2013-02-13 21:38:46 +00:00
Chad Rosier
ed40f84fdc [ms-inline-asm] Add support for memory references that have non-immediate
displacements.
rdar://12974533

llvm-svn: 175083
2013-02-13 21:33:44 +00:00
Reed Kotler
c0c9bb9263 For Mips 16, add the optimization where the 16 bit form of addiu sp can be used
if the offset fits in 11 bits. This makes use of the fact that the abi
requires sp to be 8 byte aligned so the actual offset can fit in 8
bits. It will be shifted left and sign extended before being actually used.
The assembler or direct object emitter will shift right the 11 bit
signed field by 3 bits. We don't need to deal with that here.

llvm-svn: 175073
2013-02-13 20:28:27 +00:00
Andrew Trick
4cce0af4e9 MIsched: HazardRecognizers are created for each DAG. Free them.
llvm-svn: 175067
2013-02-13 19:22:27 +00:00
Krzysztof Parzyszek
b6d2a1c1ee Add registration for PPC-specific passes to allow the IR to be dumped
via -print-after-all.

llvm-svn: 175058
2013-02-13 17:40:07 +00:00
Benjamin Kramer
34ab81b7fa X86: Disable generation of rep;movsl when %esi is used as a base pointer.
This happens when there is both stack realignment and a dynamic alloca in the
function. If we overwrite %esi (rep;movsl uses fixed registers) we'll lose the
base pointer and the next register spill will write into oblivion.

Fixes PR15249 and unbreaks firefox on i386/freebsd. Mozilla uses dynamic allocas
and freebsd a 4 byte stack alignment.

llvm-svn: 175057
2013-02-13 13:40:35 +00:00
Reed Kotler
49229780c8 Make jumptables work for -static
llvm-svn: 175044
2013-02-13 08:32:14 +00:00
Elena Demikhovsky
a4a4bded4d Prevent insertion of "vzeroupper" before call that preserves YMM registers, since a caller uses preserved registers across the call.
llvm-svn: 175043
2013-02-13 08:02:04 +00:00
Eric Christopher
a2c85e433f Check i1 as well as i8 variables for 8 bit registers for x86 inline
assembly.

llvm-svn: 175036
2013-02-13 06:01:05 +00:00
David Peixotto
84c964ec93 Test commit. Fixed typo.
llvm-svn: 175020
2013-02-13 00:36:35 +00:00
Jyotsna Verma
abd979fd30 Hexagon: Add support to generate predicated absolute addressing mode
instructions.

llvm-svn: 174973
2013-02-12 16:06:23 +00:00
Justin Holewinski
9a248309f0 [NVPTX] Disable vector registers
Vectors were being manually scalarized by the backend.  Instead,
let the target-independent code do all of the work.  The manual
scalarization was from a time before good target-independent support
for scalarization in LLVM. However, this forces us to specially-handle
vector loads and stores, which we can turn into PTX instructions that
produce/consume multiple operands.

llvm-svn: 174968
2013-02-12 14:18:49 +00:00
Michel Danzer
6e93c3c0af R600: Fix regression with shadow array sampler on pre-SI GPUs.
'R600/SI: Use proper instructions for array/shadow samplers.' removed two
cases from TEX_SHADOW. Vincent Lejeune reported on IRC that this broke some
shadow array piglit tests with the r600g driver. Reinstating the removed
cases should fix this, and still works with radeonsi as well.

I will follow up with some lit tests which would have caught the regression.

NOTE: This is a candidate for the Mesa stable branch.

Tested-by: Vincent Lejeune <vljn@ovi.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174963
2013-02-12 12:11:23 +00:00
Arnold Schwaighofer
b7dd0ff204 ARM cost model: Add vector reverse shuffle costs
A reverse shuffle is lowered to a vrev and possibly a vext instruction (quad
word).

radar://13171406

llvm-svn: 174933
2013-02-12 02:40:39 +00:00
Arnold Schwaighofer
1ecca5fd68 ARM NEON: Handle v16i8 and v8i16 reverse shuffles
Lower reverse shuffles to a vrev64 and a vext instruction instead of the default
legalization of storing and loading to the stack. This is important because we
generate reverse shuffles in the loop vectorizer when we reverse store to an
array.

  uint8_t Arr[N];
  for (i = 0; i < N; ++i)
    Arr[N - i - 1] = ...

radar://13171760

llvm-svn: 174929
2013-02-12 01:58:32 +00:00
Kay Tiong Khoo
d299c572f3 Added 0x0D to 2-byte opcode extension table for prefetch* variants
Fixed decode of existing 3dNow prefetchw instruction
Intel is scheduled to add a compatible prefetchw (same encoding) to future CPUs

llvm-svn: 174920
2013-02-12 00:19:12 +00:00
Akira Hatanaka
10788fa4a6 [mips] Expand pseudo instructions before they are emitted in
MipsCodeEmitter.cpp.

JALR and NOP are expanded by function emitPseudoExpansionLowering, which is not
called when the old JIT is used.

This fixes the following tests which have been failing on
llvm-mips-linux builder:

LLVM :: ExecutionEngine__2003-01-04-LoopTest.ll
LLVM :: ExecutionEngine__2003-05-06-LivenessClobber.ll
LLVM :: ExecutionEngine__2003-06-04-bzip2-bug.ll
LLVM :: ExecutionEngine__2005-12-02-TailCallBug.ll
LLVM :: ExecutionEngine__2003-10-18-PHINode-ConstantExpr-CondCode-Failure.ll
LLVM :: ExecutionEngine__hello2.ll
LLVM :: ExecutionEngine__stubs.ll
LLVM :: ExecutionEngine__test-branch.ll
LLVM :: ExecutionEngine__test-call.ll
LLVM :: ExecutionEngine__test-common-symbols.ll
LLVM :: ExecutionEngine__test-loadstore.ll
LLVM :: ExecutionEngine__test-loop.ll

llvm-svn: 174912
2013-02-11 22:35:40 +00:00
Akira Hatanaka
23e95ae884 [mips] Fix indentation.
llvm-svn: 174907
2013-02-11 22:03:52 +00:00
Krzysztof Parzyszek
272abb00e0 Extend Hexagon hardware loop generation to handle various additional cases:
- variety of compare instructions,
- loops with no preheader,
- arbitrary lower and upper bounds.

llvm-svn: 174904
2013-02-11 21:37:55 +00:00
Krzysztof Parzyszek
b814fd5e20 Implement HexagonInstrInfo::analyzeCompare.
llvm-svn: 174901
2013-02-11 20:04:29 +00:00
Kay Tiong Khoo
09400e6c4a *fixed disassembly of some i386 system insts with intel syntax
*added file for test cases for i386 intel syntax

llvm-svn: 174900
2013-02-11 19:46:36 +00:00
Michel Danzer
14d128a6ef R600/SI: Use V_ADD_F32 instead of V_MOV_B32 for clamp/neg/abs modifiers.
The modifiers don't seem to have any effect with V_MOV_B32, supposedly it's
meant to just move bits untouched.

Fixes 46 piglit tests with radeonsi, though unfortunately 11 of those had
just regressed because they started using the clamp modifier.

NOTE: This is a candidate for the Mesa stable branch.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174890
2013-02-11 15:58:21 +00:00
Tim Northover
f2c4616277 AArch64: fix build on some MSVC versions
This does two things:

It removes a call to abs() which may have "long long" parameter on Windows,
which is not necessarily available in C++03.

It also corrects the signedness of Amount, which was relying on
implementation-defined conversions previously.

Code was already tested (albeit in an implemnetation defined way) so no extra
tests.

llvm-svn: 174885
2013-02-11 14:25:52 +00:00
Tim Northover
17c416da1f AArch64: Simplify logic in deciding whether bfi is valid
Previous code had a confusing comment which was mostly an implementation
detail. This condition corresponds to "lsb up to register width" and "width not
ridiculous".

llvm-svn: 174877
2013-02-11 12:32:18 +00:00
Tim Northover
349160133e Make use of DiagnosticType to provide better AArch64 diagnostics.
This gives a DiagnosticType to all AsmOperands in sight. This replaces all
"invalid operand" diagnostics with something more specific. The messages given
should still be sufficiently vague that they're not usually actively misleading
when LLVM guesses your instruction incorrectly.

llvm-svn: 174871
2013-02-11 09:29:37 +00:00
Evan Cheng
d6beb40882 Currently, codegen may spent some time in SDISel passes even if an entire
function is successfully handled by fast-isel. That's because function
arguments are *always* handled by SDISel. Introduce FastLowerArguments to
allow each target to provide hook to handle formal argument lowering.

As a proof-of-concept, add ARMFastIsel::FastLowerArguments to handle
functions with 4 or fewer scalar integer (i8, i16, or i32) arguments. It
completely eliminates the need for SDISel for trivial functions.

rdar://13163905

llvm-svn: 174855
2013-02-11 01:27:15 +00:00
Joel Jones
d3ff41937c Spelling correction
llvm-svn: 174852
2013-02-10 23:56:30 +00:00
Vincent Lejeune
0c989d9a63 Test Commit - Remove some trailing whitespace in R600Instructions.td
llvm-svn: 174839
2013-02-10 17:57:33 +00:00
Justin Holewinski
2f180cbb22 [NVPTX] Make address space errors more explicit (llvm_unreachable -> report_fatal_error)
llvm-svn: 174808
2013-02-09 13:34:15 +00:00
Tom Stellard
967740ea18 R600: Dump the function name when TargetLowering::LowerCall() fails
Also output a more useful error message.

NOTE: This is a candidate for the Mesa stable branch
llvm-svn: 174763
2013-02-08 22:24:40 +00:00
Tom Stellard
248e476b92 R600: rework flow creation in the structurizer v2
This fixes a couple of bugs and incorrect assumptions,
in total four more piglit tests now pass.

v2: fix small bug in the dominator updating

Patch by: Christian König

Signed-off-by: Christian König <christian.koenig@amd.com>
llvm-svn: 174762
2013-02-08 22:24:38 +00:00
Tom Stellard
ae8c01f0aa R600: fix loop analyses in the structurizer
Patch by: Christian König

Intersecting loop handling was wrong.

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 174761
2013-02-08 22:24:37 +00:00
Tom Stellard
e5d8498ede R600: fix PHI value adding in the structurizer
Otherwise we sometimes produce invalid code.

Patch by: Christian König

Signed-off-by: Christian König <christian.koenig@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 174760
2013-02-08 22:24:35 +00:00
Reed Kotler
fbd845c0d9 Add the 16 bit version of addiu. To the assembler, the 16 and 32 bit are the
same so we put in the comment field an indicator when we think we are
emitting the 16 bit version. For the direct object emitter, the difference is 
important as well as for other passes which need an accurate count of 
program size. There will be other similar putbacks to this for various
instructions.

llvm-svn: 174747
2013-02-08 21:42:56 +00:00
Bill Schmidt
53ad58d77a Refine fix to bug 15041.
Thanks to help from Nadav and Hal, I have a more reasonable (and even
correct!) approach.  This specifically penalizes the insertelement
and extractelement operations for the performance hit that will occur
on PowerPC processors.

llvm-svn: 174725
2013-02-08 18:19:17 +00:00
Arnold Schwaighofer
381c4a3e54 ARM cost model: Address computation in vector mem ops not free
Adds a function to target transform info to query for the cost of address
computation. The cost model analysis pass now also queries this interface.
The code in LoopVectorize adds the cost of address computation as part of the
memory instruction cost calculation. Only there, we know whether the instruction
will be scalarized or not.
Increase the penality for inserting in to D registers on swift. This becomes
necessary because we now always assume that address computation has a cost and
three is a closer value to the architecture.

radar://13097204

llvm-svn: 174713
2013-02-08 14:50:48 +00:00
Reed Kotler
434681ac07 When Mips16 frames grow large, the immediate field may exceed the maximum
allowed size for the instruction. This code uses RegScavenger to fix this.
We sometimes need 2 registers for Mips16 so we must handle things
differently than how register scavenger is normally used.

llvm-svn: 174696
2013-02-08 03:57:41 +00:00
Akira Hatanaka
6ea6486f83 [mips] Make Filler a class and reduce indentation.
llvm-svn: 174666
2013-02-07 21:32:32 +00:00
Bill Schmidt
73c6529ed8 Constrain PowerPC autovectorization to fix bug 15041.
Certain vector operations don't vectorize well with the current
PowerPC implementation.  Element insert/extract performs poorly
without VSX support because Altivec requires going through memory.
SREM, UREM, and VSELECT all produce bad scalar code.

There's a lot of work to do for the cost model before
autovectorization will be tuned well, and this is not an attempt to
address the larger problem.

llvm-svn: 174660
2013-02-07 20:33:57 +00:00
Akira Hatanaka
a989d1f25d [mips] Add definition of JALR instruction which has two register operands. Change the
original JALR instruction with one register operand to be a pseudo-instruction.

llvm-svn: 174657
2013-02-07 19:48:00 +00:00
Tom Stellard
075c61683a R600/SI: cleanup VGPR encoding
Remove all the unused code.

Patch by: Christian König

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174656
2013-02-07 19:39:45 +00:00
Tom Stellard
6d64a9f30f R600/SI: Handle VGPR64 destination in copyPhysReg().
Allows nexuiz to run with radeonsi.

Patch by: Michel Dänzer

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 174655
2013-02-07 19:39:43 +00:00