1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-22 04:22:57 +02:00
Commit Graph

14959 Commits

Author SHA1 Message Date
Bruno Cardoso Lopes
0794b8ab3f Patterns to match vinsert, vbroadcast, vmovmask and vcvtdq2pd AVX intrinsics
llvm-svn: 110580
2010-08-09 18:03:43 +00:00
Evan Cheng
a04ba7588a Add an option to disable 32 -> 16-bit Thumb2 size reduction pass for experimentation.
llvm-svn: 110579
2010-08-09 17:16:10 +00:00
Kalle Raiskila
e2c0e66ff1 Have SPU handle halfvec stores aligned by 8 bytes.
llvm-svn: 110576
2010-08-09 16:33:00 +00:00
Nick Lewycky
3a15ba4d5e Add optimization to Target/README.txt.
llvm-svn: 110543
2010-08-08 07:04:25 +00:00
Bill Wendling
39c49e3e17 Use the "isCompare" machine instruction attribute instead of calling the
relatively expensive comparison analyzer on each instruction. Also rename the
comparison analyzer method to something more in line with what it actually does.

This pass is will eventually be folded into the Machine CSE pass.

llvm-svn: 110539
2010-08-08 05:04:59 +00:00
Dale Johannesen
23f9086dd3 Use sdmem and sse_load_f64 (etc.) for the vector
form of CMPSD (etc.)  Matching a 128-bit memory
operand is wrong, the instruction uses only 64 bits
(same as ADDSD etc.)  8193553.

llvm-svn: 110491
2010-08-07 00:33:42 +00:00
Bruno Cardoso Lopes
5b602f8822 Patterns to match AVX 256-bit vzero intrinsics
llvm-svn: 110480
2010-08-06 22:10:01 +00:00
Bruno Cardoso Lopes
821eebf946 Patterns to match AVX 256-bit permutation intrinsics
llvm-svn: 110468
2010-08-06 20:03:27 +00:00
Jim Grosbach
9ef6362af1 Remove empty processFunctionBeforeFrameFinalized(). The default
implementation of the function is equivalent, so no need to provide
the target-specific version until/unless it needs to do something.

llvm-svn: 110465
2010-08-06 18:57:24 +00:00
Owen Anderson
f2fea95f2f Reapply r110396, with fixes to appease the Linux buildbot gods.
llvm-svn: 110460
2010-08-06 18:33:48 +00:00
Rafael Espindola
6d53fded19 Fix eabi calling convention when a 64 bit value shadows r3.
Without this what was happening was:

* R3 is not marked as "used"
* ARM backend thinks it has to save it to the stack because of vaarg
* Offset computation correctly ignores it
* Offsets are wrong

llvm-svn: 110446
2010-08-06 15:35:32 +00:00
Bruno Cardoso Lopes
d186fba555 Patterns to match AVX 256-bit horizontal arithmetic intrinsics
llvm-svn: 110427
2010-08-06 02:10:30 +00:00
Bruno Cardoso Lopes
5e9f9c921e Patterns to match AVX 256-bit arithmetic intrinsics
llvm-svn: 110425
2010-08-06 01:52:29 +00:00
Bill Wendling
0cd2ae5158 Add the Optimize Compares pass (disabled by default).
This pass tries to remove comparison instructions when possible. For instance,
if you have this code:

   sub r1, 1
   cmp r1, 0
   bz  L1

and "sub" either sets the same flag as the "cmp" instruction or could be
converted to set the same flag, then we can eliminate the "cmp" instruction all
together. This is a important for ARM where the ALU instructions could set the
CPSR flag, but need a special suffix ('s') to do so.

llvm-svn: 110423
2010-08-06 01:32:48 +00:00
Owen Anderson
aadd8a89ca Revert r110396 to fix buildbots.
llvm-svn: 110410
2010-08-06 00:23:35 +00:00
Eric Christopher
cf17d8dfa7 Add an option to always emit realignment code for a particular module.
llvm-svn: 110404
2010-08-05 23:57:43 +00:00
Owen Anderson
b9762c07cb Don't use PassInfo* as a type identifier for passes. Instead, use the address of the static
ID member as the sole unique type identifier.  Clean up APIs related to this change.

llvm-svn: 110396
2010-08-05 23:42:04 +00:00
Dan Gohman
8a813c4ded Remove IntrWriteMem, as it's the default. Rename IntrWriteArgMem
to IntrReadWriteArgMem, as it's for reading as well as writing.

llvm-svn: 110395
2010-08-05 23:36:21 +00:00
Bruno Cardoso Lopes
a26a97510a Support very basic (doesn't include ABI support in the front-end, varags, ...) 256-bit argument passing and return for AVX
llvm-svn: 110394
2010-08-05 23:35:51 +00:00
Eric Christopher
b10ca25085 Handle the memory barrier pseudo that goes to nothing for the JIT.
llvm-svn: 110371
2010-08-05 20:04:36 +00:00
Eric Christopher
bc14450d15 Set hasSideEffects on the 64-bit no-sse memory barrier.
llvm-svn: 110369
2010-08-05 19:54:59 +00:00
Jim Grosbach
fb6af5329d For local variables in functions with a frame pointer, use FP as a base
register for local access when it's closer to the stack slot being refererenced
than the stack pointer. Make sure to take into account any argument frame
SP adjustments that are in affect at the time.

rdar://8256090

llvm-svn: 110366
2010-08-05 19:27:37 +00:00
Bob Wilson
fbce203f20 Fix indentation.
llvm-svn: 110363
2010-08-05 19:00:21 +00:00
Bob Wilson
4ba3c0a5e1 Add an ARM RSCrr instruction for disassembly only.
Partial fix for PR7792.

llvm-svn: 110361
2010-08-05 18:59:36 +00:00
Eric Christopher
61f3059ee1 Be a little bit more specific about target for the memory barrier
instructions.

llvm-svn: 110360
2010-08-05 18:36:20 +00:00
Eric Christopher
904ec3a392 Handle the pseudo in MCInstLower.
llvm-svn: 110359
2010-08-05 18:34:30 +00:00
Bob Wilson
9fbaea3765 Add an ARM RSBrr instruction for disassembly only.
Partial fix for PR7792.

llvm-svn: 110358
2010-08-05 18:23:43 +00:00
Chandler Carruth
01c83a8512 Silence a GCC warning about && and || without explicit parentheses. This
preserves the existing behavior, as it seems a concious choice to allow RS to
be null and BigStack marked true.

llvm-svn: 110307
2010-08-05 03:04:21 +00:00
Bob Wilson
214b004717 ARM "rrx" shift operands do not have an immediate. PR7790.
llvm-svn: 110292
2010-08-05 00:34:42 +00:00
Eric Christopher
0e09eb9f77 Make x86-64 membarriers work without sse and clean up some of the
uses.

llvm-svn: 110274
2010-08-04 23:03:04 +00:00
Jim Grosbach
511dbe9c8e and back in. false alarm on the tests from another unrelated local change.
llvm-svn: 110269
2010-08-04 22:46:09 +00:00
Eli Friedman
401dbe036d PR7814: Truncates cannot be ignored for signed comparisons.
llvm-svn: 110268
2010-08-04 22:40:58 +00:00
Devang Patel
53e2e4feae Implement target specific getDebugValueLocation().
llvm-svn: 110267
2010-08-04 22:39:39 +00:00
Jim Grosbach
497c60502c oops. revert for a moment to clean up tests first.
llvm-svn: 110259
2010-08-04 22:12:43 +00:00
Jim Grosbach
ece51f94db Reserve a stack slot if the function adjusts the stack but doesn't
simplify the call frame pseudo instructions. In that situation, the
calculations for estimating the stack size will be way off, leading to
not having an emergency spill slot when we need one. It should be possible
to be more precise about tracking the adjustment values, but not really
necessary for correctness. Upcoming cleanups for PEI in general will
render that moot.

llvm-svn: 110258
2010-08-04 22:10:15 +00:00
Devang Patel
97a93285f5 Implement target specific getDebugValueLocation().
llvm-svn: 110256
2010-08-04 22:07:50 +00:00
Torok Edwin
319c3f56c8 Use indirect calls in PowerPC JIT.
See PR5201. There is no way to know if direct calls will be within the allowed
range for BL. Hence emit all calls as indirect when in JIT mode.
Without this long-running applications will fail to JIT on PowerPC with a
relocation failure.

llvm-svn: 110246
2010-08-04 20:47:44 +00:00
Dale Johannesen
53bc276b33 Remove switch for disabling ARM tail calls. They
seem to be working correctly.  No functional change.

llvm-svn: 110226
2010-08-04 18:07:17 +00:00
Devang Patel
e48431509c Add DEBUG message.
llvm-svn: 110224
2010-08-04 18:06:05 +00:00
Benjamin Kramer
8bce8e326c Enable COFF writer on mingw32 and cygwin.
llvm-svn: 110200
2010-08-04 15:32:40 +00:00
Kalle Raiskila
ce1e4d80cb Make SPU backend handle insertelement and
store for "half vectors"

llvm-svn: 110198
2010-08-04 13:59:48 +00:00
Benjamin Kramer
968cc0119f Print an error message when someone tries -integrated-as on an unsupported target.
- The COFF backend doesn't support MingW/Cygwin at the moment, it'll report an
  error, but it's still much better than random assertions from the MachO backend.
- We want to make ELF the default eventually, it's what the majority of targets use.

llvm-svn: 110197
2010-08-04 13:16:30 +00:00
Gabor Greif
50fb0419ea by Alexander Herz:
"The CWriter::GetValueName() method does not check if a value as an alias 
and emits the alias name which will never be defined in the output .c 
file (so the output file fails to compile). This can happen if you have 
multiple inheritance with several destructors defined by clang (...D0Ev, 
...D1Ev, ...D2Ev)."

-- applied with minor tweaks. Thanks!

llvm-svn: 110194
2010-08-04 10:00:52 +00:00
Bob Wilson
6a2437480a Combine NEON VABD (absolute difference) intrinsics with ADDs to make VABA
(absolute difference with accumulate) intrinsics.  Radar 8228576.

llvm-svn: 110170
2010-08-04 00:12:08 +00:00
Chris Lattner
a2e36c6b18 fix a win64 encoding problem, patch by Cameron Esfahani!
llvm-svn: 110164
2010-08-03 22:49:22 +00:00
Nate Begeman
b506e13a32 Add support for getting & setting the FPSCR application register on ARM when VFP is enabled.
Add support for using the FPSCR in conjunction with the vcvtr instruction, for controlling fp to int rounding.
Add support for the FLT_ROUNDS_ node now that the FPSCR is exposed.

llvm-svn: 110152
2010-08-03 21:31:55 +00:00
Oscar Fuentes
7186be986b CMake: Change somme target library names:
XCore->XCoreGen
PIC16->PIC16CodeGen

After updating your working copy, the first build will fail because it
is using the old library dependencies. Start the build again and it
will work fine.

llvm-svn: 110127
2010-08-03 17:40:31 +00:00
Kalle Raiskila
014c93befb More SPU v2f32 stuff added: insertelement and shuffle.
llvm-svn: 110038
2010-08-02 11:22:10 +00:00
Kalle Raiskila
766fd434df Add preliminary v2f32 support for SPU. Like with v2i32, we just
duplicate the instructions and operate on half vectors. 

Also reorder code in SPUInstrInfo.td for better coherency.

llvm-svn: 110037
2010-08-02 10:25:47 +00:00
Kalle Raiskila
21615cb06e Add preliminary v2i32 support for SPU backend. As there are no
such registers in SPU, this support boils down to "emulating" 
them by duplicating instructions on the general purpose registers. 

This adds the most basic operations on v2i32: passing parameters,
addition, subtraction, multiplication and a few others.

llvm-svn: 110035
2010-08-02 08:54:39 +00:00
Eli Friedman
40cb7d9994 PR7781: Fix incorrect shifting in PPCTargetLowering::LowerBUILD_VECTOR.
llvm-svn: 109998
2010-08-02 00:18:19 +00:00
Eli Friedman
3c5289c381 PR7774: Fix undefined shifts in Alpha backend. As a bonus, this actually
improves the generated code in some cases.

llvm-svn: 109985
2010-08-01 21:13:28 +00:00
Daniel Dunbar
e0737ebae3 Silence some -Asserts uninitialized variable warnings.
llvm-svn: 109956
2010-07-31 21:08:54 +00:00
Michael J. Spencer
1f89cc1fe5 MC: Remove HasAbsolutizedSet from WindowsX86AsmBackend.
llvm-svn: 109949
2010-07-31 07:21:44 +00:00
Bob Wilson
58c8a5da9e Move newlines before inline jumptables from the asm strings in .td files to
the jtblock_operand print methods.  This avoids extra newlines in the
disassembler's output.  PR7757.

llvm-svn: 109948
2010-07-31 06:28:10 +00:00
Michael J. Spencer
b52ff1ba41 Add relax all support to the COFF object streamer.
llvm-svn: 109947
2010-07-31 06:22:29 +00:00
Bob Wilson
439e7b1d73 Add support for disassembling VMVN (immediate) instructions. PR7747.
llvm-svn: 109946
2010-07-31 05:57:44 +00:00
Evan Cheng
ee59acf6dd Add -disable-shifter-op to disable isel of shifter ops. On Cortex-a9 the shifts cost extra instructions so it might be better to emit them separately to take advantage of dual-issues.
llvm-svn: 109934
2010-07-30 23:33:54 +00:00
Bob Wilson
6ce71251cc Add a check in the ARM disassembler for NEON instructions that would
reference registers past the end of the NEON register file, and report them
as invalid instead of asserting when trying to print them.  PR7746.

llvm-svn: 109933
2010-07-30 23:27:59 +00:00
Dale Johannesen
eb251be031 PPC doesn't supported VLA with large alignment. This was
formerly rejected by the FE, so asserted in the BE; now the FE only
warns, so we treat it as a legitimate fatal error in PPC BE.
This means the test for the feature won't pass, so it's xfail'd.

llvm-svn: 109892
2010-07-30 21:09:48 +00:00
Bob Wilson
bd1dc153a5 Add the __TEXT,__StaticInit section to the list of sections emitted at the
beginning on ARM Darwin assembly files so that it won't be placed after
debug sections.  Radar 8252813.

llvm-svn: 109879
2010-07-30 19:55:47 +00:00
Bruno Cardoso Lopes
0c0dd2173c Support all 128-bit AVX vector intrinsics. Most part of them I already
declared during the addition of the assembler support, the additional
changes are:
- Add missing intrinsics
- Move all SSE conversion instructions in X86InstInfo64.td to the SSE.td file.
- Duplicate some patterns to AVX mode.
- Step into PCMPEST/PCMPIST custom inserter and add AVX versions.

llvm-svn: 109878
2010-07-30 19:54:33 +00:00
Bruno Cardoso Lopes
5d4afd0cb9 Fix typo!
llvm-svn: 109877
2010-07-30 19:41:24 +00:00
Jim Grosbach
1718345a30 Many Thumb2 instructions can reference the full ARM register set (i.e.,
have 4 bits per register in the operand encoding), but have undefined
behavior when the operand value is 13 or 15 (SP and PC, respectively).
The trivial coalescer in linear scan sometimes will merge a copy from
SP into a subsequent instruction which uses the copy, and if that
instruction cannot legally reference SP, we get bad code such as:
  mls r0,r9,r0,sp
instead of:
  mov r2, sp
  mls r0, r9, r0, r2

This patch adds a new register class for use by Thumb2 that excludes
the problematic registers (SP and PC) and is used instead of GPR
for those operands which cannot legally reference PC or SP. The
trivial coalescer explicitly requires that the register class
of the destination for the COPY instruction contain the source
register for the COPY to be considered for coalescing. This prevents
errant instructions like that above.

PR7499

llvm-svn: 109842
2010-07-30 02:41:01 +00:00
Nate Begeman
0b0f838c32 Add builtins for ssat/usat, similar to RealView's __ssat and __usat intrinsics.
llvm-svn: 109813
2010-07-29 22:48:09 +00:00
Bob Wilson
d70ec880ea Refactor ARM-specific DAG combining in preparation for adding some more
transformations.

llvm-svn: 109800
2010-07-29 20:34:14 +00:00
Dale Johannesen
717fbb2b32 Implement vector constants which are splat of
integers with mov + vdup.  8003375.  This is
currently disabled by default because LICM will
not hoist a VDUP, so it pessimizes the code if
the construct occurs inside a loop (8248029).

llvm-svn: 109799
2010-07-29 20:10:08 +00:00
Bob Wilson
823182c3e5 Don't assert on an unrecognized BrMiscFrm instruction.
PR7745.

llvm-svn: 109788
2010-07-29 18:29:28 +00:00
Nate Begeman
b24fa8b8ae Add intrinsics __builtin_arm_qadd & __builtin_arm_qsub to allow access to the QADD & QSUB instructions.
Behave identically to __qadd & __qsub RealView instruction intrinsics.

llvm-svn: 109770
2010-07-29 17:56:55 +00:00
Jakob Stoklund Olesen
1dee3913d6 Revert r109652, and remove the offending assert in loadRegFromStackSlot instead.
We do sometimes load from a too small stack slot when dealing with x86 arguments
(varargs and smaller-than-32-bit args). It looks like we know what we are doing
in those cases, so I am going to remove the assert instead of artifically
enlarging stack slot sizes.

The assert in storeRegToStackSlot stays in. We don't want to write beyond the
bounds of a stack slot.

llvm-svn: 109764
2010-07-29 17:42:27 +00:00
Jim Grosbach
8764c0127d ARM mode version of r109693. Remove incorrect substitution pattern for UXTB16. It wrongly assumed the input shift was actually a rotate. rdar://8240138
llvm-svn: 109696
2010-07-28 23:25:44 +00:00
Jim Grosbach
17bec0f609 Remove incorrect substitution pattern for UXTB16. It wrongly assumed the input shift was actually a rotate. rdar://8240138
llvm-svn: 109693
2010-07-28 23:17:45 +00:00
Jim Grosbach
03b130774b Remove dead prototype
llvm-svn: 109691
2010-07-28 23:16:12 +00:00
Jakob Stoklund Olesen
d4c60eed5e Create a fixed stack object for varargs that is as large as any register.
The size of this object isn't used for anything - technically it is of variable
size.

This avoids a false positive from the assert in
X86InstrInfo::loadRegFromStackSlot, and fixes PR7735.

llvm-svn: 109652
2010-07-28 20:55:38 +00:00
Dan Gohman
a4186ab5f0 Fix this code to avoid decrementing an iterator past the beginning
of a std::vector.

llvm-svn: 109597
2010-07-28 17:15:36 +00:00
Dan Gohman
041bd99662 Do GEP offset calculations with unsigned math rather than signed math
to avoid undefined behavior on overflow, noticed by John Regehr.

llvm-svn: 109594
2010-07-28 17:11:36 +00:00
Nate Begeman
133820e806 Implement a vectorized algorithm for <16 x i8> << <16 x i8>
This is about 4x faster and smaller than the existing scalarization.

llvm-svn: 109566
2010-07-28 00:21:48 +00:00
Nate Begeman
068e932975 ~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches.
For:

define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp {
entry:
  %shl = shl <4 x i32> %r, %a                     ; <<4 x i32>> [#uses=1]
  %tmp2 = bitcast <4 x i32> %shl to <2 x i64>     ; <<2 x i64>> [#uses=1]
  ret <2 x i64> %tmp2
}

We get:

_shl:                                   ## @shl
	pslld	$23, %xmm1
	paddd	LCPI0_0, %xmm1
	cvttps2dq	%xmm1, %xmm1
	pmulld	%xmm1, %xmm0
	ret

Instead of:

_shl:                                   ## @shl
	pshufd	$3, %xmm0, %xmm2
	movd	%xmm2, %eax
	pshufd	$3, %xmm1, %xmm2
	movd	%xmm2, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	pshufd	$1, %xmm0, %xmm3
	movd	%xmm3, %eax
	pshufd	$1, %xmm1, %xmm3
	movd	%xmm3, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm3
	punpckldq	%xmm2, %xmm3
	movd	%xmm0, %eax
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	movhlps	%xmm0, %xmm0
	movd	%xmm0, %eax
	movhlps	%xmm1, %xmm1
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm0
	punpckldq	%xmm0, %xmm2
	movdqa	%xmm2, %xmm0
	punpckldq	%xmm3, %xmm0
	ret

llvm-svn: 109549
2010-07-27 22:37:06 +00:00
Michael J. Spencer
33ac353ce4 Make MC use Windows COFF on Windows and add tests.
llvm-svn: 109494
2010-07-27 06:46:15 +00:00
Jakob Stoklund Olesen
b056152ccf The isLoadFromStackSlot and isStoreToStackSlot have no way of reporting
subregister operands like this:

%reg1040:sub_32bit<def> = MOV32rm <fi#-2>, 1, %reg0, 0, %reg0, %reg1040<imp-def>; mem:LD4[FixedStack-2](align=8)

Make them return false when subreg operands are present. VirtRegRewriter is
making bad assumptions otherwise.

This fixes PR7713.

llvm-svn: 109489
2010-07-27 04:17:01 +00:00
Jakob Stoklund Olesen
fa4bcde9d9 Add assertions that expose the PR7713 miscompilation: Accessing a stack slot
with a too-big register class.

llvm-svn: 109488
2010-07-27 04:16:58 +00:00
Eli Friedman
94c9f00dd6 And a bit more non-ASCII stuff.
llvm-svn: 109458
2010-07-26 22:28:18 +00:00
Anton Korobeynikov
b71ee4ebbb Drop some non-ascii stuff
llvm-svn: 109456
2010-07-26 22:23:07 +00:00
Evan Cheng
4cee8136a7 On x86, f32 / f64 nodes share the same registers as 128-bit vector values.
llvm-svn: 109450
2010-07-26 21:50:05 +00:00
Anton Korobeynikov
5e1e95aed9 Add a note
llvm-svn: 109448
2010-07-26 21:48:35 +00:00
Bruno Cardoso Lopes
cb0f921ca4 Temporary hack to let codegen assert or generate poor code in case
we are using AVX and no AVX version of the desired intruction is present,
this is better for incremental dev (without fallbacks it's easier to spot
what's missing). Not sure this is the best hack thought (we can also disable
all HasSSE* predicates by dinamically marking them 'false' if AVX is present)

llvm-svn: 109434
2010-07-26 21:01:18 +00:00
Anton Korobeynikov
5e3d50ec58 Currently EH lowering code expects typeinfo to be global only.
This assumption is not satisfied due to global mergeing.
Workaround the issue by temporary disablinge mergeing of const globals.
Also, ignore LLVM "special" globals. This fixes PR7716

llvm-svn: 109423
2010-07-26 18:45:39 +00:00
Evan Cheng
1566171daa ARM fastisel isn't ready.
llvm-svn: 109421
2010-07-26 18:32:55 +00:00
Douglas Gregor
caa8768635 Remove extraneous semicolon
llvm-svn: 109373
2010-07-25 17:34:42 +00:00
Douglas Gregor
dc72ace097 Unbreak CMake build
llvm-svn: 109372
2010-07-25 17:10:14 +00:00
Anton Korobeynikov
7ae895e007 Hook in GlobalMerge pass
llvm-svn: 109359
2010-07-24 21:52:08 +00:00
Evan Cheng
a0b74d8804 Add an ILP scheduler. This is a register pressure aware scheduler that's
appropriate for targets without detailed instruction iterineries.
The scheduler schedules for increased instruction level parallelism in
low register pressure situation; it schedules to reduce register pressure
when the register pressure becomes high.

On x86_64, this is a win for all tests in CFP2000. It also sped up 256.bzip2
by 16%.

llvm-svn: 109300
2010-07-24 00:39:05 +00:00
Bruno Cardoso Lopes
632295a03c Support x86 "eiz" and "riz" pseudo index registers in the assembler.
llvm-svn: 109295
2010-07-24 00:06:39 +00:00
Jim Grosbach
4b7545413d Use the appropriate register class for an i32 when adding ARM::LR to the
function live in set. This will give us tGPR for Thumb1 and GPR otherwise,
so the copy will be spillable. rdar://8224931

llvm-svn: 109293
2010-07-23 23:50:35 +00:00
Dale Johannesen
50d2bc2942 Revert 109076. It is wrong and was causing regressions. Add some
comments explaining why it was wrong.  8225024.

Fix the real problem in 8213383: the code that splits very large
blocks when no other place to put constants can be found was not
considering the case that the block contained a Thumb tablejump.

llvm-svn: 109282
2010-07-23 22:50:23 +00:00
Evan Cheng
f215e55d5f - Allow target to specify when is register pressure "too high". In most cases,
it's too late to start backing off aggressive latency scheduling when most
  of the registers are in use so the threshold should be a bit tighter.
- Correctly handle live out's and extract_subreg etc.
- Enable register pressure aware scheduling by default for hybrid scheduler.
  For ARM, this is almost always a win on # of instructions. It's runtime
  neutral for most of the tests. But for some kernels with high register
  pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by
  54 and sped up by 20%.

llvm-svn: 109279
2010-07-23 22:39:59 +00:00
Bruno Cardoso Lopes
06fcdd6563 Remove trailing whitespace
llvm-svn: 109276
2010-07-23 22:15:26 +00:00
Bruno Cardoso Lopes
a80b57e5fc Add AVX version of CLMUL instructions
llvm-svn: 109248
2010-07-23 18:41:12 +00:00
Gabor Greif
30f8e2112c fix constness warnings
llvm-svn: 109224
2010-07-23 13:28:47 +00:00
Gabor Greif
a04ffe0391 do not (implicitly) dereference iterator many times, cache it instead
llvm-svn: 109222
2010-07-23 10:23:01 +00:00
Bruno Cardoso Lopes
b5374c4b69 Declare CLMUL as a subtarget feature
llvm-svn: 109207
2010-07-23 01:22:45 +00:00
Bruno Cardoso Lopes
b034ffa291 Add x86 CLMUL (Carry-less multiplication) cpu feature
llvm-svn: 109206
2010-07-23 01:17:51 +00:00
Bruno Cardoso Lopes
b9182e3051 Add complete assembler support for FMA3 instructions, with descriptions and encodings taken from the AVX manual
llvm-svn: 109204
2010-07-23 00:54:35 +00:00
Dale Johannesen
1ee436f417 The only supported calling convention for X86-64 uses
SSE, so we can't return floating point values if this
is disabled.  Detect this error for clang.

With SSE1 only, f64 is a problem; it can be done, but
neither llvm-gcc nor clang has ever generated correct
code for it.  Since nobody noticed this I think it's
OK to treat it as an error for now.

This also handles SSE-sized vectors of floating point.
8207686, 8204109.

llvm-svn: 109201
2010-07-23 00:30:35 +00:00
Bruno Cardoso Lopes
93fd8bdf6a Fix some AVX instructions which didnt had HasAVX prefix. And also a problem with PINSRW, which was totally wrong because of a typo I introduced previously
llvm-svn: 109198
2010-07-23 00:14:54 +00:00
Chris Lattner
a1dcdf0bd4 eliminate the TargetInstrInfo::GetInstSizeInBytes hook.
ARM/PPC/MSP430-specific code (which are the only targets that
implement the hook) can directly reference their target-specific
instrinfo classes.

llvm-svn: 109171
2010-07-22 21:27:00 +00:00
Bruno Cardoso Lopes
7722724eee Add remaining AVX instructions (most of them dealing with GR64 destinations. This complete the assembler support for the general AVX ISA. But we still miss instructions from FMA3 and CLMUL specific feature flags, which are now the next step
llvm-svn: 109168
2010-07-22 21:18:49 +00:00
Chris Lattner
65ad913bec remove the JIT "NeedsExactSize" feature and supporting logic.
llvm-svn: 109167
2010-07-22 21:17:55 +00:00
Chris Lattner
9c30949ae1 switch a private implementation of GetFunctionSizeInBytes.
This is probably not the best way to implement "Force LR to 
be spilled if the Thumb function size is > 2048." do this, 
it should use the branch shortening infrastructure, but I'm
just preserving functionality here.

llvm-svn: 109165
2010-07-22 21:14:33 +00:00
Chris Lattner
7704d56b21 X86MCInstLower now depends on AsmPrinter being around.
llvm-svn: 109154
2010-07-22 21:10:04 +00:00
Chris Lattner
367aa754b1 instead of migrating it to the MC instruction encoder, just
rip out the implementation of X86InstrInfo::GetInstSizeInBytes.
The code being ripped out just implemented a copy and hacked up
version of the (old) instruction encoder, and is buggy and 
terrible in other ways.  Since "GetInstSizeInBytes" is really 
only there to support the JIT's "NeedsExactSize" hook (which
noone is using), just rip out the code.  I will rip out the
NeedsExactSize hook next.

This resolves rdar://7617809 - switch X86InstrInfo::GetInstSizeInBytes to use X86MCCodeEmitter

llvm-svn: 109149
2010-07-22 21:05:13 +00:00
Xerxes Ranby
c7a04bcaa5 ARMv4 JIT forgets to set the lr register when making a indirect function call. Fixes PR7608
llvm-svn: 109125
2010-07-22 17:28:34 +00:00
Gabor Greif
a7509fca78 undo 80 column trespassing I caused
llvm-svn: 109092
2010-07-22 10:37:47 +00:00
Chandler Carruth
e10953d673 Mark an assert-only variable as used.
llvm-svn: 109091
2010-07-22 08:02:25 +00:00
Chandler Carruth
4810e179e9 Fix the generated file name for CMake.
llvm-svn: 109090
2010-07-22 08:00:52 +00:00
Chandler Carruth
66edf31b8e Attempt to fix linking issues with CMake. Please review other CMake users,
especially on other platforms. Is there a better way to fix this.

llvm-svn: 109084
2010-07-22 06:27:45 +00:00
Owen Anderson
01e73ac583 Update CMake files.
llvm-svn: 109081
2010-07-22 06:00:01 +00:00
Eric Christopher
4924d5fb93 Custom lower the memory barrier instructions and add support
for lowering without sse2.  Add a couple of new testcases.

Fixes a few libgomp tests and latent bugs.  Remove a few todos.

llvm-svn: 109078
2010-07-22 02:48:34 +00:00
Evan Cheng
32f6aba7d8 Fix constant island pass's handling of tBR_JTr. The offset of the instruction does not have to be 4-byte aligned. Rather, it's the offset + 2 that must be aligned since the instruction expands into:
mov     pc, r1
        .align  2
LJTI0_0_0:
        .long    LBB0_14

This fixes rdar://8213383. No test case since it's not possible to come up with a suitable small one.

llvm-svn: 109076
2010-07-22 02:09:47 +00:00
Eric Christopher
5901214b6a 80-columns.
llvm-svn: 109070
2010-07-22 00:26:08 +00:00
Nate Begeman
c50bef0df7 Make fast isel win64-aware w.r.t. call-clobbered regs
llvm-svn: 109069
2010-07-22 00:09:39 +00:00
Evan Cheng
5aa6a25102 More register pressure aware scheduling work.
llvm-svn: 109064
2010-07-21 23:53:58 +00:00
Bruno Cardoso Lopes
5920e38cd2 Add more 256-bit forms for a bunch of regular AVX instructions
Add 64-bit (GR64) versions of some instructions (which are not
described in their SSE forms, but are described in AVX)

llvm-svn: 109063
2010-07-21 23:53:50 +00:00
Rafael Espindola
c8342b43c4 Fixes win64. It was broken by a previous patch where I missed the !isWin64
and then forced every register to be a vr128 on win64.

llvm-svn: 109060
2010-07-21 23:19:57 +00:00
Jim Grosbach
489d758ea8 For ARM/Darwin, add a dwarf entry indicating whether a function is arm or thumb
rdar://8202967

llvm-svn: 109057
2010-07-21 23:03:52 +00:00
Chris Lattner
418c190d93 add some rough support for making mcinst lowering work without an
asmprinter or mangler around.  This is option #B for killing off 
X86InstrInfo::GetInstSizeInBytes.  Option #A (killing 
"needsexactsize") was sent for consideration to llvmdev.

llvm-svn: 109056
2010-07-21 23:03:35 +00:00
Eric Christopher
3d118d5e8a Baby steps towards ARM fast-isel.
llvm-svn: 109047
2010-07-21 22:26:11 +00:00
Owen Anderson
f8addbb0a1 Fix batch of converting RegisterPass<> to INTIALIZE_PASS().
llvm-svn: 109045
2010-07-21 22:09:45 +00:00
Bruno Cardoso Lopes
eea3b7ed83 Add missing AVX convert instructions. Those instructions are not described in their SSE forms (although they exist), but add the AVX forms anyway, so the assembler can benefit from it
llvm-svn: 109039
2010-07-21 21:37:59 +00:00
Nate Begeman
e7ca21ab3b Fix a couple issues with Win64 ABI
1) all registers were spilled as xmm, regardless of actual size
2) win64 abi doesn't do the varargs-size-in-%al thing

Still to look into:

xmm6-15 are marked as clobbered by call instructions on win64 even though they aren't.

llvm-svn: 109035
2010-07-21 20:49:52 +00:00
Bruno Cardoso Lopes
1284fdc932 Avoid AVX instructions to be selected instead of its SSE form
llvm-svn: 109032
2010-07-21 20:38:42 +00:00
Rafael Espindola
9aab8413b8 Fix calling convention on ARM if vfp2+ is enabled.
llvm-svn: 109009
2010-07-21 11:38:30 +00:00
Eric Christopher
959481ec87 Pulling out previous patch, must've run the tests in
the wrong directory.

llvm-svn: 109005
2010-07-21 09:23:56 +00:00
Eric Christopher
5c12ad2a4b Lower MEMBARRIER on x86 and support processors without SSE2.
Fixes a pile of libgomp failures in the llvm-gcc testsuite due
to the libcall not existing.

llvm-svn: 109004
2010-07-21 09:05:23 +00:00
Bruno Cardoso Lopes
d13d8c2562 Add AVX only vzeroall and vzeroupper instructions
llvm-svn: 109002
2010-07-21 08:56:24 +00:00
Evan Cheng
df725c25dd Teach bottom up pre-ra scheduler to track register pressure. Work in progress.
llvm-svn: 108991
2010-07-21 06:09:07 +00:00
Bruno Cardoso Lopes
c4d93a5a34 Add new AVX vpermilps, vpermilpd and vperm2f128 instructions
llvm-svn: 108984
2010-07-21 03:07:42 +00:00
Bruno Cardoso Lopes
a7efb29695 Add new AVX vmaskmov instructions, and also fix the VEX encoding bits to support it
llvm-svn: 108983
2010-07-21 02:46:58 +00:00
Bruno Cardoso Lopes
e0dce1c741 Add new AVX vextractf128 instructions
llvm-svn: 108964
2010-07-20 23:19:02 +00:00
Chris Lattner
fa1479a813 make asmprinter optional, even though passing in null will cause things to explode right now.
llvm-svn: 108955
2010-07-20 22:45:33 +00:00
Chris Lattner
535d070a96 continue pushing dependencies around.
llvm-svn: 108952
2010-07-20 22:35:40 +00:00
Chris Lattner
607e343ee1 reduce X86MCInstLower dependencies on asmprinter.
llvm-svn: 108950
2010-07-20 22:30:53 +00:00
Chris Lattner
2c7b47d3bf pass around MF, not MMI.
llvm-svn: 108949
2010-07-20 22:26:07 +00:00
Chris Lattner
5834a54ac3 cleanups.
llvm-svn: 108947
2010-07-20 22:23:57 +00:00
Chris Lattner
cbca96b513 move two asmprinter methods into the asmprinter .cpp file.
llvm-svn: 108945
2010-07-20 22:18:19 +00:00
Chris Lattner
fd071c0bd8 prune #includes a little.
llvm-svn: 108929
2010-07-20 21:17:29 +00:00
Bruno Cardoso Lopes
b677cbc9b2 Add new AVX instruction vinsertf128
llvm-svn: 108892
2010-07-20 19:44:51 +00:00
Jim Grosbach
30f1b06af3 Using BIC for immediates needs an extra bump for its complexity to get
instruction selection to prefer it when possible. rdar://7903972

llvm-svn: 108844
2010-07-20 16:07:04 +00:00
Jim Grosbach
fa61724ac3 Removed un-used code.
llvm-svn: 108841
2010-07-20 14:51:32 +00:00
Bruno Cardoso Lopes
0fa595f073 Fix PR7174, a couple o Mips fixes:
- Fix a typo for PIC check during jmp table lowering
- Also fix the "first jump table basic block is not
considered only reachable by fall through" problem, use this
ad-hoc solution until I come up with something better.

Patch by stetorvs@gmail.com

llvm-svn: 108820
2010-07-20 08:37:04 +00:00