1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-20 19:42:54 +02:00
Commit Graph

24384 Commits

Author SHA1 Message Date
Owen Anderson
5a2e4b2e02 Create an FPOW SDNode opcode def in the target independent .td file rather than in a specific backend.
llvm-svn: 182450
2013-05-22 06:36:09 +00:00
Rafael Espindola
9fa2758841 Attempt to fix the mingw32 bot.
This should hopefully fix
http://lab.llvm.org:8011/builders/clang-x86_64-darwin11-self-mingw32

llvm-svn: 182446
2013-05-22 02:30:47 +00:00
Rafael Espindola
ff498d2344 s/u_int32_t/uint32_t/
llvm-svn: 182444
2013-05-22 01:36:19 +00:00
Rafael Espindola
c584e29854 Fix warning in non-assert build.
llvm-svn: 182443
2013-05-22 01:29:38 +00:00
Reed Kotler
dc9a2e4d7b Mips16 does not use register scavenger from TargetRegisterInfo. It allocates
a RegScavenger object on it's own.
 

llvm-svn: 182430
2013-05-21 22:06:02 +00:00
Akira Hatanaka
4da68c1676 [mips] Rename option to make it compatible with gcc.
llvm-svn: 182397
2013-05-21 17:17:59 +00:00
Akira Hatanaka
6123e22ce0 [mips] Add instruction selection patterns for blez and bgez.
llvm-svn: 182396
2013-05-21 17:13:47 +00:00
Justin Holewinski
2a53cbfbe1 [NVPTX] Add @llvm.nvvm.sqrt.f() intrinsic
llvm-svn: 182394
2013-05-21 16:51:30 +00:00
Jyotsna Verma
5bb9f9d4f2 Hexagon: SelectionDAG should not use MVT::Other to check the legality of BR_CC.
llvm-svn: 182390
2013-05-21 15:54:32 +00:00
Hal Finkel
ba37c38241 Fix PPC branch selection for counter-based branches
Although I had added some support for the BDZ/BDNZ branches into the selector
(in r158204), I had not correctly adjusted the condition at the top of the
loop. As a result, these branches were still essentially unsupported.

This fixes PR16086. Unfortunately, any test case would be very large (because
it would need to force the loop backedge to exceed the range of the 16-bit
immediate).

llvm-svn: 182385
2013-05-21 14:21:09 +00:00
Elena Demikhovsky
981934409d removed commented lines
llvm-svn: 182377
2013-05-21 13:27:44 +00:00
Elena Demikhovsky
9523e885e5 Removed SSEPacked domain from all forms (AVX, SSE, signed, unsigned) scalar compare instructions, like COMISS, COMISD.
No functional changes.

llvm-svn: 182371
2013-05-21 12:04:22 +00:00
Benjamin Kramer
6b339bee4e X86: When emulating unsigned PCMPGTQ with PCMPGTD, fix the sign bit for the smaller type.
Otherwise we'll get a mix of signed and unsigned compares.
Fixes PR15977.

llvm-svn: 182364
2013-05-21 09:58:54 +00:00
Richard Sandiford
bb9bbbf6cc Fix indentation
llvm-svn: 182356
2013-05-21 08:48:24 +00:00
Reed Kotler
2d5f41cb35 Add some additional functions to the list of helper functions for
pic calls. These need to be there so we don't try and use helper
functions when we call those.

As part of this, make sure that we properly exclude helper functions in pic
mode when indirect calls are involved.

llvm-svn: 182343
2013-05-21 00:50:30 +00:00
Hal Finkel
cf203aeadb Rename LoopSimplify.h to LoopUtils.h
As discussed, LoopUtils.h is a better name.

llvm-svn: 182314
2013-05-20 20:46:30 +00:00
Akira Hatanaka
05711091c4 [mips] Add (setne $lhs, 0) instruction selection pattern.
llvm-svn: 182307
2013-05-20 18:18:07 +00:00
Akira Hatanaka
96de87d87c [mips] Trap on integer division by zero.
By default, a teq instruction is inserted after integer divide. No divide-by-zero
checks are performed if option "-mnocheck-zero-division" is used.

llvm-svn: 182306
2013-05-20 18:07:43 +00:00
Hal Finkel
077bcf0963 Remove copied preheader insertion logic from PPCCTRLoops
Now that the preheader insertion logic in LoopSimplify is externally exposed,
use it, and remove the copy-and-pasted version.

No functionality change intended.

llvm-svn: 182300
2013-05-20 16:47:10 +00:00
Justin Holewinski
2d2a08ee5e [NVPTX] Fix mis-use of CurrentFnSym in NVPTXAsmPrinter. This was causing a symbol name error in the output PTX.
llvm-svn: 182298
2013-05-20 16:42:18 +00:00
Justin Holewinski
3c0938687d [NVPTX] Add programmatic interface to NVVMReflect pass
llvm-svn: 182297
2013-05-20 16:42:16 +00:00
Hal Finkel
b008faddc8 Rename PPC MTCTRse to MTCTRloop
As the pairing of this instruction form with the bdnz/bdz branches is now
enforced by the verification pass, make it clear from the name that these
are used only for counter-based loops.

No functionality change intended.

llvm-svn: 182296
2013-05-20 16:08:37 +00:00
Hal Finkel
e230e28ec5 Add a PPCCTRLoops verification pass
When asserts are enabled, this adds a verification pass for PPC counter-loop
formation. Unfortunately, without sacrificing code quality, there is no better
way of forming counter-based loops except at the (late) IR level. This means
that we need to recognize, at the IR level, anything which might turn into a
function call (or indirect branch). Because this is currently a finite set of
things, and because SelectionDAG lowering is basic-block local, this can be
done. Nevertheless, it is fragile, and failure results in a miscompile. This
verification pass checks that all (reachable) counter-based branches are
dominated by a loop mtctr instruction, and that no instructions in between
clobber the counter register. If these conditions are not satisfied, then an
ICE will be triggered.

In short, this is to help us sleep better at night.

llvm-svn: 182295
2013-05-20 16:08:17 +00:00
Benjamin Kramer
66d4343951 R600: Fix bug detected by GCC warning.
R600TextureIntrinsicsReplacer.cpp:232: warning: the address of ‘ArgsType’ will always evaluate as ‘true’

This doesn't have any effect on the output as a vararg intrinsic behaves the
same way as a non-vararg one.

llvm-svn: 182293
2013-05-20 15:58:43 +00:00
Tom Stellard
8c433f74be R600/SI: Use a multiclass for MUBUF_Load_Helper
This will simplify the instructions and also the pattern definitions.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182288
2013-05-20 15:02:31 +00:00
Tom Stellard
20519ba9b1 R600/SI: Add a pattern for S_LOAD_DWORDX2_* instructions
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182287
2013-05-20 15:02:28 +00:00
Tom Stellard
d72698331e R600/SI: Add pattern for rotr
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182286
2013-05-20 15:02:24 +00:00
Tom Stellard
5ca265d214 R600: Swap the legality of rotl and rotr
The hardware supports rotr and not rotl.

llvm-svn: 182285
2013-05-20 15:02:19 +00:00
Tom Stellard
1609774b15 R600/SI: Add patterns for 64-bit shift operations
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182284
2013-05-20 15:02:12 +00:00
Tom Stellard
ad64547aac R600/SI: Use the same names for VOP3 operands and encoding fields
This makes it possible to reorder the operands without breaking the
encoding.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182283
2013-05-20 15:02:08 +00:00
Tom Stellard
9e5dc799d9 R600/SI: Make fitsRegClass() operands const
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182282
2013-05-20 15:02:01 +00:00
Mihai Popa
d42b1e0685 VSTn instructions have a number of encoding constraints which are not implemented. I have added these using wrapper methods around the original custom decoder (incidentally - this is a huge poorly written method that should be cleaned up. I have left it as is since the changes would be much to hard to review).
llvm-svn: 182281
2013-05-20 14:57:05 +00:00
Mihai Popa
dbf8d46d3c Q registers are encoded in fields of the same length as D registers. As Q registers are half as many, the ARM reference manual mandates the least significant bit to be zeroed out. Failure to do so should result in an undefined instruction. With this change test/MC/Disassembler/ARM/invalid-VQADD-arm.txt is passing (removed XFAIL).
llvm-svn: 182279
2013-05-20 14:42:43 +00:00
Richard Sandiford
cc815ef1d8 [SystemZ] Add long branch pass
Before this change, the SystemZ backend would use BRCL for all branches
and only consider shortening them to BRC when generating an object file.
E.g. a branch on equal would use the JGE alias of BRCL in assembly output,
but might be shortened to the JE alias of BRC in ELF output.  This was
a useful first step, but it had two problems:

(1) The z assembler isn't traditionally supposed to perform branch shortening
    or branch relaxation.  We followed this rule by not relaxing branches
    in assembler input, but that meant that generating assembly code and
    then assembling it would not produce the same result as going directly
    to object code; the former would give long branches everywhere, whereas
    the latter would use short branches where possible.

(2) Other useful branches, like COMPARE AND BRANCH, do not have long forms.
    We would need to do something else before supporting them.

    (Although COMPARE AND BRANCH does not change the condition codes,
    the plan is to model COMPARE AND BRANCH as a CC-clobbering instruction
    during codegen, so that we can safely lower it to a separate compare
    and long branch where necessary.  This is not a valid transformation
    for the assembler proper to make.)

This patch therefore moves branch relaxation to a pre-emit pass.
For now, calls are still shortened from BRASL to BRAS by the assembler,
although this too is not really the traditional behaviour.

The first test takes about 1.5s to run, and there are likely to be
more tests in this vein once further branch types are added.  The feeling
on IRC was that 1.5s is a bit much for a single test, so I've restricted
it to SystemZ hosts for now.

The patch exposes (and fixes) some typos in the main CodeGen/SystemZ tests.
A later patch will remove the {{g}}s from that directory.

llvm-svn: 182274
2013-05-20 14:23:08 +00:00
Justin Holewinski
d5636664a4 [NVPTX] Add GenericToNVVM IR converter to better handle idiomatic LLVM IR inputs
This converter currently only handles global variables in address space 0. For
these variables, they are promoted to address space 1 (global memory), and all
uses are updated to point to the result of a cvta.global instruction on the new
variable.

The motivation for this is address space 0 global variables are illegal since we
cannot declare variables in the generic address space.  Instead, we place the
variables in address space 1 and explicitly convert the pointer to address
space 0. This is primarily intended to help new users who expect to be able to
place global variables in the default address space.

llvm-svn: 182254
2013-05-20 12:13:32 +00:00
Justin Holewinski
fda22b94b1 [NVPTX] Fix i1 kernel parameters and global variables. ABI rules say we need to use .u8 for i1 parameters for kernels.
llvm-svn: 182253
2013-05-20 12:13:28 +00:00
Stepan Dyatkovskiy
adb91e7ed9 PR15868 fix.
Introduction:
In case when stack alignment is 8 and GPRs parameter part size is not N*8:
we add padding to GPRs part, so part's last byte must be recovered at
address K*8-1.
We need to do it, since remained (stack) part of parameter starts from
address K*8, and we need to "attach" "GPRs head" without gaps to it:

Stack:
|---- 8 bytes block ----| |---- 8 bytes block ----| |---- 8 bytes...
[ [padding] [GPRs head] ] [ ------ Tail passed via stack  ------ ...

FIX:
Note, once we added padding we need to correct *all* Arg offsets that are going
after padded one. That's why we need this fix: Arg offsets were never corrected
before this patch. See new test-cases included in patch.

We also don't need to insert padding for byval parameters that are stored in GPRs
only. We need pad only last byval parameter and only in case it outsides GPRs
and stack alignment = 8.
Though, stack area, allocated for recovered byval params, must satisfy
"Size mod 8 = 0" restriction.

This patch reduces stack usage for some cases:
We can reduce ArgRegsSaveArea since inner N*4 bytes sized byval params my be
"packed" with alignment 4 in some cases.

llvm-svn: 182237
2013-05-20 08:01:34 +00:00
Jakob Stoklund Olesen
dad0022424 Also expand 64-bit bitcasts.
llvm-svn: 182229
2013-05-20 01:01:43 +00:00
Jakob Stoklund Olesen
e033165fd4 Implement spill and fill of I64Regs.
llvm-svn: 182228
2013-05-20 00:53:25 +00:00
Jakob Stoklund Olesen
f777e6d131 Mark i64 SETCC as expand so it is turned into a SELECT_CC.
llvm-svn: 182227
2013-05-20 00:28:36 +00:00
Benjamin Kramer
1e23dfa473 Replace some bit operations with simpler ones. No functionality change.
llvm-svn: 182226
2013-05-19 22:01:57 +00:00
Jakob Stoklund Olesen
50d419ac93 Don't use %g0 to materialize 0 directly.
The wired physreg doesn't work on tied operands like on MOVXCC.

Add a README note to fix this later.

llvm-svn: 182225
2013-05-19 21:47:13 +00:00
Jakob Stoklund Olesen
f4ec84c2d4 Select i64 values with %icc conditions.
llvm-svn: 182224
2013-05-19 20:38:21 +00:00
Jakob Stoklund Olesen
1948cf9ca7 Add floating point selects on %xcc predicates.
llvm-svn: 182222
2013-05-19 20:33:11 +00:00
Jakob Stoklund Olesen
c53b3587a3 Implement SPselectfcc for i64 operands.
Also clean up the arguments to all the MOVCC instructions so the
operands always are (true-val, false-val, cond-code).

llvm-svn: 182221
2013-05-19 20:20:54 +00:00
Venkatraman Govindaraju
d432ed013f [Sparc] Rearrange integer registers' allocation order so that register allocator will use I and G registers before using L and O registers.
Also, enable registers %g2-%g4 to be used in application and %g5 in 64 bit mode.

llvm-svn: 182219
2013-05-19 20:07:20 +00:00
Jakob Stoklund Olesen
702a7c68c9 Handle i64 FrameIndex nodes in SPARC v9 mode.
llvm-svn: 182216
2013-05-19 19:14:24 +00:00
Hal Finkel
d4eb9291fa Check InlineAsm clobbers in PPCCTRLoops
We don't need to reject all inline asm as using the counter register (most does
not). Only those that explicitly clobber the counter register need to prevent
the transformation.

llvm-svn: 182191
2013-05-18 09:20:39 +00:00
Tim Northover
5256d08973 AArch64: add CMake dependency to fix very parallel builds
llvm-svn: 182190
2013-05-18 08:17:47 +00:00
David Majnemer
a21386b571 X86: Bad peephole interaction between adc, MOV32r0
The peephole tries to reorder MOV32r0 instructions such that they are
before the instruction that modifies EFLAGS.

The problem is that the peephole does not consider the case where the
instruction that modifies EFLAGS also depends on the previous state of
EFLAGS.

Instead, walk backwards until we find an instruction that has a def for
EFLAGS but does not have a use.
If we find such an instruction, insert the MOV32r0 before it.
If it cannot find such an instruction, skip the optimization.

llvm-svn: 182184
2013-05-18 01:02:03 +00:00
Matt Arsenault
118196f0ca Add LLVMContext argument to getSetCCResultType
llvm-svn: 182180
2013-05-18 00:21:46 +00:00
JF Bastien
cbcaf8db77 Support unaligned load/store on more ARM targets
This patch matches GCC behavior: the code used to only allow unaligned
load/store on ARM for v6+ Darwin, it will now allow unaligned load/store
for v6+ Darwin as well as for v7+ on Linux and NaCl.

The distinction is made because v6 doesn't guarantee support (but LLVM
assumes that Apple controls hardware+kernel and therefore have
conformant v6 CPUs), whereas v7 does provide this guarantee (and
Linux/NaCl behave sanely).

The patch keeps the -arm-strict-align command line option, and adds
-arm-no-strict-align. They behave similarly to GCC's -mstrict-align and
-mnostrict-align.

I originally encountered this discrepancy in FastIsel tests which expect
unaligned load/store generation. Overall this should slightly improve
performance in most cases because of reduced I$ pressure.

llvm-svn: 182175
2013-05-17 23:49:01 +00:00
Rafael Espindola
aabd77b198 Fix the build in c++11 mode.
The errors were:

non-constant-expression cannot be narrowed from type 'int64_t' (aka 'long') to 'uint32_t' (aka 'unsigned int') in initializer list

and

non-constant-expression cannot be narrowed from type 'long' to 'uint32_t' (aka 'unsigned int') in initializer list

llvm-svn: 182168
2013-05-17 22:45:52 +00:00
Vincent Lejeune
c8aad4509a R600: Lower int_load_input to copyFromReg instead of Register node
It solves a bug uncovered by dot4 patch where the register class of
int_load_input use was ignored.

llvm-svn: 182130
2013-05-17 16:51:06 +00:00
Vincent Lejeune
5a2e018ab6 R600: Use bottom up scheduling algorithm
llvm-svn: 182129
2013-05-17 16:50:56 +00:00
Vincent Lejeune
2bf65b1826 R600: Use depth first scheduling algorithm
It should increase PV substitution opportunities and lower gpr
usage (pending computations path are "flushed" sooner)

llvm-svn: 182128
2013-05-17 16:50:44 +00:00
Vincent Lejeune
a140216a0a R600: Replace big texture opcode switch in scheduler by usesTC/usesVC
llvm-svn: 182127
2013-05-17 16:50:37 +00:00
Vincent Lejeune
152473c61c R600: Relax some vector constraints on Dot4.
Dot4 now uses 8 scalar operands instead of 2 vectors one which allows register
coalescer to remove some unneeded COPY.
This patch also defines some structures/functions that can be used to handle
every vector instructions (CUBE, Cayman special instructions...) in a similar
fashion.

llvm-svn: 182126
2013-05-17 16:50:32 +00:00
Vincent Lejeune
0c663b698a R600: Improve texture handling
llvm-svn: 182125
2013-05-17 16:50:20 +00:00
Vincent Lejeune
b57cb76b6d R600: Rename 128 bit registers.
Almost all instructions that takes a 128 bits reg as input (fetch, export...)
have the abilities to swizzle their argument and output. Instead of printing
default swizzle for each 128 bits reg, rename T*.XYZW to T* and let instructions
print potentially optimized swizzles themselves.

llvm-svn: 182124
2013-05-17 16:50:09 +00:00
Vincent Lejeune
d391d51989 R600: Some factorization
llvm-svn: 182123
2013-05-17 16:50:02 +00:00
Vincent Lejeune
bf991c018d R600: Factorize Fetch size limit inside AMDGPUSubTarget
llvm-svn: 182122
2013-05-17 16:49:55 +00:00
Vincent Lejeune
d39a89783b R600: prettier dump of clamp
llvm-svn: 182121
2013-05-17 16:49:49 +00:00
Tom Stellard
a4cc081e08 R600: Fix encoding for R600 family GPUs
Reviewed-by: Vincent Lejeune <vljn@ovi.com>

https://bugs.freedesktop.org/show_bug.cgi?id=64193
https://bugs.freedesktop.org/show_bug.cgi?id=64257
https://bugs.freedesktop.org/show_bug.cgi?id=64320

NOTE: This is a candidate for the 3.3 branch.
llvm-svn: 182113
2013-05-17 15:23:21 +00:00
Tom Stellard
b91da0601d R600: Pass MCSubtargetInfo reference to R600CodeEmitter
llvm-svn: 182112
2013-05-17 15:23:12 +00:00
Venkatraman Govindaraju
989fb74a1c [Sparc] Implements hasReservedCallFrame and hasFP.
This is to generate correct framesetup code when the function
 has variable sized allocas.

llvm-svn: 182108
2013-05-17 15:14:34 +00:00
Benjamin Kramer
e9efb3252f X86: Make shuffle -> shift conversion more aggressive about undefs.
Shuffles that only move an element into position 0 of the vector are common in
the output of the loop vectorizer and often generate suboptimal code when SSSE3
is not available. Lower them to vector shifts if possible.

We still prefer palignr over psrldq because it has higher throughput on
sandybridge.

llvm-svn: 182102
2013-05-17 14:48:34 +00:00
Ulrich Weigand
e299bf2813 [PowerPC] Fix hi/lo encoding in old-style code emitter
This patch implements the equivalent change to r182091/r182092
in the old-style code emitter.  Instead of having two separate
16-bit immediate encoding routines depending on the instruction,
this patch introduces a single encoder that checks the machine
operand flags to decide whether the low or high half of a
symbol address is required.

Since now both encoders make no further distinction between
"symbolLo" and "symbolHi", the .td operand can now use a
single getS16ImmEncoding method.

Tested by running the old-style JIT tests on 32-bit Linux.

llvm-svn: 182097
2013-05-17 14:14:12 +00:00
Ulrich Weigand
24cfcf8f49 [PowerPC] Merge/rename PPC fixup types
Now that fixup_ppc_ha16 and fixup_ppc_lo16 are being treated exactly
the same everywhere, it no longer makes sense to have two fixup types.

This patch merges them both into a single type fixup_ppc_half16,
and renames fixup_ppc_lo16_ds to fixup_ppc_half16ds for consistency.
(The half16 and half16ds names are taken from the description of
relocation types in the PowerPC ABI.)

No change in code generation expected.

llvm-svn: 182092
2013-05-17 12:37:21 +00:00
Ulrich Weigand
89ebba5af6 [PowerPC] Fix processing of ha16/lo16 fixups
The current PowerPC MC back end distinguishes between fixup_ppc_ha16
and fixup_ppc_lo16, which are determined by the instruction the fixup
applies to, and uses this distinction to decide whether a fixup ought
to resolve to the high or the low part of a symbol address.

This isn't quite correct, however.  It is valid -if unusual- assembler
to use, e.g.
  li 1, symbol@ha
or
  lis 1, symbol@l
Whether the high or the low part of the address is used depends solely
on the @ suffix, not on the instruction.

In addition, both
  li 1, symbol
and
  lis 1, symbol
are valid, assuming the symbol address fits into 16 bits; again, both
will then refer to the actual symbol value (so li will load the value
itself, while lis will load the value shifted by 16).


To fix this, two places need to be adapted.  If the fixup cannot be
resolved at assembler time, a relocation needs to be emitted via
PPCELFObjectWriter::getRelocType.  This routine already looks at
the VK_ type to determine the relocation.  The only problem is that
will reject any _LO modifier in a ha16 fixup and vice versa.  This
is simply incorrect; any of those modifiers ought to be accepted
for either fixup type.

If the fixup *can* be resolved at assembler time, adjustFixupValue
currently selects the high bits of the symbol value if the fixup
type is ha16.  Again, this is incorrect; see the above example
  lis 1, symbol

Now, in theory we'd have to respect a VK_ modifier here.  However,
in fact common code never even attempts to resolve symbol references
using any nontrivial VK_ modifier at assembler time; it will always
fall back to emitting a reloc and letting the linker handle it.

If this ever changes, presumably there'd have to be a target callback
to resolve VK_ modifiers.  We'd then have to handle @ha etc. there.

llvm-svn: 182091
2013-05-17 12:36:29 +00:00
Benjamin Kramer
5781bb221a Don't cast away constness.
llvm-svn: 182086
2013-05-17 11:39:41 +00:00
Christian Konig
34f0d6eaf2 R600/SI: return undef instead of null for skipped arguments
This is a candidate for the stable branch.

Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=64694

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
llvm-svn: 182084
2013-05-17 09:46:48 +00:00
Venkatraman Govindaraju
62af2fad30 [Sparc] Prevent instructions that defines or uses %o7 to be in call's delay slot.
llvm-svn: 182063
2013-05-16 23:53:29 +00:00
Akira Hatanaka
3848727973 [mips] Improve instruction selection for pattern (store (fp_to_sint $src), $ptr).
Previously, three instructions were needed:

trunc.w.s $f0, $f2
mfc1 $4, $f0
sw $4, 0($2)

Now we need only two:

trunc.w.s $f0, $f2
swc1 $f0, 0($2)

llvm-svn: 182053
2013-05-16 21:17:15 +00:00
Rafael Espindola
21effc7220 Remove addFrameMove.
Now that we have good testing, remove addFrameMove and create cfi
instructions directly.

llvm-svn: 182052
2013-05-16 21:02:15 +00:00
Akira Hatanaka
af33bc35ec [mips] Factor out unaligned store lowering code.
llvm-svn: 182050
2013-05-16 20:45:17 +00:00
Jack Carter
0c75cd3c10 Mips assembler: Add TwoOperandConstraint definitions
This patch removes alias definition for addiu $rs,$imm 
and instead uses the TwoOperandAliasConstraint field in 
the ArithLogicI instruction class. 

This way all instructions that inherit ArithLogicI class 
have the same macro defined. 

The usage examples are added to test files.

Patch by Vladimir Medic

llvm-svn: 182048
2013-05-16 20:24:27 +00:00
Jack Carter
27313685a3 Mips td file formatting: white space and long lines
llvm-svn: 182047
2013-05-16 20:08:49 +00:00
Hal Finkel
80fddc9af7 Create an new preheader in PPCCTRLoops to avoid counter register clobbers
Some IR-level instructions (such as FP <-> i64 conversions) are not chained
w.r.t. the mtctr intrinsic and yet may become function calls that clobber the
counter register. At the selection-DAG level, these might be reordered with the
mtctr intrinsic causing miscompiles. To avoid this situation, if an existing
preheader has instructions that might use the counter register, create a new
preheader for the mtctr intrinsic. This extra block will be remerged with the
old preheader at the MI level, but will prevent unwanted reordering at the
selection-DAG level.

llvm-svn: 182045
2013-05-16 19:58:38 +00:00
Akira Hatanaka
ba455f200e [mips] Test case for r182042. Add comment.
llvm-svn: 182044
2013-05-16 19:57:23 +00:00
Akira Hatanaka
fe7cc5cbd7 [mips] Fix instruction selection pattern for sint_to_fp node to avoid emitting an
invalid instruction sequence.

Rather than emitting an int-to-FP move instruction and an int-to-FP conversion
instruction during instruction selection, we emit a pseudo instruction which gets
expanded post-RA. Without this change, register allocation can possibly insert a
floating point register move instruction between the two instructions, which is not
valid according to the ISA manual.

mtc1 $f4, $4         # int-to-fp move instruction.
mov.s $f2, $f4       # move contents of $f4 to $f2.
cvt.s.w $f0, $f2     # int-to-fp conversion.

llvm-svn: 182042
2013-05-16 19:48:37 +00:00
Jack Carter
8986125dda Mips assembler: Add branch macro definitions
This patch adds bnez and beqz instructions which represent alias definitions for bne and beq instructions as follows:
bnez $rs,$imm => bne $rs,$zero,$imm
beqz $rs,$imm => beq $rs,$zero,$imm

The corresponding test cases are added.

Patch by Vladimir Medic

llvm-svn: 182040
2013-05-16 19:40:19 +00:00
Akira Hatanaka
8513e9139f [mips] Fix indentation.
llvm-svn: 182036
2013-05-16 18:42:42 +00:00
Akira Hatanaka
8857b3a45c [mips] Delete unused enum value.
llvm-svn: 182035
2013-05-16 18:40:12 +00:00
Ulrich Weigand
7b22c7a38a [PowerPC] Use true offset value in "memrix" machine operands
This is the second part of the change to always return "true"
offset values from getPreIndexedAddressParts, tackling the
case of "memrix" type operands.

This is about instructions like LD/STD that only have a 14-bit
field to encode immediate offsets, which are implicitly extended
by two zero bits by the machine, so that in effect we can access
16-bit offsets as long as they are a multiple of 4.

The PowerPC back end currently handles such instructions by
carrying the 14-bit value (as it will get encoded into the
actual machine instructions) in the machine operand fields
for such instructions.  This means that those values are
in fact not the true offset, but rather the offset divided
by 4 (and then truncated to an unsigned 14-bit value).

Like in the case fixed in r182012, this makes common code
operations on such offset values not work as expected.
Furthermore, there doesn't really appear to be any strong
reason why we should encode machine operands this way.

This patch therefore changes the encoding of "memrix" type
machine operands to simply contain the "true" offset value
as a signed immediate value, while enforcing the rules that
it must fit in a 16-bit signed value and must also be a
multiple of 4.

This change must be made simultaneously in all places that
access machine operands of this type.  However, just about
all those changes make the code simpler; in many cases we
can now just share the same code for memri and memrix
operands.

llvm-svn: 182032
2013-05-16 17:58:02 +00:00
Hal Finkel
7daa616e35 PPC32 cannot form counter loops around i64 FP conversions
On PPC32, i64 FP conversions are implemented using runtime calls (which clobber
the counter register). These must be excluded.

llvm-svn: 182023
2013-05-16 16:52:41 +00:00
Aaron Ballman
42af887d8c Fixing a 64-bit conversion warning in MSVC.
llvm-svn: 182018
2013-05-16 16:03:36 +00:00
Rafael Espindola
4c7120e048 Remove dead calls to addFrameMove.
Without a PROLOG_LABEL present, the cfi instructions are never printed.

llvm-svn: 182016
2013-05-16 15:08:37 +00:00
Ulrich Weigand
08228b8354 [PowerPC] Report true displacement value from getPreIndexedAddressParts
DAGCombiner::CombineToPreIndexedLoadStore calls a target routine to
decompose a memory address into a base/offset pair.  It expects the
offset (if constant) to be the true displacement value in order to
perform optional additional optimizations; in particular, to convert
other uses of the original pointer into uses of the new base pointer
after pre-increment.

The PowerPC implementation of getPreIndexedAddressParts, however,
simply calls SelectAddressRegImm, which returns a TargetConstant.
This value is appropriate for encoding into the instruction, but
it is not always usable as true displacement value:

- Its type is always MVT::i32, even on 64-bit, where addresses
  ought to be i64 ... this causes the optimization to simply
  always fail on 64-bit due to this line in DAGCombiner:

      // FIXME: In some cases, we can be smarter about this.
      if (Op1.getValueType() != Offset.getValueType()) {

- Its value is truncated to an unsigned 16-bit value if negative.
  This causes the above opimization to generate wrong code.

This patch fixes both problems by simply returning the true
displacement value (in its original type).  This doesn't
affect any other user of the displacement.

llvm-svn: 182012
2013-05-16 14:53:05 +00:00
Richard Sandiford
cb335bb295 [SystemZ] Tweak register array comment
llvm-svn: 182007
2013-05-16 13:39:02 +00:00
Patrik Hagglund
a0ea76e714 Removed unused variable, detected by gcc
-Wunused-but-set-variable. Leftover from r181979.

llvm-svn: 181993
2013-05-16 08:37:22 +00:00
Rafael Espindola
537821c785 Delete dead code.
llvm-svn: 181982
2013-05-16 04:59:17 +00:00
Rafael Espindola
24bf7876c2 Don't call addFrameMove on XCore.
getExceptionHandlingType is not ExceptionHandling::DwarfCFI on xcore, so
etFrameInstructions is never called. There is no point creating cfi
instructions if they are never used.

llvm-svn: 181979
2013-05-16 04:16:25 +00:00
Rafael Espindola
92a3518a62 Removed dead code.
llvm-svn: 181975
2013-05-16 03:34:58 +00:00
Reed Kotler
fb71c30979 Patch number 2 for mips16/32 floating point interoperability stubs.
This creates stubs that help Mips32 functions call Mips16 
functions which have floating point parameters that are normally passed
in floating point registers.
 

llvm-svn: 181972
2013-05-16 02:17:42 +00:00
Derek Schuff
8af06f2ba6 Revert "Support unaligned load/store on more ARM targets"
This reverts r181898.

llvm-svn: 181944
2013-05-15 23:07:43 +00:00
Rafael Espindola
e1f815ebd9 Delete dead code.
llvm-svn: 181941
2013-05-15 22:27:35 +00:00
Hal Finkel
ae8f6158eb undef setjmp in PPCCTRLoops
Trying to unbreak the VS build by copying some undef code from
Utils/LowerInvoke.cpp.

llvm-svn: 181938
2013-05-15 22:20:24 +00:00
David Majnemer
8ce4c34d1d X86: Remove redundant test instructions
Increase the number of instructions LLVM recognizes as setting the ZF
flag. This allows us to remove test instructions that redundantly
recalculate the flag.

llvm-svn: 181937
2013-05-15 22:03:08 +00:00
Hal Finkel
91bd48d046 Implement PPC counter loops as a late IR-level pass
The old PPCCTRLoops pass, like the Hexagon pass version from which it was
derived, could only handle some simple loops in canonical form. We cannot
directly adapt the new Hexagon hardware loops pass, however, because the
Hexagon pass contains a fundamental assumption that non-constant-trip-count
loops will contain a guard, and this is not always true (the result being that
incorrect negative counts can be generated). With this commit, we replace the
pass with a late IR-level pass which makes use of SE to calculate the
backedge-taken counts and safely generate the loop-count expressions (including
any necessary max() parts). This IR level pass inserts custom intrinsics that
are lowered into the desired decrement-and-branch instructions.

The most fragile part of this new implementation is that interfering uses of
the counter register must be detected on the IR level (and, on PPC, this also
includes any indirect branches in addition to function calls). Also, to make
all of this work, we need a variant of the mtctr instruction that is marked
as having side effects. Without this, machine-code level CSE, DCE, etc.
illegally transform the resulting code. Hopefully, this can be improved
in the future.

This new pass is smaller than the original (and much smaller than the new
Hexagon hardware loops pass), and can handle many additional cases correctly.
In addition, the preheader-creation code has been copied from LoopSimplify, and
after we decide on where it belongs, this code will be refactored so that it
can be explicitly shared (making this implementation even smaller).

The new test-case files ctrloop-{le,lt,ne}.ll have been adapted from tests for
the new Hexagon pass. There are a few classes of loops that this pass does not
transform (noted by FIXMEs in the files), but these deficiencies can be
addressed within the SE infrastructure (thus helping many other passes as well).

llvm-svn: 181927
2013-05-15 21:37:41 +00:00