1
0
mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-10-26 06:22:56 +02:00
Commit Graph

21852 Commits

Author SHA1 Message Date
Akira Hatanaka
6b71c77901 Move the code that creates instances of MipsInstrInfo and MipsFrameLowering out
of MipsTargetMachine.cpp.

llvm-svn: 161191
2012-08-02 18:21:47 +00:00
Akira Hatanaka
a1ef6b9f53 Set transient stack alignment in constructor of MipsFrameLowering and re-enable
test o32_cc_vararg.ll.

llvm-svn: 161189
2012-08-02 18:15:13 +00:00
Jiangning Liu
b12a230a17 Support fpv4 for ARM Cortex-M4.
llvm-svn: 161163
2012-08-02 08:35:55 +00:00
Jiangning Liu
a806f72610 Fix #13035, a bug around Thumb instruction LDRD/STRD with negative #0 offset index issue.
llvm-svn: 161162
2012-08-02 08:29:50 +00:00
Jiangning Liu
69f0770c21 Fix #13138, a bug around ARM instruction DSB encoding and decoding issue.
llvm-svn: 161161
2012-08-02 08:21:27 +00:00
Jiangning Liu
e4c91e239f Fix #13241, a bug around shift immediate operand for ARM instruction ADR.
llvm-svn: 161159
2012-08-02 08:13:13 +00:00
Manman Ren
78b8d454cc X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

This patch is a rework of r160919 and was tested on clang self-host on my local
machine.

rdar://10554090 and rdar://11873276

llvm-svn: 161152
2012-08-02 00:56:42 +00:00
Manman Ren
1c73b43c93 X86: mark GATHER instructios as mayLoad
llvm-svn: 161143
2012-08-01 23:28:59 +00:00
Jim Grosbach
b60d4c1110 ARM: Remove redundant instalias.
llvm-svn: 161134
2012-08-01 20:33:05 +00:00
Jim Grosbach
32cbdce015 Clean up formatting.
llvm-svn: 161133
2012-08-01 20:33:02 +00:00
Jim Grosbach
c4b3246b76 Tidy up.
llvm-svn: 161132
2012-08-01 20:33:00 +00:00
Chad Rosier
078db59861 Whitespace.
llvm-svn: 161122
2012-08-01 18:39:17 +00:00
Elena Demikhovsky
0fec7026d9 Added FMA functionality to X86 target.
llvm-svn: 161110
2012-08-01 12:06:00 +00:00
Craig Topper
3f200a7638 Add more indirection to the disassembler tables to reduce amount of space used to store the operand types and encodings. Store only the unique combinations in a separate table and store indices in the instruction table. Saves about 32K of static data.
llvm-svn: 161101
2012-08-01 07:39:18 +00:00
Akira Hatanaka
731add334a Implement MipsJITInfo::replaceMachineCodeForFunction.
No new test case is added.
This patch makes test JITTest.FunctionIsRecompiledAndRelinked pass on mips
platform.

Patch by Petar Jovanovic.

llvm-svn: 161098
2012-08-01 02:29:24 +00:00
Akira Hatanaka
7c93354aff Remove unused variable.
llvm-svn: 161095
2012-08-01 00:37:53 +00:00
Akira Hatanaka
c43e6b2166 Implement MipsSERegisterInfo::eliminateCallFramePseudoInstr. The function emits
instructions that decrement and increment the stack pointer before and after a
call when the function does not have a reserved call frame.

llvm-svn: 161093
2012-07-31 23:52:55 +00:00
Akira Hatanaka
24dddbed36 Add definitions of two subclasses of MipsRegisterInfo, Mips16RegisterInfo and
MipsSERegisterInfo.

llvm-svn: 161092
2012-07-31 23:41:32 +00:00
Akira Hatanaka
87fd9a992a Add definitions of two subclasses of MipsFrameLowering, Mips16FrameLowering and
MipsSEFrameLowering.

Implement MipsSEFrameLowering::hasReservedCallFrame. Call frames will not be
reserved if there is a call with a large call frame or there are variable sized
objects on the stack.

llvm-svn: 161090
2012-07-31 22:50:19 +00:00
Akira Hatanaka
45e66997d4 Add Mips16InstrInfo.cpp and MipsSEInstrInfo.cpp to CMakeLists.txt.
llvm-svn: 161083
2012-07-31 22:11:05 +00:00
Akira Hatanaka
46388c74c0 Add definitions of two subclasses of MipsInstrInfo, MipsInstrInfo (for mips16),
and MipsSEInstrInfo (for mips32/64).

llvm-svn: 161081
2012-07-31 21:49:49 +00:00
Akira Hatanaka
44f5fec97d Delete mips64 target machine classes. mips target machines can be used in place
of them.

llvm-svn: 161080
2012-07-31 21:39:17 +00:00
Akira Hatanaka
ad80f510bc Let PEI::calculateFrameObjectOffsets compute the final stack size rather than
computing it in MipsFrameLowering::emitPrologue.

llvm-svn: 161078
2012-07-31 21:28:49 +00:00
Akira Hatanaka
d43e99897c Expand DYNAMIC_STACKALLOC nodes rather than doing custom-lowering.
The frame object which points to the dynamically allocated area will not be
needed after changes are made to cease reserving call frames.

llvm-svn: 161076
2012-07-31 20:54:48 +00:00
Akira Hatanaka
e1beddb7e8 Define ADJCALLSTACKDOWN/UP nodes. These nodes are emitted regardless of whether
or not it is in mips16 mode. Define MipsPseudo (mode-independant pseudo) and
PseudoSE (mips32/64 pseudo) classes.

llvm-svn: 161071
2012-07-31 19:13:07 +00:00
Akira Hatanaka
4a17cb84f3 Change name of class MipsInst to InstSE to distinguish it from mips16's
instruction class. SE stands for standard encoding.

llvm-svn: 161069
2012-07-31 18:55:01 +00:00
Akira Hatanaka
85ddbf2e38 When store nodes or memcpy nodes are created to copy the function call
arguments to the stack in MipsISelLowering::LowerCall, use stack pointer and
integer offset operands rather than frame object operands.

llvm-svn: 161068
2012-07-31 18:46:41 +00:00
Chad Rosier
9a4ff99710 [x86 frame lowering] In 32-bit mode, use ESI as the base pointer.
Previously, we were using EBX, but PIC requires the GOT to be in EBX before 
function calls via PLT GOT pointer.

llvm-svn: 161066
2012-07-31 18:29:21 +00:00
Akira Hatanaka
aab47c049b Fix type of LUXC1 and SUXC1. These instructions were incorrectly defined as
single-precision load and store.

Also avoid selecting LUXC1 and SUXC1 instructions during isel. It is incorrect
to map unaligned floating point load/store nodes to these instructions.

llvm-svn: 161063
2012-07-31 18:16:49 +00:00
Craig Topper
4f038a7a70 Make INSTRUCTION_SPECIFIER_FIELDS match X86DisassemblerCommon.h. Also remove trailing whitespace.
llvm-svn: 161029
2012-07-31 05:18:26 +00:00
Craig Topper
62ece8c8d5 Tidy up trailing whitespace
llvm-svn: 161027
2012-07-31 04:58:05 +00:00
Craig Topper
676ae779ef Tidy up trailing whitespace
llvm-svn: 161026
2012-07-31 04:38:27 +00:00
Kevin Enderby
cde92a2741 Fix a bug in ARMMachObjectWriter::RecordRelocation() in ARMMachObjectWriter.cpp
where the other_half of the movt and movw relocation entries needs to get set
and only with the 16 bits of the other half.

rdar://10038370

llvm-svn: 160978
2012-07-30 18:46:15 +00:00
Craig Topper
f374fd0e17 Mark MOVZX16/MOVSX16 as neverHasSideEffects/mayLoad
llvm-svn: 160953
2012-07-30 07:14:07 +00:00
Craig Topper
19fc5055ea Mark MOVZX32_NOREX as isCodeGenOnly and neverHasSideEffects. The isCodeGenOnly change allows special detection of _NOREX instructions to be removed from tablegen disassembler code.
llvm-svn: 160951
2012-07-30 06:48:11 +00:00
Craig Topper
80fdfb7f56 Give VCVTTPD2DQ priority over CVTTPD2DQ.
llvm-svn: 160942
2012-07-30 02:20:32 +00:00
Craig Topper
492a7af190 Fix patterns for CVTTPS2DQ to specify SSE2 instead of SSE1.
llvm-svn: 160941
2012-07-30 02:14:02 +00:00
Craig Topper
9050c15c71 Fix up patterns for VCVTSS2SD. Specifically give it priority over SSE form. Add an OptForSpeed to explicitly pair up with an OptForSize that was already on another pattern.
llvm-svn: 160939
2012-07-30 01:38:57 +00:00
Craig Topper
147248a6a0 Fix load types on intrinsic forms of SS2SD and SD2SS AVX/SSE convert instruction patterns.
llvm-svn: 160938
2012-07-29 23:26:34 +00:00
Craig Topper
293e781ba6 Move more SSE/AVX convert instruction patterns into their definitions.
llvm-svn: 160937
2012-07-29 22:30:06 +00:00
Manman Ren
ceef7c4d9b Revert r160920 and r160919 due to dragonegg and clang selfhost failure
llvm-svn: 160927
2012-07-29 02:44:09 +00:00
Craig Topper
e75418242a Fold patterns for some of the SSE/AVX convert instructions into their instruction definitions.
llvm-svn: 160922
2012-07-28 18:59:19 +00:00
Craig Topper
189349dab2 Mark some of the SSE/AVX convert instructions as mayLoad/neverHasSideEffects.
llvm-svn: 160921
2012-07-28 18:36:39 +00:00
Manman Ren
ea77f9076b X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

rdar://10554090 and rdar://11873276

llvm-svn: 160919
2012-07-28 16:48:01 +00:00
Craig Topper
3c15b4afd4 Make CVTSS2SI instruction definition consistent with CVTSD2SI.
llvm-svn: 160914
2012-07-28 08:28:23 +00:00
Craig Topper
8121932592 Fix up memory load types for SSE scalar convert intrinsic patterns.
llvm-svn: 160913
2012-07-28 07:59:59 +00:00
Manman Ren
fbc9fcdbf2 X86 Peephole: fix PR13475 in optimizeCompare.
It is possible that an instruction can use and update EFLAGS.
When checking the safety, we should check the usage of EFLAGS first before
declaring it is safe to optimize due to the update.

llvm-svn: 160912
2012-07-28 03:15:46 +00:00
Akira Hatanaka
9a1f4df56a Pass the correct call frame size to callseq_start node. This is needed to
replace uses of function getMaxCallFrameSize defined in MipsFunctionInfo with
the one MachineFrameInfo has.

llvm-svn: 160841
2012-07-26 23:27:01 +00:00
Jakob Stoklund Olesen
4b1105c720 Remove the X86 sub_ss and sub_sd sub-register indexes completely.
llvm-svn: 160833
2012-07-26 23:07:20 +00:00
Jakob Stoklund Olesen
008b36037d Remove the last mentions of sub_ss and sub_sd from patterns.
I'll remove these two sub-register indexes shortly.

llvm-svn: 160831
2012-07-26 23:03:08 +00:00
Jakob Stoklund Olesen
c1dd7a213d Eliminate sub_ss, sub_sd from broadcast patterns.
The (COPY_TO_REGCLASS GR32:$src, VR128) pattern looks odd, but
copyPhysReg does the right thing with it. (The old pattern would
eventually produce the same cross-class copy).

llvm-svn: 160830
2012-07-26 22:59:06 +00:00
Jakob Stoklund Olesen
419ccae442 Eliminate more sub_ss / sub_sd patterns.
This gets rid of some more INSERT_SUBREG - IMPLICIT_DEF patterns,
simplifying the emitted code a bit.

llvm-svn: 160820
2012-07-26 22:30:18 +00:00
Jakob Stoklund Olesen
0a41a45c36 Eliminate some SUBREG_TO_REG patterns with sub_ss and sub_sd.
The SUBREG_TO_REG instruction has magic semantics asserting that the
source value was defined by an instruction that cleared the high half of
the register. Those semantics are never actually exploited for xmm
registers.

llvm-svn: 160818
2012-07-26 22:03:21 +00:00
Jakob Stoklund Olesen
44868927b9 Eliminate a batch of uses of sub_ss and sub_sd in the X86 target.
These idempotent sub-register indices don't do anything --- They simply
map XMM registers to themselves.  They no longer affect register classes
either since the SubRegClasses field has been removed from Target.td.

This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns
with COPY_TO_REGCLASS patterns which simply become COPY instructions.

The number of IMPLICIT_DEF instructions before register allocation is
reduced, and that is the cause of the test case changes.

llvm-svn: 160816
2012-07-26 21:40:42 +00:00
Craig Topper
9667d599eb Make l/q suffixes on AVX forms of scalar convert instructions consistent with their non-AVX forms.
llvm-svn: 160775
2012-07-26 07:48:28 +00:00
Akira Hatanaka
92df819965 Fix call setup for PIC.
Patch by Reed Kotler.

llvm-svn: 160774
2012-07-26 02:24:43 +00:00
Jim Grosbach
a7de0b586b ARM: Don't assume an SDNode is a constant.
Before accessing a node as a ConstandSDNode, make sure it actually is one.
No testcase of non-trivial size.

rdar://11948669

llvm-svn: 160735
2012-07-25 17:02:47 +00:00
Nuno Lopes
537a3395e5 make all Emit*() functions consult the TargetLibraryInfo information before creating a call to a library function.
Update all clients to pass the TLI information around.
Previous draft reviewed by Eli.

llvm-svn: 160733
2012-07-25 16:46:31 +00:00
Rafael Espindola
062cf9ccf9 Fix typos. Thanks to Matt Beaumont-Gay for noticing it.
llvm-svn: 160731
2012-07-25 15:42:45 +00:00
Rafael Espindola
8dfc815c3f When a return struct pointer is passed in registers, the called has nothing
to pop.

llvm-svn: 160725
2012-07-25 13:41:10 +00:00
Rafael Espindola
e58779ca1b Factor a long list of conditions into a predicate function. No functionality
change.

llvm-svn: 160724
2012-07-25 13:35:45 +00:00
Akira Hatanaka
f52e519898 Eliminate the stack slot used to save the global base register.
The long branch pass (fixed in r160601) no longer uses the global base register
to compute addresses of branch destinations, so it is not necessary to reserve
a slot on the stack.

llvm-svn: 160703
2012-07-25 03:16:47 +00:00
Kevin Enderby
9a7bc24c01 Fix a bug in the x86 disassembler's symbolic disassembly support for Jcc-Jump
if Condition Is Met instuctions that was not correctly determining the target
instruction.

So for a jne rel32 instruction:

% cat x.s
.byte 0x0f, 0x85, 0x09, 0x00, 0x00, 0x00
% as x.s

it was incorrectly deterining the target:

% otool -q -tv a.out 
a.out:
(__TEXT,__text) section
0000000000000000	jne	0xd

and with the fix it gets this correct as:

% otool -q -tv a.out
a.out:
(__TEXT,__text) section
0000000000000000	jne	0xf

rdar://11505997

llvm-svn: 160694
2012-07-24 21:40:01 +00:00
Nuno Lopes
6319e86c31 add a few more functions to TargetLibraryInfo:
fputc, memchr, memcmp, putchar, puts, strchr, strncmp

llvm-svn: 160690
2012-07-24 21:00:36 +00:00
David Chisnall
c21e3b3ce1 ELF does not imply GNU/Linux. Do not assume GNU conventions just because we
are targeting an ELF platform.  Only fold gs-relative (and fs-relative) loads
if it is actually sensible to do so for the target platform.

This fixes PR13438.

llvm-svn: 160687
2012-07-24 20:04:16 +00:00
Nuno Lopes
aa177bbdbc TargetLibraryInfo: add strn?cat, strn?cpy, and strn?len
llvm-svn: 160678
2012-07-24 17:25:06 +00:00
Akira Hatanaka
dba3c8d511 Fix function MipsCodeEmitter::emitExternalSymbolAddress to pass test
ExecutionEngine/test-fp.ll.
 
Patch by Petar Jovanovic.

llvm-svn: 160653
2012-07-24 00:08:26 +00:00
Akira Hatanaka
b1ba835c42 Add basic ability to setup call frame, and make procedure calls.
Hello world will compile and execute with this patch.

Patch by Reed Kotler.

llvm-svn: 160651
2012-07-23 23:45:54 +00:00
Akira Hatanaka
50170a31a8 Add comment for relocations MO_HIGHER and HIGHEST in MipsBaseInfo.h.
llvm-svn: 160636
2012-07-23 19:19:20 +00:00
Micah Villmow
759c8c46ac Test revert of test changes.
llvm-svn: 160632
2012-07-23 16:42:45 +00:00
Micah Villmow
0482f60db4 Test commit.
llvm-svn: 160631
2012-07-23 16:37:24 +00:00
Sylvestre Ledru
bf8acb65ac Fix a typo (the the => the)
llvm-svn: 160621
2012-07-23 08:51:15 +00:00
Akira Hatanaka
68623fbfd7 Fix Mips long branch pass.
This pass no longer requires that the global pointer value be saved to the
stack or register since it uses bal instruction to compute branch distance.

llvm-svn: 160601
2012-07-21 03:30:44 +00:00
Akira Hatanaka
508804a633 Add HIGHER and HIGHEST relocations to Mips backend.
llvm-svn: 160599
2012-07-21 03:09:04 +00:00
Akira Hatanaka
734a4a8569 Revert accidental commit.
llvm-svn: 160598
2012-07-21 02:20:33 +00:00
Akira Hatanaka
e87a027c19 Add VK_Mips_HIGHER and VK_Mips_HIGHEST to MCSymbolRefExpr::VariantKind.
Test case will be added later when long branch patch is checked in.

llvm-svn: 160597
2012-07-21 02:15:19 +00:00
Craig Topper
015d52b2dc Don't use implicit register operands to calculate L-bit for AVX instructions. Needed because super reg defs and kills are added as implicit operands on 128-bit instructions. Fixes PR13349. Patch by Jose Fonseca.
llvm-svn: 160543
2012-07-20 07:03:46 +00:00
Preston Gurd
6d82adeada Adds the family codes for the Midview Atom processors so that the
Atom buildbot will auto-detect Atom.

llvm-svn: 160521
2012-07-19 19:05:37 +00:00
Sebastian Pop
07b782daeb default to use -mv4 when no version of Hexagon has been specified
This fixes a bunch of make check failures of the form:

Unknown Architecture Version.
UNREACHABLE executed at ../lib/Target/Hexagon/HexagonSubtarget.cpp:60!

llvm-svn: 160518
2012-07-19 18:24:50 +00:00
Jush Lu
54c7329b88 [arm-fast-isel] Add support for vararg function calls.
llvm-svn: 160500
2012-07-19 09:49:00 +00:00
Bill Wendling
d091d1863b Remove tabs.
llvm-svn: 160483
2012-07-19 00:25:04 +00:00
Bill Wendling
0b007e009e Remove tabs.
llvm-svn: 160479
2012-07-19 00:15:11 +00:00
Bill Wendling
17b12b72bc Remove tabs.
llvm-svn: 160477
2012-07-19 00:11:40 +00:00
Bill Wendling
b1bd365dfc Remove tabs.
llvm-svn: 160476
2012-07-19 00:06:06 +00:00
Manman Ren
dd8d9c10a3 X86: remove redundant cmp against zero.
Updated OptimizeCompare in peephole to remove redundant cmp against zero.
We only remove Compare if CF and OF are not used.

rdar://11855129

llvm-svn: 160454
2012-07-18 21:40:01 +00:00
Preston Gurd
d2b344c685 This patch fixes 8 out of 20 unexpected failures in "make check"
when run on an Intel Atom processor. The failures have arisen due
to changes elsewhere in the trunk over the past 8 weeks or so.

These failures were not detected by the Atom buildbot because the
CPU on the Atom buildbot was not being detected as an Atom CPU.
The fix for this problem is in Host.cpp and X86Subtarget.cpp, but
shall remain commented out until the current set of Atom test failures
are fixed.

Patch by Andy Zhang and Tyler Nowicki!

llvm-svn: 160451
2012-07-18 20:49:17 +00:00
Andrew Trick
f21192c005 Fix ARMTargetLowering::isLegalAddImmediate to consider thumb encodings.
Based on Evan's suggestion without a commitable test.

llvm-svn: 160441
2012-07-18 18:34:27 +00:00
Andrew Trick
b611feef0c whitespace
llvm-svn: 160440
2012-07-18 18:34:24 +00:00
Nadav Rotem
03d2729392 The vbroadcast family of instructions has 'fallback patterns' in case where the
load source operand is used by multiple nodes. The v2i64 broadcast was emulated
by shuffling the two lower i32 elements to the upper two.
We had a bug in the immediate used for the broadcast.
Replacing 0 to 0x44.
0x44 means [01|00|01|00] which corresponds to the correct lane.

Patch by Michael Kuperstein.

llvm-svn: 160430
2012-07-18 08:14:48 +00:00
Jack Carter
7f725ae6fe Mips specific inline asm operand modifier 'M':
Print the high order register of a double word register operand.

In 32 bit mode, a 64 bit double word integer will be represented
by 2 32 bit registers. This modifier causes the high order register
to be used in the asm expression. It is useful if you are using 
doubles in assembler and continue to control register to variable
relationships.

This patch also fixes a related bug in a previous patch:

    case 'D': // Second part of a double word register operand
    case 'L': // Low order register of a double word register operand
    case 'M': // High order register of a double word register operand

I got 'D' and 'M' confused. The second part of a double word operand
will only match 'M' for one of the endianesses. I had 'L' and 'D'
be the opposite twins when 'L' and 'M' are.

llvm-svn: 160429
2012-07-18 06:41:36 +00:00
Craig Topper
6150f43b28 Remove tab characters.
llvm-svn: 160425
2012-07-18 04:59:16 +00:00
Craig Topper
b086c8faf2 Fix typo in error message and remove some tab characters.
llvm-svn: 160423
2012-07-18 04:36:35 +00:00
Craig Topper
b144f3b6db Make x86 asm parser to check for xmm vs ymm for index register in gather instructions. Also fix Intel syntax for gather instructions to use 'DWORD PTR' or 'QWORD PTR' to match gas.
llvm-svn: 160420
2012-07-18 04:11:12 +00:00
Joel Jones
4ce75efda5 More replacing of target-dependent intrinsics with target-indepdent
intrinsics.  The second instruction(s) to be handled are the vector versions 
of count set bits (ctpop).

The changes here are to clang so that it generates a target independent 
vector ctpop when it sees an ARM dependent vector bits set count.  The changes 
in llvm are to match the target independent vector ctpop and in 
VMCore/AutoUpgrade.cpp to update any existing bc files containing ARM 
dependent vector pop counts with target-independent ctpops.  There are also 
changes to an existing test case in llvm for ARM vector count instructions and 
to a test for the bitcode upgrade.

<rdar://problem/11892519>

There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>

llvm-svn: 160410
2012-07-18 00:02:16 +00:00
Akira Hatanaka
25d4c684e9 Clean up Mips16InstrFormats.td and Mips16InstrInfo.td.
Patch by Reed Kotler.

llvm-svn: 160403
2012-07-17 22:55:34 +00:00
Evan Cheng
5e82ad04d5 Back out r160101 and instead implement a dag combine to recover from instcombine transformation.
llvm-svn: 160387
2012-07-17 18:54:11 +00:00
Evan Cheng
f84dd0cf40 Implement r160312 as target indepedenet dag combine.
llvm-svn: 160354
2012-07-17 08:31:11 +00:00
Evan Cheng
0b6bcb6e06 This is another case where instcombine demanded bits optimization created
large immediates. Add dag combine logic to recover in case the large
immediates doesn't fit in cmp immediate operand field.

int foo(unsigned long l) {
  return (l>> 47) == 1;
}

we produce

  %shr.mask = and i64 %l, -140737488355328
  %cmp = icmp eq i64 %shr.mask, 140737488355328
  %conv = zext i1 %cmp to i32
  ret i32 %conv

which codegens to

movq    $0xffff800000000000,%rax
andq    %rdi,%rax
movq    $0x0000800000000000,%rcx
cmpq    %rcx,%rax
sete    %al
movzbl    %al,%eax
ret

TargetLowering::SimplifySetCC would transform
(X & -256) == 256 -> (X >> 8) == 1
if the immediate fails the isLegalICmpImmediate() test. For x86,
that's immediates which are not a signed 32-bit immediate.

Based on a patch by Eli Friedman.

PR10328
rdar://9758774

llvm-svn: 160346
2012-07-17 06:53:39 +00:00
Evan Cheng
b409a61574 For something like
uint32_t hi(uint64_t res)
{
        uint_32t hi = res >> 32;
        return !hi;
}

llvm IR looks like this:
define i32 @hi(i64 %res) nounwind uwtable ssp {
entry:
  %lnot = icmp ult i64 %res, 4294967296
  %lnot.ext = zext i1 %lnot to i32
  ret i32 %lnot.ext
}

The optimizer has optimize away the right shift and truncate but the resulting
constant is too large to fit in the 32-bit immediate field. The resulting x86
code is worse as a result:
        movabsq $4294967296, %rax       ## imm = 0x100000000
        cmpq    %rax, %rdi
        sbbl    %eax, %eax
        andl    $1, %eax

This patch teaches the x86 lowering code to handle ult against a large immediate
with trailing zeros. It will issue a right shift and a truncate followed by
a comparison against a shifted immediate.
        shrq    $32, %rdi
        testl   %edi, %edi
        sete    %al
        movzbl  %al, %eax

It also handles a ugt comparison against a large immediate with trailing bits
set. i.e. X >  0x0ffffffff -> (X >> 32) >= 1

rdar://11866926

llvm-svn: 160312
2012-07-16 19:35:43 +00:00
Tom Stellard
39f7e52397 Revert "AMDGPU: Add core backend files for R600/SI codegen v6"
This reverts commit 4ea70107c5e51230e9e60f0bf58a0f74aa4885ea.

llvm-svn: 160303
2012-07-16 18:19:53 +00:00