large nested types over and over again to determine if they are sized or not.
Now, isSized() is able to make snap decisions about all concrete types, which
are a common occurance (and includes all primitives).
On 177.mesa, this speeds up DSE from 39.5s -> 21.3s and GCSE from
13.2s -> 11.3s, reducing gccas time from 80s -> 61s (this is a debug build).
DSE and GCSE are still too slow on this testcase, but this is a simple
improvement.
llvm-svn: 19800
all. This should speed up the X86 backend fairly significantly on integer
codes. Now if only we didn't have to compute livevar still... ;-)
llvm-svn: 19796
doesn't support certain directives and symbols on cygwin are prefixed with
an underscore. This patch makes the necessary adjustments to the output.
llvm-svn: 19775
differences, which means that identical instructions (after stripping off
the first literal string) do not run any different code at all. On the X86,
this turns this code:
switch (MI->getOpcode()) {
case X86::ADC32mi: printOperand(MI, 4, MVT::i32); break;
case X86::ADC32mi8: printOperand(MI, 4, MVT::i8); break;
case X86::ADC32mr: printOperand(MI, 4, MVT::i32); break;
case X86::ADD32mi: printOperand(MI, 4, MVT::i32); break;
case X86::ADD32mi8: printOperand(MI, 4, MVT::i8); break;
case X86::ADD32mr: printOperand(MI, 4, MVT::i32); break;
case X86::AND32mi: printOperand(MI, 4, MVT::i32); break;
case X86::AND32mi8: printOperand(MI, 4, MVT::i8); break;
case X86::AND32mr: printOperand(MI, 4, MVT::i32); break;
case X86::CMP32mi: printOperand(MI, 4, MVT::i32); break;
case X86::CMP32mr: printOperand(MI, 4, MVT::i32); break;
case X86::MOV32mi: printOperand(MI, 4, MVT::i32); break;
case X86::MOV32mr: printOperand(MI, 4, MVT::i32); break;
case X86::OR32mi: printOperand(MI, 4, MVT::i32); break;
case X86::OR32mi8: printOperand(MI, 4, MVT::i8); break;
case X86::OR32mr: printOperand(MI, 4, MVT::i32); break;
case X86::ROL32mi: printOperand(MI, 4, MVT::i8); break;
case X86::ROR32mi: printOperand(MI, 4, MVT::i8); break;
case X86::SAR32mi: printOperand(MI, 4, MVT::i8); break;
case X86::SBB32mi: printOperand(MI, 4, MVT::i32); break;
case X86::SBB32mi8: printOperand(MI, 4, MVT::i8); break;
case X86::SBB32mr: printOperand(MI, 4, MVT::i32); break;
case X86::SHL32mi: printOperand(MI, 4, MVT::i8); break;
case X86::SHLD32mrCL: printOperand(MI, 4, MVT::i32); break;
case X86::SHR32mi: printOperand(MI, 4, MVT::i8); break;
case X86::SHRD32mrCL: printOperand(MI, 4, MVT::i32); break;
case X86::SUB32mi: printOperand(MI, 4, MVT::i32); break;
case X86::SUB32mi8: printOperand(MI, 4, MVT::i8); break;
case X86::SUB32mr: printOperand(MI, 4, MVT::i32); break;
case X86::TEST32mi: printOperand(MI, 4, MVT::i32); break;
case X86::TEST32mr: printOperand(MI, 4, MVT::i32); break;
case X86::TEST8mi: printOperand(MI, 4, MVT::i8); break;
case X86::XCHG32mr: printOperand(MI, 4, MVT::i32); break;
case X86::XOR32mi: printOperand(MI, 4, MVT::i32); break;
case X86::XOR32mi8: printOperand(MI, 4, MVT::i8); break;
case X86::XOR32mr: printOperand(MI, 4, MVT::i32); break;
}
into this:
switch (MI->getOpcode()) {
case X86::ADC32mi:
case X86::ADC32mr:
case X86::ADD32mi:
case X86::ADD32mr:
case X86::AND32mi:
case X86::AND32mr:
case X86::CMP32mi:
case X86::CMP32mr:
case X86::MOV32mi:
case X86::MOV32mr:
case X86::OR32mi:
case X86::OR32mr:
case X86::SBB32mi:
case X86::SBB32mr:
case X86::SHLD32mrCL:
case X86::SHRD32mrCL:
case X86::SUB32mi:
case X86::SUB32mr:
case X86::TEST32mi:
case X86::TEST32mr:
case X86::XCHG32mr:
case X86::XOR32mi:
case X86::XOR32mr: printOperand(MI, 4, MVT::i32); break;
case X86::ADC32mi8:
case X86::ADD32mi8:
case X86::AND32mi8:
case X86::OR32mi8:
case X86::ROL32mi:
case X86::ROR32mi:
case X86::SAR32mi:
case X86::SBB32mi8:
case X86::SHL32mi:
case X86::SHR32mi:
case X86::SUB32mi8:
case X86::TEST8mi:
case X86::XOR32mi8: printOperand(MI, 4, MVT::i8); break;
}
After this, the generated asmwriters look pretty much as though they were
generated by hand. This shrinks the X86 asmwriter.inc files from 55101->39669
and 55429->39551 bytes each, and PPC from 16766->12859 bytes.
llvm-svn: 19760
strings starts out with a constant string, we emit the string first, using
a table lookup (instead of a switch statement).
Because this is usually the opcode portion of the asm string, the differences
between the instructions have now been greatly reduced. This allows many
more case statements to be grouped together.
This patch also allows instruction cases to be grouped together when the
instruction patterns are exactly identical (common after the opcode string
has been ripped off), and when the differing operand is a MachineInstr
operand that needs to be formatted.
The end result of this is a mean and lean generated AsmPrinter!
llvm-svn: 19759
emitting code like this:
case PPC::ADD: O << "add "; printOperand(MI, 0, MVT::i64); O << ", "; prin
tOperand(MI, 1, MVT::i64); O << ", "; printOperand(MI, 2, MVT::i64); O << '\n
'; break;
case PPC::ADDC: O << "addc "; printOperand(MI, 0, MVT::i64); O << ", "; pr
intOperand(MI, 1, MVT::i64); O << ", "; printOperand(MI, 2, MVT::i64); O << '
\n'; break;
case PPC::ADDE: O << "adde "; printOperand(MI, 0, MVT::i64); O << ", "; pr
intOperand(MI, 1, MVT::i64); O << ", "; printOperand(MI, 2, MVT::i64); O << '
\n'; break;
...
Emit code like this:
case PPC::ADD:
case PPC::ADDC:
case PPC::ADDE:
...
switch (MI->getOpcode()) {
case PPC::ADD: O << "add "; break;
case PPC::ADDC: O << "addc "; break;
case PPC::ADDE: O << "adde "; break;
...
}
printOperand(MI, 0, MVT::i64);
O << ", ";
printOperand(MI, 1, MVT::i64);
O << ", ";
printOperand(MI, 2, MVT::i64);
O << "\n";
break;
This shrinks the PPC asm writer from 24785->15205 bytes (even though the new
asmwriter has much more whitespace than the old one), and the X86 printers shrink
quite a bit too. The important implication of this is that GCC no longer hits swap
when building the PPC backend in optimized mode. Thus this fixes PR448.
-Chris
llvm-svn: 19755