Now it will factor things like this:
CheckType i32
...
CheckOpcode ISD::AND
CheckType i64
...
into:
SwitchType:
i32: ...
i64:
CheckOpcode ISD::AND
...
This shrinks hte table by a few bytes, nothing spectacular.
llvm-svn: 97908
IF(condition(value)):
If the value satisfies the condition, the line is processed by lit; otherwise
it is skipped. A test with no unignored directives is resolved as Unsupported.
The test suite is responsible for defining conditions; conditions are unary
functions over strings. I've defined two conditions in the LLVM test suite,
TARGET (with values like those in TARGETS_TO_BUILD) and BINDING (with values
like those in llvm_bindings). So for example you can write:
IF(BINDING(ocaml)): RUN: %blah %s -o -
and the RUN line will only execute if LLVM was configured with the ocaml
bindings.
llvm-svn: 97726
sequence, just emit instruction predicates right before them. This
exposes yet more factoring opportunitites, shrinking the X86 table
to 79144 bytes.
llvm-svn: 97704
as the very last thing before node emission. This should
dramatically reduce the number of times we do 'MatchAddress'
on X86, speeding up compile time. This also improves comments
in the tables and shrinks the table a bit, now down to
80506 bytes for x86.
llvm-svn: 97703
SwitchOpcodeMatcher) and have DAGISelMatcherOpt form it. This
speeds up selection, particularly for X86 which has lots of
variants of instructions with only type differences.
llvm-svn: 97645
stuff now that we don't care about emulating the old broken
behavior of the old isel. This eliminates the
'CheckChainCompatible' check (along with IsChainCompatible) which
did an incorrect and inefficient scan *up* the chain nodes which
happened as the pattern was being formed and does the validation
at the end in HandleMergeInputChains when it forms a structural
pattern. This scans "down" the graph, which means that it is
quickly bounded by nodes already selected. This also handles
token factors that get "trapped" in the dag.
Removing the CheckChainCompatible nodes also shrinks the
generated tables by about 6K for X86 (down to 83K).
There are two pieces remaining before I can nuke PreprocessRMW:
1. I xfailed a test because we're now producing worse code in a
case that has nothing to do with the change: it turns out that
our use of MorphNodeTo will leave dead nodes in the graph
which (depending on how the graph is walked) end up causing
bogus uses of chains and blocking matches. This is really
bad for other reasons, so I'll fix this in a follow-up patch.
2. CheckFoldableChainNode needs to be improved to handle the TF.
llvm-svn: 97539
EmitMergeInputChainsMatcher node up into EmitResultCode. This
doesn't have much of an effect on the generated code, the X86
table is exactly the same size.
llvm-svn: 97514
ordered correctly. Previously it would get in trouble when
two patterns were too similar and give them nondet ordering.
We force this by using the record ID order as a fallback.
The testsuite diff is due to alpha patterns being ordered
slightly differently, the change is a semantic noop afaict:
< lda $0,-100($16)
---
> subq $16,100,$0
llvm-svn: 97509
This allows formation of OpcodeSwitch for top level patterns, in
particular on X86. This saves about 1K of data space in the x86
table and makes the dispatch much more efficient.
llvm-svn: 97440
ComplexPattern at the root be generated multiple times, once
for each opcode they are part of. This encourages factoring
because the opcode checks get treated just like everything
else in the matcher.
llvm-svn: 97439
to a scope where every child starts with a CheckOpcode, but
executes more efficiently. Enhance DAGISelMatcherOpt to
form it.
This also fixes a bug in CheckOpcode: apparently the SDNodeInfo
objects are not pointer comparable, we have to compare the
enum name.
llvm-svn: 97438
so that we get grouping at the top level.
Add an optimization to reorder type check & record nodes
after opcode checks. We prefer to expose tree shape
matching which improves grouping and will enhance the next
optimization.
llvm-svn: 97432
dispatcher method. This eliminates the dependence of the new isel's
generated code on the old isel's predicates, however some random
hand written isel code still uses them.
llvm-svn: 97431
specifies whether there is an output flag or not. Use this
instead of redundantly encoding the chain/flag results in the
output vtlist.
llvm-svn: 97419
even some the old isel didn't. There are several parts of
this that make me feel dirty, but it's no worse than the
old isel. I'll clean up the parts I can do without ripping
out the old one next.
llvm-svn: 97415
node is always guaranteed to have a particular type
instead of hacking in ISD::STORE explicitly. This allows
us to use implied types for a broad range of nodes, even
target specific ones.
llvm-svn: 97355
with getType() == MVT::i32 etc. Teach it that two different
integer constants are contradictory. This cuts 1K off the X86
table, down to 98k
llvm-svn: 97314
predicates. For example if we have:
Scope:
CheckType i32
ABC
CheckType f32
DEF
CheckType i32
GHI
Then we know that we can transform this into:
Scope:
CheckType i32
Scope
ABC
GHI
CheckType f32
DEF
This reorders the check for the 'GHI' predicate above
the check for the 'DEF' predidate. However it is safe to do this
in this situation because we know that a node cannot have both an
i32 and f32 type.
We're now doing more factoring that the old isel did.
llvm-svn: 97312
as deeply into the pattern as we can get away with. In pratice, this
means "all the way to to the emitter code, but not across
ComplexPatterns". This substantially increases the amount of factoring
we get.
llvm-svn: 97305
longer than 80 columns. This replaces the heavy-handed "textwidth"
mechanism, and makes the trailing-whitespace highlighting lazy so
that it isn't constantly jumping on the user during typing.
llvm-svn: 97267
gross little neighbor merging implementation. This one has
the benefit of not violating the ordering of patterns, so it
generates code that passes tests again.
llvm-svn: 97218
current design. This generates a matcher that successfully
runs, but it turns out that the factoring we're doing violates
the ordering of patterns, so we end up matching (e.g.) movups
where we want movaps. This won't due, but I'll address this in
a follow on patch. It's nice to not be on by default yet! :)
llvm-svn: 97215
instead of to have a chained series of scope nodes. This makes
the generated table smaller, improves the efficiency of the
interpreter, and make the factoring optimization much more
reasonable to implement.
llvm-svn: 97160
splitting all the patterns under scope nodes into equality sets
based on their first node. The second step is to rewrite the
graph info a form that exposes the sharing. Before I do this,
I want to redesign the Scope node.
llvm-svn: 97130
movechild/record -> recordchild/movechild and
movechild/moveparent -> noop xforms. This slightly shrinks the tables
(x86 to 117454) and enables adding future improvements.
llvm-svn: 97051
the old one around for comparative purposes: have the
ENABLE_NEW_ISEL #define (which is not enabled on mainline) stop
emitting the old isel at all, yay for build time win.
llvm-svn: 97033
the new isel: fold movechild+record+moveparent into a
single recordchild N node. This shrinks the X86 table
from 125443 to 117502 bytes.
llvm-svn: 97031
internal nodes with flag results. Record these with a new
OPC_MarkFlagResults opcode and use this to update the interior
nodes' flag results properly. This fixes CodeGen/X86/i256-add.ll
with the new isel.
llvm-svn: 97021
Needed to correctly handle things like 'llvmc -framework Foo foo.o -framework
Bar bar.o' - before this commit all '-framework' options would've been grouped
together in the beginning.
Due to our dependence on CommandLine this turned out to be a giant hack; we will
migrate away from CommandLine eventually.
llvm-svn: 96922
input/output patterns have the same type. It turns out that
this triggers all the time because we don't infer types
between these boundaries. Until we do, don't turn this on.
llvm-svn: 96905
ridiculously ginormous patterns and need more than one byte
of displacement for encodings. This fixes CellSPU/fdiv.ll.
SPU is still doing something else ridiculous though.
llvm-svn: 96833
well as the operands produced when the pattern is matched. This
allows CheckSame to work correctly when matching replicated
names involving ComplexPatterns. This fixes a bunch of MSP430
failures, we're down to 13 failures, two of which are
due to a sched bug.
llvm-svn: 96824
sure to only run the complex pattern on nodes where the target opts in.
This patch only handles targets with one opcode specified so far, but
fixes 16 failures, only 34 left.
llvm-svn: 96813
result nodes correctly. Note that this includes a horrible hack
in DAGISelHeader which cannot be fixed reasonably without
eliminating (parallel) from input patterns. That, in turn,
can't be done until we support writing multiple result patterns
for the X86and_flag and related multiple-result nodes.
llvm-svn: 96767
the point where it is to the 95% feature complete mark, it just
needs result updating to be done (then testing, optimization
etc).
More specificallly, this adds support for chain and flag handling
on the result nodes, support for sdnodexforms, support for variadic
nodes, memrefs, pinned physreg inputs, and probably lots of other
stuff.
In the old DAGISelEmitter, this deletes the dead code related to
OperatorMap, cleans up a variety of dead stuff handling "implicit
remapping" from things like globaladdr -> targetglobaladdr (which
is no longer used because globaladdr always needs to be legalized),
and some minor formatting fixes.
llvm-svn: 96716
'ischaincompatible' when a pattern has more than one input chain. Need
to do some commenting and cleanup now that I understand how this works.
llvm-svn: 96443
into a roundss intrinsic, producing a cyclic dag. The root cause
of this is badness handling ComplexPattern nodes in the old dagisel
that I noticed through inspection. Eliminate a copy of the of the
code that handled ComplexPatterns by making EmitChildMatchCode call
into EmitMatchCode.
llvm-svn: 96408
use and only call IsProfitableToFold/IsLegalToFold on the load
being folded, like the old dagiselemitter does. This
substantially simplifies the code and improves opportunities for
sharing.
llvm-svn: 96368
with chains. On interior nodes that lead up to them, we just directly
check that there is a single use. This generates slightly more
efficient code.
llvm-svn: 96366
IsLegalToFold and IsProfitableToFold. The generic version of the later simply checks whether the folding candidate has a single use.
This allows the target isel routines more flexibility in deciding whether folding makes sense. The specific case we are interested in is folding constant pool loads with multiple uses.
llvm-svn: 96255
produce a table based matcher instead of gobs of C++ Code.
Though it's not done yet, the shrinkage seems promising,
the table for the X86 ISel is 75K and still has a lot of
optimization to come (compare to the ~1.5M of .o generated
the old way, much of which will go away).
The code is currently disabled by default (the #if 0 in
DAGISelEmitter.cpp). When enabled it generates a dead
SelectCode2 function in the DAGISel Header which will
eventually replace SelectCode.
There is still a lot of stuff left to do, which are
documented with a trail of FIXMEs.
llvm-svn: 96215
that predated -fast-isel which attempted to speed up the dag pattern
matchers at -O0. Since fast-isel is around, this is basically
obsolete and removing it shrinks the generated dag isels.
llvm-svn: 96188
whose opcodes extend into the ModR/M field using the
Form field of the instruction rather than by special
casing each instruction. Commented out the special
casing of VMCALL, which is the first instruction to use
this special form. While I was in the neighborhood,
added a few comments for people modifying the Intel
disassembler.
llvm-svn: 96043
matcher is now free of implicit operands!
- Still need to clean up the code now that we don't to worry about implicit
operands, and to make it a hard error if an instruction fails to specify all
of its operands for some reason.
llvm-svn: 95956
for representing constraint info semantically instead of
as a c expression that will be blatted out to the .inc
file. Fix X86RecognizableInstr to use this instead of
parsing C code :).
llvm-svn: 95753
out of the AsmWriterEmitter. This patch does the physical
code movement, but leaves the implementation unchanged. I'll
make any changes necessary to generalize the code in a
separate patch.
llvm-svn: 95697
into TargetOpcodes.h. #include the new TargetOpcodes.h
into MachineInstr. Add new inline accessors (like isPHI())
to MachineInstr, and start using them throughout the
codebase.
llvm-svn: 95687
This time it's for real! I am going to hook this up in the frontends as well.
The inliner has some experimental heuristics for dealing with the inline hint.
When given a -respect-inlinehint option, functions marked with the inline
keyword are given a threshold just above the default for -O3.
We need some experiments to determine if that is the right thing to do.
llvm-svn: 95466
than DEBUG_VALUE :( ) into the target indep AsmPrinter.cpp
file. This allows elimination of the
NO_ASM_WRITER_BOILERPLATE hack among other things.
llvm-svn: 95177
is still deterministic even amongst ambiguous instructions (eventually ambiguous
match orders will be a hard error, but we aren't there yet).
llvm-svn: 95157
Before:
<stdin>:94:1: note: possible intended match here
movsd 4096(%rsi), %xmm0
^
After:
<stdin>:94:2: note: possible intended match here
movsd 4096(%rsi), %xmm0
^
llvm-svn: 94847
be static. Also made it possible for clients to get it
and no other functions from ...GenAsmMatcher.inc by
defining REGISTERS_ONLY before including GenAsmMatcher.inc.
This sets the stage for target-specific lexers that can
identify registers and return AsmToken::Register as
appropriate.
llvm-svn: 94266
directory when building the llvmCore_Embedded project. Fix this by putting
the iPhone platform directory into DEST_DIR instead of DEST_ROOT. I also
noticed what appears to be an unintentional use of DEVELOPER_BIN instead of
DEVELOPER_DIR, so I fixed that and changed to use DEVELOPER_DIR in some places
that were hardcoded to "Developer". Finally, the other changes here allowed
some refactoring and simplification, which I have done.
llvm-svn: 93878
the new ParseInstruction method just parses and returns a list of
target operands. A new MatchInstruction interface is used to
turn the operand list into an MCInst.
This requires new/deleting all the operands, but it also gives
targets the ability to use polymorphic operands if they want to.
llvm-svn: 93469
- getToken is modeled after StringRef::split but it can split on multiple
separator chars and skips leading seperators.
- SplitString is a StringRef::split variant for more than 2 elements with the
same behaviour as getToken.
llvm-svn: 93161
memcpy, memset and other intrinsics that only access their arguments
to be readnone if the intrinsic's arguments all point to local memory.
This improves the testcase in the README to readonly, but it could in
theory be made readnone, however this would involve more sophisticated
analysis that looks through the memcpy.
llvm-svn: 92829
clear what information these functions are actually using.
This is also a micro-optimization, as passing a SDNode * around is
simpler than passing a { SDNode *, int } by value or reference.
llvm-svn: 92564
incarnations), integrated into the MC framework.
The disassembler is table-driven, using a custom TableGen backend to
generate hierarchical tables optimized for fast decode. The disassembler
consumes MemoryObjects and produces arrays of MCInsts, adhering to the
abstract base class MCDisassembler (llvm/MC/MCDisassembler.h).
The disassembler is documented in detail in
- lib/Target/X86/Disassembler/X86Disassembler.cpp (disassembler runtime)
- utils/TableGen/DisassemblerEmitter.cpp (table emitter)
You can test the disassembler by running llvm-mc -disassemble for i386
or x86_64 targets. Please let me know if you encounter any problems
with it.
llvm-svn: 91749
Checks that the code generated by 'tblgen --emit-llvmc' can be actually
compiled. Also fixes two bugs found in this way:
- forward_transformed_value didn't work with non-list arguments
- cl::ZeroOrOne is now called cl::Optional
llvm-svn: 91404
Note that "hasDotLocAndDotFile"-style debug info was already broken;
people wanting this functionality should implement it in the
AsmPrinter/DwarfWriter code.
llvm-svn: 89711
good nearby fuzzy match. Frequently the input is nearly correct, and just
showing the user the a nearby sensible match is enough to diagnose the problem.
- The "fuzzyness" is pretty simple and arbitrary, but worked on my three test
cases. If you encounter problems, or places you think FileCheck should have
guessed but didn't, please add test cases to PR5239.
For example, previously FileCheck would report this:
--
t.cpp:21:55: error: expected string not found in input
// CHECK: define void @_Z2f25f2_s1([[i64_i64_ty]] %a0)
^
<stdin>:19:30: note: scanning from here
define void @_Z2f15f1_s1(%1) nounwind {
^
<stdin>:19:30: note: with variable "i64_i64_ty" equal to "%0"
--
and now it also reports this:
--
<stdin>:27:1: note: possible intended match here
define void @_Z2f25f2_s1(%0) nounwind {
^
--
which makes it clear that the CHECK just has an extra ' %a0' in it, without
having to check the input.
llvm-svn: 89631
values, resolving references to them, and then removing the definitions.
If a template argument is set to an undefined value, we need to resolve
references to that argument to an explicit undefined value. The current code
leaves the reference to the template argument as it is, which causes an
assertion failure later when the definition of the template argument is
removed.
llvm-svn: 89581