the rough idea sketched out in http://nondot.org/sabre/LLVMNotes/CallGraphClass.txt,
allowing new spiffy implementations of the callgraph interface to be built.
Many thanks to Saem Ghani for contributing this!
llvm-svn: 24944
dependent portion of the lib/Support/SlowOperationTimer code into the
lib/System implementation where it can be ported to different platforms.
llvm-svn: 24937
the module being constructed. This is used to correctly name the module.
Previously the name of the linker tool was used which produces confusing
output when the module identifier is used in an error message.
llvm-svn: 24699
matching code that is not currently auto-generated by tblgen, e.g. X86
addressing mode. Selection routines for complex patterns can return multiple operands, e.g. X86 addressing mode returns 4.
llvm-svn: 24634
def SHL8rCL : I<0xD2, MRM4r, (ops R8 :$dst, R8 :$src),
"shl{b} {%cl, $dst|$dst, %CL}",
[(set R8:$dst, (shl R8:$src, CL))]>, Imp<[CL],[]>;
This generates a CopyToReg operand and added its 2nd result to the shl as
a flag operand.
llvm-svn: 24557
changes allow us to generate the following code:
_foo:
li r2, 0
lvx v0, r2, r3
vaddfp v0, v0, v0
stvx v0, r2, r3
blr
for this llvm:
void %foo(<4 x float>* %a) {
entry:
%tmp1 = load <4 x float>* %a
%tmp2 = add <4 x float> %tmp1, %tmp1
store <4 x float> %tmp2, <4 x float>* %a
ret void
}
llvm-svn: 24534
file to become corrupted due to interactions between mmap'd memory segments
and file descriptors closing. The problem is completely avoiding by using
a third temporary file.
Patch provided by Evan Jones
llvm-svn: 24527
The code is organized into 3 parts (2 passes)
1) a linked set of profiling passes, which implement an analysis group (linked, like alias analysis are). These insert profiling into the program, and remember what they inserted, so that at a later time they can be queried about any instruction.
2) a pass that handles inserting the random sampling framework. This also has options to control how random samples are choosen. Currently implemented are Global counters, register allocated global counters, and read cycle counter (see? there was a reason for it).
The profiling passes are almost identical to the existing ones (block, function, and null profiling is supported right now), and they are valid passes without the sampling framework (hence the existing passes can be unified with the new ones, not done yet).
Some things are a bit ugly still, but that should be fixed up soon enough.
Other todo? making the counter values not "magic 2^16 -1" values, but dynamically choosable.
llvm-svn: 24493
packed types with an element count of 1, although more generic support is
coming. This allows LLVM to turn the following code:
void %foo(<1 x float> * %a) {
entry:
%tmp1 = load <1 x float> * %a;
%tmp2 = add <1 x float> %tmp1, %tmp1
store <1 x float> %tmp2, <1 x float> *%a
ret void
}
Into:
_foo:
lfs f0, 0(r3)
fadds f0, f0, f0
stfs f0, 0(r3)
blr
llvm-svn: 24416
this and have it in about the same form, I think this makes sense.
on X86, you do a RDTSC (64bit result, from any ring since the P5MMX)
on Alpha, you do a RDCC
on PPC, there is a sequence which may or may not work depending on how things
are setup by the OS. Or something like that. Maybe someone who knows PPC
can add support. Something about the time base register.
on Sparc, you read %tick, which in some solaris versions (>=8) is readable by
userspace
on IA64 read ar.itc
So I think the ulong is justified since all of those are 64bit.
Support is slighly flaky on old chips (P5 and lower) and sometimes
depends on OS (PPC, Sparc). But for modern OS/Hardware (aka this decade),
we should be ok.
I am still not sure what to do about lowering. I can either see a lower to 0, to
gettimeofday (or the target os equivalent), or loudly complaining and refusing to
continue.
I am commiting an Alpha implementation. I will add the X86 implementation if I
have to (I have use of it in the near future), but if someone who knows that
backend (and the funky multi-register results) better wants to add it, it would
take them a lot less time ;)
TODO: better lowering and legalizing, and support more platforms
llvm-svn: 24299
names. This also changes the default to allow all of "$_." in addition
to letters and numbers as symbol names. If you don't want this, use
markCharUnacceptable to remove one of these or markCharAcceptable to add
to the set. This corresponds with what GAS accepts by default.
llvm-svn: 24291
doesn't support .asciz, just set AscizDirective to null in your asmprinter.
This compiles C strings to:
l1__2E_str_1: ; '.str_1'
.asciz "foo"
instead of:
l1__2E_str_1: ; '.str_1'
.ascii "foo\000"
llvm-svn: 24271
This eliminates the vector, allows constant time removal of a node from
a graph, and makes iteration over the all nodes list stable when adding
nodes to the graph.
llvm-svn: 24262
allocated. Further, in the common case where a node has a single value, just
reference an element from a small array. This is a small compile-time wi.
llvm-svn: 24250
This saves 12 bytes from SDNode, but doesn't speed things up substantially
(our graphs apparently already fit within the cache on my g5). In any case
this reduces memory usage.
llvm-svn: 24248
alignment information appropriately. Includes code for PowerPC to support
fixed-size allocas with alignment larger than the stack. Support for
arbitrarily aligned dynamic allocas coming soon.
llvm-svn: 24224
Add support for specifying alignment and size of setjmp jmpbufs.
No targets currently do anything with this information, nor is it presrved
in the bytecode representation. That's coming up next.
llvm-svn: 24196
Remove Function::aiterator and Module::giterator typedefs (and const versions)
as they should have been removed when abegin/gbegin were removed. Thanks to
alkis for bringing this to my attn.
llvm-svn: 23978
pointer marking the end of the list, the zero *must* be cast to the pointer
type. An un-cast zero is a 32-bit int, and at least on x86_64, gcc will
not extend the zero to 64 bits, thus allowing the upper 32 bits to be
random junk.
The new END_WITH_NULL macro may be used to annotate a such a function
so that GCC (version 4 or newer) will detect the use of un-casted zero
at compile time.
llvm-svn: 23888
Add a new flag to TargetLowering indicating if the target has really cheap
signed division by powers of two, make ppc use it. This will probably go
away in the future.
Implement some more ISD::SDIV folds in the dag combiner
Remove now dead code in the x86 backend.
llvm-svn: 23853
allows us to lower legal return types to something else, to meet ABI
requirements (such as that i64 be returned in two i32 regs on Darwin/ppc).
llvm-svn: 23802
a different enum value. This allows 'classof' for these to be really simple,
not needing to call getType() anymore.
This speeds up isa/dyncast/etc for constants, and also makes them smaller.
For example, the text section of a release build of InstCombine.cpp shrinks
from 230037 bytes to 216363 bytes, a 6% reduction.
llvm-svn: 23466
this, it is a requirement on PPC, which can have an f32 value in r3 at one
point in a function and a f64 value in r3 at another point. :(
llvm-svn: 23160
putting it into the constant pool. This allows the isel machinery to
create constants that it will end up deciding are not needed, without them
ending up in the resultant function constant pool.
llvm-svn: 23081
These patches make threading optional in LLVM. The configuration scripts are now
modified to accept a --disable-threads switch. If this is used, the Mutex class
will be implemented with all functions as no-op. Furthermore, linking against
libpthread will not be done. Finally, the ParallelJIT example needs libpthread
so its makefile was changed to always add -lpthread to the link line.
llvm-svn: 23003
binary search to test for membership. This speeds up LLC a bit more on KC++,
e.g. on itanium from 16.6974s to 14.8272s, PPC from 11.4926s to 10.7089s and
X86 from 10.8128s to 9.7943s, with no difference in generated code (like all
of the RA patches).
With these changes, isel is the slowest pass for PPC/X86, but linscan+live
intervals is still > 50% of the compile time for itanium. More work could
be done, but this is the last for now.
llvm-svn: 22993
rearrange some of the accessors to be more efficient.
This makes it much more efficient to iterate over all of the things with the
same value. This speeds up liveintervals analysis from 8.63s to 3.79s with
a release build of llc on kc++ with -march=ia64. This also speeds up live
var from 1.66s -> 0.87s as well, reducing total llc time from 20.1s->15.2s.
This also speeds up other targets slightly, e.g. llc time on X86 from 16.84
-> 16.45s, and PPC from 17.64->17.03s.
llvm-svn: 22990
is greater than the range of building selection dag node types, and
getTargetOpcode(), which returns the node opcode less the value of
isd::builtin_op_end, which specifies the end of the builtin types.
llvm-svn: 22844
used to tack a register number onto the node.
Instead of doing this, make a new node, RegisterSDNode, which is a leaf
containing a register number. These three operations just become normal
DAG nodes now, instead of requiring special handling.
Note that with this change, it is no longer correct to make illegal
CopyFromReg/CopyToReg nodes. The legalizer will not touch them, and this
is bad, so don't do it. :)
llvm-svn: 22806
1. move assertions for node creation to getNode()
2. legalize the values returned in ExpandOp immediately
3. Move select_cc optimizations from SELECT's getNode() to SELECT_CC's,
allowing them to be cleaned up significantly.
This paves the way to pick up additional optimizations on SELECT_CC, such
as sum-of-absolute-differences.
llvm-svn: 22757
CC out of the SetCC operation, making SETCC a standard ternary operation and
CC's a standard DAG leaf. This will make it possible for other node to use
CC's as operands in the future...
llvm-svn: 22728
BasicBlock's removePredecessor routine. This requires shuffling around
the definition and implementation of hasContantValue from Utils.h,cpp into
Instructions.h,cpp
llvm-svn: 22664
near the GOT, which new doesn't do. So break out the allocate into a new function.
Also move GOT index handling into JITResolver. This lets it update the mapping when a Lazy
function is JITed. It doesn't managed the table, just the mapping. Note that this is
still non-ideal, as any function that takes a function address should also take a GOT
index, but that is a lot of changes. The relocation resolve process updates any GOT entry
it sees is out of date.
llvm-svn: 22537
This is the first incremental patch to implement this feature. It adds no
functionality to LLVM but setup up the information needed from targets in
order to implement the optimization correctly. Each target needs to specify
the maximum number of store operations for conversion of the llvm.memset,
llvm.memcpy, and llvm.memmove intrinsics into a sequence of store operations.
The limit needs to be chosen at the threshold of performance for such an
optimization (generally smallish). The target also needs to specify whether
the target can support unaligned stores for multi-byte store operations.
This helps ensure the optimization doesn't generate code that will trap on
an alignment errors.
More patches to follow.
llvm-svn: 22468
vector that represents the .o file at once, build up a vector for each
section of the .o file. This is needed because the .o file writer needs
to be able to switch between sections as it emits them (e.g. switch
between the .text section and the .rel section when emitting code).
This patch has no functionality change.
llvm-svn: 22453
GRAPHVIZ will contain the path to the program if its found (or "echo Graphviz"
if not) and the #define HAVE_GRAPHVIZ will be defined if its found.
llvm-svn: 22424
This patch completes the changes for making lli thread-safe. Here's the list
of changes:
* The Support/ThreadSupport* files were removed and replaced with the
MutexGuard.h file since all ThreadSupport* declared was a Mutex Guard.
The implementation of MutexGuard.h is now based on sys::Mutex which hides
its implementation and makes it unnecessary to have the -NoSupport.h and
-PThreads.h versions of ThreadSupport.
* All places in ExecutionEngine that previously referred to "Mutex" now
refer to sys::Mutex
* All places in ExecutionEngine that previously referred to "MutexLocker"
now refer to MutexGuard (this is frivolous but I believe the technically
correct name for such a class is "Guard" not a "Locker").
These changes passed all of llvm-test. All we need now are some test cases
that actually use multiple threads.
llvm-svn: 22404
Add a Mutex class for thread synchronization in a platform-independent way.
The current implementation only supports pthreads. Win32 use of Critical
Sections will be added later. The design permits other threading models to
be used if (and only if) pthreads is not available.
llvm-svn: 22403
Implement the X86 Subtarget.
This consolidates the checks for target triple, and setting options based
on target triple into one place. This allows us to convert the asm printer
and isel over from being littered with "forDarwin", "forCygwin", etc. into
just having the appropriate flags for each subtarget feature controlling
the code for that feature.
This patch also implements indirect external and weak references in the
X86 pattern isel, for darwin. Next up is to convert over the asm printers
to use this new interface.
llvm-svn: 22389
MVTSDNode class. This class is used to provide an operand to operators
that require an extra type. We start by converting FP_ROUND_INREG and
SIGN_EXTEND_INREG over to using it.
llvm-svn: 22364
This chagne just renames some sys::Path methods to ensure they are not
misused. The Path documentation now divides methods into two dimensions:
Path/Disk and accessor/mutator. Path accessors and mutators only operate
on the Path object itself without making any disk accesses. Disk accessors
and mutators will also access or modify the file system. Because of the
potentially destructive nature of disk mutators, it was decided that all
such methods should end in the work "Disk" to ensure the user recognizes
that the change will occur on the file system. This patch makes that
change. The method name changes are:
makeReadable -> makeReadableOnDisk
makeWriteable -> makeWriteableOnDisk
makeExecutable -> makeExecutableOnDisk
setStatusInfo -> setStatusInfoOnDisk
createDirectory -> createDirectoryOnDisk
createFile -> createFileOnDisk
createTemporaryFile -> createTemporaryFileOnDisk
destroy -> eraseFromDisk
rename -> renamePathOnDisk
These changes pass the Linux Deja Gnu tests.
llvm-svn: 22354
Get rid of the difference between file paths and directory paths. The Path
class now simply stores a path that can refer to either a file or a
directory. This required various changes in the implementation and interface
of the class with the corresponding impact to its users. Doxygen comments were
also updated to reflect these changes. Interface changes are:
appendDirectory -> appendComponent
appendFile -> appendComponent
elideDirectory -> eraseComponent
elideFile -> eraseComponent
elideSuffix -> eraseSuffix
renameFile -> rename
setDirectory -> set
setFile -> set
Changes pass Dejagnu and llvm-test/SingleSource tests.
llvm-svn: 22349
1. Pass Value*'s into lowering methods so that the proper pointers can be
added to load/stores from the valist
2. Intrinsics that return void should only return a token chain, not a token
chain/retval pair.
3. Rename LowerVAArgNext -> LowerVAArg, because VANext is long gone.
llvm-svn: 22338
For now, the elf writer is only capable of emitting an empty elf file, with
a section table and a section table string table. This will be enhanced
in the future :)
llvm-svn: 22290
enhancement for ffs, ffsl, and ffsll optimizations. We can't do the opt
unless we also have the at least ffsll function. Notably SVR4 doesn't.
llvm-svn: 22033
population (ctpop). Generic lowering is implemented, however only promotion
is implemented for SelectionDAG at the moment.
More coming soon.
llvm-svn: 21676
(TRUNK)Stores and (EXT|ZEXT|SEXT)Loads have an extra SDOperand which is a SrcValueSDNode which contains the Value*. Note that if the operation is introduced by the backend, it will still have the operand, but the value* will be null.
llvm-svn: 21599