instead of stored. This reduces memdep memory usage, and also eliminates a bunch of
weakvh's. This speeds up gvn on gcc.c-torture/20001226-1.c from 23.9s to 8.45s (2.8x)
on a different machine than earlier.
llvm-svn: 91885
return partial registers. This affected the back-end lowering code some.
Also patch up some places I missed before in the "get" functions.
llvm-svn: 91880
This fixes an in-place update bug where code inserted at the end of basic blocks may not be covered by existing intervals which were live across the entire block. It is also consistent with the way ranges are specified for live intervals.
llvm-svn: 91859
by allowing backends to override routines that will default
the JIT and Static code generation to an appropriate code model
for the architecture.
Should fix PR 5773.
llvm-svn: 91824
it changes raw_fd_ostream::preferred_buffer_size to return zero on
a scary stat failure instead of setting the stream to an error state.
This method really should not mutate the stream.
llvm-svn: 91740
- Move DisableScheduling flag into TargetOption.h
- Move SDNodeOrdering into its own header file. Give it a minimal interface that
doesn't conflate construction with storage.
- Move assigning the ordering into the SelectionDAGBuilder.
This isn't used yet, so there should be no functional changes.
llvm-svn: 91727
- an MDNode is designated as function-local when created, and continues to be even if its operands are modified not to refer to function-local IR
- function-localness is designated via lowest bit in SubclassData
- getLocalFunction() descends MDNode tree to see if it is consistently function-local
Add verification of MDNodes to checks that MDNodes are consistently function-local.
Update AsmWriter to use isFunctionLocal().
llvm-svn: 91708
contains another loop, or an instruction. The loop form is
substantially more efficient on large loops than the typical
code it replaces.
llvm-svn: 91654
of 91296 that caused trouble -- the Processed list needs to be
preserved for the livetime of the pass, as AddUsersIfInteresting
is called from other passes.
llvm-svn: 91641
LegalizeDAG.cpp. Unlike the code it replaces, which simply decrements the simple
type by one, getHalfSizedIntegerVT() searches for the smallest simple integer
type that is at least half the size of the type it is called on. This approach
has the advantage that it will continue working if a new value type (such as
i24) is added to MVT.
Also, in preparation for new value types, remove the assertions that
non-power-of-2 8-bit-mutiple types are Extended when legalizing extload and
truncstore operations.
llvm-svn: 91614
and there is a new SmallVectorTemplateBase class in between it and SmallVectorImpl.
SmallVectorTemplateBase can be specialized based on isPodLike.
llvm-svn: 91518
remove start/finishGVStub and the BufferState helper class from the
MachineCodeEmitter interface. It has the side-effect of not setting the
indirect global writable and then executable on ARM, but that shouldn't be
necessary.
llvm-svn: 91464
1. Use std::equal instead of reinventing it.
2. don't run dtors in destroy_range if element is pod-like.
3. Use isPodLike to decide between memcpy/uninitialized_copy
instead of is_class. isPodLike is more generous in some cases.
llvm-svn: 91427
isPodLike type trait. This is a generally useful type trait for
more than just DenseMap, and we really care about whether something
acts like a pod, not whether it really is a pod.
llvm-svn: 91421
Checks that the code generated by 'tblgen --emit-llvmc' can be actually
compiled. Also fixes two bugs found in this way:
- forward_transformed_value didn't work with non-list arguments
- cl::ZeroOrOne is now called cl::Optional
llvm-svn: 91404
stuff isn't used just yet.
We want to model the GCC `-fno-schedule-insns' and `-fno-schedule-insns2'
flags. The hypothesis is that the people who use these flags know what they are
doing, and have hand-optimized the C code to reduce latencies and other
conflicts.
The idea behind our scheme to turn off scheduling is to create a map "on the
side" during DAG generation. It will order the nodes by how they appeared in the
code. This map is then used during scheduling to get the ordering.
llvm-svn: 91392
This change removes the DefaultConstructible
and CopyAssignable constraints on the template
parameter T (the first one).
The second template parameter (R) is defaulted to be
identical to the first and controls the result type.
By specifying it to be (const T&) additionally the
CopyConstructible constraint on T can be removed.
This allows to use StringSwitch e.g. for llvm::Constant
instances.
Regarding the other review feedback regarding performance
because of taking pointers, this class should be completely
optimizable like before, since all methods are inline and
the pointer dereferencing and result value caching should be
possible behind the scenes by the "as-if" rule.
llvm-svn: 91123
- Loosen the restrictions when checking of it branches to a landing pad.
- Make the loop more efficient by checking the '.insert' return value.
- Do cheaper checks first.
llvm-svn: 91101
more than one successor. Normally, these extra successors are dead. However,
some of them may branch to exception handling landing pads. If we remove those
successors, then the landing pads could go away if all predecessors to it are
removed. Before, it was checking if the direct successor was the landing
pad. But it could be the result of jumping through multiple basic blocks to get
to it. If we were to only check for the existence of an EH_LABEL in the basic
block and not remove successors if it's in there, then it could stop actually
dead basic blocks from being removed.
llvm-svn: 91092
The coalescer is supposed to clean these up, but when setting up parameters
for a function call, there may be copies to physregs. If the defining
instruction has been LICM'ed far away, the coalescer won't touch it.
The register allocation hint does not always work - when the register
allocator is backtracking, it clears the hints.
This patch is more conservative than r90502, and does not break
483.xalancbmk/i686. It still breaks the PowerPC bootstrap, so it is disabled
by default, and can be enabled with the -trivial-coalesce-ends option.
llvm-svn: 91049
When a call is placed to spill an interval this spiller will first try to
break the interval up into its component values. Single value intervals and
intervals which have already been split (or are the result of previous splits)
are spilled by the default spiller.
Splitting intervals as described above may improve the performance of generated
code in some circumstances. This work is experimental however, and it still
miscompiles many benchmarks. It's not recommended for general use yet.
llvm-svn: 90951
phi translation of complex expressions like &A[i+1]. This has the
following benefits:
1. The phi translation logic is all contained in its own class with
a strong interface and verification that it is self consistent.
2. The logic is more correct than before. Previously, if intermediate
expressions got PHI translated, we'd miss the update and scan for
the wrong pointers in predecessor blocks. @phi_trans2 is a testcase
for this.
3. We have a lot less code in memdep.
We can handle phi translation across blocks of things like @phi_trans3,
which is pretty insane :).
This patch should fix the miscompiles of 255.vortex, and I tested it
with a bootstrap of llvm-gcc, llvm-test and dejagnu of course.
llvm-svn: 90926
The semantics of llvm.dbg.value are that starting from where it is executed, an offset into the specified user source variable is specified to get a new value.
An example:
call void @llvm.dbg.value(metadata !{ i32 7 }, i64 0, metadata !2)
Here the user source variable associated with metadata #2 gets the value "i32 7" at offset 0.
llvm-svn: 90788
sys::cas_flag should be long on this platform, InterlockedAdd() is
defined only for the Itanium architecture (according to MSDN).
Patch by Michael Beck!
llvm-svn: 90748
The coalescer is supposed to clean these up, but when setting up parameters
for a function call, there may be copies to physregs. If the defining
instruction has been LICM'ed far away, the coalescer won't touch it.
The register allocation hint does not always work - when the register
allocator is backtracking, it clears the hints.
This patch takes care of a few more cases that r90163 missed.
llvm-svn: 90502
that it doesn't have dangling pointers when abstract types are resolved. This
modifies it somewhat to address comments: making the "StructLayoutMap" an
anonymous structure, calling "removeAbstractTypeUser" when appropriate, and
adding asserts where helpful.
llvm-svn: 90362
We want LiveVariables clients to use methods rather than accessing the
getVarInfo data structure directly. That way it will be possible to change the
LiveVariables representation.
llvm-svn: 90240
for all the processors where I have tried it, and even when it might not help
performance, the cost is quite low. The opportunities for duplicating
indirect branches are limited by other factors so code size does not change
much due to tail duplicating indirect branches aggressively.
llvm-svn: 90144
If no destination label is available, just point to the node itself
instead of pointing to some source label. Source and destination labels are
not related in any way.
llvm-svn: 90132
Graphviz can layout the graphs better if a node does not contain source
ports. Therefore only print the ports if the source ports are useful,
that means are not labeled with the empty string "".
This patch also simplifies graphs without any edgeSourceLabels e.g. the
dominance trees.
llvm-svn: 90131
running tail duplication when doing branch folding for if-conversion, and
we also want to be able to run tail duplication earlier to fix some
reg alloc problems. Move the CanFallThrough function from BranchFolding
to MachineBasicBlock so that it can be shared by TailDuplication.
llvm-svn: 89904
Make tail duplication of indirect branches much more aggressive (for targets
that indicate that it is profitable), based on further experience with
this transformation. I compiled 3 large applications with and without
this more aggressive tail duplication and measured minimal changes in code
size. ("size" on Darwin seems to round the text size up to the nearest
page boundary, so I can only say that any code size increase was less than
one 4k page.) Radar 7421267.
llvm-svn: 89814
way for each TargetJITInfo subclass to allocate its own stubs. This
means stubs aren't as exactly-sized anymore, but it lets us get rid of
TargetJITInfo::emitFunctionStubAtAddr(), which lets ARM and PPC
support the eager JIT, fixing http://llvm.org/PR4816.
* Rename the JITEmitter's stub creation functions to describe the kind
of stub they create. So far, all of them create lazy-compilation
stubs, but they sometimes get used when far-call stubs are needed.
Fixing http://llvm.org/PR5201 will involve fixing this.
llvm-svn: 89715
Note that "hasDotLocAndDotFile"-style debug info was already broken;
people wanting this functionality should implement it in the
AsmPrinter/DwarfWriter code.
llvm-svn: 89711
the body to not pass the name for the isSigned parameter. However it
seems that changing prototypes is a big-no-no, so here I revert the
previous change and pass "true" for isSigned, meaning this always does
a signed cast, which was the previous behaviour assuming the name was
not NULL! Some other C function needs to be introduced for the general
case of signed or unsigned casts. This hopefully unbreaks the ocaml
binding.
llvm-svn: 89648
tell debug info which base register to use to reference a frame index on a
per-index basis. This is useful, for example, in the presence of dynamic
stack realignment when local variables are indexed via the stack pointer and
stack-based arguments via the frame pointer.
llvm-svn: 89620
When splitting a critical edge, the registers live through the edge are:
- Used in a PHI instruction, or
- Live out from the predecessor, and
- Live in to the successor.
This allows the coalescer to eliminate even more phi joins.
llvm-svn: 89530
it may be used in contexts where preheader insertion may have failed due
to an indirectbr.
Make LoopSimplify's LoopSimplify::SeparateNestedLoop properly fail in
the case that it would require splitting an indirectbr edge.
These fix PR5502.
llvm-svn: 89484
if it is not ultimately captured. Teach BasicAliasAnalysis that a
local object address which does not escape and is never stored does
not alias with a value resulting from a load.
llvm-svn: 89398
4.2.4, 4.3.4, 4.4.2.
The workaround is to use a local min/max implementation that takes an integer
param, and not a reference to integer param (like std::min does).
llvm-svn: 89352
contents of the block to be duplicated. Use this for ARM Cortex A8/9 to
be more aggressive tail duplicating indirect branches, since it makes it
much more likely that they will be predicted in the branch target buffer.
Testcase coming soon.
llvm-svn: 89187
This is probably not confined to *just* these two things.
Anyway, the llvm-gcc front-end may look up the structure layout information for
an abstract type. That information will be stored into a table with the FE's
TD. Instruction combine can come along and also ask for information on that
abstract type, but for a separate TD (the one associated with the pass manager).
After the type is refined, the old structure layout information in the pass
manager's TD file is out of date. If a new type is allocated in the same space
as the old-unrefined type, then the structure type information in the pass
manager's TD file will be wrong, but won't know it.
Fix this by making the TD's structure type information an abstract type user.
llvm-svn: 89176
2. Allow SCCIterator to work with inverse graphs.
3. Fix an incorrect comment in GraphTraits.h (the type in the comment
was given as GraphType* when it is actually const GraphType &).
Patch by Patrick Alexander Simmons.
llvm-svn: 89091
parameter of CreateIntCast then they get an error from the compiler
(or from the linker with a non-gcc compiler). Another possibility
is to flip the order of the DestTy and isSigned parameters, since you
should then get a compiler warning if you try to use a char* for a
Type*.
llvm-svn: 88913
(like DbgDeclareInst's) to shrink substantially. It sucks that we have
to pull Compiler.h into such a public header, but at least Compiler.h
doesn't pull anything else in.
llvm-svn: 88863
- This is an initial step towards -march=native support in Clang, and towards
eliminating host dependencies in the targets. See PR5389.
- Patch by Roman Divacky!
llvm-svn: 88768
so that isa<Instructon> doesn't return true for FixedStackPseudoSourceValue
values. This fixes a variety of problems, including crashes with -debug
and -print-machineinstrs. Also, add a comment to warn about this.
llvm-svn: 88711
Provide special isLoadFromStackSlotPostFE and isStoreToStackSlotPostFE
interfaces to explicitly request checking for post-frame ptr elimination
operands. This uses a heuristic so it isn't reliable for correctness.
llvm-svn: 87047
machine instruction loads or stores from/to a stack slot. Unlike
isLoadFromStackSlot and isStoreFromStackSlot, the instruction may be
something other than a pure load/store (e.g. it may be an arithmetic
operation with a memory operand). This helps AsmPrinter determine when
to print a spill/reload comment.
This is only a hint since we may not be able to figure this out in all
cases. As such, it should not be relied upon for correctness.
Implement for X86. Return false by default for other architectures.
llvm-svn: 87026
slots. The AsmPrinter will use this information to determine whether to
print a spill/reload comment.
Remove default argument values. It's too easy to pass a wrong argument
value when multiple arguments have default values. Make everything
explicit to trap bugs early.
Update all targets to adhere to the new interfaces..
llvm-svn: 87022
making it visible to clients and adding LLVM-style cast capability.
This will be used by AsmPrinter to determine when to emit spill comments
for an instruction.
llvm-svn: 87019
cannot be folded into target cmp instruction.
- Avoid a phase ordering issue where early cmp optimization would prevent the
later count-to-zero optimization.
- Add missing checks which could cause LSR to reuse stride that does not have
users.
- Fix a bug in count-to-zero optimization code which failed to find the pre-inc
iv's phi node.
- Remove, tighten, loosen some incorrect checks disable valid transformations.
- Quite a bit of code clean up.
llvm-svn: 86969
functions like floorf, ceilf, ... Add test for detecting nearbyintf.
This change was prompted by test/Transforms/SimplifyLibCalls/floor.ll
llvm-svn: 86954
This allows StringRef to skip controversial if(str) check in constructor.
Buildbots, wait for corresponding clang and llvm-gcc FE check-ins!
llvm-svn: 86914
- Edges are split before any phis are eliminated, so the code is SSA.
- Create a proper IR BasicBlock for the split edges.
- LiveVariables::addNewBlock now has same syntax as
MachineDominatorTree::addNewBlock. Algorithm calculates predecessor live-out
set rather than successor live-in set.
This feature still causes some miscompilations.
llvm-svn: 86867
llvm.invariant.start to be used without necessarily being paired with a call
to llvm.invariant.end. If you run the entire optimization pipeline then such
calls are in fact deleted (adce does it), but that's actually a good thing since
we probably do want them to be zapped late in the game. There should really be
an integration test that checks that the llvm.invariant.start call lasts long
enough that all passes that do interesting things with it get to do their stuff
before it is deleted. But since no passes do anything interesting with it yet
this will have to wait for later.
llvm-svn: 86840
start using them in a trivial way when -enable-jump-threading-lvi
is passed. enable-jump-threading-lvi will be my playground for
awhile.
llvm-svn: 86789