generator to it. For non-bundle instructions, these behave exactly the same
as the MC layer API.
For properties like mayLoad / mayStore, look into the bundle and if any of the
bundled instructions has the property it would return true.
For properties like isPredicable, only return true if *all* of the bundled
instructions have the property.
For properties like canFoldAsLoad, isCompare, conservatively return false for
bundles.
llvm-svn: 146026
This flag is used when bundling machine instructions. It indicates
whether the operand reads a value defined inside or outside its bundle.
llvm-svn: 145997
For example, ARM allows:
vmov.u32 s4, #0 -> vmov.i32, #0
'u32' is a more specific designator for the 32-bit integer type specifier
and is legal for any instruction which accepts 'i32' as a datatype suffix.
We want to say,
def : TokenAlias<".u32", ".i32">;
This works by marking the match class of 'From' as a subclass of the
match class of 'To'.
rdar://10435076
llvm-svn: 145992
1. Added opcode BUNDLE
2. Taught MachineInstr class to deal with bundled MIs
3. Changed MachineBasicBlock iterator to skip over bundled MIs; added an iterator to walk all the MIs
4. Taught MachineBasicBlock methods about bundled MIs
llvm-svn: 145975
This was actually a bit of a mess. TLI.setPrefLoopAlignment was clearly
documented as taking log2(bytes) units, but the x86 target would still
set a preferred loop alignment of '16'.
CodePlacementOpt passed this number on to the basic block, and
AsmPrinter interpreted it as bytes.
Now both MachineFunction and MachineBasicBlock use logarithmic
alignments.
Obviously, MachineConstantPool still measures alignments in bytes, so we
can emulate the thrill of using as.
llvm-svn: 145889
Whether a fixup needs relaxation for the associated instruction is a
target-specific function, as the FIXME indicated. Create a hook for that
and use it.
llvm-svn: 145881
This is a patch by Guoping Long!
As part of utilizing LLVM Dominator computation in Clang, made two changes to LLVM dominators tree implementation:
- (1) Change the recalculate() template function to only rely on GraphTraits.
- (2) Add a size() method to GraphTraits template class to query the number of nodes in the graph.
llvm-svn: 145837
change, now you need a TargetOptions object to create a TargetMachine. Clang
patch to follow.
One small functionality change in PTX. PTX had commented out the machine
verifier parts in their copy of printAndVerify. That now calls the version in
LLVMTargetMachine. Users of PTX who need verification disabled should rely on
not passing the command-line flag to enable it.
llvm-svn: 145714
It was getting ignored after r144788.
Also fix an accidental implicit cast from the OptLevel enum
to an optional bool argument. MSVC warned on this, but gcc
didn't.
llvm-svn: 145633
as MC is the only assembler we support.
This splits MS/Windows and GNU/Windows ASM infos into two seperate classes.
While there is currently only one difference, full MS C++ ABI support will
require many more.
llvm-svn: 145409
Now that it needs to be exported in a public header (Valgrind.h)
it should be prefixed to avoid collision with other projects.
Add it to llvm-config.h as well.
This'll require regenerating the configure script after this
commit, but I don't have the required autoconf version.
llvm-svn: 145214
It was out of sync with the description in configure.ac/config.h.in.
Also re-alphabetize it from its position when it was LLVM_HOST_TRIPLE.
llvm-svn: 145213
and code model. This eliminates the need to pass OptLevel flag all over the
place and makes it possible for any codegen pass to use this information.
llvm-svn: 144788
Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix
about instructions with partial register updates causing false unwanted
dependencies.
The ExecutionDepsFix pass will break the false dependencies if the
updated register was written in the previoius N instructions.
The small loop added to sse-domains.ll runs twice as fast with
dependency-breaking instructions inserted.
llvm-svn: 144602
and stores capture) to permit the caller to see each capture point and decide
whether to continue looking.
Use this inside memdep to do an analysis that basicaa won't do. This lets us
solve another devirtualization case, fixing PR8908!
llvm-svn: 144580
These annotations are disabled entirely when either ENABLE_THREADS is off, or
building a release build. When enabled, they add calls to functions with no
statements to ManagedStatic's getters.
Use these annotations to inform tsan that the race used inside ManagedStatic
initialization is actually benign. Thanks to Kostya Serebryany for helping
write this patch!
llvm-svn: 144567
time it is queried to compute the probability of a single successor.
This makes computing the probability of every successor of a block in
sequence... really really slow. ;] This switches to a linear walk of the
successors rather than a quadratic one. One of several quadratic
behaviors slowing this pass down.
I'm not really thrilled with moving the sum code into the public
interface of MBPI, but I don't (at the moment) have ideas for a better
interface. My direction I'm thinking in for a better interface is to
have MBPI actually retain much more state and make *all* of these
queries cheap. That's a lot of work, and would require invasive changes.
Until then, this seems like the least bad (ie, least quadratic)
solution. Suggestions welcome.
llvm-svn: 144530
correctly handle blocks whose successor weights sum to more than
UINT32_MAX. This is slightly less efficient, but the entire thing is
already linear on the number of successors. Calling it within any hot
routine is a mistake, and indeed no one is calling it. It also
simplifies the code.
llvm-svn: 144527
the sum of the edge weights not overflowing uint32, and crashed when
they did. This is generally safe as BranchProbabilityInfo tries to
provide this guarantee. However, the CFG can get modified during codegen
in a way that grows the *sum* of the edge weights. This doesn't seem
unreasonable (imagine just adding more blocks all with the default
weight of 16), but it is hard to come up with a case that actually
triggers 32-bit overflow. Fortuately, the single-source GCC build is
good at this. The solution isn't very pretty, but its no worse than the
previous code. We're already summing all of the edge weights on each
query, we can sum them, check for an overflow, compute a scale, and sum
them again.
I've included a *greatly* reduced test case out of the GCC source that
triggers it. It's a pretty lame test, as it clearly is just barely
triggering the overflow. I'd like to have something that is much more
definitive, but I don't understand the fundamental pattern that triggers
an explosion in the edge weight sums.
The buggy code is duplicated within this file. I'll colapse them into
a single implementation in a subsequent commit.
llvm-svn: 144526
The old naming scheme (load/use/def/store) can be traced back to an old
linear scan article, but the names don't match how slots are actually
used.
The load and store slots are not needed after the deferred spill code
insertion framework was deleted.
The use and def slots don't make any sense because we are using
half-open intervals as is customary in C code, but the names suggest
closed intervals. In reality, these slots were used to distinguish
early-clobber defs from normal defs.
The new naming scheme also has 4 slots, but the names match how the
slots are really used. This is a purely mechanical renaming, but some
of the code makes a lot more sense now.
llvm-svn: 144503
RegAllocGreedy has been the default for six months now.
Deleting RegAllocLinearScan makes it possible to also delete
VirtRegRewriter and clean up the spiller code.
llvm-svn: 144475
When this field is true it means that the load is from constant (runt-time or compile-time) and so can be hoisted from loops or moved around other memory accesses
llvm-svn: 144100
the mailing list. Suggestions for other statistics to collect would be
awesome. =]
Currently these are implemented as a separate pass guarded by a separate
flag. I'm not thrilled by that, but I wanted to be able to collect the
statistics for the old code placement as well as the new in order to
have a point of comparison. I'm planning on folding them into the single
pass if / when there is only one pass of interest.
llvm-svn: 143537
-g flag. In this part we generate the .file for the source being assembled and
the .loc's for the assembled instructions.
The next part will be to generate the dwarf Compile Unit DIE and a dwarf
subprogram DIE for each non-temporary label.
Once the next part is done test cases will be added. rdar://9275556
llvm-svn: 143509