A * (1 - (uitofp i1 C)) -> select C, 0, A
B * (uitofp i1 C) -> select C, B, 0
select C, 0, A + select C, B, 0 -> select C, B, A
These come up in code that has been hand-optimized from a select to a linear blend,
on platforms where that may have mattered. We want to undo such changes
with the following transform:
A*(1 - uitofp i1 C) + B*(uitofp i1 C) -> select C, A, B
llvm-svn: 181216
We used to disable constant merging not only if a constant is llvm.used, but
also if an alias of a constant is llvm.used. This change fixes that.
llvm-svn: 181175
This gets exception handling working on ELF and Macho (x86-64 at least).
Other than the EH frame registration, this patch also implements support
for GOT relocations which are used to locate the personality function on
MachO.
llvm-svn: 181167
Now even the small structures could be passed within byval (small enough
to be stored in GPRs).
In regression tests next function prototypes are checked:
PR15293:
%artz = type { i32 }
define void @foo(%artz* byval %s)
define void @foo2(%artz* byval %s, i32 %p, %artz* byval %s2)
foo: "s" stored in R0
foo2: "s" stored in R0, "s2" stored in R2.
Next AAPCS rules are checked:
5.5 Parameters Passing, C.4 and C.5,
"ParamSize" is parameter size in 32bit words:
-- NSAA != 0, NCRN < R4 and NCRN+ParamSize > R4.
Parameter should be sent to the stack; NCRN := R4.
-- NSAA != 0, and NCRN < R4, NCRN+ParamSize < R4.
Parameter stored in GPRs; NCRN += ParamSize.
llvm-svn: 181148
Add support for matching 'ordered' and 'unordered' floating point min/max
constructs.
In LLVM we can express min/max functions as a combination of compare and select.
We have support for matching such constructs for integers but not for floating
point. In floating point math there is no total order because of the presence of
'NaN'. Therefore, we have to be careful to preserve the original fcmp semantics
when interpreting floating point compare select combinations as a minimum or
maximum function. The resulting 'ordered/unordered' floating point maximum
function has to select the same value as the select/fcmp combination it is based
on.
ordered_max(x,y) = max(x,y) iff x and y are not NaN, y otherwise
unordered_max(x,y) = max(x,y) iff x and y are not NaN, x otherwise
ordered_min(x,y) = min(x,y) iff x and y are not NaN, y otherwise
unordered_min(x,y) = min(x,y) iff x and y are not NaN, x otherwise
This matches the behavior of the underlying select(fcmp(olt/ult/.., L, R), L, R)
construct.
Any code using this predicate has to preserve this semantics.
A follow-up patch will use this to implement floating point min/max reductions
in the vectorizer.
radar://13723044
llvm-svn: 181143
The intended semantics mirror autoconf, where the user is able to
specify a host triple, but if it's left to the build system then
"config.guess" is invoked for the default.
This also renames the LLVM_HOSTTRIPLE define to LLVM_HOST_TRIPLE to
fit in with the style of the surrounding defines.
llvm-svn: 181112
Now that we hava a convinient place to keep it, remeber the set of
identified structs as we merge modules.
This speeds up the linking of all the bitcode files in clang with the
gold plugin and -plugin-opt=emit-llvm (i.e., link only, no codegen) from
5:25 minutes to 13.6 seconds!
Patch by Xiaofei Wan!
llvm-svn: 181104
Update comments, fix * placement, fix method names that are not
used in clang, add a linkInModule that takes a Mode and put it
in Linker.cpp.
llvm-svn: 181099
Build attribute sections can now be read if they exist via ELFObjectFile, and
the llvm-readobj tool has been extended with an option to dump this information
if requested. Regression tests are also included which exercise these features.
Also update the docs with a fixed ARM ABI link and a new link to the Addenda
which provides the build attributes specification.
llvm-svn: 181009
the "identifier" parsed by the frontend callback by skipping forward
until we've consumed a token that ends at the point dictated by the
callback.
In addition, inform the callback when it's parsing an unevaluated
operand (e.g. mov eax, LENGTH A::x) as opposed to an evaluated one
(e.g. mov eax, [A::x]).
This commit depends on a clang commit.
llvm-svn: 180978
to emitted instructions. Use this if you want an instruction to be
counted towards the prologue or if there is no useful source location.
rdar://problem/13442648
llvm-svn: 180929
CodeModel: It's now possible to create an MCJIT instance with any CodeModel you like. Previously it was only possible to
create an MCJIT that used CodeModel::JITDefault.
EnableFastISel: It's now possible to turn on the fast instruction selector.
The CodeModel option required some trickery. The problem is that previously, we were ensuring future binary compatibility in
the MCJITCompilerOptions by mandating that the user bzero's the options struct and passes the sizeof() that he saw; the
bindings then bzero the remaining bits. This works great but assumes that the bitwise zero equivalent of any field is a
sensible default value.
But this is not the case for LLVMCodeModel, or its internal equivalent, llvm::CodeModel::Model. In both of those, the default
for a JIT is CodeModel::JITDefault (or LLVMCodeModelJITDefault), which is not bitwise zero.
Hence this change introduces LLVMInitializeMCJITCompilerOptions(), which will initialize the user's options struct with
defaults. The user will use this in the same way that they would have previously used memset() or bzero(). MCJITCAPITest.cpp
illustrates the change, as does the comment in ExecutionEngine.h.
llvm-svn: 180893
the things, and renames it to CBindingWrapping.h. I also moved
CBindingWrapping.h into Support/.
This new file just contains the macros for defining different wrap/unwrap
methods.
The calls to those macros, as well as any custom wrap/unwrap definitions
(like for array of Values for example), are put into corresponding C++
headers.
Doing this required some #include surgery, since some .cpp files relied
on the fact that including Wrap.h implicitly caused the inclusion of a
bunch of other things.
This also now means that the C++ headers will include their corresponding
C API headers; for example Value.h must include llvm-c/Core.h. I think
this is harmless, since the C API headers contain just external function
declarations and some C types, so I don't believe there should be any
nasty dependency issues here.
llvm-svn: 180881
The cause of the windows failures was fixed by r180791. Revert to the state
after Sabre's original revert.
Original message:
revert r179735, it has no testcases, and doesn't really make sense.
llvm-svn: 180844
register-indirect address with an offset of 0.
It used to be that a DBG_VALUE is a register-indirect value if the offset
(operand 1) is nonzero. The new convention is that a DBG_VALUE is
register-indirect if the first operand is a register and the second
operand is an immediate. For plain registers use the combination reg, reg.
rdar://problem/13658587
llvm-svn: 180816
The actual storage was already using unsigned, but the interface was using
uint64_t. This is wasteful on 32 bits and looks to be the root causes of
a miscompilation on Windows where a value was being sign extended to 64bits
to compare with the result of getSlotIndex.
Patch by Pasi Parviainen!
llvm-svn: 180791
The `llvm.tls_init_funcs' (created by the front-end) holds pointers to the TLS
initialization functions. These need to be placed into the correct section so
that they are run before `main()'.
<rdar://problem/13733006>
llvm-svn: 180737
For regular object files this is only meaningful for common symbols. An object
file format with direct support for atoms should be able to provide alignment
information for all symbols.
This replaces getCommonSymbolAlignment and fixes
test-common-symbols-alignment.ll on darwin. This also includes a fix to
MachOObjectFile::getSymbolFlags. It was marking undefined symbols as common
(already tested by existing mcjit tests now that it is used).
llvm-svn: 180736
This un-reverts r179735 and reverts commit r180574.
This fixes assertion failures for me locally and should fix the failures
on Windows reported widely on llvm-dev. We should check if the bots
caught this and if so why not.
llvm-svn: 180722
This seems to me an obvious place to allow target passes to annotate
memory operations. There are plenty of bits, and I'm not aware of
another good way for early target passes to propagate hints along to
later passes. Target independent transforms can simply preserve them,
the way they preserve the other flags. Like MachineMemOperands in
general, if the target flags are lost we must still generate correct
code.
This has lots of uses, but I want this flexibility now to make it
easier to work with the new MachineTraceMetrics
analysis. MachineTraceMetrics can gather a lot of information about
instructions based on the surrounding code. This information can be
used to influence postRA machine passes that don't work on SSA form.
llvm-svn: 180666
to determine whether or not we're on a darwin platform for debug code
emitting.
Solves the problem of a module with no triple on the command line
and no triple in the module using non-gdb ok features on darwin. Fix
up the member-pointers test to check the correct things for cross
platform (DW_FORM_flag is a good prefix).
Unfortunately no testcase because I have no ideas how to test something
without a triple and without a triple in the module yet check
precisely on two platforms. Ideas welcome.
llvm-svn: 180660
We switch the order of offset and field type to make TBAAStructType node
(name, parent node, offset) similar to scalar TBAA node (name, parent node).
TypeIsImmutable is added to TBAAStructTag node.
llvm-svn: 180654
Clarify documentation and API to make the difference between register and
register-indirect addressed locations more explicit. Put in a comment
to point out that with the current implementation we cannot specify
a register-indirect location with offset 0 (a breg 0 in DWARF).
No functionality change intended.
rdar://problem/13658587
llvm-svn: 180641
For Mach-O there were 2 implementations for parsing object files. A
standalone llvm/Object/MachOObject.h and llvm/Object/MachO.h which
implements the generic interface in llvm/Object/ObjectFile.h.
This patch adds the missing features to MachO.h, moves macho-dump to
use MachO.h and removes ObjectFile.h.
In addition to making sure that check-all is clean, I checked that the
new version produces exactly the same output in all Mach-O files in a
llvm+clang build directory (including executables and shared
libraries).
To test the performance, I ran macho-dump over all the files in a
llvm+clang build directory again, but this time redirecting the output
to /dev/null. Both the old and new versions take about 4.6 seconds
(2.5 user) to finish.
llvm-svn: 180624
Since we can't guarantee that the original dbg.declare instrinsic
is removed by LowerDbgDeclare(), we need to make sure that we are
not inserting the same dbg.value intrinsic over and over.
This removes tons of redundant DIEs when compiling optimized code.
rdar://problem/13056109
llvm-svn: 180615
Summary:
This is modelled on the Mach-O linker options implementation and should
support a Clang implementation of #pragma comment(lib/linker).
Reviewers: rafael
CC: llvm-commits
Differential Revision: http://llvm-reviews.chandlerc.com/D724
llvm-svn: 180569
getRelocationAddress is for dynamic libraries and executables,
getRelocationOffset for relocatable objects.
Mark the getRelocationAddress of COFF and MachO as not implemented yet. Add a
test of ELF's. llvm-readobj -r now prints the same values as readelf -r.
llvm-svn: 180259
For now, we just reschedule instructions that use the copied vregs and
let regalloc elliminate it. I would really like to eliminate the
copies on-the-fly during scheduling, but we need a complete
implementation of repairIntervalsInRange() first.
The general strategy is for the register coalescer to eliminate as
many global copies as possible and shrink live ranges to be
extended-basic-block local. The coalescer should not have to worry
about resolving local copies (e.g. it shouldn't attemp to reorder
instructions). The scheduler is a much better place to deal with local
interference. The coalescer side of this equation needs work.
llvm-svn: 180193
This reverts commit r179840 with a fix to test/DebugInfo/two-cus-from-same-file.ll
I'm not sure why that test only failed on ARM & MIPS and not X86 Linux, even
though the debug info was clearly invalid on all of them, but this ought to fix
it.
llvm-svn: 179996
I think it's almost impossible to fold atomic fences profitably under
LLVM/C++11 semantics. As a result, this is now unused and just
cluttering up the target interface.
llvm-svn: 179940
Adding another CU-wide list, in this case of imported_modules (since they
should be relatively rare, it seemed better to add a list where each element
had a "context" value, rather than add a (usually empty) list to every scope).
This takes care of DW_TAG_imported_module, but to fully address PR14606 we'll
need to expand this to cover DW_TAG_imported_declaration too.
llvm-svn: 179836
When the SlotIndexes pass was introduced it was intended to support insertion
of code during register allocation. Removal of code was a minor consideration
(and raised the question of what to do about dangling SlotIndex objects pointing
to the erased index), so I opted to keep all indexes around indefinitely and
simply null out those that weren't being used.
Nowadays people are moving more code around (e.g. via HandleMove), which means
more zombie indexes. I want to start killing off indexes when we're done with
them to reclaim the resources they use up.
llvm-svn: 179834
variant/dialect. Addresses a FIXME in the emitMnemonicAliases function.
Use and test case to come shortly.
rdar://13688439 and part of PR13340.
llvm-svn: 179804
Semantics of parameters named Index and Idx were inconsistent between
"include/llvm/IR/Attributes.h", "lib/IR/AttributeImpl.h" and
"lib/IR/Attributes.cpp": sometimes these were fixed 1-based indexes of IR
parameters (or AttributeSet::ReturnIndex for IR return values or
AttributeSet::FunctionIndex for IR functions), other times they were the
internal slot for storage in the underlying AttributeSetImpl. I renamed usage of
the former to "Index" and usage of the latter to "Slot" ("Slot" was already
being used consistently for the latter in a subset of cases)
Patch by Stephen Lin!
llvm-svn: 179791
1. Verify::VerifyParameterAttrs in "lib/IR/Verifier.cpp" and
AttrBuilder::removeFunctionOnlyAttrs in "lib/IR/Attributes.cpp" (only called
by Verify::VerifyFunctionAttrs) separately maintained a list of function-only
attribute types. I've consolidated the logic into a new function used for
both cases in "lib/IR/Verifier.cpp", so this logic is in one place (other
than the AsmParser front-end)
2. Various functions in "lib/IR/Verifier.cpp" passed AttributeSet around by
reference needlessly, as it's just a handle to an immutable pimpl body.
Patch by Stephen Lin!
llvm-svn: 179790
* We only ever specialize these templates with an instantiation of ELFType,
so we don't need a template template.
* Replace LLVM_ELF_COMMA with just passing the individual parameters to the
macro. This requires a second macro for when we only have ELFT, but that
is still a small win.
llvm-svn: 179726
I will remove the isBigEndianHost function once I update clang.
The ifdef logic is designed to
* not use configure/cmake to avoid breaking -arch i686 -arch ppc.
* default to little endian
* be as small as possible
It looks like sys/endian.h is the preferred header on most modern BSD systems,
but it is better to change this in a followup patch as machine/endian.h is
available on FreeBSD, OpenBSD, NetBSD and OS X.
llvm-svn: 179527
This is a rework of the broken parts in r179373 which were subsequently reverted in r179374 due to incompatibility with C++98 compilers. This version should be ok under C++98.
llvm-svn: 179520
The register allocator expects minimal physreg live ranges. Schedule
physreg copies accordingly. This is slightly tricky when they occur in
the middle of the scheduling region. For now, this is handled by
rescheduling the copy when its associated instruction is
scheduled. Eventually we may instead bundle them, but only if we can
preserve the bundles as parallel copies during regalloc.
llvm-svn: 179449
MIPS64EL relocation entries have up to three relocation operations. Because
libObject only exposes a single relocation name, use the concatenation of
the individual relocation type names.
llvm-svn: 179357
Original message:
Print more information about relocations.
With this patch llvm-readobj now prints if a relocation is pcrel, its length,
if it is extern and if it is scattered.
It also refactors the code a bit to use bit fields instead of shifts and
masks all over the place.
llvm-svn: 179345
With this patch llvm-readobj now prints if a relocation is pcrel, its length,
if it is extern and if it is scattered.
It also refactors the code a bit to use bit fields instead of shifts and
masks all over the place.
llvm-svn: 179294
It was returning the loaded address of the section containing the relocation,
which really doesn't seem to be the intent of this function.
llvm-svn: 179255
Add support for the COFF relocation types IMAGE_REL_I386_DIR32NB and
IMAGE_REL_AMD64_ADDR32NB for 32- and 64-bit respectively. These are
similar to normal 4-byte relocations except that they do not include
the base address of the image.
Image-relative relocations are used for debug information (32-bit) and
SEH unwind tables (64-bit).
A new MCSymbolRef variant called 'VK_COFF_IMGREL32' is introduced to
specify such relocations. For AT&T assembly, this variant can be accessed
using the symbol suffix '@imgrel'.
llvm-svn: 179240
Compact unwind has an encoding for when we're not able to generate compact
unwind and must generate an EH frame instead. Track that, but still emit that CU
encoding.
llvm-svn: 179220
Test cases that regressed due to r179115, plus a few more, were added in
r179182. Original commit message below:
[ms-inline asm] Use parsePrimaryExpr in lieu of parseExpression if we need to
parse an identifier. Otherwise, parseExpression may parse multiple tokens,
which makes it impossible to properly compute an immediate displacement.
An example of such a case is the source operand (i.e., [Symbol + ImmDisp]) in
the below example:
__asm mov eax, [Symbol + ImmDisp]
Part of rdar://13611297
llvm-svn: 179187
The target hooks are getting out of hand. What does it mean to run
before or after regalloc anyway? Allowing either Pass* or AnalysisID
pass identification should make it much easier for targets to use the
substitutePass and insertPass APIs, and create less need for badly
named target hooks.
llvm-svn: 179140