This first cut is pretty conservative. The final argument register (R6)
is call-saved, so we would need to make sure that the R6 argument to a
sibling call is the same as the R6 argument to the calling function,
which seems worth keeping as a separate patch.
Saying that integer truncations are free means that we no longer
use the extending instructions LGF and LLGF for spills in int-conv-09.ll
and int-conv-10.ll. Instead we treat the registers as 64 bits wide and
truncate them to 32-bits where necessary. I think it's unlikely we'd
use LGF and LLGF for spills in other situations for the same reason,
so I'm removing the tests rather than replacing them. The associated
code is generic and applies to many more instructions than just
LGF and LLGF, so there is no corresponding code removal.
llvm-svn: 188669
We had previously been asserting when faced with a FCOPYSIGN f64, ppcf128 node
because there was no way to expand the FCOPYSIGN node. Because ppcf128 is the
sum of two doubles, and the first double must have the larger magnitude, we
can take the sign from the first double. As a result, in addition to fixing the
crash, this is also an optimization.
llvm-svn: 188655
Modern PPC cores support a floating-point copysign instruction, and we can use
this to lower the FCOPYSIGN node (which is created from calls to the libm
copysign function). A couple of extra patterns are necessary because the
operand types of FCOPYSIGN need not agree.
llvm-svn: 188653
This reduces the noise in diffs making it more likely that, at least for
LLVM revision-over-revision, diffs will actually yield usable results.
This is consistent with objdump's DWARF dumping behavior.
llvm-svn: 188650
We check this in many/all other cases, just missed this one it seems.
Perhaps it'd be worth unifying this so we never emit zero-length
DW_AT_names.
llvm-svn: 188649
When patching inlineasm nodes to use GPRPair for 64-bit values, we
were dropping the information that two operands were tied, which
effectively broke the live-interval of vregs affected.
llvm-svn: 188643
project's autoconf. This is the last of the missing optional checks used
by libSupport that seemed to be missing from the sample project, but
I could easily have missed some as this was done by inspection when
Craig asked me to add the terminfo support.
llvm-svn: 188618
allocated by setupterm. Without this, some folks are seeing leaked
memory whenever this routine is called more than once. Thanks to Craig
Topper for the report.
llvm-svn: 188615
This fixes SCEVExpander so that it does not create multiple distinct induction
variables for duplicate PHI entries. Specifically, given some code like this:
do.body6: ; preds = %do.body6, %do.body6, %if.then5
%end.0 = phi i8* [ undef, %if.then5 ], [ %incdec.ptr, %do.body6 ], [ %incdec.ptr, %do.body6 ]
...
Note that it is legal to have multiple entries for a basic block so long as the
associated value is the same. So the above input is okay, but expanding an
AddRec in this loop could produce code like this:
do.body6: ; preds = %do.body6, %do.body6, %if.then5
%indvar = phi i64 [ %indvar.next, %do.body6 ], [ %indvar.next1, %do.body6 ], [ 0, %if.then5 ]
%end.0 = phi i8* [ undef, %if.then5 ], [ %incdec.ptr, %do.body6 ], [ %incdec.ptr, %do.body6 ]
...
%indvar.next = add i64 %indvar, 1
%indvar.next1 = add i64 %indvar, 1
And this is not legal because there are two PHI entries for %do.body6 each with
a distinct value.
Unfortunately, I don't have an in-tree test case.
llvm-svn: 188614
builtin. The GCC builtin expects the arguments to be passed by val,
whereas the LLVM intrinsic expects a pointer instead.
This is related to PR 16581 and rdar:14747994.
llvm-svn: 188608
Properly constrain the operand register class for instructions used
in [sz]ext expansion. Update more tests to use the verifier now that
we're getting the register classes correct.
rdar://12594152
llvm-svn: 188594
Teach the generic instruction selection helper functions to constrain
the register classes of their input operands. For non-physical register
references, the generic code needs to be careful not to mess that up
when replacing references to result registers. As the comment indicates
for MachineRegisterInfo::replaceRegWith(), it's important to call
constrainRegClass() first.
rdar://12594152
llvm-svn: 188593
Lots of machine verifier errors result from using a plain GPR regclass
for incoming argument copies. A more restrictive rGPR class is more
appropriate since it more accurately represents what's happening, plus
it lines up better with isel later on so the verifier is happier.
Reduces the number of ARM fast-isel tests not running with the verifier
enabled by over half.
rdar://12594152
llvm-svn: 188592
This regards how mips16 is viewed. It's not really a target type but
there has always been a target for it in the td files. It's more properly
-mcpu=mips32 -mattr=+mips16 . This is how clang treats it but we have
always had the -mcpu=mips16 which I probably should delete now but it will
require updating all the .ll test cases for mips16. In this case it changed
how we decide if we have a count bits instruction and whether instruction
lowering should then expand ctlz. Now that we have dual mode compilation,
-mattr=+mips16 really just indicates the inital processor mode that
we are compiling for. (It is also possible to have -mcpu=64 -mattr=+mips16
but as far as I know, nobody has even built such a processor, though there
is an architecture manual for this).
llvm-svn: 188586
safe on PPC32 SVR4 ABI
[Patch and following text by Mark Minich; committing on his behalf.]
There are FIXME's in PowerPC/PPCFrameLowering.cpp, method
PPCFrameLowering::emitPrologue() related to "negative offsets of R1"
on PPC32 SVR4. They're true, but the real issue is that on PPC32 SVR4
(and any ABI without a Red Zone), no spills may be made until after
the stackframe is claimed, which also includes the LR spill which is
at a positive offset. The same problem exists in emitEpilogue(),
though there's no FIXME for it. I intend to fix this issue, making
LLVM-compiled code finally safe for use on SVR4/EABI/e500 32-bit
platforms (including in particular, OS-free embedded systems & kernel
code, where interrupts may share the same stack as user code).
In preparation for making these changes, to make the diffs for the
functional changes less cluttered, I am providing the non-functional
refactorings in two stages:
Stage 1 does some minor fluffy refactorings to pull multiple method
calls up into a single bool, creating named bools for repeated uses of
obscure logic, moving some code up earlier because either stage 2 or
my final version will require it earlier, and rewording/adding some
comments. My stage 1 changes can be characterized as primarily fluffy
cleanup, the purpose of which may be unclear until the stage 2 or
final changes are made.
My stage 2 refactorings combine the separate PPC32 & PPC64 logic,
which is currently performed by largely duplicate code, into a single
flow, with the differences handled by a group of constants initialized
early in the methods.
This submission is for my stage 1 changes. There should be no
functional changes whatsoever; this is a pure refactoring.
llvm-svn: 188573
If an ELF relocation is pointed at an absolute address, it will have a symbol ID of zero.
RuntimeDyldELF::processRelocationRef was not previously handling this case, and was instead trying to handle it as a section-relative fixup.
I think this is the right fix here, but my elf-fu is poor on some of the more exotic platforms, so I'd appreciate it if anyone with greater knowledge could verify this.
llvm-svn: 188572
The logic in SIInsertWaits::getHwCounts() only really made sense for SMRD
instructions, and trying to shoehorn it into handling DS_WRITE_B32 caused
it to corrupt the encoding of that by clobbering the first operand with
the second one.
Undo that damage and only apply the SMRD logic to that.
Fixes some derivates related piglit regressions with radeonsi.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
llvm-svn: 188558