Change inalloca rules to make it only apply to the last parameter

This makes things a lot easier, because we can now talk about the "argument allocation", which allocates all the memory for the call in one shot. The only functional change is to the verifier for a feature that hasn't shipped yet. llvm-svn: 199434
2025-01-31 12:41:49 +01:00 · 2014-01-16 22:59:24 +00:00 · 2014-01-16 22:59:24 +00:00 · e0ad0fd826
commit e0ad0fd826
parent b42dbc5117
6 changed files with 124 additions and 105 deletions
--- a/docs/InAlloca.rst
+++ b/docs/InAlloca.rst
@ -7,19 +7,19 @@ Introduction

 .. Warning:: This feature is unstable and not fully implemented.

-The :ref:`attr_inalloca` attribute is designed to allow taking the
-address of an aggregate argument that is being passed by value through
-memory.  Primarily, this feature is required for compatibility with the
-Microsoft C++ ABI.  Under that ABI, class instances that are passed by
-value are constructed directly into argument stack memory.  Prior to the
-addition of inalloca, calls in LLVM were indivisible instructions.
-There was no way to perform intermediate work, such as object
-construction, between the first stack adjustment and the final control
-transfer.  With inalloca, each argument is modelled as an alloca, which
-can be stored to independently of the call.  Unfortunately, this
-complicated feature comes with a large set of restrictions designed to
-bound the lifetime of the argument memory around the call, which are
-explained in this document.
+The :ref:`inalloca <attr_inalloca>` attribute is designed to allow
+taking the address of an aggregate argument that is being passed by
+value through memory.  Primarily, this feature is required for
+compatibility with the Microsoft C++ ABI.  Under that ABI, class
+instances that are passed by value are constructed directly into
+argument stack memory.  Prior to the addition of inalloca, calls in LLVM
+were indivisible instructions.  There was no way to perform intermediate
+work, such as object construction, between the first stack adjustment
+and the final control transfer.  With inalloca, all arguments passed in
+memory are modelled as a single alloca, which can be stored to prior to
+the call.  Unfortunately, this complicated feature comes with a large
+set of restrictions designed to bound the lifetime of the argument
+memory around the call.

 For now, it is recommended that frontends and optimizers avoid producing
 this construct, primarily because it forces the use of a base pointer.
@ -30,48 +30,60 @@ passing by value with a copy.
 Intended Usage
 ==============

-In the example below, ``f`` is attempting to pass a default-constructed
-``Foo`` object to ``g`` by value.
+The example below is the intended LLVM IR lowering for some C++ code
+that passes a default-constructed ``Foo`` object to ``g`` in the 32-bit
+Microsoft C++ ABI.
+
+.. code-block:: c++
+
+    // Foo is non-trivial.
+    struct Foo { int a, b; Foo(); ~Foo(); Foo(const &Foo); };
+    void g(Foo a, Foo b);
+    void f() {
+      f(1, Foo(), 3);
+    }

 .. code-block:: llvm

-    %Foo = type { i32, i32 }
+    %struct.Foo = type { i32, i32 }
+    %callframe.f = type { %struct.Foo, %struct.Foo }
    declare void @Foo_ctor(%Foo* %this)
-    declare void @g(%Foo* inalloca %arg)
+    declare void @Foo_dtor(%Foo* %this)
+    declare void @g(%Foo* inalloca %memargs)

    define void @f() {
-      ...
-
-    bb1:
+    entry:
      %base = call i8* @llvm.stacksave()
-      %arg = alloca %Foo
-      invoke void @Foo_ctor(%Foo* %arg)
+      %memargs = alloca %callframe.f
+      %b = getelementptr %callframe.f*, i32 0
+      %a = getelementptr %callframe.f*, i32 1
+      call void @Foo_ctor(%struct.Foo* %b)
+
+      ; If a's ctor throws, we must destruct b.
+      invoke void @Foo_ctor(%struct.Foo* %arg1)
          to label %invoke.cont unwind %invoke.unwind

    invoke.cont:
-      call void @g(%Foo* inalloca %arg)
+      store i32 1, i32* %arg0
+      call void @g(%callframe.f* inalloca %memargs)
      call void @llvm.stackrestore(i8* %base)
      ...

    invoke.unwind:
+      call void @Foo_dtor(%struct.Foo* %b)
      call void @llvm.stackrestore(i8* %base)
      ...
    }

-The alloca in this example is dynamic, meaning it is not in the entry
-block, and it can be executed more than once.  Due to the restrictions
-against allocas between an alloca used with inalloca and its associated
-call site, all allocas used with inalloca are considered dynamic.
-
-To avoid any stack leakage, the frontend saves the current stack pointer
-with a call to :ref:`llvm.stacksave <int_stacksave>`.  Then, it
-allocates the argument stack space with alloca and calls the default
-constructor.  One important consideration is that the default
-constructor could throw an exception, so the frontend has to create a
-landing pad.  At this point, if there were any other inalloca arguments,
-the frontend would have to destruct them before restoring the stack
-pointer.  If the constructor does not unwind, ``g`` is called, and then
-the stack is restored.
+To avoid stack leaks, the frontend saves the current stack pointer with
+a call to :ref:`llvm.stacksave <int_stacksave>`.  Then, it allocates the
+argument stack space with alloca and calls the default constructor.  The
+default constructor could throw an exception, so the frontend has to
+create a landing pad.  The frontend has to destroy the already
+constructed argument ``b`` before restoring the stack pointer.  If the
+constructor does not unwind, ``g`` is called.  In the Microsoft C++ ABI,
+``g`` will destroy its arguments, and then the stack is restored in
+``f``.

 Design Considerations
 =====================
@ -81,31 +93,43 @@ Lifetime

 The biggest design consideration for this feature is object lifetime.
 We cannot model the arguments as static allocas in the entry block,
-because all calls need to use the memory that is at the end of the call
-frame to pass arguments.  We cannot vend pointers to that memory at
-function entry because after code generation they will alias.  In the
-current design, the rule against allocas between the inalloca alloca
-values and the call site avoids this problem, but it creates a cleanup
-problem.  Cleanup and lifetime is handled explicitly with stack save and
-restore calls.  In the future, we may be able to avoid this by using
-:ref:`llvm.lifetime.start <int_lifestart>` and :ref:`llvm.lifetime.end
-<int_lifeend>` instead.
+because all calls need to use the memory at the top of the stack to pass
+arguments.  We cannot vend pointers to that memory at function entry
+because after code generation they will alias.
+
+The rule against allocas between argument allocations and the call site
+avoids this problem, but it creates a cleanup problem.  Cleanup and
+lifetime is handled explicitly with stack save and restore calls.  In
+the future, we may want to introduce a new construct such as ``freea``
+or ``afree`` to make it clear that this stack adjusting cleanup is less
+powerful than a full stack save and restore.

 Nested Calls and Copy Elision
 -----------------------------

-The next consideration is the ability for the frontend to perform copy
-elision in the face of nested calls.  Consider the evaluation of
-``foo(foo(Bar()))``, where ``foo`` takes and returns a ``Bar`` object by
-value and ``Bar`` has non-trivial constructors.  In this case, we want
-to be able to elide copies into ``foo``'s argument slots.  That means we
-need to have more than one set of argument frames active at the same
-time.  First, we need to allocate the frame for the outer call so we can
-pass it in as the hidden struct return pointer to the middle call.  Then
-we do the same for the middle call, allocating a frame and passing its
-address to ``Bar``'s default constructor.  By wrapping the evaluation of
-the inner ``foo`` with stack save and restore, we can have multiple
-overlapping active call frames.
+We also want to be able to support copy elision into these argument
+slots.  This means we have to support multiple live argument
+allocations.
+
+Consider the evaluation of:
+
+.. code-block:: c++
+
+    // Foo is non-trivial.
+    struct Foo { int a; Foo(); Foo(const &Foo); ~Foo(); };
+    Foo bar(Foo b);
+    int main() {
+      bar(bar(Foo()));
+    }
+
+In this case, we want to be able to elide copies into ``bar``'s argument
+slots.  That means we need to have more than one set of argument frames
+active at the same time.  First, we need to allocate the frame for the
+outer call so we can pass it in as the hidden struct return pointer to
+the middle call.  Then we do the same for the middle call, allocating a
+frame and passing its address to ``Foo``'s default constructor.  By
+wrapping the evaluation of the inner ``bar`` with stack save and
+restore, we can have multiple overlapping active call frames.

 Callee-cleanup Calling Conventions
 ----------------------------------
--- a/docs/LangRef.rst
+++ b/docs/LangRef.rst
@ -727,29 +727,27 @@ Currently, only the following parameter attributes are defined:

 .. Warning:: This feature is unstable and not fully implemented.

-    The ``inalloca`` argument attribute allows the caller to get the
-    address of an outgoing argument to a ``call`` or ``invoke`` before
-    it executes.  It is similar to ``byval`` in that it is used to pass
-    arguments by value, but it guarantees that the argument will not be
-    copied.
+    The ``inalloca`` argument attribute allows the caller to take the
+    address of all stack-allocated arguments to a ``call`` or ``invoke``
+    before it executes.  It is similar to ``byval`` in that it is used
+    to pass arguments by value, but it guarantees that the argument will
+    not be copied.

-    To be :ref:`well formed <wellformed>`, the caller must pass in an
-    alloca value into an ``inalloca`` parameter, and an alloca may be
-    used as an ``inalloca`` argument at most once.  The attribute can
-    only be applied to parameters that would be passed in memory and not
-    registers.  The ``inalloca`` attribute cannot be used in conjunction
-    with other attributes that affect argument storage, like ``inreg``,
-    ``nest``, ``sret``, or ``byval``.  The ``inalloca`` stack space is
-    considered to be clobbered by any call that uses it, so any
+    To be :ref:`well formed <wellformed>`, an alloca may be used as an
+    ``inalloca`` argument at most once.  The attribute can only be
+    applied to the last parameter, and it guarantees that they are
+    passed in memory.  The ``inalloca`` attribute cannot be used in
+    conjunction with other attributes that affect argument storage, like
+    ``inreg``, ``nest``, ``sret``, or ``byval``.  The ``inalloca`` stack
+    space is considered to be clobbered by any call that uses it, so any
    ``inalloca`` parameters cannot be marked ``readonly``.

-    Allocas passed with ``inalloca`` to a call must be in the opposite
-    order of the parameter list, meaning that the rightmost argument
-    must be allocated first.  If a call has inalloca arguments, no other
-    allocas can occur between the first alloca used by the call and the
-    call site, unless they are are cleared by calls to
-    :ref:`llvm.stackrestore <int_stackrestore>`.  Violating these rules
-    results in undefined behavior at runtime.
+    When the call site is reached, the argument allocation must have
+    been the most recent stack allocation that is still live, or the
+    results are undefined.  It is possible to allocate additional stack
+    space after an argument allocation and before its call site, but it
+    must be cleared off with :ref:`llvm.stackrestore
+    <int_stackrestore>`.

    See :doc:`InAlloca` for more information on how to use this
    attribute.
--- a/lib/IR/Verifier.cpp
+++ b/lib/IR/Verifier.cpp
@ -910,6 +910,11 @@ void Verifier::VerifyFunctionAttrs(FunctionType *FT, AttributeSet Attrs,

    if (Attrs.hasAttribute(Idx, Attribute::StructRet))
      Assert1(Idx == 1, "Attribute sret is not on first parameter!", V);
+
+    if (Attrs.hasAttribute(Idx, Attribute::InAlloca)) {
+      Assert1(Idx == FT->getNumParams(),
+              "inalloca isn't on the last parameter!", V);
+    }
  }

  if (!Attrs.hasAttributes(AttributeSet::FunctionIndex))
@ -1541,15 +1546,6 @@ void Verifier::VerifyCallSite(CallSite CS) {
  // Verify call attributes.
  VerifyFunctionAttrs(FTy, Attrs, I);

-  // Verify that values used for inalloca parameters are in fact allocas.
-  for (unsigned i = 0, e = CS.arg_size(); i != e; ++i) {
-    if (!Attrs.hasAttribute(1 + i, Attribute::InAlloca))
-      continue;
-    Value *Arg = CS.getArgument(i);
-    Assert2(isa<AllocaInst>(Arg), "Inalloca argument is not an alloca!", I,
-            Arg);
-  }
-
  if (FTy->isVarArg()) {
    // FIXME? is 'nest' even legal here?
    bool SawNest = false;
@ -1583,6 +1579,10 @@ void Verifier::VerifyCallSite(CallSite CS) {

      Assert1(!Attrs.hasAttribute(Idx, Attribute::StructRet),
              "Attribute 'sret' cannot be used for vararg call arguments!", I);
+
+      if (Attrs.hasAttribute(Idx, Attribute::InAlloca))
+        Assert1(Idx == CS.arg_size(), "inalloca isn't on the last argument!",
+                I);
    }
  }

@ -1888,21 +1888,6 @@ void Verifier::visitAllocaInst(AllocaInst &AI) {
  Assert1(AI.getArraySize()->getType()->isIntegerTy(),
          "Alloca array size must have integer type", &AI);

-  // Verify that an alloca instruction is not used with inalloca more than once.
-  unsigned InAllocaUses = 0;
-  for (User::use_iterator UI = AI.use_begin(), UE = AI.use_end(); UI != UE;
-       ++UI) {
-    CallSite CS(*UI);
-    if (!CS)
-      continue;
-    unsigned ArgNo = CS.getArgumentNo(UI);
-    if (CS.isInAllocaArgument(ArgNo)) {
-      InAllocaUses++;
-      Assert1(InAllocaUses <= 1,
-              "Allocas can be used at most once with inalloca!", &AI);
-    }
-  }
-
  visitInstruction(AI);
 }

--- a/test/Verifier/inalloca-vararg.ll
+++ b/test/Verifier/inalloca-vararg.ll
@ -0,0 +1,9 @@
+; RUN: not llvm-as %s -o /dev/null 2>&1 | FileCheck %s
+
+declare void @h(i32, ...)
+define void @i() {
+  %args = alloca i32
+  call void (i32, ...)* @h(i32 1, i32* inalloca %args, i32 3)
+; CHECK: inalloca isn't on the last argument!
+  ret void
+}
--- a/test/Verifier/inalloca1.ll
+++ b/test/Verifier/inalloca1.ll
@ -17,3 +17,6 @@ declare void @e(i64* readonly inalloca %p)

 declare void @f(void ()* inalloca %p)
 ; CHECK: do not support unsized types
+
+declare void @g(i32* inalloca %p, i32 %p2)
+; CHECK: inalloca isn't on the last parameter!
--- a/test/Verifier/inalloca2.ll
+++ b/test/Verifier/inalloca2.ll
@ -1,4 +1,6 @@
-; RUN: not llvm-as %s -o /dev/null 2>&1 | FileCheck %s
+; This used to be invalid, but now it's valid.  Ensure the verifier
+; doesn't reject it.
+; RUN: llvm-as %s -o /dev/null

 declare void @doit(i64* inalloca %a)

@ -7,7 +9,6 @@ entry:
  %a = alloca [2 x i32]
  %b = bitcast [2 x i32]* %a to i64*
  call void @doit(i64* inalloca %b)
-; CHECK: Inalloca argument is not an alloca!
  ret void
 }

@ -16,6 +17,5 @@ entry:
  %a = alloca i64
  call void @doit(i64* inalloca %a)
  call void @doit(i64* inalloca %a)
-; CHECK: Allocas can be used at most once with inalloca!
  ret void
 }