From 3eca9c36a63595baee0659ac818fd5bedc528db1 Mon Sep 17 00:00:00 2001 From: Kim Barrett Date: Tue, 28 Sep 2021 05:56:34 +0000 Subject: [PATCH] 8264707: HotSpot Style Guide should permit use of lambda Reviewed-by: stefank, dholmes, coleenp, iklam, sjohanss, eosterlund, iveresov, kvn --- doc/hotspot-style.html | 121 +++++++++++++++++- doc/hotspot-style.md | 280 +++++++++++++++++++++++++++++++++++++++-- 2 files changed, 388 insertions(+), 13 deletions(-) diff --git a/doc/hotspot-style.html b/doc/hotspot-style.html index 8f421de54f0..fe72cbbdf80 100644 --- a/doc/hotspot-style.html +++ b/doc/hotspot-style.html @@ -50,6 +50,7 @@
  • nullptr
  • <atomic>
  • Uniform Initialization
  • +
  • Local Function Objects
  • Additional Permitted Features
  • Excluded Features
  • Undecided Features
  • @@ -254,8 +255,7 @@ while ( test_foo(args...) ) { // No, excess spaces around control

    Function argument deduction. This is always permitted, and indeed encouraged. It is nearly always better to allow the type of a function template argument to be deduced rather than explicitly specified.

  • auto variable declarations (n1984)
    For local variables, this can be used to make the code clearer by eliminating type information that is obvious or irrelevant. Excessive use can make code much harder to understand.

  • Function return type deduction (n3638)
    Only use if the function body has a very small number of return statements, and generally relatively little other code.

  • -
  • Generic lambdas. Lambdas are not (yet) permitted.

  • -
  • Lambda init captures. Lambdas are not (yet) permitted.

  • +
  • Also see lambda expressions.

  • Expression SFINAE

    Substitution Failure Is Not An Error (SFINAE) is a template metaprogramming technique that makes use of template parameter substitution failures to make compile-time decisions.

    @@ -288,6 +288,121 @@ while ( test_foo(args...) ) { // No, excess spaces around controlaggregate initialization

    Although related, the use of std::initializer_list remains forbidden, as part of the avoidance of the C++ Standard Library in HotSpot code.

    +

    Local Function Objects

    + +

    Single-use function objects can be defined locally within a function, directly at the point of use. This is an alternative to having a function or function object class defined at class or namespace scope.

    +

    This usage was somewhat limited by C++03, which does not permit such a class to be used as a template parameter. That restriction was removed by C++11 (n2657). Use of this feature is permitted.

    +

    Many HotSpot protocols involve "function-like" objects that involve some named member function rather than a call operator. For example, a function that performs some action on all threads might be written as

    +
    void do_something() {
    +  struct DoSomething : public ThreadClosure {
    +    virtual void do_thread(Thread* t) {
    +      ... do something with t ...
    +    }
    +  } closure;
    +  Threads::threads_do(&closure);
    +}
    +

    HotSpot code has historically usually placed the DoSomething class at namespace (or sometimes class) scope. This separates the function's code from its use, often to the detriment of readability. It requires giving the class a globally unique name (if at namespace scope). It also loses the information that the class is intended for use in exactly one place, and does not have any subclasses. (However, the latter can now be indicated by declaring it final.) Often, for simplicity, a local class will skip things like access control and accessor functions, giving the enclosing function direct access to the implementation and eliminating some boilerplate that might be provided if the class is in some outer (more accessible) scope. On the other hand, if there is a lot of surrounding code in the function body or the local class is of significant size, defining it locally can increase clutter and reduce readability.

    +

    C++11 added lambda expressions as a new way to write a function object. Simple lambda expressions can be significantly more concise than a function object, eliminating a lot of boiler-plate. On the other hand, a complex lambda expression may not provide much, if any, readability benefit compared to an ordinary function object. Also, while a lambda can encapsulate a call to a "function-like" object, it cannot be used in place of such.

    +

    A common use for local functions is as one-use RAII objects. The amount of boilerplate for a function object class (local or not) makes such usage somewhat clumsy and verbose. But with the help of a small amount of supporting utility code, lambdas work particularly well for this use case.

    +

    Another use for local functions is partial application. Again here, lambdas are typically much simpler and less verbose than function object classes.

    +

    Because of these benefits, lambda expressions are permitted in HotSpot code, with some restrictions and usage guidance. An anonymous lambda is one which is passed directly as an argument. A named lambda is the value of a variable, which is its name.

    +

    Lambda expressions should only be passed downward. In particular, a lambda should not be returned from a function or stored in a global variable, whether directly or as the value of a member of some other object. Lambda capture is syntactically subtle (by design), and propagating a lambda in such ways can easily pass references to captured values to places where they are no longer valid. In particular, members of the enclosing this object are effectively captured by reference, even if the default capture is by-value. For such uses-cases a function object class should be used to make the desired value capturing and propagation explicit.

    +

    Limiting the capture list to [&] (implicitly capture by reference) is a simplifying restriction that still provides good support for HotSpot usage, while reducing the cases a reader must recognize and understand.

    + +

    The use of mutable lambda expressions is forbidden because there don't seem to be many, if any, good use-cases for them in HotSpot. A lambda expression needs to be mutable in order to modify a by-value captured value. But with only downward lambdas, such usage seems likely to be rare and complicated. It is better to use a function object class in any such cases that arise, rather than requiring all HotSpot developers to understand this relatively obscure feature.

    +

    While it is possible to directly invoke an anonymous lambda expression, that feature should not be used, as such a form can be confusing to readers. Instead, name the lambda and call it by name.

    +

    Some reasons to prefer a named lambda instead of an anonymous lambda are

    + +

    Lambda expressions, and particularly anonymous lambda expressions, should be simple and compact. One-liners are good. Anonymous lambdas should usually be limited to a couple lines of body code. More complex lambdas should be named. A named lambda should not clutter the enclosing function and make it long and complex; do continue to break up large functions via the use of separate helper functions.

    +

    An anonymous lambda expression should either be a one-liner in a one-line expression, or isolated in its own set of lines. Don't place part of a lambda expression on the same line as other arguments to a function. The body of a multi-line lambda argument should be indented from the start of the capture list, as if that were the start of an ordinary function definition. The body of a multi-line named lambda should be indented one step from the variable's indentation.

    +

    Some examples:

    +
      +
    1. foo([&] { ++counter; });
    2. +
    3. foo(x, [&] { ++counter; });
    4. +
    5. foo([&] { if (predicate) ++counter; });
    6. +
    7. foo([&] { auto tmp = process(x); tmp.f(); return tmp.g(); })
    8. +
    9. Separate one-line lambda from other arguments:

      +
      foo(c.begin(), c.end(),
      +    [&] (const X& x) { do_something(x); return x.value(); });
    10. +
    11. Indentation for multi-line lambda:

      +
      c.do_entries([&] (const X& x) {
      +               do_something(x, a);
      +               do_something1(x, b);
      +               do_something2(x, c);
      +             });
    12. +
    13. Separate multi-line lambda from other arguments:

      +
      foo(c.begin(), c.end(),
      +    [&] (const X& x) {
      +      do_something(x, a);
      +      do_something1(x, b);
      +      do_something2(x, c);
      +    });
    14. +
    15. Multi-line named lambda:

      +
      auto do_entry = [&] (const X& x) {
      +  do_something(x, a);
      +  do_something1(x, b);
      +  do_something2(x, c);
      +};
    16. +
    +

    Item 4, and especially items 6 and 7, are pushing the simplicity limits for anonymous lambdas. Item 6 might be better written using a named lambda:

    +
    c.do_entries(do_entry);
    +

    Note that C++11 also added bind expressions as a way to write a function object for partial application, using std::bind and related facilities from the Standard Library. std::bind generalizes and replaces some of the binders from C++03. Bind expressions are not permitted in HotSpot code. They don't provide enough benefit over lambdas or local function classes in the cases where bind expressions are applicable to warrant the introduction of yet another mechanism in this space into HotSpot code.

    +

    References:

    + +

    References from C++17

    + +

    References from C++20

    + +

    References from C++23

    +

    Additional Permitted Features

    Excluded Features

    @@ -337,7 +451,6 @@ while ( test_foo(args...) ) { // No, excess spaces around control

    Member initializers and aggregates (n3653)

  • [[noreturn]] attribute (n2761)

  • Rvalue references and move semantics

  • -
  • Lambdas

  • diff --git a/doc/hotspot-style.md b/doc/hotspot-style.md index f02954347ed..6d167cad9d6 100644 --- a/doc/hotspot-style.md +++ b/doc/hotspot-style.md @@ -596,9 +596,7 @@ use can make code much harder to understand. Only use if the function body has a very small number of `return` statements, and generally relatively little other code. -* Generic lambdas. Lambdas are not (yet) permitted. - -* Lambda init captures. Lambdas are not (yet) permitted. +* Also see [lambda expressions](#lambdaexpressions). ### Expression SFINAE @@ -703,6 +701,273 @@ Some relevant sections from cppreference.com: Although related, the use of `std::initializer_list` remains forbidden, as part of the avoidance of the C++ Standard Library in HotSpot code. +### Local Function Objects + +* Local function objects, including lambda expressions, may be used. +* Lambda expressions must only be used as a downward value. +* Prefer `[&]` as the capture list of a lambda expression. +* Return type deduction for lambda expressions is permitted, and indeed encouraged. +* An empty parameter list for a lambda expression may be elided. +* A lambda expression must not be `mutable`. +* Generic lambda expressions are permitted. +* Lambda expressions should be relatively simple. +* Anonymous lambda expressions should not overly clutter the enclosing expression. +* An anonymous lambda expression must not be directly invoked. +* Bind expressions are forbidden. + +Single-use function objects can be defined locally within a function, +directly at the point of use. This is an alternative to having a function +or function object class defined at class or namespace scope. + +This usage was somewhat limited by C++03, which does not permit such a class +to be used as a template parameter. That restriction was removed by C++11 +([n2657]). Use of this feature is permitted. + +Many HotSpot protocols involve "function-like" objects that involve some +named member function rather than a call operator. For example, a function +that performs some action on all threads might be written as + +``` +void do_something() { + struct DoSomething : public ThreadClosure { + virtual void do_thread(Thread* t) { + ... do something with t ... + } + } closure; + Threads::threads_do(&closure); +} +``` + +HotSpot code has historically usually placed the DoSomething class at +namespace (or sometimes class) scope. This separates the function's code +from its use, often to the detriment of readability. It requires giving the +class a globally unique name (if at namespace scope). It also loses the +information that the class is intended for use in exactly one place, and +does not have any subclasses. (However, the latter can now be indicated by +declaring it `final`.) Often, for simplicity, a local class will skip +things like access control and accessor functions, giving the enclosing +function direct access to the implementation and eliminating some +boilerplate that might be provided if the class is in some outer (more +accessible) scope. On the other hand, if there is a lot of surrounding code +in the function body or the local class is of significant size, defining it +locally can increase clutter and reduce readability. + + +C++11 added _lambda expressions_ as a new way to write a function object. +Simple lambda expressions can be significantly more concise than a function +object, eliminating a lot of boiler-plate. On the other hand, a complex +lambda expression may not provide much, if any, readability benefit compared +to an ordinary function object. Also, while a lambda can encapsulate a call +to a "function-like" object, it cannot be used in place of such. + +A common use for local functions is as one-use [RAII] objects. The amount +of boilerplate for a function object class (local or not) makes such usage +somewhat clumsy and verbose. But with the help of a small amount of +supporting utility code, lambdas work particularly well for this use case. + +Another use for local functions is [partial application][PARTIALAPP]. Again +here, lambdas are typically much simpler and less verbose than function +object classes. + +Because of these benefits, lambda expressions are permitted in HotSpot code, +with some restrictions and usage guidance. An anonymous lambda is one which +is passed directly as an argument. A named lambda is the value of a +variable, which is its name. + +Lambda expressions should only be passed downward. In particular, a lambda +should not be returned from a function or stored in a global variable, +whether directly or as the value of a member of some other object. Lambda +capture is syntactically subtle (by design), and propagating a lambda in +such ways can easily pass references to captured values to places where they +are no longer valid. In particular, members of the enclosing `this` object +are effectively captured by reference, even if the default capture is +by-value. For such uses-cases a function object class should be used to +make the desired value capturing and propagation explicit. + +Limiting the capture list to `[&]` (implicitly capture by reference) is a +simplifying restriction that still provides good support for HotSpot usage, +while reducing the cases a reader must recognize and understand. + +* Many common lambda uses require reference capture. Not permitting it +would substantially reduce the utility of lambdas. + +* Referential transparency. Implicit reference capture makes variable +references in the lambda body have the same meaning they would have in the +enclosing code. There isn't a semantic barrier across which the meaning of +a variable changes. + +* Explicit reference capture introduces significant clutter, especially when +lambda expressions are relatively small and simple, as they should be in +HotSpot code. + +* There are a number of reasons why by-value capture might be used, but for +the most part they don't apply to HotSpot code, given other usage restrictions. + + * A primary use-case for by-value capture is to support escaping uses, + where values captured by-reference might become invalid. That use-case + doesn't apply if only downward lambdas are used. + + * By-value capture can also make a lambda-local copy for mutation, which + requires making the lambda `mutable`; see below. + + * By-value capture might be viewed as an optimization, avoiding any + overhead for reference capture of cheap to copy values. But the + compiler can often eliminate any such overhead. + + * By-value capture by a non-`mutable` lambda makes the captured values + const, preventing any modification by the lambda and making the captured + value unaffected by modifications to the outer variable. But this only + applies to captured auto variables, not member variables, and is + inconsistent with referential transparency. + +* Non-capturing lambdas (with an empty capture list - `[]`) have limited +utility. There are cases where no captures are required (pure functions, +for example), but if the function is small and simple then that's obvious +anyway. + +* Capture initializers (a C++14 feature - [N3649]) are not permitted. +Capture initializers inherently increase the complexity of the capture list, +and provide little benefit over an additional in-scope local variable. + +The use of `mutable` lambda expressions is forbidden because there don't +seem to be many, if any, good use-cases for them in HotSpot. A lambda +expression needs to be mutable in order to modify a by-value captured value. +But with only downward lambdas, such usage seems likely to be rare and +complicated. It is better to use a function object class in any such cases +that arise, rather than requiring all HotSpot developers to understand this +relatively obscure feature. + +While it is possible to directly invoke an anonymous lambda expression, that +feature should not be used, as such a form can be confusing to readers. +Instead, name the lambda and call it by name. + +Some reasons to prefer a named lambda instead of an anonymous lambda are + +* The body contains non-trivial control flow or declarations or other nested +constructs. + +* Its role in an argument list is hard to guess without examining the +function declaration. Give it a name that indicates its purpose. + +* It has an unusual capture list. + +* It has a complex explicit return type or parameter types. + +Lambda expressions, and particularly anonymous lambda expressions, should be +simple and compact. One-liners are good. Anonymous lambdas should usually +be limited to a couple lines of body code. More complex lambdas should be +named. A named lambda should not clutter the enclosing function and make it +long and complex; do continue to break up large functions via the use of +separate helper functions. + +An anonymous lambda expression should either be a one-liner in a one-line +expression, or isolated in its own set of lines. Don't place part of a +lambda expression on the same line as other arguments to a function. The +body of a multi-line lambda argument should be indented from the start of +the capture list, as if that were the start of an ordinary function +definition. The body of a multi-line named lambda should be indented one +step from the variable's indentation. + +Some examples: + +1. `foo([&] { ++counter; });` +2. `foo(x, [&] { ++counter; });` +3. `foo([&] { if (predicate) ++counter; });` +4. `foo([&] { auto tmp = process(x); tmp.f(); return tmp.g(); })` +5. Separate one-line lambda from other arguments: + + ``` + foo(c.begin(), c.end(), + [&] (const X& x) { do_something(x); return x.value(); }); + ``` +6. Indentation for multi-line lambda: + + ``` + c.do_entries([&] (const X& x) { + do_something(x, a); + do_something1(x, b); + do_something2(x, c); + }); + ``` +7. Separate multi-line lambda from other arguments: + + ``` + foo(c.begin(), c.end(), + [&] (const X& x) { + do_something(x, a); + do_something1(x, b); + do_something2(x, c); + }); + ``` +8. Multi-line named lambda: + + ``` + auto do_entry = [&] (const X& x) { + do_something(x, a); + do_something1(x, b); + do_something2(x, c); + }; + ``` + +Item 4, and especially items 6 and 7, are pushing the simplicity limits for +anonymous lambdas. Item 6 might be better written using a named lambda: +``` +c.do_entries(do_entry); +``` + +Note that C++11 also added _bind expressions_ as a way to write a function +object for partial application, using `std::bind` and related facilities +from the Standard Library. `std::bind` generalizes and replaces some of the +binders from C++03. Bind expressions are not permitted in HotSpot code. +They don't provide enough benefit over lambdas or local function classes in +the cases where bind expressions are applicable to warrant the introduction +of yet another mechanism in this space into HotSpot code. + +References: + +* Local and unnamed types as template parameters ([n2657]) +* New wording for C++0x lambdas ([n2927]) +* Generalized lambda capture (init-capture) ([N3648]) +* Generic (polymorphic) lambda expressions ([N3649]) + +[n2657]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2657.htm +[n2927]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2927.pdf +[N3648]: https://isocpp.org/files/papers/N3648.html +[N3649]: https://isocpp.org/files/papers/N3649.html + +References from C++17 + +* Wording for constexpr lambda ([p0170r1]) +* Lambda capture of *this by Value ([p0018r3]) + +[p0170r1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0170r1.pdf +[p0018r3]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0018r3.html + +References from C++20 + +* Allow lambda capture [=, this] ([p0409r2]) +* Familiar template syntax for generic lambdas ([p0428r2]) +* Simplifying implicit lambda capture ([p0588r1]) +* Default constructible and assignable stateless lambdas ([p0624r2]) +* Lambdas in unevaluated contexts ([p0315r4]) +* Allow pack expansion in lambda init-capture ([p0780r2]) ([p2095r0]) +* Deprecate implicit capture of this via [=] ([p0806r2]) + +[p0409r2]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0409r2.html +[p0428r2]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0428r2.pdf +[p0588r1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0588r1.html +[p0624r2]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0624r2.pdf +[p0315r4]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0315r4.pdf +[p0780r2]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0780r2.html +[p2095r0]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2095r0.html +[p0806r2]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0806r2.html + +References from C++23 + +* Make () more optional for lambdas ([p1102r2]) + +[p1102r2]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1102r2.html + ### Additional Permitted Features * `constexpr` @@ -757,9 +1022,6 @@ part of the avoidance of the C++ Standard Library in HotSpot code. ([n3206](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3206.htm)), ([n3272](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3272.htm)) -* Local and unnamed types as template parameters -([n2657](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2657.htm)) - * Range-based `for` loops ([n2930](http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2009/n2930.html)) ([range-for](https://en.cppreference.com/w/cpp/language/range-for)) @@ -837,9 +1099,6 @@ features that have not yet been discussed. * Rvalue references and move semantics -* Lambdas - - [ADL]: https://en.cppreference.com/w/cpp/language/adl "Argument Dependent Lookup" @@ -854,3 +1113,6 @@ features that have not yet been discussed. [SFINAE]: https://en.cppreference.com/w/cpp/language/sfinae "Substitution Failure Is Not An Error" + +[PARTIALAPP]: https://en.wikipedia.org/wiki/Partial_application + "Partial Application"