Ruby Methods, Procs and Blocks
09 Feb 2017This is the first article in the Ruby's Dark Corners series.
Here are a few question this article will try to answer:
- What combination of parameters are legal in method definition?
- How are arguments assigned to parameters when calling a method?
- What is a proc in relation with a block?
- What are the differences between lambdas and procs?
- What is the difference between
{ ... }
anddo ... end
?
Declaring Parameters
In everything that follows we make the distinction between parameters (or formal parameters): the parameters as they appear in method definitions; and arguments (or actual parameters): the value passed to method calls.
A method definition admits the following types of parameters:
type | example |
---|---|
required | a |
optional | b = 2 |
array decomposition | (c, *d) |
splat | *args |
post-required | f |
keyword | g: , h: 7 |
double splat | **kwargs |
block | &blk |
A kitchen sink example:
def foo a, b = 2, *c, d, e:, f: 7, **g, &blk; end
Here are quick explanations:
Optional parameters can have a default value.
Array decomposition parameters are required parameters that decompose an array argument into parts. Here are some examples, assuming the argument is
[[1, 2], 3]
:(*a) a = [[1, 2], 3] (a, b) a = [1, 2], b = 3 (a, *b) a = [1, 2], b = [3] (a, b, *c) a = [1, 2], b = 3, c = [] ((a, b), c) a = 1, b = 2, c = 3 ((a, *b), c) a = 1, b = [2], c = 3
The splat parameter enables variable length argument lists and receives all extra arguments.
Post-required parameters are required parameters that occur after a splat parameter.
Keyword operators have to be named explicitly when calling the method:
foo(f: 6)
. They can have default values.The double splat parameter acts like the splat parameter, but for extra keyword arguments.
The block parameter gives a name to the block passed to the method, allowing you to pass it to other methods, and call it by name instead of through
yield
. See more on blocks in the section on Blocks and Procs.
You can get a list of a method's parameters with Method#parameters
, which will
show the type of each parameter. Here it is, running over our foo
method.
method(:foo).parameters
# [[:req, :a], [:opt, :b], [:rest, :c], [:req, :d], [:keyreq, :e], [:key, :f], [:keyrest, :g], [:block, :blk]]
Note that this glitches for array decomposition parameters, indicating just
[:req]
.
Ordering
You can't mix match these parameters as you please. All types of parameters are optional, but those that are present must respect the following ordering:
- required parameters and optional parameters
- splat parameter (at most one)
- post-required parameters
- keyword parameters
- double splat parameter (at most one)
- block parameter (at most one)
As a matter of best practices, you should not mix required and optional parameters: put all required parameters first. You should also avoid post-required parameters, and using both optional parameters and a splat parameter. If you require default values, use keyword parameters instead.
See some rationale for these practices.
Assigning Arguments to Parameters
The real complexity of Ruby's methods is determining how arguments are mapped to parameters.
Here are the different types of arguments you can pass to a method call:
type | example |
---|---|
regular | v |
keyword | b: , b: v or :b => v |
hash argument | v1 => v2 |
splat | *v |
double splat | **v |
block conversion | &v |
block | { .. } or do .. end |
You can substitute v
(and v1
, v2
) with almost any expression, as long as
you respect operator precedence.
A kitchen sink example:
foo 1, *array, 2, c: 3, "d" => 4, **hash
Here are quick explanations:
Keyword arguments are equivalent to hash arguments whose key is a symbol. If you are using keyword arguments with the intent to fill keyword parameters, you should use the keyword syntax. In what follows, when we say keyword arguments we alway refer to keyword argument and hash arguments with a symbol key: the two are indistinguishable.
Hash arguments will be aggregated into a hash that will be assigned to one of the method's parameters. Keywords argument may or may not be added to this hash (see below).
If the splat argument is an
Array
or an object that supported theto_a
method, it is expanded inside the method call:foo(1, *[2, 3], 3)
is equivalent tofoo(1, 2, 3, 4)
. Otherwise the splat expands to a regular argument.A double splat works the same, but for hashes:
foo(a: 1, **{b: 2, c: 3}, e: 4)
is equivalent tofoo(a: 1, b: 2, c: 3, d: 4)
. It performs implicit conversion toHash
for objects that support theto_hash
method. Note in passing that by conventionto_hash
represents implicit conversion to aHash
, whileto_h
represents explicit (on-demand only) conversion.A block conversion argument must be a
Proc
, or an object that can implicitly converted to one viato_proc
.Finally, a method can be passed a block explicitly. The block will appear after the argument list. We'll touch on the difference between the two block syntaxes in the section about blocks.
Ordering
Again, arguments cannot be supplied in any order. All regular and splat arguments must appear before any keyword and double splat arguments. A block conversion or block argument (only one allowed) must appear last.
If the order is not respected, an error ensues.
Assigning Arguments to Parameters
Where an error is mentionned, it is most likely an ArgumentError
. I haven't
re-checked everything, but it should always be the case.
Here is the full procedure for figuring out how arguments are assigned to parameters:
Expand any splat or double splat arguments. The expanded content is taken into account when we talk about regular, keyword or hash arguments later on.
First, we handle all keywords and hash parameters.
If there is exactly one more required parameters than there are regular arguments, all keyword and hash arguments are collected into a hash, which is assigned to that parameter. If there were any keyword parameter, an error ensues. If there was a double splat parameter, it is assigned an empty hash.
Otherwise, if all of the following conditions hold, implicit hash conversion is performed.
- There are keyword or double splat parameters.
- There are no keyword or hash arguments.
- There are more regular arguments than required parameters.
- The last regular argument is a hash, or can be converted to one via
to_hash
and that method doesn't returnnil
.
Implicit hash conversion simply consists of treating all the (key, value) pairs from the last regular argument as additional keyword or hash arguments. These new arguments are taken into account in the rest of the assignment procedure.
We now assign keyword arguments to the corresponding keyword parameters. If a keyword parameter without default value cannot be assigned, an error ensues.
If there is double splat parameter, assign it a hash aggregating all remaining keyword parameters.
If there remains keyword or hash arguments, aggregate them in a hash, which is to be treated as the last regular argument.
Note: if two values are supplied for the same key, the last one wins and the previous one disappears. However, if the two values are visible in the method call (e.g.
foo(a: 1, **{a: 2})
), a warning is emitted.We now consider regular arguments, which by now consists of the supplied regular arguments (including the result of splat expansion), minus the last argument if implicit hash conversion was performed in step 2, plus a final argument containing a hash if arguments remained in step 2.
If there are less regular arguments than required parameters, an error ensues. Otherwise, let
x
be the number of regular arguments andn
the number of required and optional parameters.If
x <= n
, the lastn-x
optional parameters get assigned their default values, while the remainingx
parameters get assigned thex
regular arguments.If
x > n
, then
first arguments are assigned to then
parameters. Thex-n
remaining arguments are collected in an array, which is assigned to the splat parameter. If there is no splat parameter, an error ensues.If a block or block conversion argument is passed, make a proc from it and assign it to the block parameter, if any — otherwise it becomes the implicit block parameter.
Blocks and Procs
The following is in general much better understood than assignment from arguments to parameters.
Procs are higher-order functions, while blocks are a syntactic
notation to pass a proc to a method. It's not quite that simple however. You
can pass procs as regular arguments, but the block parameter is special. All
methods can accept a block implicitly, or explicitly via a block a parameter.
This parameter must be assigned a (syntactic) block (which becomes a proc) or a
"regular" proc marked with &
.
Other than with blocks, procs can be instantiated with proc
, Proc.new
,
lambda
or ->
(lambda literal or dash rocket or stab operator). The
three first forms simply take a block argument, while the last form looks like
this:
-> (a, b) { ... }
-> (a, b) do .. end
-> { ... }
-> do .. end
When a method accepts an implicit block, you can call it with yield
(you can
also call the explicit block parameter with yield
). Somewhat peculiarly, the
yield
notation cannot be passed a block of its own. A proc or block parameter
named x
can be called using x.call(1, 2)
, x.(1, 2)
or x[1, 2]
.
A less known trick is that you can also get a reference to the proc backing the
block argument by using proc
or Proc.new
(without passing them a block)
inside the method.
Procs vs Lambdas
Lambdas are special, stricter kind of procs.
- Regular procs can perform non-local control flow: if a proc uses
retry
,return
,break
,next
orredo
, those will be interpreted as though the proc's code had been inlined at the point where the proc was called. This can only happen in the method in which the proc was originally defined. If called outside the method, a proc that performs non-local control flow will cause aLocalJumpError
. - If regular procs are passed too many arguments, the extra arguments are ignored instead of causing an error (the behaviour is otherwise the same as for methods, as outlined above). Lambda behave like methods for parameter assignment.
- Regular prrocs with multiple parameters can be passed a single array argument, which will be automatically "splatted".
proc
and Proc.new
are used to create regular procs, while lambda
and ->
create lambdas.
It seems possible to convert between proc and lambda like this: lambda &my_proc
or proc &my_lambda
. However, these conversions don't do anything:
lambda &my_proc
returns my_proc
, and the regular proc behaviour is
preserved. The converse is true for the reverse conversion.
Curly Brackets ({}
) vs do .. end
There is no semantic difference between both forms.
There is one syntactic difference because the two forms have different precedences:
foo bar { ... } # foo(bar { ... })
foo bar do ... end # foo(bar) { ... }
If you don't need to use this particularity, there are two popular ways to choose on form over the other:
Use curlies for single-line blocks,
do ... end
for multi-line blocks.Use curlies when the return value is used,
do ... end
when only the side-effects matter.
Edits
1 (2017/02/09)
Thanks to Benoit Daloze, who pointed out many small mistakes in the article,
as well as the fact that the proper names for what I called regular and
default parameters were required and optional parameters. He also
mentionned Method#parameters
, and inspired me to improve the section on blocks
and procs.
2 (2017/02/11)
Following a conversation with Tom Enebo on Twitter, I realized that I forgot to account for hash arguments with non-symbol keys! This lead to some more investigation and the uncovering of a few errors. The assignment procedure has been revised and is now much simpler.
3 (2017/03/27)
Reader A Quiet Immanence pointed out in the comments that a conversion between regular proc and lambda doesn't do anything, and that an array argument to a regular proc with multiple parameters is auto-splatted.
4 (2017/03/27)
Will Spurgin pointed out in the comments that you can have array decomposition, splat and keyword parameters in the same method, as long as you respect the parameter ordering.