Need help with software development? I’m available for hire.

# Data Structures as Functions (or, Implementing `Set#to_proc` and `Hash#to_proc` in Ruby) At the March 2015 meeting of the London Ruby User Group, I delivered a presentation based on this blog post with some supplemental material. I’ve since tweaked some of the examples to closer match the ones in the final talk but the content should largely be the same.

Reading Henrik Nyh’s “`Array#to_proc` for hash access” made me think about a similar concept in Clojure: data structures being used as functions.

## Functions

First of all, let’s be clear what we mean by “functions”. The definition of a function in mathematics is as follows:

A function is a relation between a set of inputs and a set of permissible outputs with the property that each input is related to exactly one output.

So a function `f` can be considered as a way of mapping from some input (its “domain”) to some output (its “codomain”), e.g.

``````f(1)
# => 2
``````

Crucially, whenever the function is called with a specific input, it should always produce the same output. This means `f` in the following example would not be a function because it produces two different outputs given the same input:

``````f(1)
# => 2

f(1)
# => 3
``````

Whole programming languages are built out of this concept (it’s called functional programming for a reason) but they also appear in languages that aren’t purely functional, such as Ruby.

A common place you might find them is when using methods from `Enumerable` such as `all?` and `map`:

``````(1..10).all? { |x| x > 4 }
# => false

(1..10).map { |x| x * 2 }
# => [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
``````

In both of the above cases, the blocks passed to `all?` and `map` are actually simple functions.

``````proc { |x| x > 4 }
``````

No matter how many times you call this function with the number 0, it will always return false.

``````proc { |x| x * 2 }
``````

Similarly, no matter how many times you call this function with the number 1, it will always return 2.

With this in mind, let’s take a look at functions in Clojure.

## Using sets as functions

In Clojure (as in many languages, including Ruby), there are data structures for sets and maps (not to be confused with the function of the same name but something akin to a Hash in Ruby or a dictionary in Python) but, unlike other languages, these structures can be used in places where you would expect a function.

For example, Clojure’s `filter` takes two arguments: a function `pred` and a collection `coll`. As you might expect from its name, it goes through `coll` and only returns elements that return true when fed into `pred`. We can use this to filter even numbers from a range using `even?`:

``````(filter even? (range 1 30))
;; => (2 4 6 8 10 12 14 16 18 20 22 24 26 28)
``````

Of course, `even?` can be used by itself too:

``````(even? 5)
;; => false
``````

Nothing too surprising there but what if we used a set instead of an existing function?

Let’s define a set of numbers we like:

``````(def perfect-numbers #{6 28 496})
``````

Now let’s filter our range again but using the set instead of `even?`:

``````(filter perfect-numbers (range 1 30))
;; => (6 28)
``````

To understand what happened here, let’s just try using the set as a function as we did `even?`:

``````(perfect-numbers 1)
;; => nil

(perfect-numbers 6)
;; => 6
``````

It turns out that sets can be used as functions of their members: when given some input, they will return that input if (and only if) it is a member of the set; if it isn’t, it returns `nil`. This behaviour makes for a nifty shortcut when testing whether a value is in some whitelist:

``````(def valid-attributes #{:name :age})

(defn foo [attribute]
(when (valid-attributes attribute)
;; insert money-making business logic here
(println "PROFIT")))
``````

The way this actually works in Clojure is that sets implement the `IFn` interface meaning that they have an `invoke` method that can be used to call them as if they were ordinary functions. There’s a very similar `__invoke` magic method in PHP that allows objects to be called as functions:

``````<?php
class Monkey
{
public function __invoke(\$name)
{
return "Ook, {\$name}!";
}
}

\$monkey = new Monkey;
\$monkey('Bob');
// => 'Ook, Bob!'
``````

We can achieve similar behaviour in Ruby by implementing `Set#to_proc` (in the same way Henrik implemented `to_proc` on `Array`):

``````class Set
def to_proc
proc { |element| element if member?(element) }
end
end
``````

This means we can do the same sort of filtering using `select`:

``````require "set"

perfect_numbers = Set[6, 28, 496]

(1..30).select(&perfect_numbers)
# => [6, 28]
``````

If you’re unfamiliar with the syntax above, let’s take a quick detour (if this is old news to you, feel free to skip ahead).

## What’s with the ampersand?

In Ruby, there are methods (such as `select` above) that take blocks. A good example of this is `map` which evaluates a given block for each item in a collection and returns a new collection with the results:

``````(1..30).map { |x| x.to_s }
# => ["1", "2", "3", ..., "30"]
``````

Wherever you can pass a block in Ruby, you can also pass a `Proc` using an ampersand like so:

``````stringify = proc { |x| x.to_s }

(1..30).map(&stringify)
# => ["1", "2", "3", ..., "30"]
``````

Note that `stringify` is already a `Proc` here but if it wasn’t, Ruby would internally call `to_proc` on it in an attempt to coerce it to the right type.

As mapping over a collection and calling a method on each element is quite a common task and armed with the knowledge of Ruby’s internal use of `to_proc`, a technique emerged of using symbols as a shortcut:

``````(1..30).map(&:to_s)
# => ["1", "2", "3", ..., "30"]
``````

This works by implementing `Symbol#to_proc` with something like so:

``````class Symbol
def to_proc
proc { |x| x.send(self) }
end
end
``````

(The real implementation actually features a cache to reduce some of the performance overhead of this approach.)

You can think of this as expanding like so:

``````(1..30).map(&:to_s)
(1..30).map(&:to_s.to_proc)
(1..30).map(&proc { |x| x.send(:to_s) })
(1..30).map { |x| x.send(:to_s) }
(1..30).map { |x| x.to_s }
``````

## Calling Procs

With `to_proc` implemented on sets, can we use them as we did in Clojure?

``````perfect_numbers.call(1)
# NoMethodError: undefined method `call' for #<Set: {6, 28, 496}>
``````

Sadly not, as sets themselves are not `Proc`s. Perhaps we could use the ampersand trick?

``````&perfect_numbers.call(1)
# SyntaxError: unexpected &, expecting end-of-input
# &perfect_numbers.call(1)
#  ^
``````

No luck there either. In fact, we’d have to do the following:

``````perfect_numbers.to_proc.call(1)
# => nil
``````

This is where our efforts diverge from Clojure: simply implementing `to_proc` doesn’t mean that sets are now functions. You could maybe make them more `Proc`-like with something like the following but there is no equivalent to Clojure’s `IFn` or PHP’s `__invoke` in Ruby:

``````class Object
def call(*args)
to_proc.call(*args)
end
end

perfect_numbers.call(3)
# => 6
``````

## Using hashes as functions

A slightly more interesting case is that of maps in Clojure:

``````({:a 2 :b 3} :a)
;; => 2

({:a 2 :b 3} :c)
;; => nil
``````

Like the behaviour of sets, maps are functions of their keys: given the input of the key, they return the corresponding value (or `nil` if there is no such entry). This fits quite nicely with our original mathematical functions: the keys being the domain and the values being the codomain.

As for how this might be useful, Jay Fields wrote up an interesting use case when comparing and filtering two maps. It also gives us an alternate way to access hash keys as in Henrik’s original post.

First, let’s implement `Hash#to_proc` to mirror Clojure’s behaviour:

``````class Hash
def to_proc
method(:[]).to_proc
end
end
``````

Here we simply take `Hash`’s existing `[]` method for accessing values and convert it to a `Proc`. Now we can use it like so:

``````%w(a b c d).map(&{"a" => 1, "b" => 2, "c" => 3})
# => [1, 2, 3, nil]
``````

Or, to use a more realistic example:

``````person = { name: "Robert Paulson", age: 43 }

name, age = [:name, :age].map(&person)
# => ["Robert Paulson", 43]
``````

In this second example, it’s almost the reverse of Henrik’s `Array#to_proc` behaviour where the hash is passed into a function created from the keys:

``````[{ name: "A" }, { name: "B" }].map(&[:name])
# => ["A", "B"]
``````

In our case, it is the key that is passed into a function created from the hash:

``````[:name].map(&{ name: "A" })
# => "A"
``````

## Other data structures as functions

Maps and sets are not the only data structures to implement the `IFn` interface in Clojure.

There are vectors:

``````([1 2 3] 0)
;; => 1
``````

Which would be equivalent to the following `Array#to_proc`:

``````class Array
def to_proc
method(:[]).to_proc
end
end

(1..3).map(&["A", "B", "C", "D", "E"])
# => ["B", "C", "D"]
``````

There are also keywords which are similar to Ruby’s symbols:

``````(:name {:name "Robert Paulson", :age 42})
;; => "Robert Paulson"
``````

To implement this in Ruby, we’d have to override the now standard `Symbol#to_proc` like so:

``````class Symbol
def to_proc
proc { |coll| coll[self] }
end
end

[{ name: "Robert Paulson", age: 42 }].map(&:name)
# => ["Robert Paulson"]
``````

Of course, this would mean you can no longer use it for method access.

In conclusion, we shouldn’t be afraid to experiment with concepts from other languages. After all, as Alan Perlis said:

A language that doesn’t affect the way you think about programming is not worth knowing.