Translate

Thursday, September 27, 2018

Functional Style: Lambda Functions and Map

Source: https://dzone.com/articles/functional-style-lambda-functions-and-map

Functional Style: Lambda Functions and Map


Want to learn more about using the function style in Java? Check out this post where we compare these first-class functions!             

by Richard Wild





First-Class Functions: Lambda Functions and Map

What Is a First-Class Function?

You may have heard it said before that a particular language is functional because it has “first-class functions.” As I said in the first article of this series on functional programming, I do not share this popular belief. I  agree that first-class functions are an essential feature of any functional language, but I don’t consider that this to be a sufficient condition for a language to be functional. There are lots of imperative languages that have this feature as well. But, what is a first-class function? Functions are described as first-class when they can be treated like any other value — that is to say that they can be dynamically assigned to a name or symbol at runtime. They can be stored in data structures, passed in via function arguments, and returned as function return values.
This is not actually a novel idea. Function pointers have been a feature of C since its beginning in 1972. Before that, procedure references were a feature of Algol 68, implemented in 1970, and at the time, they were considered a procedural programming feature. Going back further still, Lisp (first implemented in 1963) was built on the very concept that program code and data are interchangeable.
These are not obscure features either. In C, we routinely use functions as first-class objects. For example, when sorting:
char **array = randomStrings();
printf("Before sorting:\n");
for (int s = 0; s < NO_OF_STRINGS; s++)
    printf("%s\n", array[s]);
qsort(array, NO_OF_STRINGS, sizeof(char *), compare);
printf("After sorting:\n");
for (int s = 0; s < NO_OF_STRINGS; s++)
    printf("%s\n", array[s]);

The stdlib library in C has a collection of functions for different types of the sort routine. All of them are capable of sorting any kind of data: the only assistance they need from the programmer is to be provided a function that compares two elements of the data set and returns -1, 1, or 0, indicating which element is greater than the other or that they are equal.
This is essentially the Strategy pattern!
The comparator function for our array of pointers to strings could be:
int compare(const void *a, const void *b)
{
    char *str_a = *(char **) a;
    char *str_b = *(char **) b;
    return strcmp(str_a, str_b);
}

And, we pass it to the sorting function like this:
qsort(array, NO_OF_STRINGS, sizeof(char *), compare);

The absence of parentheses on the compare function name makes the compiler emit a function pointer instead of a function call. So, treating a function as a first-class object in C is very easy, although the signature for a function that accepts a function pointer is quite ugly:
qsort(void *base, size_t nel, size_t width, int (*compar)(const void *, const void *));

Function pointers are not only used in sorting. Long before .NET was invented, there was the Win32 API for writing Microsoft Windows applications. And before that, the there was the Win16 API. It made liberal use of function pointers to be used as callbacks. An application provided pointers to its own functions when calling into the window manager to be called back by the window manager when the application needed a notification of some event that had happened. You could think of this as an Observer pattern relationship between the application (the observer) and its windows (the observables) — the application received notifications of events like mouse clicks and keyboard presses occurring on its windows. The job of managing the windows — moving them about, stacking them on top of one another, deciding which application is the recipient of user actions — is abstracted in the window manager. The applications know nothing about the other applications they share the environment with. In object-oriented programming, we usually achieve this kind of decoupling by abstract classes and interfaces, but it can also be achieved using first-class functions.
So, we’ve been using first-class functions for a long time. But, it is probably fair to say that no language has done more for widely promoting the idea of functions as first-class citizens than humble Javascript.

Lambda Expressions

In Javascript, it has long been a standard practice to pass functions to other functions used as callbacks, just as in the Win32 API. This idea is integral to the HTML DOM, where first-class functions can be added as event listeners to DOM elements:
function myEventListener() {
    alert("I was clicked!")
}
...
var myBtn = document.getElementById("myBtn")
myBtn.addEventListener("click", myEventListener)

Just as in C, the lack of parentheses on the myEventListener function name when it is referenced in the call to addEventListener means that it is not executed immediately. Instead, the function is associated with the click event on the DOM element in question. When the element is clicked, then the function is invoked and the alert happens.
The popular jQuery library streamlines the process by proving a function that selects DOM elements by means of a query string and presents useful functions for manipulating the elements and adding event listeners to them:
$("#myBtn").click(function() {
    alert("I was clicked!")
})

First-class functions are also the means for achieving asynchronous I/O, as used on the XMLHttpRequest object that is the basis for Ajax. The same idea is ubiquitous in Node.js, too. When you want to make a non-blocking function call, you pass it a reference to a function for it to call you back on when it is done.
But, there is something else here, too. The second of these is not only an example of a first-class function. It is also an example of a lambda function. Specifically, this part:
function() {
    alert("I was clicked!");
}

A lambda function (often just called a lambda) is an unnamed function. They could have just called them anonymous functions, and then everyone would have known straight away what they are. But, that doesn’t sound as impressive, so lambda functions it is! The point of a lambda function is where you need a function in that place and only there; since it is not needed anywhere else, you simply define it right there. It doesn’t need a name. If you did need to reuse it somewhere else, then you would consider defining it as a named function and referencing it by name instead, like I did in the first Javascript example. Without lambda functions, programming with jQuery and Node would be very tiresome, indeed.
Lambda functions are defined in various ways in different languages:
In Javascript: function(a, b) { return a + b }
In Java: (a, b) -> a + b
In C#: (a, b) => a + b
In Clojure: (fn [a b] (+ a b))
In Clojure — the shorthand version: #(+ %1 %2)
In Groovy: { a, b -> a + b }
In F#: fun a b -> a + b
In Ruby, so-called “stabby” syntax: -> (a, b) { return a + b }
As we can see, most languages have gone for a more concise way of expressing lambdas than Javascript.

Map

You probably already use the term “map” in your programming to mean to the data structure that stores the objects as the key-value pairs (If your language calls this a “dictionary” instead, then great — no problem). In functional programming, the term has an additional meaning. In fact, the basic concept is really the same. In both cases, one set of things is being mapped to another set of things. In the sense of the data structure, the map is a noun — keys are mapped to values. In the programming sense, a map is a verb — a function maps an array of values to another array of values.
Say you have a function f and an array of values A = [a1, a2, a3, a4]. To map f over A means applying f to every element in A:
  • a1f (a1) = a1’
  • a2f (a2) = a2’
  • a3f (a3) = a3’
  • a4f (a4) = a4’
Then, assembling an array of the results in the same order as the inputs:
A’ = map( f, A ) = [a1’, a2’, a3’, a4’]

Map by Example

Ok, so this is interesting but a bit mathematical. How often would you do this anyway? Actually, it is a lot more often than you might think. As usual, an example explains things best, so let’s take a look at a simple exercise I lifted from exercism.io when I was learning Clojure. The exercise is called "RNA Transcription," and it is very simple. We will take a look at an input string of bases that need to be transcribed to an output string. The bases translate like this:
  • C → G
  • G → C
  • A → U
  • T → A
Any input other than C, G, A, T is invalid. The test in JUnit5 could look like this:
class TranscriberShould {
    @ParameterizedTest
    @CsvSource({
            "C,G",
            "G,C",
            "A,U",
            "T,A",
            "ACGTGGTCTTAA,UGCACCAGAAUU"
    })
    void transcribe_dna_to_rna(String dna, String rna) {
        var transcriber = new Transcriber();
        assertThat(transcriber.transcribe(dna), is(rna));
    }
    @Test
    void reject_invalid_bases() {
        var transcriber = new Transcriber();
        assertThrows(
                IllegalArgumentException.class,
                () -> transcriber.transcribe("XCGFGGTDTTAA"));
    }
}

And, we can make the tests pass with this Java implementation:
class Transcriber {
    private Map<Character, Character> pairs = new HashMap<>();
    Transcriber() {
        pairs.put('C', 'G');
        pairs.put('G', 'C');
        pairs.put('A', 'U');
        pairs.put('T', 'A');
    }
    String transcribe(String dna) {
        var rna = new StringBuilder();
        for (var base: dna.toCharArray()) {
            if (pairs.containsKey(base)) {
                var pair = pairs.get(base);
                rna.append(pair);
            } else
                throw new IllegalArgumentException("Not a base: " + base);
        }
        return rna.toString();
    }
}

The key to programming in the functional style is, unsurprisingly, to turn everything that can possibly be expressed as a function into one. So, let’s do that:
char basePair(char base) {
    if (pairs.containsKey(base))
        return pairs.get(base);
    else
        throw new IllegalArgumentException("Not a base " + base);
}
String transcribe(String dna) {
    var rna = new StringBuilder();
    for (var base : dna.toCharArray()) {
        var pair = basePair(base);
        rna.append(pair);
    }
    return rna.toString();
}

Now, we’re in a position to use the map as a verb. In Java, there is a function provided for this in the Streams API:
char basePair(char base) {
    if (pairs.containsKey(base))
        return pairs.get(base);
    else
        throw new IllegalArgumentException("Not a base " + base);
}
String transcribe(String dna) {
    return dna.codePoints()
            .mapToObj(c -> (char) c)
            .map(base -> basePair(base))
            .collect(
                    StringBuilder::new,
                    StringBuilder::append,
                    StringBuilder::append)
            .toString();
}

Hmmmm

So, let’s critique that solution. The best thing that can be said about it is that the loop has gone. If you think about it, looping is a kind of clerical activity that we really shouldn’t have to be concerned with most of the time. Usually, we loop because we want to do something for every element in a collection. What we really want to do here is take this input sequence and generate an output sequence from it. Streaming takes care of the basic admin work of iterating for us. It is, in fact, a design pattern — a functional design pattern — but, I shall not mention its name just yet. I don’t want to scare you off just yet.
I have to admit that the rest of the code is not that great, and it is mostly due to the fact that primitives in Java are not objects. The first bit of non-greatness is this:
mapToObj(c -> (char) c)

We had to do this because Java treats primitives and objects differently, and although the language does have wrapper classes for the primitives, there is no way to directly get a collection of Character objects from a String.
The other bit of less-than-awesomeness is this:
.collect(
        StringBuilder::new,
        StringBuilder::append,
        StringBuilder::append)

It is far from obvious why it is necessary to call append twice. I will explain that later, but the time is not right for that now.
I’m not going to try to defend this code — it sucks. If there were a convenient way of getting a Stream of Character objects from a String, or even an array of Characters, then there would be no problem, but we haven't been blessed with one. Dealing with primitives is not the sweet spot for FP in Java. Come to think of it, it’s not even good for OO programming. So, maybe we shouldn’t be so obsessed with primitives. What if we designed them out of the code? We could create an enumeration for the bases:
enum Base {
    C, G, A, T, U;
}
And, we have a class to act as a first-class collection holding a sequence of bases:
class Sequence {
    List<Base> bases;
    Sequence(List<Base> bases) {
        this.bases = bases;
    }
    Stream<Base> bases() {
        return bases.stream();
    }
}

Now, the Transcriber  looks like this:
class Transcriber {
    private Map<Base, Base> pairs = new HashMap<>();
    Transcriber() {
        pairs.put(C, G);
        pairs.put(G, C);
        pairs.put(A, U);
        pairs.put(T, A);
    }
    Sequence transcribe(Sequence dna) {
        return new Sequence(dna.bases()
                .map(pairs::get)
                .collect(toList()));
    }
}

This is a lot better. The pairs::get is a method reference; it refers to the get method of the instance assigned to the pairs variable. By creating a type for the bases, we have designed the possibility of the invalid input, so the need for the basePair method disappears, as does the exception. This is one advantage Java has over Clojure, which, by itself, cannot enforce types in function contracts. What’s more, the StringBuilder has disappeared as well. Java Streams are excellent for when you need to iterate a collection, process each element in some way, and build up a new collection containing the results. This probably accounts for quite a large proportion of the loops you have written in your life. Most of the housekeeping, which is not part of the real job at hand, is done for you.

In Clojure

Lack of typing aside, Clojure is somewhat more concise than the Java version, and it gives us no difficulty in mapping over the characters of a string. The most important abstraction in Clojure is the sequence; all the collection types can be treated as sequences, and strings are no exception:
(def pairs {\C, "G",
            \G, "C",
            \A, "U",
            \T, "A"})
(defn- base-pair [base]
  (if-let [pair (get pairs base)]
    pair
    (throw (IllegalArgumentException. (str "Not a base: " base)))))
(defn transcribe [dna]
  (map base-pair dna))
The business end of this code is the last line (map base-pair dna) — it is worth pointing out because you might have missed it. It means map the base-pair function over the dna string (which behaves as a sequence). If we want it to return a string instead of a list, which is what map gives us, the only change required would be:
(apply str (map base-pair dna))

In C#

Let’s try another language. An imperative approach to the solution in C# looks like this:
namespace RnaTranscription
{
    public class Transcriber
    {
        private readonly Dictionary<char, char> _pairs = new Dictionary<char, char>
        {
            {'C', 'G'},
            {'G', 'C'},
            {'A', 'U'},
            {'T', 'A'}
        };
        public string Transcribe(string dna)
        {
            var rna = new StringBuilder();
            foreach (char b in dna)
                rna.Append(_pairs[b]);
            return rna.ToString();
        }
    }
}

Again, C# does not present us with the problems we encountered in Java, because a string in C# is enumerable and all 'primitives' can be treated as objects with behavior.
We can rewrite the program in a more functional way, like this, and it turns out to be considerably less verbose than the Java Streams version. For “map” in a Java stream, read “select” in C# instead:
public string Transcribe(string dna)
{
    return String.Join("", dna.Select(b => _pairs[b]));
}

Or, if you like, you can use LINQ for its syntactic sugar:
public string Transcribe(string dna)
{
    return String.Join("", from b in dna select _pairs[b]);
}

Why Do we Loop?

You probably get the idea. If you think of the times before when you have written a loop, most often you will have been trying to accomplish one of the following:
  • Mapping an array of one type into an array of another type.
  • Filtering by finding all the items in an array satisfying some predicate.
  • Determining whether any or none of the items in an array satisfy some predicate.
  • Accumulating a count, sum or some other kind of cumulative result from an array.
  • Sorting the elements of an array into a particular order.
The functional programming features available in most modern languages let you accomplish all these things without resorting to writing loops or creating collections to store the results in. The functional style allows you to dispense with those housekeeping operations and concentrate on the real work. What’s more, the functional style allows you to chain operations together, for example, if you needed to:
  1. Map the elements of an array into another type.
  2. Filter out some of the mapped elements.
  3. Sort the filtered elements.
In the imperative style, this would require either multiple loops or one loop with much code inside it. Either way, it involves lots of administrative work that obscures the real purpose of the program. In the functional style, you can dispense the admin work and express directly what you mean. Later on, we shall see more examples of how the functional style can make your life easier.

Next Time

While I was learning functional programming and getting used to the Java streams API, every time I wrote a loop, the very next thing I would do is consider how I could rewrite it as a stream. It is usually possible. In C#, the ReSharper plugin for Visual Studio automatically suggests this kind of refactoring for you. Now that I have internalized the functional style, I just go straight for the stream and don’t even bother with the loop unless I really need one. In the next article, we will continue our exploration of first-class functions and how we can use the functional style to make our code more expressive, with a look at  filter  and reduce. Stay tuned!