Tuesday, April 8, 2014

Kotlin: Pattern matching

The past year I have evolved a simple pattern matching scheme that I now automatically incorporate into new Kotlin class hierarchies. The scheme is trivial to include, costs nothing to program and offers some beneficial features. Its development is thanks to the many nice features offered by Kotlin.

Consider the development of an immutable list type. The architecture comprises the trait ListIF, the abstract class ListAB, and the two concrete classes Nil and Cons. The ListIF trait operates as an interface, introducing the behaviours supported. The abstract class ListAB is the home for the implementation of most of the common behaviours. The concrete classes Nil and Cons are used to create list instances.

The implementation for this architecture is shown in Example 1. Note the match member function introduced in the trait. This is the pattern matching scheme for this class hierarchy. The function is given two parameters, one for each concrete class. The first parameter is a function to convert a Nil into the result type B. The second parameter is a function from a Cons to the result type B.

The concrete subclasses Nil and Cons override the definition for match. In the class Nil the first parameter is invoked against the Nil instance. In the class Cons the second parameter is invoked against the Cons instance. This pattern matching scheme is effectively the strategy design pattern.

Example 1: Pattern matching

package example1

import kotlin.test.*

trait ListIF<A> {
    fun size(): Int
    fun length(): Int
    fun interleave(xs: ListIF<A>): ListIF<A>

    fun <B> match(nil: (Nil<A>) -> B, cons: (Cons<A>) -> B): B
}

abstract class ListAB<A> : ListIF<A> {
    override fun size(): Int {...}

    override fun length(): Int {...}
    
    override fun interleave(xs: ListIF<A>): ListIF<A> {...}
}

class Nil<A> : ListAB<A>() {
    override fun <B> match(nil: (Nil<A>) -> B, cons: (Cons<A>) -> B): B = nil(this)
}

class Cons<A>(val hd: A, val tl: ListIF<A>) : ListAB<A>() {
    override fun <B> match(nil: (Nil<A>) -> B, cons: (Cons<A>) -> B): B = cons(this)
}

fun main(args: Array<String>) {
    val xs: ListIF<Int> = Cons(1, Cons(3, Cons(5, Nil<Int>())))
    val ys: ListIF<Int> = Cons(2, Cons(4, Nil<Int>()))

    assertEquals(3, xs.size())
    assertEquals(2, ys.length())
    assertEquals(4, xs.interleave(ys).length())
}

The definition for member function size is:

override fun size(): Int {
    return this.match(
        {(nil: Nil<A>) -> 0},
        {(cons: Cons<A>) -> 1 + cons.tl.size()}
    )
}

A match is performed against the recipient list and if it is empty then zero is returned. If the list is non empty then we return 1 plus the size of the remaining list. Of course we could achieve the same using a when clause and a number of is selector clauses. However, it is incumbent on the programmer to provide all the necessary choices. With the match function if you omit a choice you get a compiler error.

The second function literal given to match reveals another feature. The formal parameter cons is of type Cons and is effectively a smart cast of this. Hence in the body of this function literal we can reference the tl property of cons.

The definition for member function interleave shows how we handle nested pattern matching:

override fun interleave(xs: ListIF<A>): ListIF<A> {
    return this.match(
        {(nil1: Nil<A>) -> Nil<A>()},
        {(cons1: Cons<A>) ->
            xs.match(
                {(nil2: Nil<A>) -> Nil<A>()},
                {(cons2: Cons<A>) -> Cons(cons1.hd, Cons(cons2.hd, cons1.tl.interleave(cons2.tl)))}
            )
        }
    )
}

Interleaving the values from two lists requires a number of considerations. They are fully captured by the nested pattern matching. If the first list (this) is empty then an empty list is delivered. If the first list is not empty then we look to the second list (xs). If this is empty then an empty list is returned. Otherwise, for two non empty lists we prefix the head of the first and the head of the second on to interleaving their tails. As well as a type safe implementation of our logic it also allows us to reason about its correctness before we test it.

The recursive definition of the list data type naturally leads to a recursive definition of the classes and its member functions. Here is member function length:

override fun length(): Int {
    fun recLength(xs: ListIF<A>, acc: Int): Int {
        return xs.match(
            {(nil: Nil<A>) -> acc},
            {(cons: Cons<A>) -> recLength(cons.tl, 1 + acc)}
        )
    }

    return recLength(this, 0)
}

The nested function recLength has an accumulating parameter so that the recursive call might exploit tail call optimization. If I understand the Kotlin annotation tailRecursive this does not apply here since the recursive call is nested in the function literal. Perhaps the clever people at IntelliJ could find a way so that I can have my cake and eat it.

Of course not all my class hierarchies are defined recursively and so this issue does not arise. The pattern matching scheme is inexpensive to implement, is type safe, supports smart casts and a clarity in function bodies that we can reason about their correctness.

Friday, February 14, 2014

Kotlin: An Option type #2

The first blog introduced the OptionIF<T> trait and some of the supporting member functions. The package level functions inject and bind are used to make the OptionIF<T> trait into a monad. A monad is a way to structure computations in terms of values and sequences of computations using those values. Monads allow the programmer to build up computations using sequential building blocks, which can themselves be sequences of computations. The monad determines how combined computations form a new computation and frees the programmer from having to code the combination manually each time it is required. It is useful to think of a monad as a strategy for combining computations into more complex computations. The inject and bind functions have the signatures:

fun <T> inject(t: T): OptionIF<T>
fun <T, V> bind(op: OptionIF<T>, f: (T) -> OptionIF<V>): OptionIF<V>

The inject function is simply a factory method to create an instance of OptionIF<T>.

The basic intuition behind the bind operation is that it allows us to combine two computations into one more complex computation and allows them to interact with one another. The first parameter type OptionIF<T> represents the first computation. Significantly, the second parameter type is T -> OptionIF<V> which can, given the result of the first computation, produce a second computation to run. In other words bind(op){t -> ... } is a computation which runs op and binds the value wrapped in op to the function literal parameter t. The function body then decides what computation to run using the value for t.

Example 5 demonstrated the map member function of OptionIF<T>. Example 6 shows how we might achieve the same using inject and bind. The mapped function applies the function parameter f to the content of the option type parameter op. Significantly the definition of mapped does not have to distinguish between the concrete OptionIF<A> types of op.

Example 6: The monadic option type

package example6

import kotlin.test.*
import com.adt.kotlin.data.immutable.option.*

fun <A, B> mapped(f: (A) -> B, op: OptionIF<A>): OptionIF<B> = bind(op){(a: A) -> inject(f(a))}

fun main(args: Array<String>) {
    val name: OptionIF<String> = Some("Ken Barclay")
    val absentName: OptionIF<String> = None<String>()

    assertEquals(Some(11), mapped({(str: String) -> str.length()}, name))
    assertEquals(None<Int>(), mapped({str -> str.length()}, absentName))
}

Example 7 demonstrates how we might use the monadic option type. Running the program with the inputs 18, 27 and 9 delivers the result Some(Pair(2, 3)), demonstrating that 18 and 27 are exactly divisible by 9. The inputs 18, 27 and 6 produce the result None, showing that 18 and 27 and not both exactly divisible by 6.

Example 7: Staircasing

package example7

import kotlin.test.*
import com.adt.kotlin.data.immutable.option.*


fun divide(num: Int, den: Int): OptionIF<Int> {
    return if (num % den != 0) None<Int>() else Some(num / den)
}


fun division(a: Int, b: Int, c: Int): OptionIF<Pair<Int, Int>> {
    val ac: OptionIF<Int> = divide(a, c)
    return when (ac) {
        is None<Int> -> None<Pair<Int, Int>>()
        is Some<Int> -> {
            val bc: OptionIF<Int> = divide(b, c)
            when (bc) {
                is None<Int> -> None<Pair<Int, Int>>()
                is Some<Int> -> Some(Pair(ac.get(), bc.get()))
                else -> None<Pair<Int, Int>>()
            }
        }
        else -> None<Pair<Int, Int>>()
    }
}


fun bindDivision(a: Int, b: Int, c: Int): OptionIF<Pair<Int, Int>> {
    return bind(divide(a, c)){(ac: Int) -> bind(divide(b, c)){(bc: Int) -> inject(Pair(ac, bc))}}
}


fun main(args: Array<String>) {
    assertEquals(Some(Pair(2, 3)), division(18, 27, 9))
    assertEquals(None<Pair<Int, Int>>(), division(18, 27, 6))
    assertEquals(Some(Pair(2, 3)), bindDivision(18, 27, 9))
    assertEquals(None<Pair<Int, Int>>(), bindDivision(18, 27, 6))
}

Function division threatens to march off to the right side of the listing if it became any more complicated. It is very typical imperative code that is crying out for some refactoring.

We can bring the staircasing effect under control with bind. Consider the implementation of bindDivision. The outer call to bind(divide(a, c)){ ac -> ...} has an OptionIF value produced from the expression divide(a, c). Should this be a Some value, then the function literal is called and its formal parameter ac is bound to the result from the Some value. In the inner call bind(divide(b, c)){ bc -> ...} the inner function literal is called with the formal parameter bc bound to the Some value produced from divide(b, c). If the two calls to divide both deliver Some values then a final Some result is produced carrying the pair we seek. If a None value is delivered by either call to divide then a None value is the final result.

This simplification is also present in the two following examples. Often we make lookups of a map data structure, a database or a web service and receive an optional response. In some scenarios we take the result of a first lookup and use it to perform a second lookup. Example 8a is an application based on a set of products differentiated by a unique code. The products are organized by country, city, supplier and code. A product stock is realized by overlapping maps.

The immutable MapIF<K, V> trait and its lookUpKey member function provide the implementation. The function getProductFrom aims to deliver a Product given the country, city, supplier and code, and the product stock:

fun getProductFrom(country: String, ...,
productsByCountry: MapIF<String, MapIF<String, MapIF<String, MapIF<String, Product>>>>): OptionIF<Product>

The product, if it exists, is obtained by making repeated lookUpKey function calls down through the nested maps. If anyone fails than we bail out with None<Product>.

Example 8a: Product searching

package example8a

import kotlin.test.*
import com.adt.kotlin.data.immutable.option.*
import com.adt.kotlin.data.immutable.map.*

data class Product(val code: String, val name: String)

fun getProductFrom(country: String, city: String, supplier: String, code: String, productsByCountry: MapIF<String, MapIF<String, MapIF<String, MapIF<String, Product>>>>): OptionIF<Product> {
    val productsByCity: OptionIF<MapIF<String, MapIF<String, MapIF<String, Product>>>> = productsByCountry.lookUpKey(country)
    if (productsByCity.isDefined()) {
        val productsBySupplier: OptionIF<MapIF<String, MapIF<String, Product>>> = productsByCity.get().lookUpKey(city)
        if (productsBySupplier.isDefined()) {
            val productsByCode: OptionIF<MapIF<String, Product>> = productsBySupplier.get().lookUpKey(supplier)
            if (productsByCode.isDefined()) {
                return productsByCode.get().lookUpKey(code)
            } else
                return None<Product>()
        } else
            return None<Product>()
    } else
        return None<Product>()
}

// By country, by city, by supplier and by code
val productsByCountry: MapIF<String, MapIF<String, MapIF<String, MapIF<String, Product>>>> =
        fromSequence(
                "USA" to fromSequence(
                        "San Francisco" to fromSequence(
                                "Graham" to fromSequence(
                                        "SH" to Product("SH", "Shiraz"),
                                        "ZI" to Product("ZI", "Zinfandel")
                                )
                        )
                ),
                "France" to fromSequence(
                        "Bordeaux" to fromSequence(
                                "Jacques" to fromSequence(
                                        "SE" to Product("SE", "Saint Emilion"),
                                        "ME" to Product("ME", "Medoc")
                                )
                        ),
                        "Beaune" to fromSequence(
                                "Marcel" to fromSequence(
                                        "AC" to Product("AC", "Aloxe-Corton"),
                                        "ME" to Product("ME", "Mersault"),
                                        "PO" to Product("PO", "Pommard"),
                                        "SA" to Product("SA", "Savigny"),
                                        "VO" to Product("VO", "Volnay")
                                )
                        )
                )
        )

fun main(args: Array<String>) {
    assertEquals(Some(Product("PO", "Pommard")), getProductFrom("France", "Beaune", "Marcel", "PO", productsByCountry))
    assertEquals(Some(Product("SH", "Shiraz")), getProductFrom("USA", "San Francisco", "Graham", "SH", productsByCountry))
    assertEquals(None<Product>(), getProductFrom("France", "Beaune", "Marcel", "ZZ", productsByCountry))

}

The implementation of function getProductFrom just does not look great. It looks ugly and not code you want to be associated with in your place of work.

However, since our OptionIF<T> is monadic, we can sequence the lookUpKey calls much more gracefully. The implementation of the function getProductFrom in Example 8b is now much simpler and is reminiscent of the bindDivision function from Example 7.

Example 8b: Product searching

package example8b

import kotlin.test.*
import com.adt.kotlin.data.immutable.option.*
import com.adt.kotlin.data.immutable.map.*


data class Product(val code: String, val name: String)


fun getProductFrom(country: String, city: String, supplier: String, code: String, productsByCountry: MapIF<String, MapIF<String, MapIF<String, MapIF<String, Product>>>>): OptionIF<Product> {
    return bind(productsByCountry.lookUpKey(country)) {productsByCity ->
        bind(productsByCity.lookUpKey(city)){productsBySupplier ->
            bind(productsBySupplier.lookUpKey(supplier)){productsByCode ->
                productsByCode.lookUpKey(code)
            }
        }
    }
}


// By country, by city, by supplier and by code
val productsByCountry: MapIF<String, MapIF<String, MapIF<String, MapIF<String, Product>>>> = ...


fun main(args: Array<String>) {
    assertEquals(Some(Product("PO", "Pommard")), getProductFrom("France", "Beaune", "Marcel", "PO", productsByCountry))
    assertEquals(Some(Product("SH", "Shiraz")), getProductFrom("USA", "San Francisco", "Graham", "SH", productsByCountry))
    assertEquals(None<Product>(), getProductFrom("France", "Beaune", "Marcel", "ZZ", productsByCountry))
}

Alongside the OptionIF<T> trait and its concrete classes None<T> and Some<T> there are a number of package level utility functions that make working with options very easy and natural. The sequenceM function combines a list of monadic options into one option monad that has a list of the results of those options inside it:

fun <T> sequenceM(bs: ListIF<OptionIF<T>>): OptionIF<ListIF<T>>

Example 9 demonstrates how this might be used. Consider we have a MapIF<String, Int> for the frequency distribution of words in a document, and a ListIF<String> of words whose total occurrences we wish to obtain.

Example 9: Utility function sequenceM

package example9

import kotlin.test.*
import com.adt.kotlin.data.immutable.option.*
import com.adt.kotlin.data.immutable.list.*
import com.adt.kotlin.data.immutable.map.*

val frequencies: MapIF<String, Int> = fromSequence(
        "software" to 5,
        "engineering" to 3,
        "kotlin" to 9,
        "object" to 8,
        "oriented" to 3,
        "functional" to 3,
        "programming" to 20
)
val words: ListIF<String> = fromSequence("object", "functional")

fun getWordFrequencies(words: ListIF<String>, frequencies: MapIF<String, Int>): ListIF<OptionIF<Int>> =
        words.map{(word: String) -> frequencies.lookUpKey(word)}

fun distributions(frequency: ListIF<OptionIF<Int>>): Int {
    fun ListIF<Int>.sum(): Int = this.foldLeft(0){(res: Int) -> {(elem: Int) -> res + elem}}
    val freq: OptionIF<ListIF<Int>> = sequenceM(frequency)
    return freq.getOrElse(emptyList<Int>()).sum()
}

fun main(args: Array<String>) {
    val wordFrequencies: ListIF<OptionIF<Int>> = getWordFrequencies(words, frequencies)
    val dist: Int = distributions(wordFrequencies)
    assertEquals(11, dist)
}

Given the words and the frequencies the function getWordFrequencies maps each word into a ListIF of OptionIF<Int> through application of the lookUpKey member function. The function distributions then finds the total occurences in the ListIF<Int> (wrapped in an OptionIF) obtained from sequenceM. Note that if any of the OptionIF<Int> in the ListIF passed to sequenceM is a None<Int> then a None<ListIF<Int>> is returned.

The OptionIF<T> provides a type for representing optional values. It also acted as a vehicle for demonstrating powerful abstractions that can hide mundane implementation details. Using higher order abstractions such as higher order functions, monads, etc. we can simplify the implementation of function bodies. 

These observations have inspired me to reconsider how I implement function logic. I aim to reduce the complexity of function bodies by reducing or removing my use of control structures such as if statements nested in while statements, representative of much of my existing imperative code. By presenting function bodies as simple sequential actions they are much easier to write and much easier to reason about their correctness. Functions that are much more trustworthy. Functions such as bindDivision and distributions are simple illustrations of what I seek to achieve.

Kotlin: An Option type #1

This pair of blogs introduces the Kotlin Option type that allows the programmer to specify where something is present or absent. It can be used in place of null to denote absence of a result. The Option type offers more than a new data type: more powerful abstractions, such as higher order functions, that hide many of the details of mundane operations such as mapping a function across a data type. We get to focus on the what, not the how making our code more declarative: incentivizing us to make our code blocks simpler.

Probably all Java developers have experienced a NullPointerException. Usually this occurs because some function returns it when not expected and when not dealing with that possibility in your client code. The value null is also used to represent the absence of a value such as when performing the member function get on a Map.

Kotlin, of course, allows you to work safely with null. The null-safe operator for accessing properties is foo?.bar?.baz will not throw an exception if either foo or its bar property is null. Instead, it returns null as its result.

An alternative approach that seeks to reduce the need for null is to provide a type for representing optional values, i.e. values that may be present or not: the OptionIF<T> trait. By stating that a value may or mat not be present on the type level, you and other developers are required by the compiler to deal with this possibility. Further, by providing various combinators we can chain together function calls on option values.

Kotlin OptionIF<T> is a container for zero or one element of a given type. An OptionIF<T> can be either a Some<T> wrapping a value of type T, or can be a None<T> which represents a missing value. You create an OptionIF<T> by instantiating the Some or None classes:

val name: OptionIF<String> = Some("Ken Barclay")
val absentName: OptionIF<String> = None<String>()

Given an instance of OptionIF<T> and the need to do something with it, how is this done? One way would be to check if a value is present by means of the isDefined member function, and if that is the case obtain the value with the get member function. This is shown in Example 1.

Example 1Accessing options

package example1

import com.adt.kotlin.data.immutable.option.*

fun main(args: Array<String>) {
    val op: OptionIF<Int> = Some(5)
    if (op.isDefined())
        println("op: ${op.get()}")  // => op: 5
}

Deliberately, I have chosen to include types explicitly as an aid to the reader. Of course, Kotlin's type inferencing would allow me to omit some and, in turn, make the code more compact.

The most common way to take optional values apart is through a pattern match. Example 2 introduces the show function which pattern matches on the OptionIF parameter and for a Some instance the member function get is used to retrieve the enclosed value.

Example 2: Pattern matching options

package example2

import com.adt.kotlin.data.immutable.option.*

fun show(op: OptionIF<String>) {
    when (op) {
        is Some<*> -> println(op.get())
        else -> println("Missing")
    }
}

fun main(args: Array<String>) {
    val name: OptionIF<String> = Some("Ken Barclay")
    val absentName: OptionIF<String> = None<String>()

    show(name)              // => Ken Barclay
    show(absentName)    // => Missing
}

If you think this is clunky and expect something more elegant from Kotlin OptionIF<T> you are correct. If you use the member function get, you may forget about checking with isDefined leading to a runtime exception, and this scenario is no different from using null. You should avoid using options this way. One simple improvement we can make to these use cases is provided by the getOrElse member function as shown in Example 3.

Example 3: Provide a default

package example3

import com.adt.kotlin.data.immutable.option.*

fun main(args: Array<String>) {
    val name: OptionIF<String> = Some("Ken Barclay")
    val absentName: OptionIF<String> = None<String>()

    println(name.getOrElse("Missing"))              // => Ken Barclay
    println(absentName.getOrElse("Missing"))    // => Missing

}

I suggested that we consider an option as a collection of zero or one elements. Consequently it is provided with many of the behaviors we associate with containers such as filtering, mapping and folding. Example 4 illustrates transforming an OptionIF<String> into an OptionIF<Int>. When you map an OptionIF<String> that is a None<String> then you get a None<Int>.

Example 4: Mapping

package example4

import com.adt.kotlin.data.immutable.option.*

fun main(args: Array<String>) {
    val name: OptionIF<String> = Some("Ken Barclay")
    val absentName: OptionIF<String> = None<String>()

    println(name.map{(str: String) -> str.length()}.getOrElse(0))         // => 11
    println(absentName.map{str -> str.length()}.getOrElse(0))           // => 0
}

The member function map is a higher order function, accepting a transformer function as parameter. Using customizable higher order functions allows us to think about solutions differently. Function map allows us to apply a generic operation to the data type and means we can focus on the result.

A somewhat more elaborate illustration is given in Example 5 which implements a repository of users. We need to be able to find a user by their unique id. A request made with a non-existent id suggests a return type of OptionIF<User>.

Example 5: User repository

package example5

import com.adt.kotlin.data.immutable.option.*

class User(val id: Int, val firstName: String, val lastName: String, val age: Int)

class UserRepository {

    fun findById(id: Int): OptionIF<User> {
        return if (users.containsKey(id)) {
            val user: User = users.get(id)!!
            Some(user)
        } else
            None<User>()
    }

// ---------- properties ----------------------------------

    val users: Map<Int, User> = hashMapOf(
            1 to User(1, "Ken", "Barclay", 25),
            2 to User(2, "John", "Savage", 31)
    )
}

fun main(args: Array<String>) {
    val repository: UserRepository = UserRepository()
    val user: OptionIF<User> = repository.findById(1)
    val userName: OptionIF<String> = user.map{u -> "${u.firstName} ${u.lastName}"}
    if (userName.isDefined())
        println("User: ${userName.get()}")      // => User: Ken Barclay
}

Our dummy implementation uses a HashMap. Perhaps we might develop a Map<K, V> implementation with a lookUpKey member function that returns an OptionIF<V>.

In the follow-on blog I will consider some more advanced use-cases with OptionIF<T>. I aim to show how more powerful abstractions over the OptionIF<T> type can reduce the complexity of functions, encouraging us to cede control to these abstractions and remove the need to code each atomic step in an algorithm.


Thursday, June 21, 2012

function(groovy) #5


function(groovy) #5

If all you have is a hammer, everything looks like a nail.

Abstract: One of the challenges of a new paradigm is learning to think in an entirely new way. The ability to pass code as a parameter introduces a new style to the imperative programmer. A declarative style abstracts out generic pieces of machinery and customizes them via higher-order functions.

Ken Barclay
kenbarc@googlemail.com

Previous:           The iterator methods
Resources:        Chapter 12 Higher Order Functions

The iterator functions

Many functions to perform functional style List manipulation are defined as static methods and closures in the companion class GListF. The class is defined as abstract so there is no intention of creating an instance. It simply acts as a repository for these useful functions. The arrangement is not unlike the latest Groovy extension methods.

For example, this class defines the function map that transforms a list of values in the manner of the GDK's collect method using a function as its transformer. The map function is provided in two flavors: a curried form and an uncurried form. Of course, the function C is used to define the curried version from its uncurried version. The curried version is given the suffix C in its name:

/**
 * Function map takes two parameters: a function and a list. Function map
 *   applies the function parameter to each item in the list, delivering
 *   a new list.
 *
 * map:: (X -> Y) * [X] -> [Y]
 * mapC:: (X -> Y) -> [X] -> [Y]
 *
 * @param fxy     pure function:: X -> Y
 * @param ys      existing list
 * @return        new list of transformed values
 */
private static final <X, Y> List<Y> _map(final Closure<Y> f, final List<X> xs) { xs.collect(f) }
public static final map = GListF.&_map
public static final mapC = C(map)

For most applications function map will suffice. However, in some advanced examples we will find we wish to call mapC with its single function parameter.

Observe how the behavior is first defined with the _map method. The closures map and mapC are defined from it. The &. operator is used to get a reference to a method, and from it create a reference to a closure. The aim of this scheme is to define the _map method with the generic type parameters X and Y. Including both the parameter type and result type fully documents the values involved in using this method. Groovy does not support generic type parameters when defining closures and so this approach circumvents this limitation. A Groovy static type checker would assist greatly.

All the methods/closures defined in GListF present the Groovy list with an immutable interface. As a consequence, some of the operations are not especially efficient. This issue is revisited in a later blog where we implement a more efficient immutable list. Notwithstanding this, we can begin presenting solutions that are more declarative than would be possible using Groovy lists directly.



Mapping

The GDK's collect iterator method is used to iterate through the elements of the receiving list object and transforms each element into a new value using the closure parameter as a transformer, returning a list of transformed values.

A pictorial representation of how map behaves is shown in Figure 5.1. On the left we show some function (closure) f which when applied to a single parameter (represented by the circle) delivers a value (the square). Now, if we have a list of values of type circle, then map f list delivers a list of values of type square produced by applying f to each of the originals.

Figure 5.1: Application of map


As we noted, the companion class GListF contains a rich selection of functions (static closure definitions) for processing lists. Example 5.1 revisits the previous example, this time using the mapC function from GListF.

Example 5.1: Curried map

import static com.adt.groovy.data.immutable.list.GListF.mapC

    // multiply:: Integer * Integer -> Integer
final multiply = { final int x, final int y -> x * y}
    // twice:: Integer -> Integer
final twice = multiply.curry(2)
    // quadruple:: Integer -> Integer
final quadruple = multiply.curry(4)

    // twiceAll:: [Integer] -> [Integer]
final twiceAll = mapC(twice)
    // quadrupleAll:: [Integer] -> [Integer]
final quadrupleAll = mapC(quadruple)

final List<Integer> numbers = [1, 2, 3, 4]
assert twiceAll(numbers) == [2, 4, 6, 8]
assert quadrupleAll(numbers) == [4, 8, 12, 16]

It is relatively easy to prove that map(f g, xs) = (map f map g)(xs), where ● represents function composition. Applying f g to every element of a list is the same as applying g to every member of the list and then applying f to every member of the resulting list. The equality demonstrates how we might refactor our code to make it more efficient. If our code includes (map f map g) then we optimize it with map(f g). The former makes one pass across the initial list then a second pass across the result list. The latter makes only a single pass across the initial list reducing the amount of work to be done. This equality is demonstrated in Example 5.2.

Example 5.2: Map composition equality

import static com.adt.groovy.data.immutable.list.GListF.mapC
import static com.adt.groovy.data.immutable.list.GListF.map

    // twice:: Integer -> Integer
final twice = { final int x -> 2 * x}
    // successor:: Integer -> Integer
final successor = { final int x -> 1 + x}

    // composeTwiceSuccessor:: Integer -> Integer
final composeTwiceSuccessor = twice >> successor

    // composeMapTwiceMapSuccessor:: [Integer] -> [Integer]
final composeMapTwiceMapSuccessor = mapC(twice) >> mapC(successor)

final List<Integer> numbers = [1, 2, 3, 4]
assert map(composeTwiceSuccessor, numbers) == composeMapTwiceMapSuccessor(numbers)

The function composeTwiceSuccessor transforms an Integer into another Integer. The left side of the assert applies this transformation to the list of numbers. On the right side of the assert the function composeMapTwiceMapSuccessor first doubles the values in the numbers list, then increments by one the values in the resulting list.



Filtering

It is also common to select all the elements from a list with some given property. For example we might wish to choose only those overdrawn bank accounts from a list of accounts. We usually describe this as filtering. Example 5.3 introduces the higher order function filter in which the selection criterion is specified by the function parameter. The selection function is sometimes known as a predicate ― a function returning a Boolean value which describes whether the criterion has been met.

Example 5.3: The filter function

import static com.adt.groovy.data.immutable.list.GListF.filter

    // isEven:: Integer -> Boolean
final isEven = { final int x -> (x % 2 == 0)}

final List<Integer> numbers = [11, 12, 13, 14]
assert filter(isEven, numbers) == [12, 14]
assert filter(isEven, []) == []

def bk1 = new Book(title: 'Groovy Programming', author: 'Barclay', price: 22.90, category: 'CompSci')
def bk2 = new Book(title: 'Groovy in Action', author: 'Konig', price: 32.00, category: 'CompSci')
def bk3 = new Book(title: 'Grails', author: 'Rocher', price: 29.00, category: 'CompSci')
def bk4 = new Book(title: 'Thinking in Java', author: 'Eckel', price: 24.50, category: 'CompSci')

def List<Book> books = [bk1, bk2, bk3, bk4]

assert filter({bk -> (bk.price < 25.00)}, books) == [bk1, bk4]

The GDK's iterator method findAll is used to implement the filter function.

Earlier we showed how proving equality of two mappings can lead to more efficient programs. We can also show that:

filter p ● map f = map f ● filter(p ● f)

The equation says that a map followed by a filter can be replaced by a filter followed by a map. The right side is potentially more efficient than the left since the map could be applied to a shorter list. This is demonstrated by Example 5.4.

Example 5.4: Maps and filters

import static com.adt.groovy.data.immutable.list.GListF.filterC
import static com.adt.groovy.data.immutable.list.GListF.mapC

    // isWrittenBy:: String -> Book -> Boolean
final isWrittenBy = { final String auth -> { final Book book -> (book.author == auth)}}

    // priceIncrease:: BigDecimal -> Book -> Book
final priceIncrease = { final BigDecimal amount -> { final Book book -> new Book(title: book.title, author: book.author, price: book.price + amount, category: book.category)}}

final mapThenFilter = mapC(priceIncrease(2.00)) >> filterC(isWrittenBy('Ken Barclay'))
final filterThenMap = filterC(priceIncrease(2.00) >> isWrittenBy('Ken Barclay')) >> mapC(priceIncrease(2.00))

final book1 = new Book(title: 'Groovy Programming', author: 'Ken Barclay', price: 20.45, category: 'CompSci')
final book2 = new Book(title: 'Groovy in Action', author: 'Dierk Konig', price: 32.00, category: 'CompSci')
final book3 = new Book(title: 'Object Oriented Design', author: 'Ken Barclay', price: 25.50, category: 'CompSci')
final book4 = new Book(title: 'Programming Groovy', author: 'Venkat Subramaniam', price: 29.50, category: 'CompSci')
final List<Book> books = [book1, book2, book3, book4]

assert mapThenFilter(books) == filterThenMap(books)

Figure 5.2 describes the filter closure. If p represents some predicate function (closure, returning a boolean value), then when applied to a triangle it delivers true and when applied to a square it produces false. Now filter p list delivers a new list containing only those triangles that satisfy the predicate.

Figure 5.2: Application of filter




Folding

It is also common to reduce a collection to a single value. For example we might wish to sum or multiply all the elements in a list. This is the basis for a particular kind of higher order function called folding. Folding is used to fold a function (such as a summation) into a list of values.

The fold function reduces a list to a single value by "folding" a binary operator between successive elements of the list. For example, consider we have the list [x1, x2, x3, ..., xn], an initial value e, and the binary operator □, then:

fold □ e [x1, x2, x3, ..., xn] = e □ x1 □ x2 □ x3 □ ... □ xn

The fold function takes a binary function □, together with an initial value e and a list. Notice that the version of fold above is left-associative. In other words, the result it produces is equivalent to doing:

( ... (((e □ x1) □ x2) □ x3) ... □ xn)

Example 5.5 introduces the function foldLeft implemented using the GDK's each iterator method. It has three parameters: the binary function to be folded in, the initial value when starting the folding, and the list of values. The type signature for the function given as:

/**
 * foldLeft is a higher-order function that folds a binary function into a
 *   list of values. Effectively:
 *
 *   foldLeft(f, e, [x1, x2, ..., xn]) = (...((e f x1) f x2) f...) f xn
 *
 * foldLeft:: (Y -> X -> Y) * Y * [X] -> Y
 * foldLeftC:: (Y -> X -> Y) -> Y -> [X] -> Y
 *
 * @param f     pure binary function:: Y -> X -> Y
 * @param e       initial value
 * @param xs      existing list
 * @return        folded result
 */
private static final <X, Y> Y _foldLeft(final Closure<Y> f, final Y e, final List<X> xs) {
  Y res = e
  xs.each{ x -> res = f(res)(x) }
  return res
}
public static final foldLeft = GListF.&_foldLeft
public static final foldLeftC = C3(foldLeft)

Note that the function parameter is expected in its curried form. The function parameter combines a Y and an X to produce a Y. The initial value for foldLeft is of type Y and is combined with the first member of the list. The result of type Y is then combined with the second member of the list, and so on. Ultimately, a Y value is delivered. The companion class GListF includes the two variants foldLeft and foldLeftC.

Example 5.5: The foldLeft function

import static com.adt.groovy.fp.FunctionF.C
import static com.adt.groovy.data.immutable.list.GListF.appendC
import static com.adt.groovy.data.immutable.list.GListF.foldLeft

    // add:: Integer * Integer -> Integer
final add = { final int x -> { final int y -> x + y }}

final List<Integer> numbers = [11, 12, 13, 14]
final int sum = foldLeft(add, 0, numbers)
assert sum == 50

    // logicalAnd:: Boolean -> Boolean -> Boolean
final logicalAnd = { final boolean x -> { final boolean y -> x && y }}

final List<Boolean> truthful = [true, true, true]
final boolean allTruthful = foldLeft(logicalAnd, true, truthful)
assert allTruthful == true

final List<Integer> flatten = foldLeft(appendC, [], [[1, 2, 3], [4, 5], [6]])
assert flatten == [1, 2, 3, 4, 5, 6]

Folding is a surprisingly fundamental operation. Its generality allows us to define many standard list processing operations. Example 5.6 defines the primitive operation reverse in terms of a left fold. The initial value given to foldLeft is the empty list [] into which the reversed list is assembled. The binary function required by foldLeft must then have the signature [X] -> X -> [X] where the second parameter is prepended on to the list first parameter. Of course consC does just this but has its parameter in the opposite order. Functions flip and flipC performs the necessary transformation, converting the function X -> Y -> Z to Y -> X -> Z.

Example 5.6: Folding and primitive operations

import static com.adt.groovy.fp.FunctionF.flipC
import static com.adt.groovy.data.immutable.list.GListF.consC
import static com.adt.groovy.data.immutable.list.GListF.foldLeft

    // reverse:: [A] -> [A]
final reverse = {xs -> foldLeft(flipC(consC), [], xs)}

assert reverse([1, 2, 3, 4]) == [4, 3, 2, 1]
assert reverse(['Ken', 'John', 'Jessie']) == ['Jessie', 'John', 'Ken']

These examples illustrate how the functional style gives us new and different building blocks. Some of the examples used the curried variants mapC and filterC to produce specialist processing functions. No new code had to be developed other than the closure parameters. The curried forms are our factories for functions.

The examples also show a reduced use of control flow such as for statements. This offers many advantages. First, the code is easier to read and write. That means it will be easier to reason about its correctness. Control flow statements are usually accompanied by state variables that are updated. In its place we see code in which we sequentially bind results to identifiers then finish the code with the resulting expression.

These extension methods are not, in some cases. especially efficient. For example, the concatenate method produces a new list by joining together two existing lists. Immutable data structures, the subject of the next blog, can offer a solution.