Lambda To The Rescue: Implementation Details

This is the sixth post in a series about using Functional Programming concepts to make your Object Oriented code more comprehensible. Start here if you want to read the whole thing.

🤔 How do you hide your internals?

In OOP, we're used to hiding the implementation details of our classes by using interfaces. We can just define a contract that the rest of the application needs to adhere to, and keep the knowledge of the internals completely separate from the rest of the system. Let's look at a simple interface:


namespace Dns;

interface Client
    public function resolve(Request $request): Response;

As a consumer of this interface, we actually know everything we need to know to start programming. We'll inject the class we're making with a DNS\Client instance later, the actual implementation of it doesn't concern us right now. We have enough information to create our application knowing that we can send the client a Request and get a Response back. The internals of the Dns\Client can be changed at will, without altering our program.

For me, this was the biggest breakthrough of learning an Object Oriented language, the moment when using interfaces as contracts clicked. That's why, when I started to dive into Functional Programming languages, I wasn't really happy with what I saw. Where were the interfaces at?

😅 Exporting from modules as interface

The first Functional Programming language I learnt was Scheme. It was great! When you start learning it, you start to see the recursion patterns, see that syntax is only so important as the language makes it, and countless other nice things (programming with continuations, anyone?). What bothered me was that lists are used as the main data structure everywhere! You'll sometimes find libraries that do something like this:

(define address
  (list "Toon Daelman" "FooBarStreet 42" "9000 Ghent" "Belgium"))

Yes, that's a workable data structure, but it's not ideal... To get a person's country from their address, you need to do something like this:

(list-ref address 3)

Which means, from the address list, take the 3rd (zero-based) index. That's not very readable at all, and it doesn't hide any of the details of our data structure. If we want to change it, every function interacting with this data structure will need to change as well.

Luckily, the problem was me. I didn't look far enough, and most people working with lisps have other ways of hiding their internals, for instance:

(module address (address country)
  (import scheme)

  (define address
    (lambda (name line1 line2 country)
      (list name line1 line2 country)))

  (define country
    (lambda (address)
      (list-ref address 3)))

This is a scheme module, that exports two functions address and country. It uses the base library scheme. It defines a function address, that acts like a constructor and returns a black box object, that you can deconstruct using separate functions. In this case we only have a country function that takes an address and returns its country. Consuming modules of this address module need only now the constructor and the other functions, not that the underlying object is still a list!

And this lets us change the implementation as well!

(module address (address country)
  (import scheme)

  (define address
    (lambda (name line1 line2 country)
      (vector name line1 line2 country)))

  (define country
    (lambda (address)
      (vector-ref address 3)))

We're now using vectors as the datatype for address instead of lists, but the address constructor and the country getter function are still called the same and behave exactly the same. We could use the records features as well, still keeping the same public interface...

🎩 Types

The second Functional Programming language I started to look into was Haskell. It immediately blew my mind with its type system. Let's check out this piece of code:

type AddressLine = String
type Country = String

data Address = Address
  { name :: String
  , line1 :: AddressLine
  , line2 :: AddressLine
  , country :: Country
  } deriving (Show)

countryFrom :: Address -> Country
countryFrom = country

We define two type aliasses AddressLine and Country. Then we say that an Address consists of a name, a line1 which is an AddressLine, a line2 (also an AddressLine) and a country of the type Country. Then we define a function called countryFrom that takes an Address and returns a Country, which is implemented by just saying it's equal to country, the function that is automatically created to unwrap Records.

In fact, creating the countryFrom function was just done as an example to show off the type annotations in a situation analogous to the previous example in scheme. The type system in Haskell allows us to not only write the contract of a function, but it allows us to write abstractions as well! Check this out:

head :: [a] -> a

This is the type of the head function, which operates on lists. It takes a list of as and returns an a. The a is a type variable, it substitutes for every type you can think of. This way, you can strictly type a function that can work with all sorts of types! Take for instance a list of Addresses! Call head on that list, and you'll get the first Address of the list.


😍 I love this, it's like magic! And it goes even further. Let's say you want to be able to declare the function == which takes two arguments and checks if they're equal to each other. You would think that that would be easy using type variables, doing something like this:

(==) :: a -> a -> Bool

Which means, a function == which takes two arguments of the same type (we use a for both arguments, which means we don't care which type it is, but it should be the same for both arguments) and returns a Bool. The problem is that there's no certainty that the type a has a concept of equality to it. It could be that we want to be able to define our own rules for equality on a type-by-type basis as well... That's why that signature a -> a -> Bool isn't enough.

Haskell actually has another abstraction over their types to allow us to put the a -> a -> Bool in a context:

class Eq a where
  (==) :: a -> a -> Bool
  x == y = not (x /= y)

  (/=) :: a -> a -> Bool
  x /= y = not (x == y)

This is a typeclass named Eq that defines equality for every type a that we say is part of the typeclass. It defines two functions, == and /=, which both take two arguments of the type a and return a Bool. What's also nice, is that they're both defined in terms of each other. == says that it's not (/=) and vice versa.

Now, if we want to make our Address type part of the typeclass Eq, we can just implement the == function, and we get the other one for free because it's defined in terms of ==. Let's look at an example:

instance Eq Address where
  x == y = sameAddressLines && sameCountry
    where sameAddressLines = (line1 x == line1 y) && (line2 x == line2 y)
          sameCountry = country x == country y

In this case, we say Addresses are the same if their AddressLines and Country are the same. We don't take name into consideration. Now we can == on Addresses everywhere. If you don't need special rules for deriving equality for a given type, the Haskell compiler can derive it for your type using the deriving (Eq) statement.

🤓 What can we learn from this?

We've seen two ways of how implementation details can be hidden in a functional programming environment. When we're in Object Oriented environments we do almost the same things, but we use interfaces for them. And we use interfaces to describe contracts for many other things we want to do. That's why I'm sometimes confused when I stumble upon an interface: what's the primary reason for it to be here? Is it there to define a contract for external systems? Is it meant to hide implementation details or is it just part of a design pattern used in this package? Is it a marking interface used to indicate the type of the implementing class? Not every class needs to be implementing an interface... It already has one! All public methods of a concrete class can be seen as its public interface. And I think in most cases like the first example in scheme where we want to hide the internals of a datatype, a concrete class can be enough (think ValueObjects).

I find that in Object Oriented Programming, thinking about types as you do when writing e.g. Haskell, tends to help when defining your interfaces effectively. One of the biggest differences is the manifestation of side-effects: In Haskell, there's a type for functions with side effects, while interfaces in Object Oriented languages mostly don't give you any insights into that (I've often thought about annotating side-effects in the docblocks of my interfaces).

While PHP doesn't allow you to implement your own Equality rules for your classes, some Object Oriented languages do. Compare these examples:

$foo == $bar;

In the first example, we can't influence how PHP compares two instances of our class, while in the second one, we have complete control (even over the name of the method). It depends on taste what's best, certainly when your language allows you to overload == for your classes. Always try to make it as readable as possible!

Wow, we made it through! Hope to see you in the next episode! Happy programming y'aλλ! 🖖

Categories: Functional Programming