People receiving code reviews from me on a regular basis will know this by now, but if I see inheritance happening in new code, I get moody. I look the extends in the eye and prepare for battle. I pick up my SOLID axe and start examining the principle that will allow me to slay this sub-class beast. And then I cry a little because I'll have to explain again where this centuries-old grudge comes from.

Now, don't get me wrong. I do understand the usefulness of inheritance in very specific cases. But most of the time when I encounter a subclassed beast during a code review I go all out offensive on it. Let's explore why:

Single Responsibility Principle

If you've been Object Oriented Programming for some time and you've been looking around a bit, you've definitely encountered the SOLID principles. These are basically principles (not rules) that, when followed, will help you produce code that's easier to comprehend, change and test than when you were not following them. The first principle is the Single Responsibility Principle, which states that a class should have only one responsibility, or "one reason for change".

Let's try and use that to mitigate this case of inheritance:

interface Users
{
    public function getById(UserId $id): User;
}

class UsersDB implements Users
{
    protected $db;

    public function __construct(DB $db)
    {
        $this->db = $db;
    }

    public function getById(UserId $id): User
    {
        // ...
    }
}

class UsersCached extends UsersDB
{
    protected $cache;

    public function __construct(Cache $cache, DB $db)
    {
        parent::__construct($db);

        $this->cache = $cache;
    }

    public function getById(UserId $id): User
    {
        // get from cache?

        // if not in cache
        $user = parent::getById($id);

        // put in cache

        return $user;
    }
}

Apart from all other issues that we have here, let's try to find out if the Single Responsibility Principle holds for this implementation. Looking at the class UsersDB, we can clearly see that it has just one responsibility: it represents a bunch of Users that are persisted to our database. It has no other reasons to change than for the purpose of representing users persisted in our database.

Now let's look at the UsersCached implementation. It represents a bunch of Users as well, in this case it caches them. We could think that this class has only one responsibility, until we see the parent::getById() call. It basically calls a method that we inherited (so it's available in our class) from the parent class. The knowledge needed to query the database is actually in our class because of the inheritance!

You could argue that it's still ok since it's clearly a parent class and nothing's wrong there, but let's look at it through the lense of a class should have only one reason for change. What if we decide that the constructor of the UsersDB class will now take a PDO instance instead of our own DB class? Guess what! The implementation of our UsersCached needs to change as well, because it needs to pass the PDO instance on to its parent. So a decision about the database made us change the caching class. And it's obvious that changing the caching mechanism would also affect the caching class. Which gives us more than one reason for change of this class.

This is clearly a violation of the Single Responsibility Principle.

We could fix this e.g. by using the Decorator Pattern:

interface Users
{
    public function getById(UserId $id): User;
}

final class UsersDB implements Users
{
    private $db;

    public function __construct(DB $db)
    {
        $this->db = $db;
    }

    public function getById(UserId $id): User
    {
        // ...
    }
}

final class UsersCached implements Users
{
    private $users;
    private $cache;

    public function __construct(Users $users, Cache $cache)
    {
        $this->users = $users;
        $this->cache = $cache;
    }

    public function getById(UserId $id): User
    {
        // get from cache?

        // if not in cache, get from nested user object
        $user = $this->users->getById($id);

        // put in cache

        return $user;
    }
}

As you can see, it's only a very small change.

  • we added final keywords to our classes and private to our variables to indicate that we don't want inheritance to happen 😏
  • we made both classes implement the Users interface
  • we injected an implementation of the Users interface into the constructor of the UsersCached, which will be called when no user is found in the cache.

You could try and argue that this is no better because we still rely on the Users interface to stay stable. But if that interface changes, the responsibility "representing a bunch of users" changes, which is exactly the one responsibility that we have for our class. Changing the implementation of our UsersDB class will not affect the UsersCached class anymore, except when the contract of the interface is broken. Great!

Dependency Inversion Principle

Let's take a short second look at the same code examples from above, now while trying to look at it from another one of the SOLID principles, the Dependency Inversion Principle. The principle comes down to depending on abstractions instead of concretions. In the case of the first example, the UsersCached extends UsersDB, which it needs when there's nothing found in the cache. We depend on a concrete class, which is a violation of the principle. In the second example, we made UsersCached implement the interface directly instead of extending from UsersDB. We then inject another implementation of Users in the constructor of UsersCached. Since we depend on the interface here, we don't depend on a concrete class anymore, and any implementation of the Users interface should work! This solution follows the principle! An added benifit of this is that our UsersCached instance is now much easier to test: we just need to inject a fake Users interface and see if it gets called when the cache doesn't have the value that we look for.

You could do the same with the other SOLID principles really, take e.g. Liskov Substitution Principle. If a class extends from our class, it can easily overwrite a method to behave completely differently than what it was, so the principle doesn't hold. Again, it's easy to make a violation there.

Shared Dependencies

Now that we have a few SOLID reasons to say that inheritance is not a desirable solution for our problems, let's look at some other ways that I think it is misused. Consider this example:

abstract class Notification
{
    protected $id;

    public function __construct(Id $id)
    {
        $this->id = $id;
    }

    public function toUrl($relative = true): string
    {
        $path = "/notifications/{$this->id}";

        if ($relative === true) {
            return $path;
        } else {
            return "https://domain.of.application{$path}";
        }
    }
}

I see this kind abstract classes quite often. It's a form of deduplication (people think they shouldn't repeat code), but what's problematic is the fact that you won't always know what kinds of exceptions to the rule there are going to be. Consider a class AccountSpecificNotification which extends from Notification. It wants to overwrite the toUrl() method, because account specific notifications are at /account/{accountId}/notifications/{notificationId}. Trouble is, we have to again implement the absolute and relative paths rule, and the chance of making mistakes becomes higher. Also the absolute URL of our domain is now in two classes. The knowledge about converting a notification to a URL is also in two classes. If we have a few of these exceptions, stuff gets difficult to keep track of.

As a solution, you could just make this class an interface and specify the toUrl() method on Notification instances. This way you always know the implementation for a certain type of Notification will be in the class itself. Then you'll have to live with the duplication. This is personally how I often do it. I duplicate stuff when in doubt and refactor later, with the benefit of hindsight.

Another thing that you could do is look at the toUrl() method, and see that it's in fact a shared dependency between Notifications with another responsibility than "representing a notification". It could become another class. Let's write a separate class for it:

final class NotificationToUrl
{
    public function getRelativeUrlForNotification(Notification $notification): string
    {
        $url = '/notifications/' . $notification->getId();

        if ($notification instanceof AccountSpecificNotification) {
            $url = '/account/' . $notification->getAccountId();
            $url .= '/notifications/' . $notification->getId();
        }

        return $url;
    }

    public function getFullUrlForNotification(Notification $notification): string
    {
        $path = $this->getRelativeUrlForNotification($notification);

        return "https://domain.of.application{$path}";
    }
}

Of course this is not perfect either, but it's at least a bit better from some standpoints: we get all the knowledge of converting notifications to URLs in one place, and we don't need inheritance!

In short

If you're using inheritance to fix a problem, I propose that you take a second and ask yourself why. In this post I propose a few ways of looking at inheritance that could help you come to a better solution in some cases.

  • Are you violating one of the SOLID principles? Try and look for a solution where you can adhere to them more.
  • Are you trying to remove duplication? Try to postpone your decisions about the abstraction that you're going to use to remove the duplication.
  • Do you have a shared method that multiple classes need? Try extracting it to a separate class.

Since most of the time one of these methods produces a result that I find more desirable as explained in this post, I'll mostly oppose strongly against the use of inheritance when it's not strictly needed.

Hope that you'll be with me on slaying this beast! Until next time, happy programming! 🐲

Tags:

You've made it to this post thinking "Why do we still need to talk about Exceptions?". Well, they're used everywhere in OOP codebases, but sometimes they're used in a way that make debugging a bit difficult. Let's look at some ways to make debugging exceptions a bit more fun!

You know that feeling when you're trying to investigate an Exception that was thrown, but you can't seem to find the origin of it? You dig a bit deeper and you find that it's an exception that was caught and not rethrown, e.g.:

try {
    $this->doSomeImportantStuffWith($foo);
} catch (VeryDescriptiveException $e) {
    // do some stuff here

    throw new SomethingWentWrongException('oh no!');
}

Now, when you encounter this SomethingWentWrongException, you'll see that the trace takes you back to the 6th line of this code example. All information that was inside the VeryDescriptiveException, including its message, stack trace and other useful information is gone. Of course, debugging that error in doSomeImportantStuffWith() would be much easier if you had all that info.

Fatal error: Uncaught SomethingWentWrongException: oh no! in test.php:6
Stack trace:
#0 /Users/toon/Projects/devblog/test.php(34): Test->withoutPrevious()
#1 {main}
  thrown in /Users/toon/Projects/devblog/test.php on line 6

Prevent information loss by using $previous

The obvious answer to this simplified example would be to just rethrow the VeryDescriptiveException instead of throwing a more general SomethingWentWrongException... And that would be valid, but let's say we're implementing an Interface that prescribes that we only throw SomethingWentWrongExceptions. We can't let the VeryDescriptiveException through or we'll break the Liskov Substitution Principle. We want to throw that specific SomethingWentWrongException, while still somehow preserving the information of that previous exception that we caught. Let's check the docs:

PHP Docs for the Exception Constructor

That Throwable $previous = null is what we're looking for! I've almost never seen this being used in the wild, but it's great for our usecase:

try {
    $this->doSomeImportantStuffWith($foo);
} catch (VeryDescriptiveException $e) {
    // do some stuff here

    throw new SomethingWentWrongException('oh no!', 0, $e);
}

this results in this error:

Fatal error: Uncaught VeryDescriptiveException: hello there! in test.php:15
Stack trace:
#0 /Users/toon/Projects/devblog/test.php(23): Test->doSomeImportantStuffWith('test')
#1 /Users/toon/Projects/devblog/test.php(34): Test->withPrevious()
#2 {main}

Next SomethingWentWrongException: oh no! in test.php:6
Stack trace:
#0 /Users/toon/Projects/devblog/test.php(34): Test->withPrevious()
#1 {main}
  thrown in /Users/toon/Projects/devblog/test.php on line 6

As you can see, the stack trace of the original exception along with the one we wrapped it in are presented to us on error! We can see the message as well, and we can even add our own properties to the exception and be presented with them here if we just implement the __toString() method of the exception.

Custom properties in exceptions

Let's say we've been building an API client that does some HTTP requests to an endpoint. Something can go wrong during the HTTP request, and we want to thrown an ApiConnectionFailed exception whenever the HTTP response code is not 2xx or something else fails, so that the rest of our application has one single Exception it needs to prepare for. It would be very handy to debug if that exception contained our HTTP Request & Response objects, e.g.:

final class ApiConnectionFailed extends Exception
{
    private $request;
    private $response;

    public static function withHttpRequestAndResponse(
        $message,
        Request $request = null,
        Response $response = null,
        Throwable $previous = null
    ) {
        $exception = new static($message, 0, $previous);
        $exception->request = $request;
        $exception->response = $response;

        return $exception;
    }

    public function __toString()
    {
        $string = parent::__toString();
        $string .= "\n{$this->request}";
        $string .= "\n{$this->response}";

        return $string;
    }
}

As you can see, we used a named constructor withHttpRequestAndResponse to be able to keep the default constructor for Exceptions, but to also be able to construct it using a lot of extra relevant debugging information. This means that this Exception behaves like other exceptions and is very transparent to other developers.

If something fails, we'll construct the exception like this:

try {
    $response = $httpClient->execute($request);
} catch (HttpClientException $e) {
    throw ApiConnectionFailed::withHttpRequestAndResponse(
        'Http connection could not be established',
        $request,
        null,
        $e
    );
}

if ($response->getResponseCode() === 500) {
    throw ApiConnectionFailed::withHttpRequestAndResponse(
        'We got a 500 error from the host',
        $request,
        $response
    );
}

The exception will always contain all relevant information to be able to debug.

Recap

To quickly recap, you can make your life a lot easier by

  • providing exceptions with descriptive names
  • providing the $previous exception when rethrowing
  • providing all relevant data to an exception
  • providing named constructors for your exceptions to be able to add debugging info but keep the original constructor

Hope that helps! Until next time, have fun coding! 🖖

Tags:

This is the sixth post in a series about using Functional Programming concepts to make your Object Oriented code more comprehensible. Start here if you want to read the whole thing.

🤔 How do you hide your internals?

In OOP, we're used to hiding the implementation details of our classes by using interfaces. We can just define a contract that the rest of the application needs to adhere to, and keep the knowledge of the internals completely separate from the rest of the system. Let's look at a simple interface:

<?php

namespace Dns;

interface Client
{
    public function resolve(Request $request): Response;
}

As a consumer of this interface, we actually know everything we need to know to start programming. We'll inject the class we're making with a DNS\Client instance later, the actual implementation of it doesn't concern us right now. We have enough information to create our application knowing that we can send the client a Request and get a Response back. The internals of the Dns\Client can be changed at will, without altering our program.

For me, this was the biggest breakthrough of learning an Object Oriented language, the moment when using interfaces as contracts clicked. That's why, when I started to dive into Functional Programming languages, I wasn't really happy with what I saw. Where were the interfaces at?

😅 Exporting from modules as interface

The first Functional Programming language I learnt was Scheme. It was great! When you start learning it, you start to see the recursion patterns, see that syntax is only so important as the language makes it, and countless other nice things (programming with continuations, anyone?). What bothered me was that lists are used as the main data structure everywhere! You'll sometimes find libraries that do something like this:

(define address
  (list "Toon Daelman" "FooBarStreet 42" "9000 Ghent" "Belgium"))

Yes, that's a workable data structure, but it's not ideal... To get a person's country from their address, you need to do something like this:

(list-ref address 3)

Which means, from the address list, take the 3rd (zero-based) index. That's not very readable at all, and it doesn't hide any of the details of our data structure. If we want to change it, every function interacting with this data structure will need to change as well.

Luckily, the problem was me. I didn't look far enough, and most people working with lisps have other ways of hiding their internals, for instance:

(module address (address country)
  (import scheme)

  (define address
    (lambda (name line1 line2 country)
      (list name line1 line2 country)))

  (define country
    (lambda (address)
      (list-ref address 3)))
)

This is a scheme module, that exports two functions address and country. It uses the base library scheme. It defines a function address, that acts like a constructor and returns a black box object, that you can deconstruct using separate functions. In this case we only have a country function that takes an address and returns its country. Consuming modules of this address module need only now the constructor and the other functions, not that the underlying object is still a list!

And this lets us change the implementation as well!

(module address (address country)
  (import scheme)

  (define address
    (lambda (name line1 line2 country)
      (vector name line1 line2 country)))

  (define country
    (lambda (address)
      (vector-ref address 3)))
)

We're now using vectors as the datatype for address instead of lists, but the address constructor and the country getter function are still called the same and behave exactly the same. We could use the records features as well, still keeping the same public interface...

🎩 Types

The second Functional Programming language I started to look into was Haskell. It immediately blew my mind with its type system. Let's check out this piece of code:

type AddressLine = String
type Country = String

data Address = Address
  { name :: String
  , line1 :: AddressLine
  , line2 :: AddressLine
  , country :: Country
  } deriving (Show)

countryFrom :: Address -> Country
countryFrom = country

We define two type aliasses AddressLine and Country. Then we say that an Address consists of a name, a line1 which is an AddressLine, a line2 (also an AddressLine) and a country of the type Country. Then we define a function called countryFrom that takes an Address and returns a Country, which is implemented by just saying it's equal to country, the function that is automatically created to unwrap Records.

In fact, creating the countryFrom function was just done as an example to show off the type annotations in a situation analogous to the previous example in scheme. The type system in Haskell allows us to not only write the contract of a function, but it allows us to write abstractions as well! Check this out:

head :: [a] -> a

This is the type of the head function, which operates on lists. It takes a list of as and returns an a. The a is a type variable, it substitutes for every type you can think of. This way, you can strictly type a function that can work with all sorts of types! Take for instance a list of Addresses! Call head on that list, and you'll get the first Address of the list.

🎩➡️🐰

😍 I love this, it's like magic! And it goes even further. Let's say you want to be able to declare the function == which takes two arguments and checks if they're equal to each other. You would think that that would be easy using type variables, doing something like this:

(==) :: a -> a -> Bool

Which means, a function == which takes two arguments of the same type (we use a for both arguments, which means we don't care which type it is, but it should be the same for both arguments) and returns a Bool. The problem is that there's no certainty that the type a has a concept of equality to it. It could be that we want to be able to define our own rules for equality on a type-by-type basis as well... That's why that signature a -> a -> Bool isn't enough.

Haskell actually has another abstraction over their types to allow us to put the a -> a -> Bool in a context:

class Eq a where
  (==) :: a -> a -> Bool
  x == y = not (x /= y)

  (/=) :: a -> a -> Bool
  x /= y = not (x == y)

This is a typeclass named Eq that defines equality for every type a that we say is part of the typeclass. It defines two functions, == and /=, which both take two arguments of the type a and return a Bool. What's also nice, is that they're both defined in terms of each other. == says that it's not (/=) and vice versa.

Now, if we want to make our Address type part of the typeclass Eq, we can just implement the == function, and we get the other one for free because it's defined in terms of ==. Let's look at an example:

instance Eq Address where
  x == y = sameAddressLines && sameCountry
    where sameAddressLines = (line1 x == line1 y) && (line2 x == line2 y)
          sameCountry = country x == country y

In this case, we say Addresses are the same if their AddressLines and Country are the same. We don't take name into consideration. Now we can == on Addresses everywhere. If you don't need special rules for deriving equality for a given type, the Haskell compiler can derive it for your type using the deriving (Eq) statement.

🤓 What can we learn from this?

We've seen two ways of how implementation details can be hidden in a functional programming environment. When we're in Object Oriented environments we do almost the same things, but we use interfaces for them. And we use interfaces to describe contracts for many other things we want to do. That's why I'm sometimes confused when I stumble upon an interface: what's the primary reason for it to be here? Is it there to define a contract for external systems? Is it meant to hide implementation details or is it just part of a design pattern used in this package? Is it a marking interface used to indicate the type of the implementing class? Not every class needs to be implementing an interface... It already has one! All public methods of a concrete class can be seen as its public interface. And I think in most cases like the first example in scheme where we want to hide the internals of a datatype, a concrete class can be enough (think ValueObjects).

I find that in Object Oriented Programming, thinking about types as you do when writing e.g. Haskell, tends to help when defining your interfaces effectively. One of the biggest differences is the manifestation of side-effects: In Haskell, there's a type for functions with side effects, while interfaces in Object Oriented languages mostly don't give you any insights into that (I've often thought about annotating side-effects in the docblocks of my interfaces).

While PHP doesn't allow you to implement your own Equality rules for your classes, some Object Oriented languages do. Compare these examples:

$foo == $bar;
$foo->equals($bar);

In the first example, we can't influence how PHP compares two instances of our class, while in the second one, we have complete control (even over the name of the method). It depends on taste what's best, certainly when your language allows you to overload == for your classes. Always try to make it as readable as possible!

Wow, we made it through! Hope to see you in the next episode! Happy programming y'aλλ! 🖖

Tags: