Encapsulation

This is the second of the things requested by Jesper. To me, the software engineering term encapsulation is part of the bigger term modularisation. Modularisation is chopping a big lump of code into smaller parts or modules. It’s important to get the boundaries between parts in the right place. Once there are modules, they can have more or less encapsulation. Encapsulation means:

Each module combines code and data
Each module is divided into a part that is accessible to other modules (usually called public), and a part that is accessible only to code inside the module (usually called private).

In this article I will go into the division into public and private.

In the world outside code

I think it’s helpful to recognise where there’s this division into more and less accessible parts in non-coding bits of the world. A swan looks graceful above the surface of the water, but underneath there’s energetic paddling. In a theatre, an illusion is created on stage with help from things off stage such as changing rooms that would spoil that illusion. In stately homes like Downton Abbey or Audley End there’s a corresponding division into upstairs and downstairs.

Why use encapsulation?

Encapsulation is useful because it lets you manage coupling between modules. Some coupling between modules is expected and necessary; if there were no coupling then the modules would be independent. That means there would be N separate systems, rather than N modules co-operating to solve a bigger problem.

However, you want to coupling to be as low or loose as it can be while still providing a good overall solution i.e. while still allowing the parts or modules to collaborate. Also, you want to know where and how this coupling happens. Both of these (reducing coupling, and knowing where and how it happens) are so that it’s easier to change things in the future.

Also it often makes code a bit clearer. In a given class, public methods are the summary and private methods are the details, so having them clearly marked makes it easier to identify the role a given method plays. You also immediately know that a private variable will never be accessed from outside the class, rather than having to track down all its uses.

Comparing two restaurants

I’d like to use two fictitious restaurants to illustrate the benefits of encapsulation. The first restaurant is what I would consider unremarkable, and I’m treating it as a baseline. A customer sat at a table gives their order to a member of the waiting staff, this order goes to the kitchen, and a while later the food for the order is delivered from the kitchen to the customer.

What goes on in the kitchen is hidden from the customer, and this is OK as long as the right food appears quickly enough. The restaurant owners are free to rearrange things inside the kitchen however they like, as long as the orders-to-food process still works OK. So, who does what work where and when are all free to be changed. (Note that the customer relies on people like food hygiene inspectors to guarantee at least a minimum standard of cleanliness and so on, because the inner workings of the kitchen are hidden from them.)

In the second restaurant, the kitchen isn’t hidden away from customers, so customers can see the chefs as they work. A regular customer at this restaurant always orders the same dish. This is partly because of how it tastes, smells and looks, but also because of how it is cooked. Part-way through the cooking, the chef tips the pan such that the contents of the pan are ignited by the gas flame that’s normally underneath the pan. This produces a big whoosh and a flame that shoots up, and the customer likes this bit of theatre.

One day they are in the restaurant and place their normal order. To their disappointment, there’s no big whoosh and flame. The food still arrives on time, and tastes, smells and looks just as good as before, but nonetheless they don’t enjoy the meal as much as they had expected. It might be that the restaurant has tried a new way of preparing the food, or the dramatic stage is done in bulk before service starts, or it’s been moved to happen out of sight of the customer.

This is an example of unintended and uncontrolled coupling through poor encapsulation. The restaurant thought it was providing only one public service (orders-to-food). However, because the customer was able to see the details of how this service was performed, they started depending on, i.e. being coupled to, a private and accidental part of the service (orders-to-theatre). When the restaurant decided to change how they provided this service while still honouring what they thought were their commitments, this unintended and uncontrolled coupling caused problems (an unhappy customer).

Other benefits of encapsulation

Encapsulation can also reduce bugs, or make it easier to avoid bugs. Imagine that there’s a task that has three stages:

Getting ready
Doing the main work
Tidying up

In this case it’s important that the getting ready and tidying up happen. It could be that limited resources, such as files in the file system, are created in the getting ready and need tidying up at the end. It might be that the task is big enough that it makes sense to split these tasks into their own methods. If all three methods were accessible outside the class, other code could accidentally be written that only called the main work method and not the getting ready or tidying up methods. This is likely to lead to bugs. However, if the methods were private but called in the right order by a publicly-accessible wrapper method then this kind of bug is avoided. (This approach is an application of the façade pattern.)

Encapsulation can also encourage you to lift things to a more abstract level, i.e. closer to the world of people and their needs. If you’re writing an automated test for a website, you might create a page object for the login page. Its purpose is to take care of the details of the page, such that test code can concentrate on checking behaviour. The page object will need to contain selectors that say how to find the fields for username and password and the submit button, e.g. the ids of the elements in the DOM.

One approach would be to make these field selectors public, so that the test code can access them and then use them to fill in the relevant fields with a username and password and then click on the submit button. However, another approach would be to make the field selectors private but to provide a public login method. It would accept a username and password, and then fill in the corresponding fields with those values and click the submit button.

This makes the test code simpler and clearer, and would make it easier to adapt if the login page changed to e.g. hide the fields behind a button that would show / hide the login controls. In the case where the selectors were public, there would need to be an extra selector and all test code would need to change to show the login controls. In the case where the login method is the only public thing, only the details of this method need to change.

Encapsulation isn’t always needed

Encapsulation is yet another tool that should be used with a little care. While it’s good in many cases, it isn’t needed in every case. There are times when an object really is nothing more than a bag of data, otherwise known as a Data Transfer Object. It’s just to shift data from point A to point B, and usually has no behaviour in the form of methods. There’s no point in trying to add complication via a division into public and private.

Sometimes you create a class where each method is shallow. It doesn’t have much work to do, and so doesn’t need help other methods. In this the public part of the class fills the class, and the private part doesn’t exist. If this happens and makes sense, then don’t worry about the lack of private stuff.

Encapsulation isn’t always binary

So far I’ve been saying the encapsulation is a simple binary split – code and data is either public or private. Depending on the language you’re writing in, there’s likely to be other options too.

Protected means that the data or method is accessible to code in the same class and to code in classes derived from this class. This means that things that are shared between child classes can be gathered together into a parent class, while still being hidden from code outside the set of related classes.

Internal means that the data or method is accessible to code in the same assembly, e.g. DLL in the world of C#. This is the loosest thing that isn’t fully public.

Summary

Encapsulation is a useful way to strike a balance in the coupling between modules. It allows coupling to happen, but in controlled ways that make it easier to change things in the future. It’s not always needed, but if you find yourself hitting encapsulation-related problems it’s worth thinking before making something public. Is there something else missing, particularly something at a higher level of abstraction?