Making SpecFlow + Selenium testing easier with Page Objects

This article is in a series about Selenium and SpecFlow

  1. Introduction
  2. Why bother?
  3. Basic plumbing
  4. Page objects
  5. The engineering behind decent Gherkin files


Page Objects are things that shield the rest of your test from the details of the website under test.  At one end they understand all the text boxes, selects, ids, CSS selectors, XPath strings etc. that describe the parts of the page.  At the other end they present a business-level interface to the rest of the test in terms of methods and properties.  This translation is their only job.  Jobs that they don’t do (because something else is doing them) include:

  • Saying whether the test has passed or failed (the step definitions do this)
  • Choreographing the various parts of the test (the step definitions do this too)
  • Describing the parts of each test scenario (the feature files do this)

Another job that they don’t usually do, that is a more technical and low-level job than those above, is to create an IWebDriver instance.  The page object will need this to interact with the browser instance, and so is crucial.  However, it’s a lower-level dependency, and so Dependency Injection strongly suggests this should be passed into the page object from outside.  The creation of the IWebDriver is explained in the article on basic plumbing.

What parts a page object will need

Think about what services the page will provide to the rest of the test (ignoring the details of the page’s implementation for now).  What information will it provide, in what form?  What operations will it support?  For instance, if the page queries a database for a list of locations relevant to the user and turns that into a drop-down list (a select that pairs the name and id for each location) then this might lead to methods in the page object that do the following:

  • Get all location names in the list
  • Choose a location by its name
  • Choose the Nth location
  • Get the id of the selected location

It’s also worth thinking about how you would leave the page and go to another – which operations would lead to this?

I find it helpful for each page object to know its URL (the part that isn’t common to all pages on the site, such as the domain and any prefix in the rest of the URL).  This is private and set e.g. in the body of the constructor.

To anchor the page object to its page, you need member variables for each screen element that you care about.  I’ll explain the details below, but each element will have two things:

  1. A variable of type IWebElement
  2. An attribute on the variable that describes how to find the element on the screen, e.g. by its id.

An example is something like this:

[FindsBy(How = How.Id, Using = "UName")]
 private IWebElement userName;

This defines a variable called userName, which is linked to something on the screen with the id UName.  This is likely to be something such as a text input box.

Granularity of the page object interface

I think it’s worth taking a moment to consider granularity.  Should a page object for a login page have a login(username, password) method, or these methods:

  1. setUsername(username)
  2. setPassword(password)
  3. submitCredentials()

My feeling is that you should start with the single login() method, and create the others only if you need to.  There is enough to worry about in your test even if you go for the higher-level version that has a single login() method, so adding complexity that you don’t need (the 3 broken-out methods) seems to be shooting yourself in the foot.  If you do have both sets of methods, then tests should use the higher-level one unless they absolutely need to control the fine-detail via the lower-level methods.

Notice that the last method in the list of low-level methods is submitCredentials() rather than clickLogin().  This is deliberate.  If you change the label on the button or make some other change, then either the method name needs to change (and everywhere that calls it) or the method’s name and purpose start to diverge.

It’s worth keeping methods and member variables at as high a level of abstraction as you can.  It reduces the mental clutter, which will still leave plenty for you to get your teeth into.


To support the page objects, you will probably need some scaffolding.  You have two options:

  1. A class hierarchy
  2. A factory

The factory could be a static method in the base class, or could be a separate class.  What work gets done in a base class and what gets done in the factory is up to you.

The jobs that the scaffolding needs to do include:

  1. Create an instance of the correct class given an input (standard factory behaviour)
  2. Get the first part of the URL – the part that is constant across pages, such as the domain and any common prefix. If you want to be able to run the same tests against different environments you will want to read this from something like a config file.
  3. Call PageFactory.InitElements

PageFactory.InitElements is part of the Selenium library, and it is the code that initialises the IWebElements based on the attributes they are paired with.  It will hunt through the DOM of the page based on the id, CSS selector or XPath, and creates a Selenium object (an IWebElement) that is linked to what it finds.  This will let the variable (e.g. userName above) act as a stand-in for the bit of the DOM e.g. the input text box.

Responsiveness influences the design of scaffolding – more details below.

Acting on the page

There are quite a few methods that IWebElement provides that you can use to do things.  They are things like:

  • Click (for a button)
  • Clear / SendKeys (for a text input box)
  • SelectByIndex (for a select)

Usually you can use just the IWebElement, but for things like a select you will need to create a new SelectElement around the IWebElement.  For details of the operations and how to make a SelectElement, see the Selenium documentation.

Reading from the page

This is also a bit of a mixture.  For selects you will probably need to create a SelectElement around the IWebElement, and then can read its Options property.  For other bits on the screen you could just get the text of the IWebElement.

Sometimes you want to do a bit more than a simple read of information.  For instance, if you need to see if a given error message is present, you could do something like this.  It looks for a div with the id errorMsg and sees if it contains the supplied error message text:

private bool HasErrorByMessageText(string messageText)
    string errorXpathTemplate = "//div[@id='errorMsg'][contains(text(),'{0}')]";
    string errorXpath = string.Format(errorXpathTemplate, messageText);

    IReadOnlyCollection errorMessages = _driver.FindElements(By.XPath(errorXpath));

    if (errorMessages.Count > 1)
        throw new Exception("More than one instance of an error message found for " + messageText);

    return errorMessages.Count == 1;

Tedious implementation hassles to do with waiting

Probably the most annoying bit about page objects is waiting.  One consolation is that, by using page objects, you contain the hassles in one place rather than letting them spill out over the rest of the test.

The problem is that it will take a non-zero amount of time for a new page to load, or for an AJAX call to update part of the page.  Your test code will happily march on from the line that triggers a new page load to the next line that depends on the new page.  The second line of the test will do something like try to find and click on a button that simply isn’t there yet, and the test will fail.  To make matters worse, the time it will take for the new thing to be ready will vary in unpredictable ways, depending on the volume of network traffic and things like that.

It’s a bad idea to wait for a fixed amount of time, because you would have to make an unpleasant choice.  If you wait for a short time, then there’s the risk of a test breaking if there’s a lot of network traffic.  If you wait for a long time, then the test will always take ages and people will be less willing to run the tests.

It’s a better idea to wait until a specific condition is true, with a timeout to stop this wait continuing forever.  That way, if things are running quickly then your test takes no more time than it needs to, and if things are slow then the test won’t break.  What condition to wait for depends on the page.  If it’s a simple page, then you can wait for one of the screen elements in the new page to be present.  If the page uses JavaScript then life is a bit trickier because you need to wait until the JavaScript has finished – it might be creating or changing screen elements that you need for the test.

The best approach I have found is via this question on Selenium waiting in Stack Overflow.

I put a wait inside a page object method for each operation that will take a while.  Things like adding text into a text input box probably don’t need a wait, but clicking a submit button etc. probably does – if it will lead to some or all of the page being reloaded.  This way, the boring details to do with waiting are hidden away inside the relevant parts of the page object, and the rest of the test doesn’t need to worry about it.


If you site is responsive (which I hope it is), then you might well have different versions of the page.  For instance, on a wider screen menu items will be spread out horizontally in nav bars, but on a narrower screen they will be hidden behind a menu button.  In terms of page objects, there will be some things that are different between different widths of the screen, and some things that are the same.

The differences are things like the ids to use for controls (switching between the narrow screen version and the wide screen version of the same control), and whether a control is immediately visible (and hence usable) or if you need to click on something else first (like a menu button) to reveal the controls.

The things that will be the same are the operations you can do on the screen and the information you can read from it, and also the URL.  So, the implementation details will be different, but the higher-level interface will be the same.

I deal with this by using a small class hierarchy – a base class that defines the higher-level interface and any common behaviour, and then a derived class for each different version of the screen.  So, the overall class hierarchy for all page objects has 2-3 levels:

class hierarchy for page objects, showing a base class above all page objects and an intermediate parent class for pages that vary by screen width

The dark blue classes show which you would use in the tests, via a factory.

You will need to accommodate all this in the scaffolding – what needs to happen in the derived classes’ constructor and what in the base class’s?  The factory will need to use the current screen width to know which of the derived classes to create.

Tedious implementation hassles to do with Angular and selects

Most of the time, the stuff above covers what you need (with the Selenium documentation giving you the details).  Occasionally you will hit problems that are a special case.

One such problem is if you use Angular and have a select on the page.  The problem is that if you do the normal things to get the id of the selected option you will just get the number of the selected option (1, 2, 3 etc.).  You won’t get the id, which is usually what you want.

The easiest solution I’ve found was suggested in a workaround in a bug report against Angular.

You need to change the web page, so that the selected option’s id is written back to a data attribute on the select.  This gives you something to find via XPath, which will be the proper id and not the number of the option.  To do this you have something like this in the web page

<select ng-model="foo" data-current-value="{{foo}}" ng-options="...">

and then you read its value from the data attribute using IWebElement.GetAttribute(attrName)

4 thoughts on “Making SpecFlow + Selenium testing easier with Page Objects

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s