In a data pipeline (an ETL or ELT pipeline, to feed a data warehouse, data science model etc.) it is often a good idea to copy input data to storage that you control as soon as possible after you receive it. This can be known as copying the data to a staging table (or other … Continue reading Staging input data to improve testability in data pipelines
Category: Testing
Improving testability and observability of look-ups in data pipelines
Often in data pipelines (ETL or ELT pipelines for feeding a data warehouse, data science model etc.) we need to look up reference data that relates to the main flow of data through the pipeline. If this isn't done carefully, there can be problems for checking how the system is running. Before the system is … Continue reading Improving testability and observability of look-ups in data pipelines
Analogies and objectives for testing
I guess if I had to define my role at work it would be: programmer. However, I have learned a lot from people who wouldn't call themselves programmers, such as testers (Michael Bolton, Jerry Weinberg, the Ministry of Testing community etc.), user experience experts (Paul Boag, Jared Spool, Don Norman etc.), and data people of … Continue reading Analogies and objectives for testing
Testing a data pipeline
There are several approaches to testing a data pipeline - e.g. one built using an ETL tool such as SSIS or Azure Data Factory. In this article I will go through three, plus refer to another (unit testing components of the pipeline). For simplicity sake I will refer to only database tables, but other forms … Continue reading Testing a data pipeline
Regular expressions
This is the first article in a short series on some classic bits of computer science, which are occasionally useful in professional programming: Regular expressionsFinite state machinesComparing regular expressions and finite state machines A useful tool with a bad reputation Regular expressions are a way to define a set of 0 or more text strings … Continue reading Regular expressions
The skills that developers and testers share
The idea that programmers and testers are different kinds of people with different kinds of skills is sometimes helpful, but not always. It can help to match people to jobs or show where people have different strengths. But it can also lead to tribalism – you’re different from me so you’re worse than me. In … Continue reading The skills that developers and testers share
Using tools in interesting ways (tool hacking)
This article is my response to the Ministry of Testing’s blogging challenge: How we hacked a tool to make it work for us. First, I’ll go into tools in general a bit, and then give two examples of how I have used tools in slightly non-standard ways. I've written a bit about tools already, but … Continue reading Using tools in interesting ways (tool hacking)
Good software and how to get it
A little while ago, I was asked “What makes software good?”, which was followed up by “How do you end up with good software?”. I thought that they were excellent questions, and I will give my answers below. I don't claim to have the answer, just an answer. I’ll try to limit esprit d’escalier / … Continue reading Good software and how to get it
Different ways people add value in a software development team
There was a tweet about how tech companies measure people by the impact they make. I replied in the common terse Twitter way, and I want to expand on that here. I think that there are few different ways in which someone can add value in a software development team, and they're not all equally … Continue reading Different ways people add value in a software development team
Confusing user value with other things
Programmers look at software they’re working on from the inside, but users look at it from the outside. This difference in perspective can lead to different views about what’s important – too often programmers can be consumed by the technical detail and lose sight of value to the end user. In fact, they too often … Continue reading Confusing user value with other things