I was introduced to an interesting way of multiplying two numbers (integers greater than 0) recently, at a Tudor re-enactment at Kentwell Hall. It took me a while to realise what was going on behind the scenes, at least in terms of things I already understood. As it also made me think in a new … Continue reading Multiplying using halving, doubling and summing
Feedback loops for quality
This is the second part of my response to the Ministry of Testing’s latest blog challenge: What three things have helped you in your testing career? As I’m not a tester, I’m choosing to re-word this as: What three things have helped you in the quality aspects of your career as a programmer? Culture and … Continue reading Feedback loops for quality
Culture and people for quality
This article and the next are my response to the Ministry of Testing’s latest blog challenge: What three things have helped you in your testing career? As I’m not a tester, I’m choosing to re-word this as: What three things have helped you in the quality aspects of your career as a programmer? It was … Continue reading Culture and people for quality
Computer science while doing the laundry 2: Bin sort
This is part of a short series of articles about computer science while doing the laundry: Merge sortBin sort In the previous article I used doing a lot of laundry to illustrate merge sort, which is probably an impractical way of doing the laundry. In this article I will suggest a way that might actually … Continue reading Computer science while doing the laundry 2: Bin sort
Computer science while doing the laundry 1: Merge sort
This is part of a short series of computer science involving laundry: Merge sortBin sort In this article I will explain merge sort, which is a way of sorting things when there are so many of them it’s awkward or impossible to use other approaches. I’ll use doing the laundry as a way of explaining … Continue reading Computer science while doing the laundry 1: Merge sort
Introduction to Azure Data Factory
Azure Data Factory (ADF) is a tool from Microsoft that lets you move data from one place to another, optionally changing it too. This activity is sometimes described as data engineering or ETL (Extract Transform Load) or ELT. There’s an older tool from Microsoft that also does ETL, called SQL Server Integration Services (SSIS). They … Continue reading Introduction to Azure Data Factory
Visualising sauces in French cuisine
Classic French cuisine, as defined by e.g. Escoffier, has a set of base sauces such as velouté from which other sauces like normande can be derived. This article is an attempt at visualising the sauces and the relationship between them. The motivation behind it is someone I know who is studying catering, and as part … Continue reading Visualising sauces in French cuisine
Connecting Azure Data Factory code to an external database table
In this article I will talk about how to connect Azure Data Factory (ADF) to a database table. This can be surprisingly complex, so I will start with the simplest version and work towards more complex versions. I won't go into connecting ADF to other types of data store such as APIs, blob storage etc, … Continue reading Connecting Azure Data Factory code to an external database table
Staging input data to improve testability in data pipelines
In a data pipeline (an ETL or ELT pipeline, to feed a data warehouse, data science model etc.) it is often a good idea to copy input data to storage that you control as soon as possible after you receive it. This can be known as copying the data to a staging table (or other … Continue reading Staging input data to improve testability in data pipelines
Improving testability and observability of look-ups in data pipelines
Often in data pipelines (ETL or ELT pipelines for feeding a data warehouse, data science model etc.) we need to look up reference data that relates to the main flow of data through the pipeline. If this isn't done carefully, there can be problems for checking how the system is running. Before the system is … Continue reading Improving testability and observability of look-ups in data pipelines