Security and a voice-controlled internet-connected cooker

I have seen adverts for a NEFF cooker that you can control with your voice via Alexa.

This is spiffy, but I can also see potential security problems.  I’m not advocating attacking Alexa or a NEFF cooker – this article is a standard-issue discussion of security problems, to help people improve security.  I hope I’m missing out some important details that will mean it’s more secure than this article will suggest.  I also hope that no-one has their home attacked via Alexa or their NEFF cooker.

In this article I’m assuming turning your oven and hobs up high enough for long enough will cause harm.  At the least it could damage your oven, but it might also start a fire.  (Certainly, I wouldn’t want to assume that turning your oven and hobs up high won’t cause harm.)

As I’ve said before I’m no security expert.  This is just an example of using imagination, as guided by a framework (see STRIDE below) to try to see flaws in a plan.  This article is using the NEFF cooker as an example of more general things – security, and the potential problems of IoT, which might be over-looked in the rush towards shiny.

Architecture

Before I dive into the details, I think it would be useful to have a map of the system I’ll be talking about.  This is based on assumptions, so could be wrong.

A diagram showing the interaction of the user, Alexa and the cooker

Some key points that underpin this assumed architecture:

  • Speech recognition is something that the Alexa system will do, so that the cookers, doorknobs etc. that use Alexa to be voice-controlled don’t have to (i.e. separation of concerns).
  • Speech recognition is hard. All that the Alexa device can do is recognise its wake word.  After that it ships the audio it hears to the Alexa servers in the cloud so that the main speech recognition can happen there, where there’s a lot more computer power available.
  • The Alexa servers (running a NEFF skill or Alexa app) will recognise the command to do something to the oven e.g. turn it on. Whether the output is the string of words or some skill-specific token doesn’t really matter.  The important thing is this command must get from the Alexa server to the oven somehow.NEFF integrating with Alexa at its servers doesn’t make much sense because it would mean that NEFF would have to maintain its own (logical) network for sending commands from the Alexa servers to an arbitrary home’s cooker in who knows which country.  (Remember, NEFF isn’t the only manufacturer integrating with Alexa, and so each would need their own network back to an arbitrary user’s home.)  Instead, I assume the command comes back to the Alexa box, and is then broadcast over the home’s Wi-Fi to get to the cooker.

So the series of steps involved in controlling a cooker by voice are:

1 The user speaks the command
2-5 The Alexa device recognises its wake word and starts streaming audio to the Alexa server over Wi-Fi and then the internet
6-9 The Alexa server recognises the rest of the command, and returns the result to the Alexa device over the internet and then Wi-Fi
10 The Alexa device broadcasts the results of the speech recognition over the Wi-Fi
11 The Wi-Fi gateway attached to the NEFF cooker picks up the speech recognition output and turns that into control signals for the cooker
12 The control signals go from the gateway to the cooker to e.g. set its temperature

STRIDE

One approach to thinking about security problems is STRIDE.  As far as I’m aware, it’s a set of prompts to help you think, rather than a multi-step procedure that you must follow to the letter.

Type of attack

Thing under threat

Explanation

Spoofing Authentication Fooling the system into thinking you are someone or something you’re not
Tampering Integrity Changing code or data, in flight or at rest
Repudiation Non-repudiation Covering your tracks, so it appears you did / didn’t do the thing you didn’t / did do
Information Disclosure Confidentiality Accessing information that you’re not supposed to be allowed to get to
Denial of Service Availability Stopping other people from using the system
Elevation of Privilege Authorisation Doing something you’re not supposed to be allowed to do

Spoofing and elevation of privilege can often end with the same result – your being able to do something you’re not supposed to do.  One of them (spoofing) starts of the beginning of the process, where you impersonate someone who’s able to do the thing you want to do.  The other one (elevation of privilege) doesn’t bother trying to make you look like someone else, rather it just attacks the protections that limit who can do what.  I have written an article about the difference between authentication and authorisation if you want more details.

I will now use the STRIDE method to think of possible weaknesses with the cooker.

Spoofing

Alexa offers voice profiles, but as far as I can tell these are optional.  That means that speaker verification isn’t always available as a security check for the Alexa + cooker combination.  So, to impersonate you I just speak your language – I don’t need to impersonate you well enough to fool a person.

I might not even need to be inside your house to do control your cooker.  I could shout in through your letterbox or shine a laser at Alexa.  I might not even need to be in the same country in order to control your cooker.  You might have an answer phone that plays what it’s recording over a speaker, so if you’re rushing to pick up the phone you can hear the bit of the conversation that happened before you got to the phone.  This means I just need to phone you when you’re out and speak loudly enough (assuming Alexa can hear your answer phone).

If I want to be elegant or defeat voice profiles / speaker verification, I could get recordings of you via some social engineering.  For instance, I could visit your house or phone you up while recording things, and steer the conversation to make sure you say the words I’m interested in.  For instance, I could pretend to be from NEFF after-sales care, and claim that customers who give feedback about their purchase will be entered into a prize draw.  Then using audio editing software (which might even be free) I can splice together the sentence I want (e.g. “Alexa turn on my cooker”), and then play that to your answer phone later.

Tampering

If I wanted to be subtle, I might not directly cause a fire but set things up such that you accidentally cause a fire in the future.  I could send malformed instructions to the cooker, e.g. ones that cause a buffer overflow.  This could allow me to reprogram the cooker control software.  (I’m assuming the cooker’s software can be upgraded, e.g. to fix bugs if they’re discovered in the field.  This means that the program isn’t physically burned into a chip, but instead is stored in something more temporary and over-writable.  Also, note that this is nothing to do with Alexa, but instead attacks the Wi-Fi interface to the cooker.  As an attacker I’m free to stick the knife in wherever I think you’re least protected, or where it will do most damage.)

Once I can reprogram the cooker (which is a phrase I wasn’t expecting to be able to say, just as I wasn’t expecting to be able to say things like “reboot the cooker” below) I can change what happens when you use the controls.  So when the cooker receives a signal that one of the knobs has been turned a quarter turn, instead of doing this:

  1. Turn on the front right hob to 100C

I could make the oven do this:

  1. Turn on every hob to maximum;
  2. Turn on the oven to maximum;
  3. Stop listening to further signals.

My tampering didn’t directly start a fire, however it made the cooker into a fire waiting to happen when you next did the wrong thing.

Disconnecting the Wi-Fi wouldn’t be enough.  You would have to reboot your cooker, or maybe even reset it to factory settings.  Switching it off at the wall will stop the cooker from being hot, even if it doesn’t clear the code changes I’ve made.

Repudiation

This is an area where there’s an interesting difference between this attack and attacks on a normal computer system e.g. in a business or government.  Unusually, my attack might partly clean up after itself if it’s successful.  If I manage to take control of your cooker and so turn it up high for a very long time, this could start a fire.  The fire could burn enough of your house to make it hard to examine your answer phone, your cooker, your Alexa box, your modem etc.  It would only be things in the cloud that I would have to worry about.

Denial of Service

This section is slightly out of order, as the remaining two sections will go together.

The cooker might have modes where it can’t cook, such as cleaning or running diagnostics.  The more brute force way I could use this is I could repeatedly send messages to keep it in one of these modes.  You would have to disconnect your cooker from the Wi-Fi to fix this.

If I wanted to do something more permanent, I could re-use the buffer overflow attack from the Tampering section but with a different objective.  Instead of rewriting the knobs to turn everything on to maximum, the attack could set it into one of these non-cooking modes and then disconnect it from the controls.  You then wouldn’t be able to use your cooker to cook anything.

Information Disclosure and Elevation of Privilege

Assume I manage to record the traffic going over your Wi-Fi – depending on the strength of your Wi-Fi maybe I don’t even need to be inside your home.  I also learn when your cooker turns on and off.  This could either be old-school spying, or more modern techniques such as analysing how the electricity consumption of your home varies over time.

Pairing up the packets that flow over the network with when your cooker turns on, I might be able to establish a relationship.  I might discover that packets A, B and C went over the network just before the cooker turned on.  I hope that there’s some security used in A, B and C so I can’t just read the plain text for e.g. “turn on the oven to 200C”.  I will assume that they’re just arbitrary bytes that are opaque to me.  However, traffic analysis has shown that they’re probably somehow linked to the cooker turning on, just by the coincidence of happening around the same time.

The details of the meaning of A, B and C matter.  They might mean e.g. turn the oven on to 200C.  On the other hand, they might mean e.g. turn the oven on to 200C if you receive this message before 13th January 2020 13:45:00 (or whatever e.g. five minutes into the future is).

If it’s the former, then I have got hold of the general-purpose command to set the oven’s temperature.  If it’s the latter, then I can set the oven’s temperature only before 13th January 2020 13:45:00.  While the latter is mildly irritating to the homeowner, the former means the system has disclosed valuable information.  It means that anyone who can send Wi-Fi signals into your home, and not just authorised users, can control the cooker.

This kind of weakness was used to attack cash machines in their early days.  A maintenance engineer attached a small computer to the wires inside the cash machine.  The messages going over the wires were encrypted far too strongly for the engineer to unravel.  However, they noticed that message X always happened just before the cash machine dispensed some money.  So, they simply replayed X several times and it was like a one-armed bandit hitting the jackpot.  The lack of an expiry date in the messages turned the message from a context-specific instruction into a general-purpose one.

Expiry dates aren’t the only answer – there’s a trade-off between security and robustness.  If the expiry date were set too close, then if messages are delayed by e.g. normal network congestion, a valid message will be discarded.

It’s important to note here that the information being disclosed isn’t my end objective, which is attacking your home.  However, the disclosed information, even though less valuable to me than your home being on fire, helps me to achieve that objective.

Another example of this is attacking websites.  An attacker might deliberately request a page that they think won’t exist in a website.  This is to trigger a 404 error.  The web server behind the site might be poorly configured, such that it returns a default 404 response.  This default 404 response might include the name and version number of the software powering the web server.  This would then let the attacker narrow down the options for their next step, using a list of known vulnerabilities for that web server software.  This would then let then more easily reach their end objective, which might be stealing credit card numbers.

Brief diversion to accessibility

Speech and computers can make the world a better place for people with disabilities, illnesses and other problems.  For instance, phone banking can give people back the dignity of being in control of their money.  Speech synthesis helped Prof Stephen Hawking regain the ability to express himself in speech.  You might think this speech-controlled internet-connected oven is like that.

However, it’s important to not confuse shiny with would be an effective help for real people.  It could be that physical, and not computer-based, changes would make a bigger difference to a blind person:

As in so many things to do with people using technology, it’s important to talk to actual users and find out what’s important to them and how they can best be helped.

Summary

As I said in the introduction, I really hope that no-one with a NEFF cooker has it attacked, and I hope this article doesn’t help anyone who’s trying to do that.  However, security rarely makes for good marketing copy, particularly when compared to better features than the competition’s.  Also, security is an expensive thing to think of when it’s too late.  I hope that this article will act as a reminder to think of security before it’s too late, and give you some pointers as to how to do that in an effective way.

Leave a comment