SECURE AND RESILIENT ROLLOUT OF SOFTWARE SERVICES in the Smart Grid - - PowerPoint PPT Presentation

secure and resilient rollout of software services
SMART_READER_LITE
LIVE PREVIEW

SECURE AND RESILIENT ROLLOUT OF SOFTWARE SERVICES in the Smart Grid - - PowerPoint PPT Presentation

SECURE AND RESILIENT ROLLOUT OF SOFTWARE SERVICES in the Smart Grid E. Piatkowska 1 , D. Umsonst 2 , M. Chong 2 , C. Gavriluta 1 and P. Smith 1 {firstname.lastname}@ait.ac.at; umsonst@kth.se; mchong@kth.se 1 AIT Austrian Institute of Technology 2


slide-1
SLIDE 1

SECURE AND RESILIENT ROLLOUT OF SOFTWARE SERVICES

in the Smart Grid

  • E. Piatkowska1, D. Umsonst2, M. Chong2, C. Gavriluta1 and P. Smith1

{firstname.lastname}@ait.ac.at; umsonst@kth.se; mchong@kth.se

1 AIT Austrian Institute of Technology 2 KTH Royal Institute of Technology

This research has received funding in the framework of the joint programming initiative ERA-Net Smart Grids Plus, with support from the European Union’s Horizon 2020 research and innovation programme.

slide-2
SLIDE 2

2

PRESENTATION OUTLINE

  • 1. Large-scale rollout of software in

the smart grid

  • 2. Example scenario in a medium-to-

low voltage distribution network

  • 3. Motivate the need for adaptive

approach to rolling out software in the Smart Grid

  • 4. Adaptive rollout approaches
  • 5. Detecting failures and reasoning

about their root causes

slide-3
SLIDE 3

3

  • Energy distribution systems are undergoing a transition into so-called Smart Grids,

which involves the increased use of software systems

  • In many cases software-based services are used to support grid control
  • Voltage control in substations
  • Active and reactive power management of inverters
  • Implementation of energy services (e.g., demand-response schemes)
  • Electric vehicle charging
  • Consequently, there is a coupling between the state (correctness) of software-based

systems and power system behaviour

  • For several reasons, the software and its configuration in the smart grid will require

updating

  • (Security) patches, adaptation to grid behaviour, new services, …
  • The LarGo! project is concerned with the secure and resilient large-scale rollout of

software services in the Smart Grid

MOTIVATION: LARGE-SCALE SOFTWARE ROLLOUTS IN THE SMART GRID

slide-4
SLIDE 4

4

  • Medium-to-low voltage network based on

the CIGRE benchmark network with eighteen loads

  • Software components control voltage levels

may be subject to updates

EXAMPLE SOFTWARE ROLLOUT SCENARIO

Control Centre Household1 Householdn

= ~

EMS

= ~

EMS

V0 V1 Vn Secondary Substation DMS GWD CTRL

High Voltage Medium Voltage Low Voltage

DMS: Distribution Management System GWD: Grid Watchdog CTRL: Substation Controller BEMS: Energy Management System

On-Load Tap Changer Communication Network

A patch may involve new settings of 𝜕max|min|𝑛|𝑜 and 𝑅max|min parameters

Max reactive injection Min reactive injection

Grid Connected Inverter

slide-5
SLIDE 5

5

  • Offline process identifies a safe rollout strategy to update EMSs, including

the droop law settings; this strategy is executed using a rollout scheduler

SOFTWARE ROLLOUT SCHEDULER

Rollout Strategy

  • S1. V11
  • S2. V15, V16
  • S3. V17
  • S4. V18

Software Rollout Scheduler Software Repository

Software Configuration

slide-6
SLIDE 6

6

AN EXAMPLE SOFTWARE ROLLOUT FAILURE

Scenario

  • Patch software in EMSs,

including update of Droop law

  • Failed update – flipped

Droop law configuration

  • Inverters inject rather

than draw power as voltages increase; problem not corrected during rollout

  • Updates at 5s, 20s, 35s,

and 50s

  • Update order is V11, (V15,

V16), V17, V18

  • Eventually voltage

exceed safety threshold at several locations Safety limit

Voltage at substation: 251V; Max reactive power for an inverter: 2500V

Software update

We need to be able to detect these failures and respond accordingly

slide-7
SLIDE 7

7

  • Based on the root cause of a failure, different responses to failures may be

desirable, e.g., to expedite a large-scale rollout

ADAPTIVE ROLLOUT STRATEGIES

Strategy Example Root Causes of Failure Skip and Continue Local and Persistent Device misconfigurations; mismatch between expected and actual target system state for one device Retry and Continue Local and Transient Power system perturbations; transient local communication issues Halt Global and Persistent Cyber-attacks; misconfigurations (cf. droop law); system state mismatches (asset mgmt.) Rollback Optionally, it could be desirable to rollback to a previous known-good state, although this may not be desirable

  • r possible
slide-8
SLIDE 8

8

  • To determine the root cause of a software rollout failure, distributed “sensors”

are required, located in the substation and EMSs

DETECTING SOFTWARE ROLLOUT FAILURES

Sensor Description AV: Anti-virus software A host-based antivirus system running on the EMSs HIDS: Host-based IDS A host-based IDS running on the EMSs (e.g., OSSEC) NIDS: Network-based IDS A network-based IDS running on the EMSs and in the substation (SSN) (e.g., Snort) AD: Anomaly Detection An anomaly detection system that identifies unusual voltage measurements at the EMSs, e.g., based on residuals SWM: Software Manager A software that is located at the EMS that checks whether a software update has completed successfully STPA: Hazardous Control Detection A system that checks whether control actions that are carried

  • ut in the substation could cause hazards, based on results

from an STPA analysis

slide-9
SLIDE 9

9

DEPLOYMENT OF DISTRIBUTED SENSORS

Control Centre Household1 Householdn

= ~

EMS

= ~

EMS

V0 V1 Vn Secondary Substation DMS GWD CTRL

High Voltage Medium Voltage Low Voltage

DMS: Distribution Management System GWD: Grid Watchdog CTRL: Substation Controller BEMS: Energy Management System AV: Anti-virus software HIDS: Host-based IDS NIDS: Network-based IDS AD: Anomaly Detection STPA: Hazardous Control Detection

AV, SWM, HIDS, AD AV, SWM, HIDS, AD NIDS SWM STPA, HIDS Root Cause Analysis

slide-10
SLIDE 10

ROOT CAUSE ANALYSIS WITH EVIDENTIAL NETWORKS

  • An evidential network is a graph structure for

knowledge representation and inference

  • Nodes in the graph represent variables, e.g.:
  • Control system state
  • HIDS and NIDS alarms
  • Variables have a frame that defines their mutually

exclusive values

  • Relations between variables are given as mass

functions that describe beliefs

  • Dempster Shafer (DS) theory allows relation implication

rules with uncertainty measures

  • Inference within the evidential network is achieved by

two operators, called combination and marginalisation

  • P. P. Shenoy, A valuation-based language for expert systems,

International Journal of Approximate Reasoning, Vol. 3, pp. 383–411, 1989.

10

slide-11
SLIDE 11

EMS Substation (SSN)

11

ROLLOUT SCENARIO EVIDENTIAL NETWORK

System State @ SSN {normal, erroneous, malicious} {normal, manipulated} {true, false} {true, false} Network State Hazardous Command Sent Suspicious Activity NIDS STPA HIDS Rollout State {normal, erroneous, malicious} {normal, erroneous, malicious} System State @ EMS HIDS {true, false} AV {true, false} AD {true, false} SWM {true, false} Suspicious Activity Malware Detected Voltage Anomaly Software Updated

slide-12
SLIDE 12
  • BEMS report voltage anomalies – small perturbations within a limited time

frame are considered normal

12

SCENARIO 1: NORMAL BEHAVIOUR DURING A ROLLOUT

slide-13
SLIDE 13
  • Failed Rollout results in persistent disruptions in grid operation
  • Sign error introduced to PV inverter controllers at node R11, R15, and R16

13

SCENARIO 2: MISCONFIGURATION OF DROOP LAW

slide-14
SLIDE 14
  • Compromised rollout results in malware being installed on nodes (BEMS)

14

SCENARIO 3: MALWARE ON THE EMSs

slide-15
SLIDE 15
  • Man-in-the-middle attack performed during the rollout to compromise

communication between voltage sensors and voltage control at the substation

  • Integrity attack performed to trigger unnecessary or unsafe control actions

15

SCENARIO 4: MAN-IN-THE-MIDDLE ATTACK

slide-16
SLIDE 16

CAUSAL ANALYSIS DEPLOYMENT ARCHITECTURE

  • Event-driven architecture using microservices
  • Communication between components with MQTT – an MQTT broker serves

as an event bus

  • Independent from testbed and implementation of components; intended to be

scalable and easy to extend

16

  • Main components:
  • Distributed sensors
  • Algorithms for

complex event processing

  • Web-based graphical

user interface

slide-17
SLIDE 17

17

  • The Smart Grid contains large amounts of software that is used to support

critical control applications

  • Software in the Smart Grid will need to be updated
  • (Security) patches, adaptation to grid behaviour, new services, …
  • Failures in the software rollout process can result in power systems

consequences

  • For large-scale software rollouts, it is desirable to automate the process and

adapt the behaviour of the process based on the cause of failures

  • Proposed an approach to analysing the root cause of deployment failures

based on events generated by distributed sensors

  • Future work will involve evaluating the approach in a lab-based environment

and large-scale simulations

CONCLUSION AND OUTLOOK

slide-18
SLIDE 18

THANK YOU!

Paul Smith {firstname.lastname}@ait.ac.at