WECC Human Performance Work Group Event Analysis Norm Szczepanski, - - PowerPoint PPT Presentation

wecc human performance work group
SMART_READER_LITE
LIVE PREVIEW

WECC Human Performance Work Group Event Analysis Norm Szczepanski, - - PowerPoint PPT Presentation

WECC Human Performance Work Group Event Analysis Norm Szczepanski, SMUD Shawn Halverson, BPA W E C C E S T E R N L E C T R I C I T Y O O R D I N A T I N G O U N C I L 2 Agenda WECC HPWG Event


slide-1
SLIDE 1

WECC Human Performance Work Group Event Analysis

Norm Szczepanski, SMUD Shawn Halverson, BPA

W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-2
SLIDE 2

Agenda

  • WECC HPWG
  • Event
  • Perspective from operations
  • Perspective from the field
  • Corrective actions
  • Lessons learned

2 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-3
SLIDE 3

WECC Human Performance Work Group Event Analysis

Purpose: The Human Performance Work Group (HPWG) provides common vocabularies, tools, techniques, and training materials to assist Bulk Power System (BPS) Operations and Field personnel in order to promote the sustainability of Human Performance Improvement practices.

3 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-4
SLIDE 4

WECC Human Performance Work Group Event Analysis

Process: The Human Performance Work Group attends monthly review sessions to analyze operating events from a Human Performance perspective and share any Human Performance lessons learned.

4 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-5
SLIDE 5

WECC Human Performance Work Group Event Analysis

Goal: By sharing Human Performance Lessons Learned from these operating events information is passed along that will help

  • thers avoid the same or similar situations.

5 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-6
SLIDE 6

How much of the time do we error?

To Err Is 90 Percent Human

Why We Make Mistakes Joseph T. Hallinan

6 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-7
SLIDE 7

How much of the time do we error?

7 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-8
SLIDE 8

WECC Human Performance Work Group Event Analysis Lesson Learned

8 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

Loss of communication to multiple SCADA RTUs at a Switching Center

slide-9
SLIDE 9

WECC Human Performance Work Group Event Analysis

Why is this event a good Human Performance Lesson Learned? This example shows how human error created a system condition that lay undetected until specific circumstances were created. The event that transpired had wide reaching impacts to both Control Center Operations and Field personnel.

9 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-10
SLIDE 10

Event Description

Grid Operations lost communications with multiple substation Remote Terminal Units (RTUs) that were routed through a Switching Center Energy Management System (EMS) platform.

10 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-11
SLIDE 11

Event Description

A total of 87 RTUs were impacted, including 34 Bulk Electric System (BES) RTUs. The various substations affected have operating voltages ranging from 4kV to 500kV. This resulted in the loss of Substation Control and Data Acquisition (SCADA) functionality. The total event duration was 78 minutes.

11 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-12
SLIDE 12

Context behind this Event

  • The Switching Center that contained the EMS

platform for the RTU communication was recently relocated to a newly constructed control room approximately 2 months prior to this event.

  • The power for this EMS platform was routed

through an uninterruptible power supply (UPS) that was a new and different model compared to other Switching Center facilities.

12 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-13
SLIDE 13

Context behind this Event

  • A transformer #3 was taken out of service. Its

tertiary supplied the primary station service power source.

  • This created a “UPS General Alarm” that was

acknowledged by both the Switching Center System Operator and the Transmission Dispatcher.

13 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-14
SLIDE 14

Context behind this Event

  • The Switching Center System Operator

determined that a Substation Operator needed to be called out to investigate the alarm.

  • However, due to other switching taking place,

the Dispatch request was never issued and the cause of the UPS General Alarm was not investigated.

14 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-15
SLIDE 15

Context behind this Event

  • About 5 ½ hours later communications with

the 87 RTUs was lost as the UPS system was

  • perating on its backup battery and finally ran
  • ut of power to run the communications

equipment.

15 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-16
SLIDE 16

Context behind this Event

  • A Technician was sent to determine the cause
  • f the power failure and found the main

circuit breaker to the UPS tripped.

  • This main circuit breaker was reset and closed

by the Technician restoring power to the RTU communications equipment.

16 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-17
SLIDE 17

EMS UPS Cabinet

17 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-18
SLIDE 18

EMS UPS Main CB (inside UPS cabinet)

18

slide-19
SLIDE 19

Upon Further Review

  • An inspection determined that the system

Auto/Manual Restart switch was selected to the “Manual‘’ position (factory default setting the vendor was unaware of and did not correct during in-servicing of new system).

  • In this configuration, the UPS system is

designed to trip the main CB for a momentary loss of AC power, as experienced during an automatic transfer of station service.

19 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-20
SLIDE 20

Upon Further Review

  • The vendor selected the switch to the “Auto”

position which will ensure the main circuit breaker remains closed and the UPS transfers to the alternate AC power supply.

  • Subsequent local testing verified the UPS

system automatic transfer switch to be functional and set appropriately.

20 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-21
SLIDE 21

HP Perspective

“Testing and Energization of new equipment”

  • During the time that new equipment is being

installed provides an opportunity for local Technicians and Substation Operators to work with vendors and receive training on new equipment.

21 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-22
SLIDE 22

HP Perspective

“Testing and Energization of new equipment”

  • The “As Left” condition of newly installed

equipment needs to be understood and verified that proper operation will occur when called upon to perform its required function.

22 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-23
SLIDE 23

HP Perspective

“Equipment has become more advanced and complicated”

  • Many times physical control switches

have been replaced with logic buried in menus on a display screen. This can result in unwanted factory default settings being overlooked and resulting in equipment not operating as expected.

23 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-24
SLIDE 24

HP Perspective

“Equipment has become more advanced and complicated”

  • Control / Selector switches may be

located in areas that are not normally inspected on a routine basis. This can lead to a condition that is not readily visible to the Technician or Substation Operator.

24 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-25
SLIDE 25

HP Perspective

“Understanding the meaning of Alarms”

  • There are several circumstances where alarms are

ganged together to produce one alarm point and requiring in-depth local troubleshooting to determine the actual problem.

  • Alarm nomenclature can be misleading as to the

actual problem or severity of the condition. In this case a “UPS General Alarm” came in after the loss

  • f primary AC power to the UPS, however there

was no apparent power system trouble.

25 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-26
SLIDE 26

HP Perspective

“Understanding the meaning of Alarms”

  • It was not clear that the communication

equipment was running on backup battery power until the RTU communication failed.

26 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-27
SLIDE 27

HP Perspective

“Is the training provided adequate?” Even though equipment may perform the same function as existing equipment at other locations

  • n the system, new equipment may have features,
  • perating modes and alarm conditions that are not

like similar equipment. Without becoming familiar with the operation of this new equipment, people can be lead to expect it to respond like all the others on the system especially if they do not know that this equipment is different than the others on the system.

27 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-28
SLIDE 28

HP Perspective

“Is the training provided adequate?” This was the case here where the UPS system was a different make and model than other locations on the system and had only been

  • perational for about 2 months.

28 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-29
SLIDE 29

Corrective Actions

  • The UPS system Auto/Manual Restart switch at

this facility was placed in the Auto position.

  • Future installations of similar UPS systems have

been flagged to ensure correct Auto/Manual Restart switch position selection prior to releasing equipment for regular service.

  • Operating personnel have been counselled

regarding deficiencies in the response to the UPS General Alarm received on the day of the event.

29 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-30
SLIDE 30

Corrective Actions

  • Expected Operator response to alarms of this

nature have been reinforced, including communication requirements.

  • Review was conducted with Operations

personnel about the importance of the on-site UPS facilities and their impact on area RTUs.

  • Reviewed the locations of all on-site UPS facilities

and their local alarm panels.

  • Identified the appropriate contacts for when UPS

facility issues are encountered.

30 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-31
SLIDE 31

A few “Take-Aways”

What’s-in-it-for-me (WIIFM) Sometimes alarm conditions have no immediate undesired consequence. This may lead to a false sense of security if the alarm does not adequately describe the equipment or system operating

  • status. As seen in this case, the RTU equipment

continued to operate normally until the backup battery ran out of power.

31 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-32
SLIDE 32

A few “Take-Aways”

What’s-in-it-for-me (WIIFM) Is it possible that if the alarm actually stated that the UPS was operating on backup battery power that different actions may have been taken? Either way the UPS system operating condition would have been made clear.

32 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-33
SLIDE 33

A few “Take-Aways”

What’s-in-it-for-me (WIIFM) When an operating event is assessed and corrected, keep looking at the extent of the condition. Are there other pieces of equipment or systems that are new and may have a similar condition / trap in place? Look at your process for integrating new equipment

  • nto the system. Is there a common understanding

between the Field and Dispatch personnel about the operation and meaning of alarm conditions for this equipment.

33 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-34
SLIDE 34

What to do next?

  • Lessons Learned
  • HPI in the Control Room

34 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-35
SLIDE 35

Feedback

35 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L

slide-36
SLIDE 36

Human Performance Lesson Learned…

We learn so little from experience because we often blame the wrong cause.

Why We Make Mistakes Joseph T. Hallinan

36 W

E S T E R N

E

L E C T R I C I T Y

C

O O R D I N A T I N G

C

O U N C I L