SHOULD A MODEL KNOW ITS OWN ID? Some Thoughts About 7 TH ANNUAL - - PowerPoint PPT Presentation

should a model
SMART_READER_LITE
LIVE PREVIEW

SHOULD A MODEL KNOW ITS OWN ID? Some Thoughts About 7 TH ANNUAL - - PowerPoint PPT Presentation

SHOULD A MODEL KNOW ITS OWN ID? Some Thoughts About 7 TH ANNUAL RISK AMERICAS CONFERENCE Mitigating Inventory MAY 17-18, 2018 NEW YORK CITY Risk By Accurately Tracking Model P RESENTED BY J ON H ILL , P H . D. FORMER MANAGING DIRECTOR


slide-1
SLIDE 1

SHOULD A MODEL ‘KNOW’ ITS OWN ID?

7TH ANNUAL RISK AMERICAS CONFERENCE

MAY 17-18, 2018 NEW YORK CITY

PRESENTED BY JON HILL, PH. D.

FORMER MANAGING DIRECTOR GLOBAL MODEL RISK GOVERNANCE CREDIT SUISSE NEWYORK

JONHILL@OPTONLINE.NET

Some Thoughts About Mitigating Inventory Risk By Accurately Tracking Model Usage

slide-2
SLIDE 2

All of the ideas, opinions, suggestions, notions or asides offered in this presentation are entirely the opinions of the speaker and should not be construed to represent in any way those of Credit Suisse, Morgan Stanley, Citigroup or any other previous employers. Furthermore, any anecdotes, cautionary tales or war stories that may insinuate themselves into this presentation shall be understood to have occurred at a mythical institution that will

  • nly be identified as “Retro Bank”, unless otherwise specifically identified by the speaker.

Disclaimer

slide-3
SLIDE 3

Contents

  • What is inventory risk?
  • What types of questions about

model usage and inventory are difficult to answer with today’s database inventorIes?

  • How could these questions be

answered accurately?

  • Description of the necessary

functionality to support a Model Transponder Function.

  • A two-step phased

implementation approach can minimize disruption and production overhead.

  • Summary: Pros and Cons
slide-4
SLIDE 4

SOME THOUGHTS ON MITIGATING INVENTORY RISK BY IMPROVING MODEL USAGE TRANSPARENCY

My Definition of Inventory Risk (adapted from SR11-7): Inventory risk is the risk resulting from incomplete or inaccurate quantitative model inventories, the use of models that have previously been retired or remain unvalidated or the use of models that have not be entered into inventory.

slide-5
SLIDE 5
  • One of the more daunting challenges facing model risk managers at major financial firms is the task
  • f ascertaining that the model inventory, however it is implemented and maintained, is complete

and accurate. 1

  • At almost every firm this is accomplished through a manual process called attestation: model

managers or functional heads for every asset class and business unit are asked to sign off on the complete set of models that fall within their domain of ownership and responsibility.

  • Such attestations are typically done via email requiring the a model risk manager to a the list of

models that inventory indicates is owned and maintained by the model supervisor or functional and

  • btaining a confirmation by return email. Such a process can be both clumsy and error-prone –

some models may simply be overlooked in the process (the technical term is “falling through the cracks”), some may be ‘orphans’, models mis-assigned due to staff turnover or re-allocation of responsibilities and therefore without owners; some orphans may no longer be in use. Model Inventory Attestation Is Still Primarily a Manual Process! Why Is That?

Introduction

1 None of the firms I have worked at (Salomon Smith-Barney, Citigroup, Morgan Stanley, Credit Suisse) or with as a consultant have any accurate quantitative way of

answering these types of questions other than to query model owners/developers or their downstream users and receiving qualitative estimates. It is also un uncomfortable fact that model supervisors/owners/developers do not always know a who all of their downstream users are.

slide-6
SLIDE 6
  • Resolving these discrepancies can require numerous iterations of the attestation process to

determine the current correct ownership of orphan models. And of course, it is not an uncommon experience that some models have no owner assigned at all due to staff turnover. Particularly problematic are upstream and downstream dependencies between models. We rely on model

  • wners to identify all upstream models, but very often they will have complete knowledge of all

downstream models, models that receive other models’ output as their input.

  • Particularly problematic are upstream and downstream dependencies between models. While

model owners should be aware of upstream dependencies (these can be traced by following all model inputs back to their source 2), but very often they will not have complete knowledge of all downstream models, models that receive other models’ output as their input. These would be best known to the downstream model users. Model Inventory Attestation Is Still Primarily a Manual Process! Why Is That?

Introduction

1 Note: the role of model risk manager is relatively new and complements the role of model validator in mitigating model risk 2 If they cannot there are larger problems in model development management.

slide-7
SLIDE 7

In a Bank Exam, Could Your Firm Give Accurate Answers to the Following Seven Basic Questions Involving Model Inventory? 1

1) What is the exact number of different models that have been used over the last year? 2) How often has each model been executed, by day, by month, by year? 3) Where are the firm’s models being used? Business unit, legal entity, geographic regions? 4) Are there any models in your inventory that were not executed during the last year? 5) Are there any models that were executed on any of your firm’s computers that do not appear in inventory? Please provide a full listing. 6) Are you able to provide a full list of the IDs of models that exhibit significant seasonality? If so, what are the peak and trough’s of seasonal model usage. 7) Were there any instances of a retired model still being executed during the last year?

1 There are likely other types of questions regarding model inventory that are difficult to answer accurately. These seven are the most

important questions I can think of. Perhaps you can think of some others.

slide-8
SLIDE 8

Inability to answer the previous questions regarding model usage is indicative of a form of model risk that is that is not often identified or analyzed in its totality because it belongs to a class of seldom recognized risks that reside outside of and between models.1

Let’s Call it “Inventory Risk”

What are the sorts of liabilities that may arise from model inventory risk? Here are a few ….

  • Regulatory risk arising from incomplete or inaccurate model inventories (i.e. CCAR bank exams)
  • Financial and regulatory risks arising from the use of unvalidated or retired models
  • Difficulty in identifying models still in inventory but no longer in use
  • Inability to enforce model risk management practices uniformly across all models, asset classes, regions and

legal entities.

  • Manual inventory attestation processes are error-prone and invariably result in errors of omission
  • Incomplete understanding of upstream and downstream model dependencies
  • Lack of transparency into firmwide model usage, regionality, seasonality, etc.

In an age of automation, machine learning and big data we really should ask ourselves if we cannot find better ways to make firmwide model usage more transparent and in doing so help to automate the model attestation process

1 A tip of the hat to Martin Goldberg for his seminal 2017 paper entitled “Much of Model Risk Does Not Come From Any Model”, The Journal of Structured Finance, Spring,

2017, pp. 32-37. Although not described in this paper, inventory risk is clearly from the class of less well-recognized model risks that are external to models. Martin is currently working at Bloomberg on credit risk models.

slide-9
SLIDE 9

THE HEART OF THE MATTER

This section will attempt to identify a single underlying reason why few if any firms can answer the 7 questions

  • n the previous slide

with a high degree of

  • confidence. This is the

true source of most model usage opacity and inventory risk

slide-10
SLIDE 10

The root cause of model usage opacity may be traced to this single surprising blind spot in most firms’ model risk management framework.1 Let’s try to put this into perspective by comparing to some other familiar technologies:

  • my smart phone ‘knows’ its unique serial number (it’s embedded in the permanent onboard

memory that stays with the phone for life).

  • My washing machine knows its own serial number too, so does my automobile. These are

embedded in the onboard electronics that control these devices.

  • Even before electronics, serial numbers were stamped on the frames of every automobile that

Henry Ford produced and somewhere on almost all manufactured products of any significance. Today, Tesla can track every car they’ve ever made, its location, travel speed, level of charge, etc. Most important financial models are assigned Model IDs as a convenient lookup index into the automated model databases that almost all firms have to maintain today. These databases typically house all of the relevant documentation for each model such as development and validation documents, and in some rare cases, even source code. Yet the models themselves do not ‘know’ their

  • wn IDs in the sense that the ID number is embedded in the model’s source code. In the next section I

will introduce the concept of a Transponder Function, which if added to every model in a firm’s inventory, can go a long way towards improving the transparency of model usage and mitigating many of the risks listed in slide #8.

The Heart of the Matter Is This: Models Do Not ‘know’ Their Own IDs

1 At first blush this may not seem to be a true root cause. This presentation will endeavor to convince any doubters

that this is indeed the case.

slide-11
SLIDE 11

TWO STEPS TO MITIGATING MODEL INVENTORY RISK

Creating models that ‘know’ their own IDs by embedding them in their source code is a simple yet necessary first step. But this alone is not

  • sufficient. The real heart
  • f the matter is what we

might do with that embedded information.

slide-12
SLIDE 12

This can be implemented by developers once a new model is assigned an ID by adding a single trivial assignment statement as the first executable line in the model’s main routine: Main () { Global Int Model_ID = 1234567; Model Code() ; Exit; } Embedding Model IDs into Source Code is a Trivially Simple But Necessary First Step

Step 1: Embed Model IDs in the Model Source Code

An Aside: An obvious question is why weren’t embedded IDs standard practice from the first models? The answer is probably because model IDs only became standard with the introduction of centralized model databases sometime in the new millennium. Apparently, none in the industry saw an incentive to retrofit thousands of models with embedded IDs.1This presentation will attempt to provide that incentive. This first step is very simple, requires almost no effort and will not impact performance. Once done, it is hard-coded (like a serial number stamped onto an automobile frame) for the life of the model so long as model IDs are uniquely assigned and not re-used.

1 I suspect a more accurate answer to this question may well be “pure sloth”. Quantitative models for use in finance go back at least 50 years or more, long before

model inventory databases became a regulatory requirement sometime in the new millennium. Only then did it become necessary to assign unique IDs to models as a clean way to index them into the databases.

slide-13
SLIDE 13
  • Assuming models have their IDs embedded in their source code, the next obvious question is what

could we do with that information? The answer is: with a little thought, quite a lot

  • The second step will require the creation of a Universal Model Transponder Function, inspired by

the radio transponders that air traffic controllers rely on to track civilian and commercial aircraft

  • What sorts of data would we want a model transponder to ‘broadcast’ to a centralized model usage

database via the firm’s intranet? Here are a few important indicative data fields to start with : 1) Model ID 2) Name of the Model (as a text string) – model names may not be unique, so cannot serve as index 3) Timestamp at execution – date, hour, minute granularity 4) Type of model – pricing, risk, credit, forecasting, finance, HR, etc. 5) Implementation – production code (C++, JAVA, etc.), or EUC model 6) A MAC address 1 - uniquely identifies the processor executing the model 7) Vector of Upstream model IDs – this information would be invaluable if the model ID is also embedded in any results produced by the model. If deployed comprehensively across the firm this information could capture all upstream and downstream dependencies. Embedding Model IDs into Source Code is a Necessary First Step But The Second and Final Step Will Require More Investment

Step 2: Create A Universal Model Transponder Function

1 The Media Access Control, or MAC, address is the hardware equivalent of an IP address. It is a unique identifier embedded in every computer’s network interface card and can be used

to identify not only the actual computer executing the software but through a lookup function its physical location. A computer’s unique MAC address can be obtained via a function call to the computer’s Operating System.

slide-14
SLIDE 14

In order to use embedded Model IDs to track model usage globally, a firm’s developers would need to add a basic new functionality to each model in the form of a Transponder Function:

1) The Transponder Function would be called once each time the model code is executed. 2) The Transponder should be have the ability to transmit indicative data about the model via the Firm’s Intranet to a central database. (These data fields are listed in the previous slide) 3) Transmission permission must be strictly one-way, from model to database, in order to avoid

  • pening a back door into the model.

4) As an option to #3, to avoid the risk of jamming the firm’s intranet Transponder output could be written into local temporary file systems (or databases). 5) Since the usage data is not timely, a sweep of all temp files into a central database could be made

  • n a regular basis during off-peak intranet hours (i.e. weekly at 3 AM on Sundays).

6) At the end of a year’s worth of data collection a treasure trove of information about model usage could be available in the central database.1 Embedding Model IDs into Source Code is a Necessary First Step But The Second and Final Step Will Require More Investment

How Would A Model Transponder Function Operate?

1 The resulting trove would constitute a voluminous audit trail of information about model usage amenable to analysis using data mining and Machine

Language algorithms to find patterns of model usage not readily detectable by human inspection and analysis of the usage data.

slide-15
SLIDE 15

A proof of concept could be demonstrated via a simulation that doesn’t require modifying any production models and very little time or IT resources: 1) Create a set of hundreds of ‘dummy’ skeleton models that contain only an embedded test ID and a prototype Model Transponder Function.1 2) A script could be developed that will call of the dummy models with randomly assigned frequencies, some very frequent, some infrequent and one or two not at all. 3) Use the script to simulate seasonality and regionality for a subset of the models. 4) Simulate a full year’s worth of model usage. 5) Mine the resulting database information to create various types of analyses (frequency histograms, seasonality charts, distribution by regions, usage spikes, dead periods, etc.) and to identify patterns of usage. 6) Use the simulation to identify flaws in the Transponder Function, communication pipelines and the centralized database. This can help to identify problems and refine the method before production 7) Present results to management to make the a case for authorizing formal production. A practical way to establish the value added by embedding IDs and installing a Transponder Function Using ‘Dummy’ Models

As a First Step, Create a Simple Proof of Concept Simulation

1 Note that the source code for the transponder does not have to be included in the model’s source code, in fact it probably should not be. Rather, the

Transponder Function code should be maintained separately of any model and compiled into a Dynamically Linked Library (DLL) that can be joined with the compiled model code during the build process. This will allow the Transponder Function to be modified without modifying the model codes.

slide-16
SLIDE 16

Conceptually, It’s Really Rather Simple

Transponder Function Centralized Database But the devil may be hiding in the details … Intranet or temp file Dummy Model with ID

Model usage indicative data Model usage indicative data

Note: It may not be necessary for the Transponder to send data to a centralized database via the Firm’s intranet – this is really a placeholder for any type of communication pipe that a Firms’ IT staff choose, For the purposes of this presentation it is not particularly important to specify how the communication is to be implemented, only that the final destination is a central database with a log of model usage statistics from slide #12, indexed by model ID and collected over a significant length of time, e.g. at least one year.

Model usage indicative data

slide-17
SLIDE 17

This is what is what a dummy model might look like in pseudo code:

Main () { Global Int Model_ID = 1234567; /* Embed the Model ID */ Int Time = SystemClock(); /* Get the current date and time */ Char Name = *GetModelName(); /* Get the model’s name as a text string */ Int Mac_Adress = GetMac(this computer); /* Get the MAC address from the operating system */ Int *Upstream _Array = Get_Upstream_IDS[]’; /* Get an array of one or more upstream model IDs *. Char DB_Name = *GetDbName(): /* Get the name of the centralized destination database */ Int ModelReturnCode = Model Code() ; /* Execute the dummy model */ Int TransponderReturnCode = Transponder (Model ID, Time, Name, Type, Implementation, MAC, Upstream_Array, DB_Name) ; /* Call the transponder and pass the indicative data to it */ Exit(); } A practical way to establish the value added by embedding IDs and installing a Transponder Function Using ‘Dummy’ Models

Pseudo Dummy Model Code for a Proof of Concept

slide-18
SLIDE 18

Pros:

  • A major advantage of the proposed innovation is that it can be implemented incrementally over

time beginning with limited sets of models such as those used for CCAR/DFAST stress testing or the set of pricing models in the high risk tier. Changes could be included in the regular release cycles.

  • The Model Transponder approach places the tracking usage software inside each model rather than

relying on an external execution platform to track and store usage statistics.1

  • Because it is platform independent it is a global solution that will operate on any Firm computer

that has access to the firm’s intranet (or that can write results to a temporary file).

  • Offers a direct means for comprehensively identifying upstream and downstream dependencies.

Cons:

  • Requires some minor modifications to the source code of each model.
  • High bandwidth from heavily used models could bottleneck the Firm’s intranet.
  • Vendor models present a special challenge – doubtful vendors would agree to install Transponders

in their models. But there may be workarounds through the inhouse execution scripts or host programs that Firms use to interface between the vendor code and the Firm’s computers.

  • Spreadsheet models could present challenges as well, but not insurmountable ones.

Pros and Cons

Summary

This presentation has described an innovative method for improving the transparency of model usage across an entire Firm, but not without cost. Here are the pros and cons of this approach:

1 Most production models at banks are managed by host execution platforms, although most EUC models are not. It is possible for execution

platforms to be designed or modified to track usage statistics but large firms may have hundreds of different platforms and each would have to be customized to provide similar data. Any changes would have to be made to all such platforms.

slide-19
SLIDE 19

Have you ever wondered why, while browsing say the NYTimes online, ads pop up for items similar to items you have purchased online recently (shower curtains or bedsheets for example) using from different websites? That is because when you made the transaction an embedded ‘transponder’ sent indicative data about your activities & interests to a centralized database maintained by the vendor. This information is useful to vendors to target their online ads to potential customers, to monitor consumer interests and to build profiles of each of millions or hundreds of millions of clients.This is why we see those popup ads mysteriously tailored to our individual purchasing patterns. One-way transponder functions have been used for years to track external clients behavior patterns.

Final Thoughts: Many Tech Industries Have Had this Functionality for Years!

There is nothing new about the concept of embedded one-way ‘transponder’ functions. Google, Amazon, Ebay &Tesla have been had their equivalents in place for a decade or more. Now consider Tesla. Somewhere at Tesla Central Command is a large screen that can display the location of every Tesla vehicle ever sold, along with its current speed, direction, time since last charge, driving patterns of the owner, and a host of tracking data that help Tesla to understand usage, geographic concentrations, charging stations used, etc., so they can improve and expand their services and market share optimally. They can do this because every Tesla vehicle has the equivalent of a Transponder Function embedded in each vehicle in its onboard computers. So why is it that financial firms are so far behind and cannot manage to collect similar patterns of behavior for their internal clients (i.e. their quantitative models)? Are we not as good as the Techs?

slide-20
SLIDE 20

And Finally ….

Can anyone hearing this presentation think of any reason why financial models should not have their IDs embedded in their source code? I certainly cannot! (Hint: ‘no’ is the right answer) So that’s my story and I’m sticking to it!

Note: nothing in this presentation addresses the problems associated with ‘near models’ or ‘calculator tools’ whose owners refuse to acknowledge whether they function as models that must be assigned IDs, entered into inventory and submitted for validation. The distinction between model and tool or near-model is a governance issue. Treatment of these grey area quasi-models should fall under the Firm’s model governance policies and procedures.

slide-21
SLIDE 21

END OF PRESENTATION

“Should a Model ‘Know’ Its Own ID?”