Measuring the economic impact of Covid-19 in the UK with business - - PowerPoint PPT Presentation
Measuring the economic impact of Covid-19 in the UK with business - - PowerPoint PPT Presentation
Measuring the economic impact of Covid-19 in the UK with business website data Juan Mateos-Garcia Starting at 11.30AM ESCoE COVID-19 ECONOMIC MEASUREMENT WEBINARS Measuring the Economic Impact of Covid-19 with business website data Alex
nesta.org.uk @nesta_uk
Measuring the Economic Impact of Covid-19 with business website data
Alex Bishop [@AlexJBishop] Juan Mateos-Garcia [@JMateosGarcia]
ESCoE Covid-19 Economic Measurement Seminar 23 July 2020
Introduction Data and methods Findings Conclusion
Summary
➢ We create a novel data pipeline to analyse sectoral and geographical exposure to Covid-19 in the UK. Our analysis shows that:
Product search data tracks changes in consumer interest in products and services (and indirectly demand for industries) linked to Covid-19 Local economies with a larger proportion of the workforce in sectors negatively exposed to Covid-19 tend to have higher claimant count rates / faster growth in claimant count rates compared to pre-Covid 19 months. ■ This link is intensified for locations with high shares of employment with low diversification options away into less exposed sectors. Our semantic analysis of Covid-19 notices in business websites shows that they track the evaluation of the pandemic and reveals heterogeneity in company responses to it
➢ Our results suggest that novel data sources can help improve the evidence base about the economic impacts of Covid-19.
Realising their value will require integration with other data sources (official and surveys) and innovation in how this information is disseminated.
Click icons for interactive versions of the figures
Introduction
Introduction Data and methods Findings Conclusion
Measuring the economic impact of Covid-19 Covid-19 Supply Demand Adaptation Impact
Lockdown Social distancing Uncertainty Scale Scope Change in process Innovation
What firms, sectors and places are most exposed to this shock? Inform policies to mitigate How are business reacting to this shock? Inform policies to adapt
We need relevant, inclusive, timely and trusted indicators about these processes
Introduction Data and methods Findings Conclusion
Measuring the economic impact of Covid-19
Introduction Data and methods Findings Conclusion
State of play Data source Topic (see examples in the annex) Strengths and weaknesses Labour market data
- Measure sectoral exposure to Covid-19 via scope to WFH in
different occupations + Representativity + Comparability
- Lags
- Explainability
- Assumption of homogeneity
Business panels
- Measure business adaptation and impacts via changes adopted /
expected and survival prospects + Relevance + Comparability + Timeliness
- Response biases
- Small sample frames
- Assumption of homogeneity
“Big” data
- Measure exposure to Covid-19 through mentions in earning
calls and their link with market value
- Measure changes in demand with transaction data
- Measure sectoral exposure through changes in labour demand
proxied via job ads
- Measure sectoral and geographical exposure through analysis of
Covid-19 notices in websites + Timeliness + Granularity ? Representativity ? Comparability ? Reproducibility
Data and methods
Introduction Data and methods Findings Conclusion
Our setup Business websites (Glass) Research questions: What firms, sectors and places are most exposed to the Covid-19 shock? What are their diversification options and what are they doing about them? We combine web sources with open and official sources to generate indicators of exposure to Covid-19 that are granular (at the firm, sector and geographical level), timely, comprehensive and comparable. ➢ 1.8 million websites in the UK collected in June 2020 (90% coverage of UK business websites according to Glass) ➢ Obtained via web domain registries and enriched using machine learning. ➢ Contains business descriptions, predicted sectors, postcodes and text of Covid-19 notices ➢ Lacks direct information about exposure to Covid-19 and employment. ➢ Potential biases in coverage and noise in some information (eg registered addresses != trading addresses) Our coverage analysis (annex) shows strong correlation (r > 0.9) with sectoral and geographical distributions in CH and IDBR.
Introduction Data and methods Findings Conclusion
Data pipeline Business websites (Glass) Companies House Google Search Trends
Generate terms to query...
Measure sectoral and local exposure to Covid-19 Measure sectoral diversification
- ptions
Analyse sector mix in businesses Analyse Covid-19 notices
Measure adaptation to Covid-19
Nomis Work in progress! We look forward to your feedback
Granular descriptions of the many things that businesses do Timely measure of exposure to Covid-19 in terms of consumer interest
Enrich with SIC & location data
Gives us labels to integrate web data with official taxonomies Generate local estimates and validate with measures of impact
We enrich Glass with other open and web sources to increase its relevance for our research questions.
Normalise and scale
Introduction Data and methods Findings Conclusion
Google Search Trends
Blackfriars Scenery has unique experience in the area of live events including awards ceremonies, theatre and performance
- staging. We are able to meet a wide range of needs by supplying
sets from our extensive stock of staging and flattage or by producing settings tailored to your requirements.
artistic_director arts_centre arts_council theatre orchestra opera performing_arts comedy dancers dance concerts arts jazz festival artists
Established body of literature using Google Search data to proxy consumer / user interest in various subjects (not always successfully cf. Google Flu Trends) (see annex)
Our pipeline:
1. We aggregate company descriptions over sectors and extract salient terms (high frequency in division compared to corpus) focusing on 73 divisions with more activity. 2. We stem terms to remove duplicates 3. We query top 15 vs Google 4. We remove very low search frequency terms
Observations
- Results will be dominated by most popular sub-sectors in
Glass
- Noise in SIC codes is a limiting factor (“Other
Manufacturing Activities ?)
- “I don't think that word means what you think it means” -
see “dance” in search chart.
Introduction Data and methods Findings Conclusion
Google Search Trends samples
01:Crop and animal production, hunting and related service activities breeding crops cattle seed growers farm acres horses farmers plants animals agricultural dogs estate land nursery growing garden produce grown 11:Manufacture of beverages distillery brewery brewing cider ales gin beer drinks produce first great using local range new best quality one time years 25:Manufacture of fabricated metal products, except machinery and equipment metal_fabrication sheet_metal mild_steel steel_fabrication metalwork precision_engineering powder_coating plating laser_cutting steelwork fabrication stainless_steel welding cnc aluminium metal steel wire tooling lead_times 62:Computer programming, consultancy and related activities sql_server sharepoint custom_software magento microsoft_dynamics web_applications software_development
- pen_source microsoft erp
software_applications web_mobile mobile_apps salesforce ibm web_development cloud_computing linux implementations oracle 86:Human health activities chiropractic_clinic dental_care
- rthodontics chiropractic
cosmetic_dentistry dentures dental_treatment chiropractors dental_practice dental_implants
- steopaths oral_health sports_injuries
physiotherapy dentistry osteopathy physio physiotherapists treatment_plan treatment_options 96:Other personal service activities independent_funeral hair_salon barbers funeral_directors hairdressing beauty_treatments salon funeral beauty_salon waxing hair_extensions dry_cleaning hair_beauty stylists hair nails tanning tattoo grooming laundry
Findings
Introduction Data and methods Findings Conclusion
Search trends (products) We measure changes in average consumer interest in 609 terms (~products and services) between February and April () and February and June (Δ).
Social consumption activities: theatre, travel, gym... Home consumption and production activities: gardening, baking, DIY Golf clubs before and afuer the lockdown
Introduction Data and methods Findings Conclusion
Search trends (sectors) When we aggregate category search trends over SIC divisions we also find intuitive results.
Decline in Real Estate, Accommodation and Food services and transportation (but note eg spikes in postal services and couriers) with recovery afuer the end of lockdown. Persistent decline in Arts, Entertainment and Recreation (excluding sports, which rebound) and Travelling
Introduction Data and methods Findings Conclusion
Sector changes We normalise search interest for categories in divisions in April / June by interest pre-Covid-19. The sector ranking suggests that:
Industries involved in home production experience more search interest Industries involved in social consumption / construction experience less search interest
Transport, accommodation, construction, personal health, creative arts... Gardening and landscaping services, manufacture of food, manufacture of wood, publishing, textiles, beverages. Retail of automotives Activities of membership (eg religious)
- rganisations
Introduction Data and methods Findings Conclusion
Sector exposure We will take the ratio of consumer interest between February and a post-pandemic month as a proxy for sector exposure to Covid. We rank sectors based on their position (in quintiles) in the distribution.
Eg Sector in position 0 in April = lowest quintile of search interest change between February and April: Strong negative
- exposure. We will focus most of the analysis on this indicator but we can reproduce the analysis for April, June or July.
Introduction Data and methods Findings Conclusion
Geography of exposure (BRES) We have estimated the share of BRES employment in sectors with different levels of exposure based on our experimental search measures.
We do this at the TTWA level and missing Northern Ireland.
Our analysis reveals heterogeneity in local economy exposure to Covid-19, reflecting the broad-based nature of Covid-19 impacts and the heterogeneity
- f local economies.
This becomes clearer when we consider the situation in different TTWAs
Introduction Data and methods Findings Conclusion
Geography of high negative exposure
Introduction Data and methods Findings Conclusion
Geography of high positive exposure
Introduction Data and methods Findings Conclusion
Geography of exposure and labour market impacts Are our measures of search interest informative about the effect of Covid-19 in local economies? We use the claimant count rates (CCR) and their evolution as a proxy for the impact of Covid-19 in local labour markets.
Recent analysis suggests that claim counts overstate the impact of Covid-19 on employment (Brewer et al, 2020) for administrative reasons ...but they are one of the few indicators available at our level of geographical resolution (TTWAs).
We take it (with caution) as a proxy for local labour market exposure to Covid-19, whether perceived or realised.
Introduction Data and methods Findings Conclusion
Geography of exposure and labour market impacts (2) We regress claimant count rates / changes on share
- f employment in high exposure sectors
CC April / CC Feb CC May / CC Feb CC June / CC Feb Z score (share establishments with lowest search interest in April) 0.094 0.106 0.104 0.034 0.049 0.046 0.0054 0.0284 0.0241 Controls Region Region Region R2 0.143 0.159 0.142 Number of
- bservations
218 218 218 Our results are consistent with the idea that TTWAs with high levels of negative exposure to Covid-19 have experienced a higher claimant count rate and a higher growth in claimant rates with respect to February The effects are the opposite when we regress on share of activity in sectors with high levels of positive exposure. The results are consistent across measures of activity (BRES employment and IDBR establishments)
Introduction Data and methods Findings Conclusion
Modelled sectors So far we assumed that every company belongs to a single industry (MECE industrial taxonomy) But a firm, industry or location’s resilience to Covid-19 and other shocks may hinge on its ability to pivot and diversify away from growth bottlenecks. Can we label companies with multiple industries to capture these opportunities? We have treated our Glass-CH matched dataset as a labelled dataset to train a machine learning model that predicts sectors based on business descriptions.
○ The model includes the actual label in the top five predictions 72% of the times, and in the top ten 84% (see annex for extra detail). This isn’t bad given the noise in company descriptions and SIC labels but there is much room for improvement.
We build a ‘sector space’ based on sectors co-occurrences in
- companies. Sectors that tend to co-occur in the same companies are
closer in this space, potentially reflecting stronger opportunities for diversification between them.
OurKidBrother is an Entertainment Production
- Company. We produce live &
broadcast events and create strategic, entertainment focused campaigns for brands, agencies, promoters and cultural institutions. Our industry experience spans two decades, delivering festivals, concerts, sporting events, tours and brand
- activations. We are London
based with a Global reach, producing events in the UK, Europe, USA, South America, Russia, Nigeria and Malawi.
Introduction Data and methods Findings Conclusion
The Covid-19 sector space
Computer programming (62) can diversify into telecommunications (61) or information service activities (63) Retailers (47) can move into wholesale (46) Arts and creative (90) can move into video (59) The options for accommodation (55) are more limited
Introduction Data and methods Findings Conclusion
Geography of exposure (considering diversification) We measure the diversification
- ptions for negatively exposed
sectors as their mean distance in the sector space to all other non-negatively exposed sectors.
This shows that sectors like accommodation, transport or libraries and museums have lower diversification opportunities than knowledge intensive sectors.
The geography of exposure to Covid-19 (in shares of BRES employment) changes when we consider diversification options
Introduction Data and methods Findings Conclusion
Geography of exposure (with diversification)
CCR April / CC Feb CCR May / CC Feb CCR June / CC Feb Z-score (share employment in sectors with low diversification
- ptions)
0.232 0.262 0.243 0.043 0.061 0.06 6.74E-08 1.81E-05 4.82E-05 Controls Region Region Region R2 0.38 0.321 0.298 Number of
- bservations
218 218 218 ➢
We fit the same models as before but using share of employment in negatively exposed sectors with low diversification
- ptions.
Areas with high shares of the population in negatively exposed sectors with low diversification options see a higher jump in claimant rates. The size of the coefficients & R2 are ~ 2x what we get when do not consider diversification
- ptions.
The results are robust to changes in measure of activity (ie establishments via IDBR).
Introduction Data and methods Findings Conclusion
Covid-19 notices
'Well, we should have hosted Company
- f Legends this weekend, but
Covid-19 had other plans. So here are the lists you guys planned to
- bring. Loyalists Traitors We will
see you in October guys, meanwhile' Through closely monitoring the progress of COVID-19 in the UK, we are adapting our
- wn business practices to follow all
Government advice and mandatory
- regulations. To ensure the health and
well-being of our team, and thus ensure that we can continue to offer HR support at this difficult and frightening time,
- ur team will avoid any non-essential in
person contact and be working from home
- ver the coming weeks. We are experienced
in agile and remote working and are equipped to do so effectively, therefore this will not impact our service in any way. 'CORONAVIRUS/COVID 19 Swansea Valley Physiotherapy Practice has had to close temporarily due to the coronavirus
- crisis. As always, the health and
safety of all our clients is paramount and we will re-open as soon as it is safe to do so. Once we re-open we wish to reassure our clients that the appropriate levels of personal protective equipment (PPE) will be used by all staff.
We have ~640k Covid-19 notices for companies in our dataset for May, June and July. What do they tell us about sectoral and geographical exposure to Covid-19, and firm responses? This is analysis very much in progress!
Introduction Data and methods Findings Conclusion
Covid-19 notices content
Our exploration sectoral distribution of notices suggests the sectors we identified as negatively exposed are overrepresented in terms of notices. But not all notices are the same! We trained a CorEx topic model on the COVID notice text, exploring the temporal trends and the sectoral distribution across these topics Each topic is a collection of words that tend to co-occur in documents, and each document has weights for different topics. Covid-19 notices seem to track the evolution of the pandemic response Topic 6 lockdown Topic 2 Socially distanced reopening
Introduction Data and methods Findings Conclusion
Sector and firm heterogeneity We also find heterogeneity in the semantic content of Covid-19 notices inside sectors.
☐ We calculate this by estimating the Shannon entropy of the topic distribution in each Glass sector. ☐ More knowledge intensive companies tend to show more entropy (diversity) in their topic mixes.
This suggests that some sectors have more options to respond to Covid-19, and that there is variation in firm responses that could have implications for firm, sector and local outcomes. We will continue analysing these questions in follow-up work.
Conclusion
Introduction Data and methods Findings Conclusion
Discussion
➢ Our analysis highlights the potential of novel data sources to monitor the economic impact of Covid-19 with more geographical and sectoral granularity and timeliness.
We note that novel data is a complement to (rather than a substitute for) offjcial data at all stages of our analysis. Thank you ONS!
➢ Our fjndings are consistent with the idea of a recovery in consumer interest in (most) products and services. ➢ They suggest that the fortune of local economies will hinge on their ability to diversify away from Covid-19 impacted sectors.
More complex and knowledge intensive economies may be able to diversify faster. Could this widen regional inequalities in the UK?
➢ High resolution, timely maps of activity, exposure and capabilities will be required to inform policies to support this economic ‘building back’ process.
Introduction Data and methods Findings Conclusion
Next steps ➢ Tuning our data pipeline and model specifjcation. We need to tune our models at all stages of the pipeline: match with CH, extracting salient words, querying Google Search Trends, building sectoral trends, predicting sector labels, topic modelling and sentiment analysis. We need to model other relevant features of a local economy and consider spatial efgects. How are our variables linked to measures
- f Covid-19 incidence?
➢ Going beyond SIC codes Coarse and ambiguous SIC taxonomies introduce noise into our analysis. Could we replace/complement them with bottom up industrial taxonomies? And how do we maintain comparability if we do this? ➢ Expanding our defjnition of exposure Search seems to capture demand shocks better. Can we build on other studies to generate sector measures of supply shock exposure? ➢ Measuring outcomes We are currently relying on imperfect indicators of Covid-19 outcomes. We will complement them with other local economic indicators about productive and labour market performance. We will also explore how to measure fjrm-level outcomes via Companies House and ONS micro-data. Could our fjrm-level dataset be matched with business panels to nowcast Covid-19 impacts? ➢ Reproducibility and dissemination Share as much of our data and code as possible in GitHub Explore options to make our analysis explorable through regularly updated data visualisations and dashboards.
nesta.org.uk @nesta_uk
juan.mateos-garcia@nesta.org.uk @JMateosGarcia alex.bishop@nesta.org.uk @AlexJBishop
Annex
Annex
Covid-19 measurement (examples) Labour market data
- Brynjolfsson, E., Horton, J. J., Ozimek, A., Rock, D., Sharma, G., & TuYe, H. Y. (2020). Covid-19 and remote work: An early look at US data (No. w27344).
National Bureau of Economic Research. Link
- Cortes, G. M. (2020). Heterogeneous Labor Market Impacts During the Early Stages of the Covid-19 Pandemic (No. 20-13). Link
- del Rio-Chanona, R. M., Mealy, P., Pichler, A., Lafond, F., & Farmer, D. (2020). Supply and demand shocks in the COVID-19 pandemic: An industry and
- ccupation perspective. arXiv preprint arXiv:2004.06759: Link
Business panels and surveys
- Bank of England (2020): Latest results from the Decision Maker Panel survey - 2020 Q1. Link
- Bartik, A. W., Bertrand, M., Cullen, Z. B., Glaeser, E. L., Luca, M., & Stanton, C. T. (2020). How are small businesses adjusting to covid-19? early evidence from a
survey (No. w26989). National Bureau of Economic Research. Link
- Buchheim, L., Dovern, J., Krolage, C., & Link, S. (2020). Firm-level Expectations and Behavior in Response to the COVID-19 Crisis. Link
Novel sources
- Bai, J. J., Jin, W., Stefgen, S., & Wan, C. (2020). The Future of Work: Work from Home Preparedness and Firm Resilience During the COVID-19 Pandemic. Wang
and Stefgen, Sebastian and Wan, Chi, The Future of Work: Work from Home Preparedness and Firm Resilience During the COVID-19 Pandemic (June 2, 2020) Link
- Carvalho, V. M., Hansen, S., Ortiz, A., Garcia, J. R., Rodrigo, T., Rodriguez Mora, S., & Ruiz de Aguirre, P. (2020). Tracking the Covid-19 crisis with high-resolution
transaction data. Link
- Haldane, A. (2020). The Second Quarter. Link
- Hassan, T. A., Hollander, S., van Lent, L., & Tahoun, A. (2020). Firm-level exposure to epidemic diseases: Covid-19, SARS, and H1N1 (No. w26971). National
Bureau of Economic Research. Link
- Kinne, J., Krüger, M., Lenz, D., Licht, G., & Winker, P. (2020). Corona pandemic afgects companies difgerently. Link
Annex
Google Search trends literature
- McLaren, N., & Shanbhogue, R. (2011). Using internet search data as economic indicators. Bank of England Quarterly Bulletin, (2011),
- Q2. Link
- Choi, H., & Varian, H. (2012). Predicting the present with Google Trends. Economic record, 88, 2-9. Link
- Preis, T., Moat, H. S., & Stanley, H. E. (2013). Quantifying trading behavior in fjnancial markets using Google Trends. Scientifjc reports,
3, 1684. Link
- Ross, A. (2013). Nowcasting with Google Trends: a keyword selection method. Fraser of Allander Economic Commentary, 37(2), 54-64.
Link
- Lazer, D., Kennedy, R., King, G., & Vespignani, A. (2014). The parable of Google Flu: traps in big data analysis. Science, 343(6176),
1203-1205. Link
- Stephens-Davidowitz, S. (2014). The cost of racial animus on a black candidate: Evidence using Google search data. Journal of Public
Economics, 118, 26-40. Link
- Youn, S., & Cho, H. C. (2016). Nowcast of TV market using Google Trend data. Journal of Electrical Engineering & Technology, 11(1),
227-233. Link
We find a strong correlation between the sectoral distribution (SIC4) of firms in Glass and Companies House (r=0.91). Glass tends to over-represent manufacturing, specialised retail, information, professional services and education, and to under-represent construction, non-specialised retail, real-estate, hairdressers and primary sectors.
Annex
Glass validation
There is also a strong correlation between the geographical (TTWA) distribution of Glass businesses and Companies House businesses (r = 0.996). Larger cities (specially London) are slightly underrepresented (probably due to under-representation of sectors like non-specialised retail, construction etc).
Annex
Glass validation
We also find a strong correlation between the sectoral distribution of Glass and CH data at the TTWA level (average r = 0.85). We identify weaker association between Glass and CH sectoral distributions in rural areas with smaller economies and stronger reliance on primary sector.
Annex
Glass validation
Annex
Predictive analysis
- We use grid-search to select the model
best able to predict a company SIC division based on its website description.
○ We train the model of business descriptions with more than 300 words using a One vs Rest framework ○ The best performing model is a Logistic Regression with L1 regularisation and balanced classes.
- The model tends to perform better for
services companies.
- The quality of the model degrades for
smaller sectors and “Other activities not elsewhere classified” style sectors.