Using Python for Record Linkage: Entrepreneurship, Research and - - PowerPoint PPT Presentation

using python for record linkage entrepreneurship research
SMART_READER_LITE
LIVE PREVIEW

Using Python for Record Linkage: Entrepreneurship, Research and - - PowerPoint PPT Presentation

Using Python for Record Linkage: Entrepreneurship, Research and Development, and Lobbying in the Unmanned Aerial Vehicle Industry Russell J. Funk funk@umich.edu May 20, 2013 Session goals Well learn how to. . . 1. pull data from


slide-1
SLIDE 1

Using Python for Record Linkage: Entrepreneurship, Research and Development, and Lobbying in the Unmanned Aerial Vehicle Industry

Russell J. Funk funk@umich.edu May 20, 2013

slide-2
SLIDE 2

Session goals

We’ll learn how to. . .

  • 1. pull data from disparate online sources
  • 2. link messy data with python using fuzzy matching
  • 3. use python to build a data set ready for analysis
slide-3
SLIDE 3
  • Motivation. . .
slide-4
SLIDE 4
  • Motivation. . .
slide-5
SLIDE 5

Why study the unmanned aerial vehicle industry?

slide-6
SLIDE 6

A research question. . .

What is the correlation between lobbying expenditures and research and development contracts for small businesses from Department of Defense?

Quick background. . .

◮ Small Business Innovation Research Program (SBIR)—requires

that all federal agencies with extramural research budgets in excess of $100 million reserve 2.5% for contracts or grants to small businesses

◮ Small Business Technology Transfer Program (STTR)—similar to

SBIR, but smaller, and emphasizes funding partnerships between small businesses and nonprofit organizations

slide-7
SLIDE 7

How can we find data?

Check README.md in the /data folder for instructions.

slide-8
SLIDE 8

The challenge. . .

How can we link data records across sources without common unique identifiers?

slide-9
SLIDE 9

Overview of the project directory. . .

slide-10
SLIDE 10

Finding (A). . . but what does it mean?

10 20 30 1,000,000 2,000,000 3,000,000

Lobbying expenditures (dollars) DOD contracts

Lobbying and contracts (r = 0.21)

slide-11
SLIDE 11

Finding (B). . . but what does it mean?

4,000,000 8,000,000 12,000,000 1,000,000 2,000,000 3,000,000

Lobbying expenditures (dollars) DOD awards (dollars)

Lobbying and awards (r = 0.25)