Breaking your proprietary software habit Best practices for data - - PowerPoint PPT Presentation

breaking your proprietary software habit
SMART_READER_LITE
LIVE PREVIEW

Breaking your proprietary software habit Best practices for data - - PowerPoint PPT Presentation

Breaking your proprietary software habit Best practices for data import into CiviCRM Young-Jin Kim, Eileen McNaughton, Micah Lee 7 deadly sins of data migration 1. Wrath - Feeling you'll get if you don't plan! 2. Gluttony - Failure to


slide-1
SLIDE 1

Young-Jin Kim, Eileen McNaughton, Micah Lee

Breaking your proprietary software habit

Best practices for data import into CiviCRM

slide-2
SLIDE 2

7 deadly sins of data migration

  • 1. Wrath - Feeling you'll get if you don't plan!
  • 2. Gluttony - Failure to restrict import scope
  • 3. Greed - Failure to get rid of data
  • 4. Sloth - Failure to iterate quickly, work cleanly
  • 5. Pride - Failure to validate the import
  • 6. Lust - Failure to dedupe
  • 7. Envy - Failure to leave behind old ways
slide-3
SLIDE 3

liberate your data, set it free

slide-4
SLIDE 4

Best practices for data migrations

  • 1. Use a dedicated environment for data imports
  • 2. Automate scripts for the full import early on! Use APIs!
  • 3. Judiciously, with client input, limit data import scope
  • 4. Data import is an iterative process: iterate, iterate!
  • 5. Think about current workflow and future workflow as it

impacts data mapping into CiviCRM

  • 6. If you can, draw up a time horizon that will demarcate

stale data from current data, e.g. 3 years in the past

  • 7. Don't reinvent the wheel make use of free tools, i.e.

migrate, civimigrate, ETL tools, Google Refine, APIs

slide-5
SLIDE 5

Google Refine

Two possible migration workflows

CiviCRM DB

Pentaho Kettle

Legacy DB

Cleanse Export Import Transform

CiviCRM DB

Civimigrate Module

Legacy DB

Export Import Transform

Export DB

slide-6
SLIDE 6

Google Refine

  • Free Open Source Data Cleaning tool

written in Java running on a local tomcat instance

  • Uber-spreadsheet on "steroids" with GUI
  • Reads in many file types and data formats

and also Google Docs spreadsheets

  • Many built in data transformations for

merging, clustering, matching, faceting

  • Ability to extend capabilities by writing

custom transforms in GREL, Python or Clojure

  • Cleaning procedure can be saved as JSON

and replayed back easily

slide-7
SLIDE 7

Pentaho Data Integration

  • Free Open Source Extract-Translate-Load

tool (ETL) written in Java Eclipse framework

  • Visual programming interface (GUI) for

pipelining data and inspecting data streams

  • Comes with connectors to many existing

data(base) formats for input and output

  • Write custom Javascript and Java steps
  • Data stream is routed using a transformation

step, transformations can be chained in a job

  • Transformations and jobs are stored as XML
  • Replay XMLs from command line
slide-8
SLIDE 8

What is Civimigrate?

It's a bandaid between Migrate Module and the CiviCRM API More technically it exposes the API as a migrate destination

slide-9
SLIDE 9
  • Maps source data to migrate destinations

(csv, oracle , xml, mysql, JSON ....)

  • Supplies a framework to do trial imports,

rollbacks, updates- Drush or GUI

  • Map tables maintain relationships between

source data and the resulting CiviCRM entities

  • Allows you to use hooks to manipulate data

during the migration (prepareRow + callbacks, e.g to sanitize data)

What does migrate do

slide-10
SLIDE 10

You've migrated your data, but what about your donors?

EFF had ~1,000 recurring donors in Convio, bringing in ~$20,000 per month. We spent a long, long time saving them, but in the end succeeded. Probably worth it.

Ways to save your recurring donors:

  • Call them on the phone,

ask them to re-donate (recommended)

  • Get credit card numbers,

carefully baby-sit selenium script

  • Keep old payment

processor around until all cards expire, write CiviCRM integration code