Automating Drupal Migrations How to go from an Estimated One Week - - PowerPoint PPT Presentation
Automating Drupal Migrations How to go from an Estimated One Week - - PowerPoint PPT Presentation
Automating Drupal Migrations How to go from an Estimated One Week to Two Minutes Down Time About Dan Harris Founder Webdrips.com Drupal-based web design and development shop Founded in July, 2011. Nine years Drupal experience
About Dan Harris
- Founder Webdrips.com
○ Drupal-based web design and development shop ○ Founded in July, 2011. ○ Nine years Drupal experience ○ 21 years professional experience.
- Twitter @webdrips
- Email dan@webdrips.com
Note About the Migration Process
Although we’re covering a Drupal 6 to 7 migration in this presentation, most if not all of these ideas presented here should work for any Drupal to Drupal migration.
Overview: Initial Plan/Estimates
- Initial estimate: one week of downtime
- SQL queries would be used to export/import
when coverage was limited with Drupal Migrate
- Only automation provided by Migrate Modules
- Existing Drupal 7 Architecture
Overview: Updated Plan
- Virtually zero downtime
○ Intermediate: asking for one day down time or less
- Complete migration in one business day
- Over 99% automated
- D7 site to be built during migration from
scratch
About the Drupal 6 Site
- Architecturally, was a mess (Frankensite)
○ Migration provided chance to clean up architecture and code
- Six custom themes (1 custom/5 subthemes)
- 35 custom modules
- 151 contributed modules
About the Drupal 6 Site
- 1000 privileged users
- About 400k non-privileged users
- 25 Content Types, including Webforms
- Over 2,500 pages
About the Drupal 7 Site
- 106 Modules
- Bootstrap Primary Theme
- One Bootstrap subtheme, Four sub-
subthemes
- Six content types only
- 11 Features provided architecture
Automated Migration Process
Requirements
- Migrate modules: migrate, migrate_extras, migrate_d2d,
migrate_webform
- Import modules: menu_import, path_redirect_import
- Four custom modules
- Scripts migration and deployment
- Fast server with SSD
Migration Script Overview
Requirements:
- Create new Drupal D7 site
- Build out site architecture with features
- Enable Modules
- Migrate D6 to D7
- Import items that couldn’t be migrated
This provided for a repeatable/reliable process
Migration Script Highlights (Review)
Build the site:
drush site-install
Enable features and modules:
drush en feature_name -y
Migrate each entity:
drush mi entity
Custom Migration Modules
- 1. Disable “edits” to the D6 site
- a. Basically re-direct webform pages, admin pages,
and paths like node/add, node/edit, etc.
- 2. Views (implemented with features) only for
migration status and post-processing
- 3. Migrate_d2d module
- 4. CSV-based Migration
Drupal Migrate/D2D/Extras
- Handled most of the heavy lifting
○ Everything except menu links, path redirects, and slide shows
- Extensive drush support
- Plenty of methods available to massage data
- D2D: simplifies migration code
Migrating Users
Challenges
- Nearly 400K unprivileged users
- Needed to assign users to organic groups
○ Based on how webform questions answered
- Had to fix user passwords
○ Fixed by writing directly to the user table inside the migration
Migrate Users Code
Unprivileged vs. Privileged was a simple query:
class NvidiaPrivilegedUserMigration extends NvidiaUserMigration { protected function query() { $query = parent::query(); $query->condition('u.mail', '%nvidia.com', 'LIKE/NOT LIKE'); return $query; } }
Migrate Users Code
Fix the password:
public function complete($account, $row) { parent::complete($account, $row); $account->pass = $row->pass; db_update('users')
- >fields(array('pass' => $account->pass))
- >condition('uid', $account->uid)
- >execute();
$this->nvidia_memberships($row); }
Assign Users to Groups (Review)
public function nvidia_memberships($row) { $membership_query = Database::getConnection('default', 'd6source')->select ('webform_submissions', 'ws'); $membership_query->join('webform_submitted_data', 'wd', 'wd.sid = ws. sid'); $membership_query->fields('wd', array('cid')); $membership_query->fields('ws', array('nid')); $membership_query->addExpression('group_concat(data)', 'data'); $membership_query->groupBy('ws.sid'); $membership_query->groupBy('cid'); $membership_query->condition('ws.uid', $row->uid); $membership_query->condition('ws.nid', array (1234567,2345678,3456789,4567890,5678901), 'IN'); $membership_id = nvidia_og_membership_associate_user_with_program();
Node Migration Challenges
- Body images & links with absolute paths
- Empty fields sometimes caused display issues
- Had to deal with “interesting” architecture
decisions on the D6 site
- Moved larger files to the cloud
- Reduced the number of content types
Node Migration Code
Dealing with textarea images:
- Needed to use Simple HTML DOM Parser
- Code Review
How a Strange Dev. Decision can Affect a Migration
D6 product page and dB variables table (review) led to the following code
$variable_name = 'nvidia_product_disable_product_image_'.$row->nid; // drush_print_r($variable_name); $query = Database::getConnection('default', 'd6source')
- >select('variable', 'v')
- >fields('v', array('name', 'value'))
- >condition('v.name', $variable_name, '=')
- >execute()
- >fetchAll();
$product_image_disabled = $query[0]->value; if ($product_image_disabled == 'i:1;') { $row->field_inline_image = NULL; }
Remove Empty Textarea Fields
public function prepare($entity, stdClass $row) { foreach ($row as $key => $value) { if (!isset($row->$key) || $row->$key === null) { $entity->$key = NULL; } } }
“Non-Standard” Entity Migrations (Review)
- D2D handles established Drupal entities well
○ nodes, users, taxonomy, etc.
- But what if you want to migrate block content
to an entity?
○ CSV Migration to the rescue
Challenges
- Biggest challenge was reducing the
migration time
○ The original estimate just for migrating users was
- ver 40h
○ Eventually that time was reduced to ~ 3 hours ○ We tweaked my.cnf, php.ini, drush.ini ○ Got a really fast server with Intel Xeon processors, fast RAM, and a SSD
Challenges
- Installation of modules in order
○ circular dependencies ○ features that add fields need to be installed before migration
- Relationships between content
○ Both nodes need to exist before creating a relationship ○ “Parent” content that did not exist in original site
Migration timeline
- -7days to release: Content freeze
- -2days: Automated rebuild, content migration
and editorial approval.
- -8h: Registration lockdown and migration
start
- -2h: Batch processing of content by editors
and final tests
Accelerating migration
- Use Drush
- Single pass for each item
○ Migration objects are big and slow ○ Don’t load an object from DB twice
- Multithreading
○ https://www.deeson.co.uk/labs/multi-processing-part-2-how-make-migrate-move
Add multithreading to a working migration class
- Not very portable
○ needs a Drush extension ○ needs to run on the ‘fast’ server
- Very effective
Add multithreading to a working migration class
- Sub-class the migration
- Make all the sub-migrations use the same
index
- Make the sub-migration work on a small
‘chunk’ of the index
- Break the migration in parts and send
chunks of it to multiple threads
Add multithreading to a working migration class
<?php class NVMultiThread extends NvidiaUnprivilegedUserMigration { public function __construct($args) { $args += array( 'source_connection' => NVIDIA_MIGRATE_SOURCE_DATABASE, 'source_version' => 6, 'format_mappings' => array( '1' => 'filtered_html', '2' => 'full_html', '3' => 'plain_text', '4' => 'full_html', ), 'description' => t('Multithreaded Migration of users from Drupal 6'), 'role_migration' => 'Role', );
This is boilerplate needed by D2D
Add multithreading to a working migration class
parent::__construct($args); $this->limit = empty($args['limit']) ? 100 : $args['limit']; $this->offset = empty($args['offset']) ? 0 : $args['offset']; $this->map = new MigrateSQLMap('nvidiaunprivilegeduser', array( 'uid' => array( 'type' => 'int', 'unsigned' => TRUE, 'not null' => TRUE, 'description' => 'User migration reference', ), ), MigrateDestinationUser::getKeySchema() ); }
map/index table
index definition
Add multithreading to a working migration class
protected function query() { $query = parent::query(); $query->range($this->arguments['offset'], $this->arguments['limit']); return $query; } }
Modify original query to limit the number of items to work on
Measuring the improvement
- Same server
- Restore destination DB from backup after
each run
- Same source DB
- Both DBs in the same server
- MySQL optimizations for concurrency issues
Measuring the improvement
1000 rows, 100 per thread
Threads Time Speed 1 71s 845/min 2 60s 1000/min 3 54s 1111/min
Measuring the improvement
10,000 rows, 1000 per thread
Threads Time Speed 1 707s 848/min 2 303s 1980/min 3 300s 2000/min 4 291s 2061/min 5 351s 1709/min
Measuring the improvement
50,000 rows, 5000 per thread
Threads Time Speed 3 1990s 1507/min 4 1562s 1920/min 5 1303s 2302/min 6 1637s 1832/min
Conclusion
- Drop DNS TTL to 1 minute days before
launch
- Repeatability is key
- Migration is very powerful but can be slow
- Automation helps drop downtime close to
zero
Conclusion
- Ask for help
- There’s many ways to use Migration, if one
way is not working drop it and use it differently
○ CSV vs direct read from DB
- Weird things happen with orphaned fields