Drupal Data Migration

A content migration can become an overwhelming test of patience if it is attempted without a detailed and comprehensive plan. A well-formulated content migration strategy is an absolute necessity to ensure the success of your switch to Drupal, and should be one of the first steps in any major move from a legacy system. But be wary of claims of easy solutions and magic tools! Put your trust in the detailed expert analysis and enterprise-tested processes that the Drupal Connect team can provide.

Your legacy system probably has a few quirks; that’s why you’re building a state-of-the-art Drupal website. You want your legacy content and data to move to the new site, so you can’t ignore that old system yet. But you don’t want those old problems to reappear on your new site. A transitional state is required – one that includes an understanding of how your legacy system works and how your new Drupal site will operate. Conceptualizing your content migration can be confusing, even in the absence of complicating factors.

Feel confident in your content migration process by trusting Drupal Connect’s experts and avoid headaches and sleepless nights!

 

Contact us about Data Migration

Featured Projects

Stanford University

Stanford University’s School of Engineering contacted Drupal Connect to migrate their legacy site into Drupal. Stanford is one of the most well-respected private universities in the country.

Challenges

HTML Migration: Stanford School of Engineering’s legacy site was comprised of over 1,300 static HTML pages that had to be converted to Drupal nodes. The markup of these pages was not always consistent, and some pages differed from the norm considerably in terms of look and feel, making automated content parsing more difficult. Migrated pages needed to retain all embedded images and links, as well as relevant copy.

MSSQL Migration: The legacy site also contained data supplied by a MSSQL database, including 1,000+ faculty profiles, 5,000+ publications, and hundreds of labs and organizations, each with many specific data fields. These also had to be migrated to Drupal with no data loss and a clean, reusable implementation.

Performance: The website is highly trafficked and as such, maintaining performance and reliability was a high priority.

Solutions

Automated HTML retrieval, parsing, cleanup, and Drupal node creation: We built a tool that crawled each page of the legacy site, fetching and cleaning/sanitizing relevant content and creating Drupal page nodes. This tool was also responsible for bringing over images as Drupal files, correcting all local links to point to new page locations, and updating the site nav to point to the new Drupal paths. Using this tool, the process was completely portable, and we were able to run it repeatedly as the legacy site’s content was updated while development was in progress.

Automated MSSQL to MySQL to Drupal importing: We built a set of scripts which, given a MSSQL dump of the legacy site’s data, converted it to a MySQL dump, imported it into the Drupal database as custom tables, and mapped the relevant fields to CCK fields in custom content types to create nodes. This was also a repeatable tool, and was used and reused as MSSQL content was added or updated.

Pressflow and Varnish: To solve the performance issues, we built the site on top of Pressflow and put it behind Varnish, a reverse proxy and HTTP accelerator that is known to serve around 3,000 requests per second.

Maintain Stanford best practices: Stanford has a custom Drupal theme as well as a custom Drupal module which were integrated into their internal authentication/log-in system. We integrated both into the site so that it maintained a cohesive look and feel with other Stanford sites, and anyone with an internal Stanford log-in (i.e., all students, faculty, and staff) could log in to the Drupal site using that information.

View Website
Stevens Institute of Technology Stevens Institute of Technology Launches onto Drupal

SIT Launches 8 Bleeding-edge Drupal Sites

This summer, Drupal Connect finalized a nine-month engagement with Stevens Institute of Technology that represents evolutionary work on multiple fronts. The new websites modernize Stevens' web presence, improve mobile and tablet reach, standardize the technology stack, and enrich the user experience. Drupal Connect's efforts also significantly improve the speed and efficiency of web administration, while reducing duplication and costs.

Responsive Design and an Increased Reach

Given Stevens' significant international appeal and the prevalence of mobile phones and tables as primary browsing devices, Drupal Connect created an HTML5 Responsive Design interface that adapts to wide, standard, narrow, tablet (portrait and landscape), and iPhone/Android (portrait and landscape) screens. For example, the theme adjusts elements in the News Events slider and gracefully resizes videos and typography. The new theme means more alumni, donors, and prospective students can view the sites in meaningful and full-featured ways.

APIs and Content Sharing Improve Browsing Experience and Reduce Costs

Stevens' had a large amount of back-end data — courses, staff bios, publications, patents, etc — that had traditionally been manually entered and maintained on the websites. This led not only to duplication of efforts, but also out-of-date information, minor inaccuracies, and typos. Drupal Connect collaborated with Stevens' in-house IT team to set up secure APIs exposing the back-end data to the Drupal sites.

The central "Hub" site consumes the API data (JSON) and in turn, shares it with the rest of the sites. In addition to the back-end systems data, Drupal Connect also built-in the capability for the Hub to share its own content as Drupal nodes, JSON, or HTML endpoints. The other sites act as clients, flexibly displaying content as appropriate. For example, Stevens' needed consistency across the News Events sliders, so these are actually shared as HTML and consumed by a custom Panels plugin. Other data have looser restrictions and are shared as Drupal nodes, enabling client sites to use the nodes in Views, Panels, or Display Suite as needed. Drupal Connect also created a system of roles, permissions, and workflows to let Stevens manage certain content centrally, but still ensured School and Department autonomy.

The APIs and Content Sharing removed a significant maintenance burden and opened-up new possibilities for sharing news and events across the sites. The benefits are easiest to see in the cross-listed courses and faculty, but are interwoven throughout the sites.

Migrating onto Drupal

Drupal Connect migrated hundreds of legacy pages from multiple technology platforms using our Flatfish library. By using Flatfish, we were able to avoid messy 64-bit Oracle database support issues and start the migration work almost immediately.

By intelligently migrating all of the old content into the new Drupal sites, Stevens was able to review and curate the content over the course of several weeks — something that would have been an impossibility in their old systems.

View Website
Stanford University

This July, the Stanford School of Engineering (SoE) relaunched their website, finalizing a nine month project with Drupal Connect.

The first Phase of the project (completed in January) focused on migrating and combining content from a legacy HTML and custom PHP MS SQL application. The following link provides more details on the Drupal Migration efforts.

The second Phase of this project:

Built the infrastructure for content sharing between the SoE site and the Mechanical Engineering site as well as future department sites

Completely reworked the IA

Developed a new Twitter Bootstrap/Stanford Framework based theme — now available across Stanford

Included support for Nginx, Varnish, and Apache Solr

Migrated an earlier in-house SoE Drupal site (engineering-info.stanford.edu)

Initiated a new rich Content Strategy

And saw the release of several Kit-compliant Features on Stanford's new F-server

Content-Sharing Infrastructure

Stanford needed a lightweight way to share faculty bios, news/press releases, and similar content across sites. Drupal Connect created an XML/RSS Views/Feeds solution that provides simple, one-way sharing from SoE. This will allow department sites, like Mechanical Engineering relaunching August 2012, to enrich their users' visits and stay up-to-date without the extra costs and duplication of manual efforts.

The success and need of the content sharing at SoE reconfirmed our decision and inspired us to redouble our efforts to build our next-generation Drupal distribution for universities, Trekk.

A New IA

As part of Stanford's new rich experience strategy, Drupal Connect created an information architecture that makes it easy for students, alumni, and faculty/staff to find the information they are looking for and to discover the amazing work that the university is undertaking every day. The new IA was integral to Stanford's updated Content Strategy and will continue to serve the site as the content and experiences evolve.

SoE Theme

The new theme is image rich, but the code is light-weight using a base theme built on top of Twitter's Bootstrap. The new theme also incorporates custom design accents from North Studio that truly make the site pop while maintaining Stanford branding guidelines.

Varnish, Nginx, and Apache Solr

As part of our efforts to improve page speed delivery, reduce costs, and provide scalability; Drupal Connect installed and configured Varnish. Stanford's unique infrastructure also required SSL for all logins and integration with a custom Apache module, WebAuth--which ties into Stanford's Kerberos system. In order to keep the setup simple, Drupal Connect also installed and configured Nginx as a lightweight SSL endpoint that proxies requests over SSL to Apache and caches static assets for improved performance.

In coordination with the new Content Strategy, Drupal Connect installed and configured Apache Solr with several custom search pages. By using Solr, SoE visitors will find more relevant results faster.

The use of these new technologies and communication with Stanford's Web Services has established best-practices for Stanford at-large.

Conclusion

Altogether, the Phase 2 implementation:

Gives SoE visitors a better experience

Extends the reach of the External Relations enabling users to discover exciting news

Ensures up-to-date content

Reduces duplication

Saves costs

And establishes best-practices and reusable components across Stanford's campus

View Website
One Tribe

Project Features:

  • e-Commerce
  • Blog
View Website

Contact Us

Request a quote or get more info

We offer free online quotes. Get yours today! If you’d like more information, please contact us. We’re looking forward to hearing from you.

Request a Quote or Contact Us

Follow us on: