Data Warehouse Migration: How to Make This Strategic Move

Data Warehouse Migration: How to Make This Strategic Move

Ever moved house?


It’s time-consuming, labor-intensive and psychologically stressful.


How about moving an Amazon fulfillment center? One of Amazon’s biggies, the area of 28 football fields with tens of millions of products in it. And hundreds of robots that move in a stunning synchronized dance together with the hundreds of human employees to get out tens of thousands of packages a day. 


Oh, and while you’re packing all that up and reestablishing it a few states over, you do need to keep fulfilling orders. After all, Amazon customers are counting on you. Significant downtime is just not an option.


Ready to take on the job?


Wait – where are you going?

Scared Home Alone GIF by Freeform - Find & Share on GIPHY


Migrating a data fulfillment center (i.e. warehouse)

Your data warehouse is not too different from an Amazon fulfillment center. It too holds thousands or millions of assets. It too has a slew of automated, synchronized pipelines that move assets from one place to another, unpacking and repackaging them as needed. It too is called upon to ship hundreds or thousands of responses to queries daily.


No one wants to disrupt this level of complexity in order to recreate it elsewhere. And yet, sometimes you have to. 


Your old data warehouse has become deprecated. Or its performance no longer meets your needs. Or you predict significant cost and efficiency benefits from transferring to a different data warehousing platform. 


It’s time to commence a data warehouse migration project. But where do you start? Where do you end? And how can you get from start to finish smoothly and successfully?


Developing your data warehouse migration strategy

Step 1: the big picture plan

Why are you considering a migration? What is important to you in the new platform? What was missing from your old platform?


For your new platform, is an on-premises or a cloud data warehouse better for your purposes? On-prem data warehouses offer complete control of your tech stack but, as the other side of the coin, demand complete responsibility for your tech stack. Compliance regulations might be easier to meet when you know exactly where your data is and who has access to it, but cloud data warehouse providers often have the resources to invest more heavily in your security than you could on your own.


Time invested in drawing the big picture will pay dividends when it comes to selecting the right data warehouse and running an efficient migration. If you need to get executive buy-in for your data warehouse migration, a big-picture plan and big-picture goals are doubly important. You will need to tangibly show why your old platform is a liability (e.g. security vulnerabilities, downtime, and client dissatisfaction) and a new platform would solve the issues (e.g. savings in infrastructure direct costs and in the indirect costs of supporting, repairing, and compensating for an inefficient system).


Step 2: clean up time!

If you’re moving an Amazon fulfillment center and you come across a load of boxes that have been sitting there for two years – hooray! You get to charge the owner of those boxes a year’s worth of long-term storage fees. 

Im Rich Cash Money GIF - Find & Share on GIPHY


If you’re moving a data warehouse and you come across a load of data assets that have been sitting around for two years with no one using them – hooray! You can dump them and likely save yourself storage and processing fees. Many cloud data warehouse providers charge separately for storage and for compute time. Extraneous data will cost you more on both counts: you’re taking up storage with unhelpful data, and you may end up including it in your queries, lengthening your compute time.


How do you figure out if a given data asset is dump-able? Enter automated data lineage. This key tool quickly and accurately maps the paths of all data assets throughout your data environment. You’ll be able to see at a glance which data assets are entirely unused (dump them) or redundant (establish which asset will be the canonical one, use it to replace any duplicates in active data pipelines, then dump the duplicates). 


Sometimes entire data pipelines are outdated, unused or inefficient. This is your chance to renovate. 


Step 3: plan your migration pilot

Any complex process can benefit from first running a pilot: a microcosm of the larger process. A pilot lets you see what works and what doesn’t – on a manageable scale. It lets you learn from experience and improve for the future without being pulled under by a grand failure.


In a data warehouse migration pilot, you pick a team or a line of business that is likely to benefit significantly from the migration AND will also willingly cooperate and provide feedback. You go through the entire migration process in miniature: everything from making a data migration map to checking that all your migrated data assets made the journey unscathed.

Road Trip Win GIF by StittsvilleOnPatrol - Find & Share on GIPHY


The aim of your pilot is to create a data migration project template and/or a migration ”kit” that can be used as the migration is rolled out to other teams or lines of business. 


Step 4: run the data migration pilot

Full speed ahead! (Well, more like partial speed ahead. This is a pilot.)


It’s essential while you’re in the middle of the migration to keep careful track of both your source and target data warehouses. Did everything in this database segment make it over? Where is X data asset now? 


If you’re moving your Amazon fulfillment center from Timbuktu to Kalamazoo, it makes sense to start by moving one department (especially if you’ve never orchestrated this kind of move before). So as you’re moving the electronics department, you keep notes on what went smoothly and what didn’t, and how you could make it go better. Because you need to be able to send out a DVD player if an order comes in, you’ll also need to keep track of where your DVD players are at every given moment. Are they in Timbuktu? In Kalamazoo? In transit?


Your business data is probably in at least as much demand by your users as DVD players are by consumers. If a business user needs a report on customer lifetime value, she needs to be able to locate and access the most up-to-date version of the report now, without a whole investigation.


Automated data lineage keeps a running, up-to-date map of both your source and target data environments and the data flow between them. A data lineage map is the go-to for your data team and business users alike in figuring out where they should look for the data asset they are currently seeking.


Step 5: migrate, test, repeat

Ahhhhh… all your pilot data has been successfully transferred to your new data warehouse. 

The Office Yes GIF - Find & Share on GIPHY


We certainly hope you were taking notes. 


Now it’s time to evaluate what happened. Your goal is to come up with insights for making the process smoother and practical tools, templates, and workflows to make it happen.


When you’ve milked your pilot migration for all its worth, expand your reach and commence with migration for other or all departments.


Special cloud data warehouse migration tips

If you’ve decided on an on-prem data warehouse migration to a cloud data warehouse, such as migrating from Vertica to Redshift, extra attention should be paid to the discovery and planning stages. Don’t move on until you know how the applications and services you had on your on-prem warehouse will be structured and accessed in the cloud. 


Also, pay particular attention to information security and regulatory compliance requirements and how those are being addressed at the level you’ve come to expect with an on-prem system – or better.


Fulfilling your migration dreams

All psyched for your data warehouse migration?


Order NOW and you’ll get it with FREE shipping! And FREE returns. Arrives before the holidays! And if you have Data Warehouse Migration Prime, it can be ready the next day!


Yeah, we wish.


Maybe someday we’ll get there. In the meantime, data warehouse migration takes a wee bit more work than a one-click order. Octopai’s data intelligence platform, with its automated data lineage, is your solid partner in both the planning stages (e.g. discovering all your data assets and pipelines, deciding what to keep and what to dump) and the execution stages (e.g. maintaining continually-updated maps of both your source and target systems throughout the migration).


Put in the thought and work, and support it with the right tools, and your migration dreams CAN become reality.

Is your organization Octopied?

With effortless onboarding and no implementation costs, Octopai’s data intelligence platform gives you unprecedented visibility and trust into the most complex data environments.

Announcement ! We are happy to share that Octopai has been acquired by Cloudera