How Do You Migrate Systems?5 July 2019
/by Jakub Białas/
Well-prepared system migrations can go smoothly and exactly as planned, or just the opposite, they can go sideways. They can take minutes, or they may drag on for hours or even days. In the worst case, everybody involved is unhappy and the system is frozen.
Let's take a look at three examples of IT projects where I’ve led the system migration and the problems that I encountered.
1) The HRD System was a domain registration system. It contained a very large database, which was migrated all at once. After switching the system, it turned out that the new system wasn’t stable. However, you couldn’t go back to the old version, because the migration had to be done quickly over the weekend. The resulting issues spilled into the next 6 months.
2) In the case of the Kozaczek portal, instead of system migration, a synchronization was prepared. Thanks to this, it was possible to launch a new portal in parallel with the existing old one and the new service was operating in “read-only” mode. Data from the old site was "injected" on a regular basis, into the new database.
3) The project "Zeberka" consisted of synchronization with the Kozaczek portal, and improving it. Instead of copying the data directly to the database, the data was saved by the API, which solved the problems with autogenerated content. The new portal could easily generate during the "data injection” period. In this project, I had already applied the best migration practices described below.
During the migration of systems in the above-described projects I noticed some recurring problems:
- When implementing a migration, you’ll find that access is often only to a part of the data, and production data often contains other standard cases.
- After migrating the data, the final tests often point out that something is missing or something is wrong. You have to then correct the migration and repeat it.
- Problems with missing data won’t be noticed until after the production launch of the application, when there’s already new data in the database which wasn’t in the old system.
- A long-term shutdown of the system during the upgrade annoys both the client and development team, especially when other problems are appearing while switching systems.
Fortunately, my experience from these projects has pointed out both significant and repetitive problems. As a response I’ve created a few principles that I always follow when writing data migrators. I’m happy to share them, here.
- Always write the datasource clearly and concisely in the structure of databases. The amount of information saved depends on the type of data we enter, but most often we write:
- The name of the resource the data originates from, this column is insignificant when all data comes from the resource site, but it really matters when we import data from more sites. It’s best to define the enum type with unique values for each resource.
- External identifier, that is, the id of the object in the resource it comes from, most often it is an id or UUID column. For users, it is often an email. It is important that this value is unique and allowed to find the object in the resource.
- The URL of the object is not necessary and it is a redundant value. It can’t always be generated, but it often turns out to be useful, because by opening such a link, we immediately get a preview of the object from the old system. This is most useful when the resource objects are files.2. By migrating the data, we always check whether the migrated object already exists in the new system, and if so, instead of adding a new object, we update the existing one, accordingly. Thanks to this solution, we can run the same migration several times and do not risk duplicating our data. Our migrations then become the synchronization of the new system with the old one, with the possibility of restarting the migration giving us many advantages:
- If there is an error in the migration and some of the data has been omitted, we can fix the migration and run it again without having to wipe the data we’ve imported from the new system.
- If we add new hooks or triggers to the new system, we won’t have to worry about the data that we had imported, we’ll only need to restart the migration.
- We can run a migration several times before finally connecting to the system and repetitively synchronize it with the old application. This allows us to test and view data in the new system while we are still working on the old one.
- We can migrate data to the new system well before its production launch, meaning we are better prepared to switch systems.3. By migrating data, we use the new application's API. Using the API ensures all hooks and triggers are invoked into the new system. If we inject data directly to the database, we have to program and call all these procedures manually. Of course, writing data by API may be slower than direct writing to the database, but it is basically a smaller issue than the inconsistency of data that can arise when you forget to call important procedures.
The main determinant of how complex the migration is, is the amount of data in the system. The more data, the longer the migration takes, the more special cases occur, the longer time it takes to migration related errors.
When writing information systems, we often have to deal with cases where the client is already using the application that they want to replace. When writing an application from scratch, we design a new database structure for it, which is almost never consistent with the previous system. At the end of the project, the situation arises where the old system needs to be replaced with the new one. However, the client expects the data from the old system in the new application, when it still actually needs to be imported into this new application.
Migration data is usually measured in hundreds of megabytes, and often in hundreds of gigabytes. Moving this, of course, has to be automated. This is a very important process and scripts for migration should be carefully prepared in advance and tested accordingly. Otherwise, the system switching process will prove to be long and stressful for those preparing the new environment, with the customer dissatisfied with the fact that they and their clients have been frozen out of their system.
So finally, the conclusion -
the principles of migration developed over the three projects previously mentioned, worked perfectly in the project, Zeberka, where the migration was smooth and both the developer and client were stress free. Zeberka is a new project and will go live in just 2 weeks. It’s currently running in beta, which is just synchronizing with the old portal. The beta version will be available to users in a week, and in 2 weeks we will disable the old portal and replace it with the beta version. At that point, the entire system migration phase will be a success.