The Next Magenta Cloud Project
How did we manage to scale our migration process so that we were able to transfer up to 70TB of data per day consistently for three months? The Next MAGENTA Cloud project has been the subject of a few of my previous blogs, but now the project is complete and the dust has settled, I feel a proper send off is required.
The project was the largest by far that Vamosa have ever delivered (by more than 1000 times) and required some creative thinking and rapid development to bring us to a point where the challenge seemed possible. To recap, the migration requirement was to bring the existing Magenta Cloud file sharing platform, which was hosted by a third party, onto a new Open Telekom Cloud (OTC)/Nextcloud platform. The files, owned by individual users and companies, would have to be moved over a dedicated fixed line between data centres within a six-month period – including setup and testing.
The software used for file sharing was changing from an existing third party solution to a new, open source Nextcloud platform hosted on OTC. A physical “lift and shift” approach between the data centres had been ruled out, as some transformation of the data was required to make it suitable for hosting within Nextcloud.
Moving and transforming data is our speciality of course, and we have been doing that successfully for over 15 years. The challenge in this case was the data volume, and the requirement to continuously refresh the migrated data as users continued to upload and modify files in the source platform. The total volume was expected to be around 7 petabytes (PB) of data, which would require a consistent daily transfer of up to 70 terabytes (TB). This daily volume was around ten times more than the total volume of our largest previous migration. Obviously, this required some serious consideration and planning about how we could potentially meet this requirement.
Our current software, Migration Architect, is able to scale across several servers, but there is still some manual configuration involved and this becomes more complex in larger projects. It was clear from the outset that a project of this size would require a different approach.
Since the content was being brought into the OTC platform, we decided to build a solution which would retain all of the benefits of our methodology and approach, while being deployable and fully scalable on the OTC platform. The plan was to develop the solution with a view to re-use in future large migration projects, and which would be deployable on any cloud platform, although with the tight time constraints, we quite quickly realised that the build of a fully reusable toolset would have to follow on at a later stage
The functionality within OTC allowed us to easily scale the migration process across multiple servers, sharing common message queues to ensure that work could be distributed evenly and without the risk of conflicts. The performance was extremely impressive, and the only limitations were the capacity of the dedicated data line and the ability of the source platform to cope with the demand. As a result we were actually asked on several occasions to slow our processing down!
Other benefits came through our ability to run parallel streams of activity focused on data quality and reporting. This allowed us to provide a dynamic dashboard showing current status and performance. This was a major benefit to the business users who were keen to track progress on a regular basis. We were also able to carry out detailed validation of every migrated item to guarantee that all data had been migrated successfully.