A recent email search showed that I have been working as part of a content migration team in Glasgow for 16 years! When I first joined Vamosa in January, 2007, I had already been working on moving data from file systems to the earliest versions of Microsoft SharePoint. This meant I had a bit of a background in data transforming and automating the process through scripting.
The Origins of Vamosa
Vamosa had started their migration journey by moving websites into content management systems. From there they developed to build a formal method around the distinct stages of data migration. With some bespoke software (notably, the Vamosa Content Migrator – VCM) they were able to structure a repeatable migration process. They were also able to provide documentation to show that all data had been migrated successfully. This was a major advantage for customers who had previously moved data manually or using ad-hoc scripts. However, the major benefit was in the way that this process could be built to incorporate transformation of the data during the migration.
That was the key differentiator for customers wishing to move from one content management system to another. In such cases, the structure and metadata of the platforms could differ significantly.
Exciting times followed, and global migration projects came and went as both the Vamosa organisation and our project sizes grew. We were finally acquired by T-Systems in 2010. By that time, we realised that the original Java based VCM software was starting to show signs of struggling to keep up. This was due to the rapidly increasing volumes of data being retained in customers’ content management systems. It was clear to everyone at Team Vamosa that we would need a more robust solution to cope with larger and more complex migrations.
The Migration Architect Years
The changing market evolved to include a huge variety of content management systems (CMS) along with an increase in complexity. There was also a dramatic increase in the volume of data being migrated. All of the knowledge of an organisation was stored within the CMS. and as this became a more central focus of collaboration, its importance to businesses grew.
Changes under T-Systems
Under the new umbrella of T-Systems, it was agreed that an upgrade to our migration software was required. This lead to a move in 2016 from VCM to Migration Architect. There were several major changes in this version of the software, all focusing on improving the resilience and stability of the platform when running on multiple servers. This led to the move to .Net from the previous Java back-end, and also the introduction of a No-SQL document database, (RavenDB) in place of the previous relational database. This brought much needed stability and a certain degree of scalability to the platform. As before, the new software opened doors and allowed us to tackle larger and more complex migration projects. There is, however, still a requirement for skilled and technical resources to deploy and use the software, which in turn can result in other limitations.
Migration Architect presently copes admirably, but although it is technically scale-able across multiple servers, the instances require significant manual effort to be kept in sync and to have workloads distributed evenly across the available servers.
Working with a distributed team over multiple geographical locations is undoubtedly a challenge. However, we have nevertheless delivered projects for major customers with up to 6 terabytes (TB) of data across 12 virtual servers. Impressive though this is, however, we are aware that this is pushing the limit of what is achievable with the current platform. We are now actively looking at further evolution of our core software.
The New Frontier
Late in 2021, we had an enquiry about migration of a large cloud file-sharing and storage platform to a new home within Telekom’s Open Telekom Cloud (OTC). The volumes were staggering; potentially 7 petabytes (PB) of data for 4.5 million users.
Transferring the Data
Due to complexities in accessing the data, a physical move between data centres had been ruled out. This left the only option as an API transfer between the source and target platform over a dedicated data line. The source and target systems were different, so an element of transformation was required during migration. There was also a requirement for no down time,so a constant refresh and sync between source and target was required. This was all to be achieved within a six-month end to end timeline.
Our process could cope with the logical challenge, but we knew immediately that our current Migration Architect software would not be capable of scaling sufficiently to handle the data volumes involved. Due to the aggressive timeline, there was a requirement for the migration of at least 50 terabytes (TB) of data per day. To put this in perspective, this was around 10 times more than the overall total volume of our largest previous migration, so it was clear that a new approach was required!
The solution to this challenge was the development of a cloud deployable migration platform. This allowed us to rapidly scale the migration to the point where the physical limitation of the fixed data line was the only blocker. In addition to these performance gains, the cloud deployment allowed for much closer integration of the processes. It also allowed for dynamic reporting of the current status to be presented to stakeholders through a simple web interface.
At this point, it was clear that the benefits of this approach, and the likelihood of similar scenarios and even larger migrations in the future, would require the new cloud deployable solution to become our standard and recommended approach.