Migrations are not easy: they take time, energy and effort to make them successful. BigQuery makes it easier with customizable tools and years of expertise to help out with your journey to the cloud. In line with our commitment to providing an open and flexible platform, we have built these migration tools following an open approach which enables flexibility and choice for you and your partners when migrating to BigQuery.
Comprehensive solution for migration to BigQuery
Today, we are announcing the preview of BigQuery Migration Service, a set of free-to-use tools to help you with your end-to-end migration needs. This service speeds up Teradata to BigQuery migrations with tooling covering migration planning, data transfer, automated SQL/script conversion and data verification. Support for additional data warehouses is coming soon.
Assessment: Plan and manage migration risks and costs
When we work with customers and partners on migrations, the most important step is to understand their ecosystem, requirements and business goals. We use this information to create a custom migration plan to help prepare and execute migrations. Over and over again, we have seen that identifying and addressing migration complexities ahead of time leads to reduced TCO and lower risk migrations.
To that end, we are excited to announce private preview of our automated assessment tool as part of the BigQuery Migration Service. Assessment leverages our many years of experience helping some of the large organizations in the world modernize with BigQuery. It provides an easy and automated way to collect statistics from your legacy warehouse and generates a state analysis report consisting of:
- List of database objects, data I/O patterns and dependencies
- Automated query translation coverage and results
- Query-to-Object mapping (e.g., which tables, views, functions it uses)
- User-to-Table mapping (e.g., which users access which tables)
- Table correlations (e.g., tables which are often joined or sub queried)
- List of BI/ETL tools in use
The summary report helps you efficiently prioritize and have a clear understanding of all the components and the amount of work required to execute a migration. In addition, all the underlying assessment data is made available within a BigQuery dataset for complete customization and ad-hoc analysis for you and your migration partners.
SQL translation: Reduce manual effort, time and errors
One of the hardest pieces of a data warehouse migration is modernizing legacy business logic, such as SQL queries, scripts and stored procedures. This process normally involves substantial manual query rewrites and verifications, which is time consuming and error prone. Today, we are excited to announce the public preview of batch and interactive SQL translation which helps automate much of this process, thus speeding up your path to a successful migration. Batch and interactive SQL translation provides fast, semantically correct and human readable translations of legacy objects with no ongoing dependencies post migration. It supports a broad range of Teradata artifacts including DML, DDL and BTEQ. Translations can be run in batch mode or ad-hoc directly from the BigQuery SQL workspace. Early users of SQL translation saw ~95% successful translations on 10M+ queries, leaving only ~5% of queries for manual review with their migration partners.
Interactive SQL translation provides a split view within the BigQuery SQL editor. Users can type in SQL queries in non-BigQuery dialects and view the translated BigQuery SQL immediately. Interactive SQL translation provides users with a live, real time SQL translation tool which allows users to self-serve translation of their queries in parallel with a centralized large-scale SQL migration effort. This not only reduces the time and effort for analysts to migrate their queries, but also increases how quickly they learn to leverage the modern capabilities of BigQuery.
Data Validation: Verify correctness of data
Data validation is a crucial step in data warehouse migration projects. It compares structured and semi-structured data from the source and target to confirm data and logic has been moved correctly. The GCP Data Validation Tool (DVT) is an open sourced CLI tool that leverages open-source frameworks. It offers customized multi-level validation functions to compare source and target tables on the table level, column level, and row level. It is also flexible, meaning that new validation rules can easily be plugged in as you see fit. Furthermore, to facilitate automation, orchestration and scheduling, it can also be integrated with Cloud Functions, Cloud Run, and Composer for recurring validation.
Customize BigQuery Migration Service to your needs
With BigQuery Migration Service we are fast tracking, simplifying and de-risking your migration so that you can modernize your data warehouse with BigQuery, a truly serverless and modern data warehouse, with confidence. We are starting with Teradata migration capabilities, but will add support for additional data warehouses soon.
We have built end-to-end tools with openness top-of-mind which you and your migration partner of choice can customize to help ensure a successful migration. From assessment which streamlines metrics collection and exposes the raw dataset for full customization to our open source Data Validation Tool, where you and your partner can add custom validation logic, we are committed to giving you migrations tools you can customize to your unique needs.
Our tools are freely available to help speed up your migration. If you would like to leverage our tools for an upcoming proof-of-concept or migration, reach out to your favorite GCP partner, your GCP sales rep or check out our documentation. We look forward to partnering with you on your journey to the cloud.
By: Manish Dalwadi (Product Manager, BigQuery) and Hamza Arian (Software Engineering Manager, BigQuery)
Source: Google Cloud Blog