Back FAQ

Frequently Asked Questions

The following are frequently asked questions regarding migrating workloads from Netezza to Hadoop/Spark using the Impetus Workload Migration Solution.

 

The Impetus Workload Migration Solution is an accelerated service where the Impetus team actively collaborates with the customer. We offer workload migration as a service for the following reasons:

  • The automated conversion percentage is dependent on the customer’s unique environment and requirements

  • Our underlying patent-pending configurable, self-learning, and extensible grammar engine needs time to learn these new patterns.

Depending on your technology stack, the new converted environment may allow running the same BI tools. However, the Impetus Workload Migration accelerators may automatically alter the underlying queries to ensure minimal end-user impact.

The Impetus Workload Migration Solution also supports the migration of Oracle Web Report, SSRS, and other reporting tools to modern Agile BI tools like Tableau along with the Netezza migration.

Impetus sets up the data access based on the consumer’s behavior patterns such as the following:

  • Data access tools such as Spark, Hive JDBC, and ODBC

  • BI and analytical tools such as Tableau and SAS

  • Ad hoc querying tools such as Hue, Zeppelin, JDBC, and ODBC

  • Sub-second BI response tools such as Kyvos Insights

  • Interactive/search/discovery applications such as HBase, Phoenix, Solr, and more.

The converted code is SQL, which can be managed by your current SQL resources.

  • You can also use SQL editors and interfaces offered by Hadoop or cloud, such as Apache Zeppelin or Hue, and other tools, such as Toad, via JDBC access. 

  • For stored procedures, the Java/Scala lightweight code is accompanied by Maven projects for easy maintenance.

  • You can also use the Impetus Data Blending tool to manage the end-to-end flow and code in a browser-based UI.

The overall pricing depends on the number of unique queries running in Netezza. Impetus can establish the cost after the risk-free assessment phase.

Impetus recommends launching the migration with a risk-free, comprehensive 4-8-week assessment, which includes a pilot (to prove the automagical conversion aspect, build confidence, and demonstrate ROI) and an end-to-end use case. After the successful completion of the initial phase, the overall migration usually takes between 3-20 months, depending on the size of the Netezza workloads.

We’ve observed that the Impetus Workload Migration Solution can save up to 70% of time compared to manual migration. This is because we automate the entire process, end-to-end, from assessment and migration to validation and execution.

Depending on the tool and usage pattern (push down query or other), the ETL tool will either be integrated with the new environment or, if needed, replaced with a Hive/Spark-based ETL.
The Impetus Data Blending tool can be used for managing the end-to-end ETL flow on Hive/Spark code in a browser-based UI.

No. There is no dependency on the Impetus tool after the migration is complete because the converted code is open source compliant.

The Impetus Workload Migration Solution ensures that the end-to-end SLA process is matched or will provide the Hadoop cluster size to match the same. If the Hadoop cluster size option is needed, it also includes cost comparisons.

The Impetus Workload Migration Solution converts stored procedures in two phases:

  • First, it converts the SQL portion to Hadoop SQL.

  • Then it converts the procedures, such as loops and cursors, to a lightweight Java or Scala wrapper.

The Impetus Workload Migration accelerator supports Ab Initio Migration to open source Spark/Hadoop code. The output code can be managed through a UI in Impetus Data Blending product, Talend, or other tools that have Spark/Hive integrations, or AWS EMR/AWS Glue/AWS Pipeline in cloud.
The Impetus Data Blending (commercial) tool can be used for managing the end-to-end ETL flow on Hive/Spark code in a browser-based UI.

All the UDFs and stored procedures are converted or replaced by a rich set of Spark/Hive UDF or reusable components. These UDF can be used for Netezza workloads or any greenfield development.

The Impetus Workload Migration Solution supports AWS, Azure, and generic IaaS cloud providers that have Hive 1.3+ and/or Spark 1.6+ in the form of IaaS or PaaS, such as EMR and HDInsights.

The Impetus Workload Migration Solution supports all Hadoop distros, including HDP, CDH, MapR, IBM Hadoop, Oracle Hadoop Appliance, AWS EMR, and Azure HDInsight that support Hive 1.3+ and/or Spark 1.6+.

The Impetus Workload Migration Solution converts SQL and any Netezza UDFs to Hive/Spark-compatible SQL queries. Impetus leverages a patent-pending configurable, self-learning, and extensible grammar engine for the conversion.

The Impetus Workload Migration Solution is an accelerated end-to-end service that enables the migration of Netezza to a low-cost and scalable Hadoop/Spark platform. In addition, it ensures that ingestion and consumption tools and processes, SLA, and DevOps work seamlessly after the migration is complete.

The cloud serves as a low-cost scalable environment that can lower capital investments such as hardware, operational costs, on-demand usage-based costs, and outsourced infrastructure costs. The cloud also allows for the possibility of additional capabilities, such as advanced analytics, machine learning, streaming data, unstructured data, and more.

Many Netezza users are exploring various growth options to achieve greater scalability and tighter integration with their enterprise data lake and cloud computing initiatives.

IBM is offering its customers the IBM PureData System or IBM Db2 Analytics Accelerator for analytics. For cloud-based deployments, IBM has IBM DashDB. But these are expensive options and carry the risk of vendor lock-in.

Netezza users want to move their increasingly important enterprise analytics processing to a new platform — choosing to shift away from dependence on a single vendor architecture in order to take advantage of the flexibility that the open source (Hadoop/Spark) revolution is delivering.

Hadoop is an open source, low-cost, scalable data warehouse. It also has more capabilities to match the evolving data landscape such as advanced analytics, machine learning, streaming data, unstructured data, and more.

The Next Chapter for Netezza Begins Here