Single source of truth from 500+ data feeds — Fortune 500 firm implemented an enterprise data lake on cloud (AWS)
A scalable, one-click data ingestion solution for data pipelines and use cases with built-in robust security, governance, and metadata management
Challenge
A Fortune Global 500 insurance brokerage and risk management company wanted to have a single source of truth and a comprehensive data repository to enhance cost efficiency, performance, and security. They also wanted to migrate their use cases as-is without compromising on data accuracy, SLAs, and existing production applications and business users.
Requirements
The brokerage firm wanted a self-service ingestion platform that could directly be used by business users. They had specific governance and compliance needs and wanted to integrate with enterprise tools to reduce time-to-market. They wanted a solution that would dynamically handle load fluctuation depending on accounting SLAs, priority, and cost.
Additionally, the firm was looking for a cloud-based data lake that would track project costs, provision user authorization for business data access needs, and help them to:
- Enable robust network policies for on-cloud systems adhering to compliance specifications
- Ensure audit certification and compliance
- Enable automated detection, masking, and encryption of PII, PHI, and SPI data
- Ensure high availability and operational readiness of production systems
- Leverage a self-service web portal for data ingestion, tracking, and debugging
Impact
Supporting ingestion of 500+ batch and real-time data feeds on the platform with multi-TB data volume
Reduced release cycle time from 8 weeks to 2 weeks
Saved OpEx of USD 50K per month with continuous monitoring and optimization
Reduced ingestion time from hours to minutes through a powerful web interface for business users
Impetus Technologies Inc. migrated all the existing applications to the cloud with zero downtime for the business. A fully secured data lake on the AWS platform with enterprise governance and security compliance was implemented with the following capabilities:
- Data pipelines connecting Teradata, Oracle, and MS SQL to AWS S3
- Enabled Consume layer with Tableau, Presto, Jupyter NoteBook, and Hue
- Voltage-enabled encryption and Ranger powered access policies
- Self-service data ingestion
- Automated analysis of data quality profiles and trends
- High availability of all critical components
- Single-click deployment for data pipelines and platform
- Serverless data pipelines leveraging Lambda, Step Functions, EMR, and CloudWatch
- Single Jenkins pipeline to onboard 100+ Spring Boot, Apigee, Angular, Node.js, and Java applications as Docker containers
- High resiliency leveraging HAProxy to route network traffic between containers
- End-to-end DevOps for data platform and use cases
Moving from disparate sources to a unified data lake created a single source of truth for the insurance firm, enabling a unified, clear, and present view of their business.
You may also be interested in…