Comparing key features of Amazon MSK and Confluent
January 2023
Impetus/Resources/Comparing key features of Amazon MSK and Confluent
How to choose the right tool for seamless Kafka services
Apache Kafka is a real-time event streaming platform that helps enterprises gain reliable insights for quick decision-making and improved customer experience. While it meets the enterprise streaming requirements, maintenance and management of Kafka is an overhead. To reduce these overheads, Amazon MSK (Managed service for Kafka) and Confluent Cloud are widely used by enterprises for event streaming with Apache Kafka.
The difference between Amazon MSK and Confluent is that of being cloud-native and cloud-hosted. But this is just the tip of the iceberg. There are many subtle differences that enterprises need to consider while choosing the right service.
This blog will compare Confluent and Amazon MSK to help you understand which fits your requirements best.
Introduced by the creators of Kafka, Confluent Cloud is a simple, scalable, resilient, and secure event streaming platform with pre-built fully managed Kafka Connectors that make it easy to connect to popular data sources and sink instantly.
Amazon Managed Service for Kafka (MSK), which runs on open source versions of Apache Kafka, is a fully-managed service that allows you to produce and consume data, create, update, and delete clusters using your existing setup applications, plug-ins, and tools. In addition, it reduces infrastructure provisioning and monitoring overheads by scaling up/down based on the requirement.
Confluent vs. MSK
Confluent Cloud clusters are self-serving, on-demand, and can be provisioned as
Basic, mainly used for experimentation, early development, and basic use cases
Standard, used for production-ready use cases leveraging an extended feature set and elastic scaling up to 1 GBPS
Dedicated, recommended for production-critical throughput at GBPS+ scale that requires private networking
On the other hand, Amazon MSK comes with two deployment options:
Provisioned, which is required to manage broker instances and storage
Serverless, which automatically provisions and scales compute and storage resources
The section below compares Confluent clusters with MSK (provisioned and serverless variants).
Infrastructure
Creating and managing Kafka clusters can be tedious. Confluent and MSK provide easy-to-set-up infrastructure and allow you to choose from multiple options depending on your requirements so that you can focus on building use cases rather than deployment.
Criteria Group
Confluent Cloud
Amazon MSK – Serverless
Amazon MSK – Provisioned
Deployment options
Supports multi-cloud (AWS, GCP, Azure) and hybrid deployment
AWS-native fully managed streaming service for Kafka
AWS-native partially managed service for Kafka with deployment options for Capacity planning
Pricing
Pay-per-use: Depends on ingress and egress throughput
Pay according to the number of partitions, throughput, and duration
Depends on the number and type of brokers and storage
Scalability
Kafka is highly scalable. However, managing brokers and partitions according to the load is cumbersome. Auto-scaling enables customers to automatically balance the load and take care of idle brokers and storage.
Criteria Group
Confluent
Amazon MSK – Serverless
Amazon MSK – Provisioned
Cluster scaling
Automatic resource allocation to manage consumer lag according to ingress/egress throughput Ingress throughput – up to 50 Mbps/CKU, Egress throughput – up to 150 Mbps/CKU
Auto-scalable resources Ingress throughput – up to 200 Mbps/CKU Egress throughput – up to 400 Mbps/CKU
Auto-scalable storage, but broker count and type need to be scaled manually
Re-balancing
Self-balancing clusters for automated load balancing
Self-managed
After scaling new brokers, partitions need to be reassigned using native Kafka tools
Storage
Infinite data storage available
Up to 250 GB per partition and up to 120 partitions per cluster (Broker and partition count can be increased through support)
Up to 16TB storage per broker and up to 30 brokers per cluster
Retention
Message retention in topics ranges from 1 hour to infinite time 7 days retention for metrics and logs
Message retention 4 hours (can be increased by a support case)
Default retention of new topics for up to 7 days
Operational management
Operational overheads include:
Monitoring your cluster’s vital statistics
Setting alarms for abnormal behavior
Monitoring logs for debugging and analysis, including best practices for cost optimization
Criteria Group
Confluent
Amazon MSK – Serverless
Amazon MSK – Provisioned
Monitoring and logging
Monitoring of multiple cluster-level metrics like throughput, storage, topic, connectors, etc., from the dashboard Free aggregation of key metrics at the topic and cluster level Third-party tools integration like Prometheus, Datadog, Grafana, etc.
Only consumer, topic, and consumer group metrics are available No additional cost required
Free monitoring of basic cluster level Option to configure enhanced broker, topic, and partition-level monitoring at an additional cost Third-party tools integration like Prometheus, Datadog, Grafana, etc.
Updates and bug fixes
Rolling upgrades to the latest stable Kafka version with zero intervention High availability guaranteed with non-disruptive upgrades
Kafka version and upgrade internally managed by AWS
Rolling upgrades to maintain a high availability and support cluster I/O throughout the version upgrade
Tech support
Expert 24×7 enterprise-level Kafka support
General AWS support
General AWS support
Eco-system integration
Core Kafka comprises brokers, topics, logs, partitions, clusters, producers, and consumers. In contrast, the Kafka eco-system consists of Kafka Core, Kafka Streams, Kafka Connect, Kafka REST Proxy, and the Schema Registry.
Criteria Group
Confluent
Amazon MSK – Serverless
Amazon MSK – Provisioned
Connectors
Drag and drop configuration for 130+ pre-built and self-managed Confluent source and sink connectors
Need to be configured using EC2 instances
Can be integrated with community-built connectors using custom plug-ins.
Kafka eco-system
Compatible with fully managed Confluent Schema Registry Provides a fully-managed solution for creating and managing ksqlDB clusters
Compatible with fully-managed AWS Glue Schema Registry
Compatible with fully-managed AWS Glue Schema Registry
Can be integrated with Confluent Schema registry and ksqlDB by installing them on EC2 instances
Comparative analysis: Confluent Cloud vs. Amazon MSK
To put it in perspective, we ran one topic with 100 partitions on both Confluent and MSK for 30 minutes. The following configurations were consistent for both:
Topics and Partitions
Number of topics = 1, number of partitions = 100
Replication factor = 3
Connectors
Source: Confluent Datagen, max tasks = 20
Sink: Confluent S3 sink, max tasks= 50, flush count= 50K
Cluster type
Cluster configuration
Launch time
Ingress throughput average
Egress throughput average
Cost (per hour)
Confluent Cloud (Standard)
Fully managed cluster and connectors
1 min
46.63 MB/s
45.52 MB/s
$17.190
Confluent Cloud (Dedicated)
Number of CKUs=2, Fully managed connectors
3-4 hours
55.06 MB/s
55.05 MB/s
$30.943
MSK Serverless
Source/Sink Connector VM: m5.4xlarge each
5 mins
59.5 MB/s
60 MB/s
$23.39
MSK Provisioned
Cluster broker type: m5. large No. of brokers: 3 (1 per zone) Source/Sink Connector: 4 workers each
MCUs/worker: 4
25-30 mins
66.3 MB/s (Aggregated)
62.7 MB/s (Aggregated)
$8.85
While configuring connectors and launching clusters in Confluent is easy, you need to spend less in MSK to achieve the same throughput,
Launch time for standard Confluent and MSK serverless, both fully managed, is the same (<5 minutes). However, dedicated Confluent clusters take more than 2 hours to launch and about the same time to upscale. In comparison, provisioned MSK takes approximately 30 minutes to launch and about the same time to upscale.
Comparing the time and cost, provisioned MSK, which also has options to configure managed connectors, is a winner.
Which one to choose – Confluent or Amazon MSK?
Benefits of Confluent Cloud
Cloud-agnostic architecture: Users can extend consistent data architecture to multi-cloud, on-premises, or private cloud environments
Fully manageable: Feature-rich platform with built-in connectors and seamless integration with fully manageable Schema registry and ksqlDB
Drag-and-drop UI for an improved experience
Reasons to choose Amazon MSK
Despite Confluent Cloud having myriad benefits, enterprises often choose Amazon MSK over Confluent because of the following reasons:
Enhanced network security: Apache Kafka on Amazon MSK is deployed within your VPC, which ensures Kafka network packets can never go out on the internet. Therefore, enterprises that have security as their primary concern prefer MSK over Confluent.
Seamless integration with AWS services: With most enterprise infrastructure already hosted on AWS, MSK seems a natural choice as it integrates seamlessly with a wide range of AWS services like Glue ETL, Glue Schema Registry, Kinesis, Lambda, etc.
Cost-effective: A comparative analysis between MSK and Confluent revealed that for achieving the same throughput, you can save up to 30% of cost by configuring optimized MSK clusters.
Impetus has helped multiple Fortune 100 companies take advantage of Kafka seamlessly using Amazon MSK. To know how we can help you choose the right tools to achieve your business goals, write to us at inquiry(at)impetus(dot)com.
Choose a lab aligned to your Data & AI journey
Address your desired use case across critical analytic dimensions
STRATEGY LAB
Collaborate with experts on strategic objectives
Identify and select core technologies
Ensure IP governance and protection
Align business outcomes with goals
$10K value, complimentary for qualified organizations
DESIGN LAB
Explore architecture options with experts
Ensure alignment of business and technology
Architect an ideal solution for a pressing problem
$100K value, complimentary for qualified organizations