Comparing key features of Amazon MSK and Confluent

How to choose the right tool for seamless Kafka services

Apache Kafka is a real-time event streaming platform that helps enterprises gain reliable insights for quick decision-making and improved customer experience. While it meets the enterprise streaming requirements, maintenance and management of Kafka is an overhead. To reduce these overheads, Amazon MSK (Managed service for Kafka) and Confluent Cloud are widely used by enterprises for event streaming with Apache Kafka.

The difference between Amazon MSK and Confluent is that of being cloud-native and cloud-hosted. But this is just the tip of the iceberg. There are many subtle differences that enterprises need to consider while choosing the right service.

This blog will compare Confluent and Amazon MSK to help you understand which fits your requirements best.

Introduced by the creators of Kafka, Confluent Cloud is a simple, scalable, resilient, and secure event streaming platform with pre-built fully managed Kafka Connectors that make it easy to connect to popular data sources and sink instantly.

Amazon Managed Service for Kafka (MSK), which runs on open source versions of Apache Kafka, is a fully-managed service that allows you to produce and consume data, create, update, and delete clusters using your existing setup applications, plug-ins, and tools. In addition, it reduces infrastructure provisioning and monitoring overheads by scaling up/down based on the requirement.

Confluent vs. MSK

Confluent Cloud clusters are self-serving, on-demand, and can be provisioned as

  • Basic, mainly used for experimentation, early development, and basic use cases
  • Standard, used for production-ready use cases leveraging an extended feature set and elastic scaling up to 1 GBPS
  • Dedicated, recommended for production-critical throughput at GBPS+ scale that requires private networking

On the other hand, Amazon MSK comes with two deployment options:

  • Provisioned, which is required to manage broker instances and storage
  • Serverless, which automatically provisions and scales compute and storage resources

The section below compares Confluent clusters with MSK (provisioned and serverless variants).

Infrastructure

Creating and managing Kafka clusters can be tedious. Confluent and MSK provide easy-to-set-up infrastructure and allow you to choose from multiple options depending on your requirements so that you can focus on building use cases rather than deployment.

Criteria GroupConfluent Cloud Amazon MSK – ServerlessAmazon MSK – Provisioned
Deployment optionsSupports multi-cloud (AWS, GCP, Azure) and hybrid deploymentAWS-native fully managed streaming service for KafkaAWS-native partially managed service for Kafka with deployment options for Capacity planning
PricingPay-per-use: Depends on ingress and egress throughput            
Pay according to the number of partitions, throughput, and duration
Depends on the number and type of brokers and storage

Scalability

Kafka is highly scalable. However, managing brokers and partitions according to the load is cumbersome. Auto-scaling enables customers to automatically balance the load and take care of idle brokers and storage.

Criteria GroupConfluentAmazon MSK – ServerlessAmazon MSK – Provisioned
Cluster scalingAutomatic resource allocation to manage consumer lag according to ingress/egress throughput Ingress throughput – up to 50 Mbps/CKU,
Egress throughput – up to 150 Mbps/CKU
Auto-scalable resources Ingress throughput – up to 200 Mbps/CKU
Egress throughput – up to 400 Mbps/CKU
Auto-scalable storage, but broker count and type need to be scaled manually
Re-balancingSelf-balancing clusters for automated load balancingSelf-managedAfter scaling new brokers, partitions need to be reassigned using native Kafka tools
StorageInfinite data storage available
Up to 250 GB per partition and up to 120 partitions per cluster
(Broker and partition count can be increased through support)
Up to 16TB storage per broker and up to 30 brokers per cluster
RetentionMessage retention in topics ranges from 1 hour to infinite time 7 days retention for metrics and logs
Message retention 4 hours (can be increased by a support case)
Default retention of new topics for up to 7 days

Operational management

Operational overheads include:

  • Monitoring your cluster’s vital statistics
  • Setting alarms for abnormal behavior
  • Monitoring logs for debugging and analysis, including best practices for cost optimization
Criteria GroupConfluentAmazon MSK – Serverless Amazon MSK – Provisioned
Monitoring and loggingMonitoring of multiple cluster-level metrics like throughput, storage, topic, connectors, etc., from the dashboard
Free aggregation of key metrics at the topic and cluster level Third-party tools integration like Prometheus, Datadog, Grafana, etc.

Only consumer, topic, and consumer group metrics are available No additional cost required
Free monitoring of basic cluster level Option to configure enhanced broker, topic, and partition-level monitoring at an additional cost
Third-party tools integration like Prometheus, Datadog, Grafana, etc.
Updates and bug fixesRolling upgrades to the latest stable Kafka version with zero intervention High availability guaranteed with non-disruptive upgradesKafka version and upgrade internally managed by AWSRolling upgrades to maintain a high availability and support cluster I/O throughout the version upgrade  
Tech supportExpert 24×7 enterprise-level Kafka supportGeneral AWS supportGeneral AWS support  

Eco-system integration

Core Kafka comprises brokers, topics, logs, partitions, clusters, producers, and consumers. In contrast, the Kafka eco-system consists of Kafka Core, Kafka Streams, Kafka Connect, Kafka REST Proxy, and the Schema Registry.

Criteria GroupConfluentAmazon MSK – Serverless  Amazon MSK – Provisioned  
ConnectorsDrag and drop configuration for 130+ pre-built and self-managed Confluent source and sink connectorsNeed to be configured using EC2 instancesCan be integrated with community-built connectors using custom plug-ins.

Kafka eco-systemCompatible with fully managed Confluent Schema Registry Provides a fully-managed solution for creating and managing ksqlDB clustersCompatible with fully-managed AWS Glue Schema Registry
 
Compatible with fully-managed AWS Glue Schema Registry

Can be integrated with Confluent Schema registry and ksqlDB by installing them on EC2 instances

Comparative analysis: Confluent Cloud vs. Amazon MSK

To put it in perspective, we ran one topic with 100 partitions on both Confluent and MSK for 30 minutes. The following configurations were consistent for both:

Topics and Partitions

Number of topics = 1, number of partitions = 100

Replication factor = 3

Connectors

Source: Confluent Datagen, max tasks = 20

Sink: Confluent S3 sink, max tasks= 50, flush count= 50K

Cluster type
Cluster configuration
Launch timeIngress throughput averageEgress throughput averageCost (per hour)
Confluent Cloud (Standard)Fully managed cluster and connectors1 min46.63 MB/s45.52 MB/s$17.190 
Confluent Cloud (Dedicated)Number of CKUs=2,
Fully managed connectors
3-4 hours55.06 MB/s55.05 MB/s$30.943 
MSK ServerlessSource/Sink Connector VM: m5.4xlarge each5 mins59.5 MB/s60 MB/s$23.39 
MSK ProvisionedCluster broker type: m5. large
No. of brokers: 3 (1 per zone)
Source/Sink Connector: 4 workers each

MCUs/worker: 4
25-30 mins66.3 MB/s (Aggregated)62.7 MB/s
(Aggregated)
$8.85 

While configuring connectors and launching clusters in Confluent is easy, you need to spend less in MSK to achieve the same throughput,

Launch time for standard Confluent and MSK serverless, both fully managed, is the same (<5 minutes). However, dedicated Confluent clusters take more than 2 hours to launch and about the same time to upscale. In comparison, provisioned MSK takes approximately 30 minutes to launch and about the same time to upscale.

Comparing the time and cost, provisioned MSK, which also has options to configure managed connectors, is a winner.

Which one to choose – Confluent or Amazon MSK?

Benefits of Confluent Cloud

  • Cloud-agnostic architecture: Users can extend consistent data architecture to multi-cloud, on-premises, or private cloud environments
  • Fully manageable: Feature-rich platform with built-in connectors and seamless integration with fully manageable Schema registry and ksqlDB
  • Drag-and-drop UI for an improved experience

Reasons to choose Amazon MSK

Despite Confluent Cloud having myriad benefits, enterprises often choose Amazon MSK over Confluent because of the following reasons:

Enhanced network security: Apache Kafka on Amazon MSK is deployed within your VPC, which ensures Kafka network packets can never go out on the internet. Therefore, enterprises that have security as their primary concern prefer MSK over Confluent.

Seamless integration with AWS services: With most enterprise infrastructure already hosted on AWS, MSK seems a natural choice as it integrates seamlessly with a wide range of AWS services like Glue ETL, Glue Schema Registry, Kinesis, Lambda, etc.

Cost-effective: A comparative analysis between MSK and Confluent revealed that for achieving the same throughput, you can save up to 30% of cost by configuring optimized MSK clusters.

Impetus has helped multiple Fortune 100 companies take advantage of Kafka seamlessly using Amazon MSK. To know how we can help you choose the right tools to achieve your business goals, write to us at inquiry@impetus.com.

Learn more about how our work can support your enterprise