MCP (Model Context Protocol)
DataFlow MCP server helps generate DataFlow manifests and migrate Kafka Connect configurations to DataFlow. It runs without access to Kubernetes or Prometheus — YAML generation and validation only. Use it in your IDE (e.g. Cursor) to create manifests and migrate connectors.
For operational guidance (Helm, production manifests, fault tolerance), use Agent Skills alongside MCP.
Features
| Tool | Description |
|---|---|
| generate_dataflow_manifest | Generate a DataFlow YAML manifest from a description (source/sink type, optional configs and transformations). |
| validate_dataflow_manifest | Validate a YAML manifest (apiVersion, kind, spec.source, spec.sink). |
| migrate_kafka_connect_to_dataflow | Migrate a Kafka Connect configuration (one or two connectors: source + sink) into a DataFlow manifest with notes on migration boundaries. |
| list_dataflow_connectors | Reference of supported connectors (sources and sinks). |
| list_dataflow_transformations | Reference of transformations with examples. |
Validation depth
validate_dataflow_manifest only performs shallow structural checks (apiVersion/kind and presence of source/sink type + config). It does not enforce the same allow-list or field rules as the operator validating webhook (pkg/providers + dataflow/api/v1/dataflow_validation.go).
list_dataflow_connectors is reference metadata for the IDE and may omit connectors that the operator already supports (for example nessie). See docs/provider-types-inventory.md for how MCP and the operator lists relate.
Docker image
The server is published to GitHub Container Registry. Recommended image for production use:
Image: ghcr.io/dataflow-operator/dataflow-mcp:25979c2
You can also use ghcr.io/dataflow-operator/dataflow-mcp:latest for the latest build.
Running with Docker
The server communicates over stdin/stdout, so the -i flag is required:
docker run -i --rm ghcr.io/dataflow-operator/dataflow-mcp:25979c2
Connecting in Cursor
Add the server to MCP settings (e.g. ~/.cursor/mcp.json or project settings).
Via Docker (recommended)
{
"mcpServers": {
"dataflow": {
"command": "docker",
"args": [
"run",
"-i",
"--rm",
"ghcr.io/dataflow-operator/dataflow-mcp:25979c2"
]
}
}
}
Via local binary
If you built the server locally:
{
"mcpServers": {
"dataflow": {
"command": "/absolute/path/to/dataflow-mcp/target/release/dataflow-mcp",
"args": []
}
}
}
Via cargo (development)
{
"mcpServers": {
"dataflow": {
"command": "cargo",
"args": ["run", "--release", "--manifest-path", "/path/to/dataflow-mcp/Cargo.toml"]
}
}
}
Testing with MCP Inspector
MCP Inspector is an interactive browser-based tool for testing and debugging MCP servers.
npx @modelcontextprotocol/inspector /path/to/dataflow-mcp/target/release/dataflow-mcp
The web UI opens at http://localhost:6274. Select stdio transport and run tools with JSON parameters.
Examples
Generating a Kafka → PostgreSQL manifest
Use generate_dataflow_manifest with:
source_type:"kafka"sink_type:"postgresql"source_config:"{\"brokers\":[\"localhost:9092\"],\"topic\":\"input-topic\",\"consumerGroup\":\"dataflow-group\"}"sink_config:"{\"connectionString\":\"postgres://user:pass@host:5432/db\",\"table\":\"output_table\"}"name:"kafka-to-postgres"(optional)
Migrating a Kafka Connect JDBC Sink
Use migrate_kafka_connect_to_dataflow with the connector config as JSON:
{
"name": "jdbc-sink",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
"connection.url": "jdbc:postgresql://pg:5432/mydb",
"table.name.format": "events",
"topics": "events"
}
}
The response includes a DataFlow YAML manifest and notes on migrated and unsupported options.
Generating a Kafka → ClickHouse manifest
Use generate_dataflow_manifest with:
source_type:"kafka"sink_type:"clickhouse"source_config:"{\"brokers\":[\"localhost:9092\"],\"topic\":\"input-topic\",\"consumerGroup\":\"dataflow-group\"}"sink_config:"{\"connectionString\":\"clickhouse://default@localhost:9000/default?dial_timeout=10s\",\"table\":\"output_table\"}"name:"kafka-to-clickhouse"(optional)
Validating a manifest
Paste the YAML manifest into validate_dataflow_manifest (parameter config). The response indicates whether the config is valid or lists errors.