Job Details

Lead Cloudera Streaming Architect (CDP | NiFi | Kafka | Flink | Kudu | SSB)

  2026-04-06     HCL Global Systems     all cities,AK  
Description:

Lead Cloudera Streaming Architect (CDP | NiFi | Kafka | Flink | Kudu | SSB)
About the Role

We are seeking a Lead Cloudera Streaming Architect with deep, hands-on experience across the Cloudera CDP streamingstack, including NiFi, Kafka, Flink, Kudu/Impala, and SQL Stream Builder (SSB). This is a highly technical, architecture-plus-implementation role responsible for designing, delivering, and optimizing mission-critical real-time data pipelines at enterprisescale.

If you have personally built end-to-end CDP/CDF streaming pipelines and can execute complex ingestion, transformation, CDC, and Kudu write-path use cases on day one - this role is for you.

What You'll Do

Streaming Architecture & Implementation

  • Architect and build real-time data pipelines using the full Cloudera Data Platform (CDP) streaming suite:
  • NiFi → Kafka → Flink → Kudu/Impala → SSB
  • Own architectural decisions, patterns, and best practices for streaming, CDC, state management, schema evolution, and exactly-once delivery.
  • Develop complex NiFi flows involving controller services ( DBCP/JDBC), stateful processors, record processors, schema registry integrations, batch-to-stream conversions, and high-volume ingestion patterns.
  • Build and optimize Flink SQL or DataStream API jobs with:
  • ? Kafka sources/sinks
  • ? event-time windows
  • ? watermarks
  • ? state management
  • ? checkpointing / savepoints
  • ? exactly-once guarantees
  • Design and tune Kudu tables (PKs, partitioning, distribution, upserts, deletes, merges).
  • Build and deploy streaming SQL jobs using Cloudera SQL Stream Builder (SSB).
Use Case Delivery

You must be able to deliver the following four core use cases immediately:
  1. NiFi → Snowflake → Impala/Kudu ingestion pipeline
  2. Kafka → Flink streaming (real-time processing)
  3. Flink → Kafka sink with exactly-once semantics
  4. CDC ingestion via NiFi, Flink CDC, or SSB (incremental keys, late events, deletes)

Optimization, Monitoring & Governance
  • Tune NiFi, Kafka, and Flink clusters for performance, throughput, and stability.
  • Implement schema governance, error handling, back-pressure strategies, and replay mechanisms.
  • Work closely with platform engineers to optimize CDP components and CDF deployments.
  • Provide architectural guidance, documentation, and mentorship to engineering teams.
Required Experience

You must have hands-on, production-grade experience with ALL of the following:

Cloudera CDP / CDF
  • CDP Public Cloud or Private Cloud Base
  • Cloudera Flow Management (NiFi + NiFi Registry)
  • Cloudera Streams Messaging (Kafka, SMM)
  • Cloudera Stream Processing (Flink, SSB)
  • Kudu / Impala ecosystem
Apache NiFi (Advanced)
  • Building complex flows (not just admin/ops)
  • QueryDatabaseTable / GenerateTableFetch / MergeRecord
  • Record-based processors & schema registry
  • JDBC / DBCP controller services
  • Stateful processors & incremental ingestion
  • NiFi → Snowflake integration
  • NiFi → Kudu ingestion patterns
Apache Kafka
  • Kafka brokers, partitions, retention, replication, consumer groups
  • Schema registry (Avro/JSON)
  • Designing topics for high-throughput streaming
Apache Flink
  • Flink SQL + DataStream API
  • Event-time processing, watermarks, windows
  • Checkpointing, savepoints, state backends
  • Kafka source/sink connectors
  • Exactly-once semantics
  • Flink CDC a plus
Apache Kudu
  • Table design (PKs, partition strategies)
  • Upserts, deletes, merge semantics
  • Integration with Impala
SQL Stream Builder (SSB)
  • Creating jobs, connectors, materialized views
  • Deploying and monitoring Flink SQL jobs in CDP
CDC (Change Data Capture)
  • CDC via NiFi or Flink CDC or SSB
  • Handling late-arriving events
  • Handling deletes, updates, schema evolution
  • Incremental key tracking
General Requirements
  • 8+ years in data engineering / streaming
  • 3-5+ years specifically with CDP/CDF streaming
  • Strong SQL and distributed system fundamentals
  • Experience in financial services, healthcare, telecom, or other high-volume industries preferred
Nice to Have
  • Kubernetes experience running NiFi/Kafka/Flink operators
  • Snowflake ingestion patterns (staging, Copy Into)
  • Experience with Debezium
  • CI/CD for data pipelines
  • Security (Kerberos, Ranger, Atlas)
What Success Looks Like

In the first 90 days, you will:
  • Deliver at least two of the four required streaming use cases end-to-end
  • Establish architectural patterns for NiFi, Flink, and Kudu pipelines
  • Optimize one existing pipeline for throughput, latency, and reliability
  • Become the subject-matter expert for Data in Motion on CDP
Apply If You Can Demonstrate
  • You have personally built NiFi → Kafka → Flink → Kudu pipelines
  • You understand event-time processing and exactly-once delivery
  • You have designed Kudu tables and worked with Impala
  • You have authored and deployed SSB SQL streaming jobs
  • You can speak to real-world CDC implementations


Apply for this Job

Please use the APPLY HERE link below to view additional details and application instructions.

Apply Here

Back to Search