Data Engineer II

CommerceIQ • Bengaluru, Karnataka, India • 1w ago

@CommerceIQ, you will:

Design, implement, and maintain robust and scalable data pipelines that support machine learning applications and real-time decision-making systems.
Work closely with ML engineers, analysts, and product teams to understand data needs and translate them into efficient data engineering solutions.
Build and maintain workflows using tools like Apache Airflow, ensuring the reliability and timeliness of data availability across the platform.
Develop ETL/ELT pipelines using PySpark and Python, and optimize them for performance and cost at scale in a production environment.
Own and manage critical parts of the data infrastructure, ensuring high availability, consistency, and security of large-scale distributed data processing systems.
Proactively monitor, troubleshoot, and enhance data workflows to ensure quality, reliability, and performance SLAs are consistently met.
Collaborate in code reviews, technical design discussions, and mentoring junior data engineers within the team.

4–6 years of hands-on experience designing, building, and deploying large-scale data processing pipelines in production environments.

Proficiency in Python is a must, with strong software engineering fundamentals and experience writing clean, maintainable code.
Extensive hands-on experience with PySpark and distributed data processing frameworks.
Production experience with Apache Airflow or similar workflow orchestration tools.
Solid understanding of data modeling, performance tuning, and optimization for large datasets.
Proven experience working with cloud-based data infrastructure (e.g., AWS, GCP, or Azure) is a strong plus.
Experience supporting ML pipelines or working in ML-driven environments is an advantage.
Strong sense of ownership, attention to detail, and a passion for building high-quality data solutions that deliver business value.