Role Overview
We are looking for a Data Engineer to design, build, and maintain a robust, scalable on-premise data infrastructure. You will focus on real-time and batch data processing using platforms such as Apache Pulsar and Apache Flink, work with NoSQL databases like MongoDB and ClickHouse, and deploy services using containerization technologies like Docker and Kubernetes.
This role is ideal for engineers with strong systems knowledge and a passion for building efficient, low-latency data pipelines in a non-cloud, on-prem environment.
Key Responsibilities
Data Pipeline & Streaming Development
Design and implement real-time data pipelines using Apache Pulsar and Apache Flink to support mission-critical systems.
Develop high-throughput, low-latency data ingestion and processing workflows across streaming and batch workloads.
Integrate internal systems and external data sources into a unified on-prem data platform.
Data Storage & Modelling
Design efficient data models for MongoDB, ClickHouse, and other on-prem databases to support analytical and operational workloads.
Optimise storage formats, indexing strategies, and partitioning schemes for performance and scalability.
Infrastructure & Containerization
Deploy, manage, and monitor containerised data services using Docker and Kubernetes in on-prem environments.
Performance, Monitoring & Reliability
Monitor the performance of streaming jobs and database queries; fine-tune for efficiency and reliability.
Implement robust logging, metrics, and alerting solutions to ensure data system availability and uptime.
Identify bottlenecks in the pipeline and proactively implement optimisations.
Requirements
Education:
Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field (preferred but not required).
Experience:
Hands-on experience in data engineering, particularly with on-premise infrastructure.
Skills:
Hands-on experience with streaming technologies such as Apache Pulsar, Apache Flink, or similar.
Familiarity with MongoDB, ClickHouse, and other NoSQL or columnar storage databases.
Proficiency in Python, Java, or Scala for data processing and backend development.
Experience deploying and managing systems using Docker and Kubernetes.
Strong understanding of Linux-based systems, including system tuning and resource monitoring.
About the Company
Blurgs AI is a deep-tech startup focused on maritime and defence data-intelligence solutions, specialising in multi-modal sensor fusion and data correlation.
Our flagship product, Trident, provides advanced domain awareness for maritime, defence, and commercial sectors by integrating data from diverse sensors such as AIS, Radar, SAR, and EO/IR.
At Blurgs AI, we foster a collaborative, innovative, and growth-driven culture.
Our team is passionate about solving real-world challenges, and we prioritise an open, inclusive work environment where creativity and problem-solving thrive.
We encourage new hires to bring their ideas to the table, offering opportunities for personal growth, skill development, and the chance to work on cutting-edge technology that impacts global defence and maritime operations.
Join us to be part of a team that's shaping the future of technology in a fast-paced, dynamic industry
