N-iX is a global software development company founded in 2002, connecting over 2,400+ tech professionals across 40+ countries. We deliver innovative technology solutions in cloud computing, data analytics, AI, embedded software,IoT, and more to global industry leaders and Fortune 500 companies. Join us to create technology that drives real change for businesses and people across the world.
Senior Observability Engineer will drive the observability platform to new heights, with a strong focus on ClickHouse as a telemetry storage solution. You will lead the migration from a custom Cosmos telemetry system to ClickHouse, ensuring robust alerting, notifications, and telemetry capabilities. This individual contributor role is hybrid and reports to the Sr. Manager.
Our client is committed to building trust and making the world more agreeable for our employees, customers, and communities. Here, you have the opportunity to be heard, exchange ideas openly, contribute meaningfully, and be proud of your work as part of a team making a global impact.
About the Team
Our Cloud Engineering team runs on collaboration, curiosity, and innovation. We build mission-critical cloud solutions that support millions of people and businesses globally. We value agile principles, DevOps practices, and infrastructure as code—in everything we do, we aim for reliability, scalability, and security.
Responsibilities
- Lead the migration and transformation of telemetry storage from custom Cosmos DB solutions to ClickHouse, building a scalable and reliable end-to-end observability platform.
- Architect, implement, and maintain alerting and notification systems integrated with ClickHouse for critical services and applications.
- Develop, deploy, and operate high-throughput telemetry pipelines, ensuring accurate and actionable monitoring across cloud environments.
- Collaborate with engineering and product teams to define and champion observability best practices.
- Design and build dashboards and visualization tools to enable proactive monitoring, detection, and resolution of incidents.
- Work with DevOps and development teams to automate collection, ingestion, and retention policies for logs, metrics, and traces.
- Drive continuous improvement in system performance, stability, and reliability through effective observability.
- Participate in on-call rotations, incident response, and root cause analysis to enhance monitoring and alerting capabilities.
Requirements
- 5+ years’ engineering experience in cloud observability platforms, infrastructure, and telemetry systems.
- Deep experience in alerting, notifications, and monitoring at scale.
- Advanced expertise with ClickHouse, or similar high-performance analytical databases, for telemetry storage and querying.
- Hands-on experience migrating telemetry/storage solutions (preferably from Cosmos DB to ClickHouse or equivalent).
- Solid understanding of telemetry pipelines, cloud-native monitoring, and best practices.
- Experience with dashboarding and visualization tools (Grafana, Kibana, or similar).
- Strong scripting and automation skills (Python, Bash, Terraform or equivalent).
- Proven collaboration and communication skills across cross-functional teams.
We offer*:
- Flexible working format - remote, office-based or flexible
- A competitive salary and good compensation package
- Personalized career growth
- Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
- Active tech communities with regular knowledge sharing
- Education reimbursement
- Memorable anniversary presents
- Corporate events and team buildings
- Other location-specific benefits
*not applicable for freelancers