Senior Site Reliability Engineer (#1967)

Colombia
Work type:
Office/Remote
Technical Level:
Senior
Job Category:
Software Development

N-iX is a global company with Ukrainian roots that helps businesses across the world develop successful software products. Founded in 2002, N-iX has come a long way and increased its presence in eight countries spanning Europe, the US, and Latin America. Today, we are a strong community of 2,000+ professionals and a reliable partner for global industry leaders and Fortune 500 companies.

We are looking for a Senior Site Reliability Engineer who is interested in an opportunity to work for an innovative hospitality company with cutting-edge technologies with new development activities and challenges ahead. In our cross international teams, one key thing that all of our engineers have in common is a desire to develop brilliant products, reliable and resilient systems as well as their own skills. We run our service in Azure. Take a chance to make a valuable contribution and enhance your professional skills.

About the job:

Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that cloud  services—both our internally critical and our externally-visible systems—have reliability, uptime appropriate to customer needs and a fast rate of improvement. Additionally, SRE’s will keep an ever-watchful eye on our systems capacity and performance.

On the SRE team, you’ll have the opportunity to manage the complex challenges of scale which are unique to the project, while using your expertise in coding, algorithms, complexity analysis and large-scale system design. You will provide scalable, reliable, durable, and secure services using a customer-first approach while innovating technically. You will understand our customer needs and how we can meet them.

Responsibilities:

  • Develop and improve the whole lifecycle of services.
  • Establish and improve monitoring capabilities to reduce outage frequency and duration.
  • Create sustainable systems through automation and uplifts.
  • Develop and scale systems sustainably through mechanisms such as automation, and evolve systems by pushing for changes that improve reliability and velocity.
  • Lead designs of major software components, systems, and features to improve the availability, scalability, latency, and efficiency of Client’s services.
  • Analyze and support services before they go live via system design consulting, developing software platforms and frameworks, capacity planning.
  • Conduct post incident analysis and reviews.

Requirements:

  • 5+ years of quality engineering experience.
  • Ideally, strong experience in Azure Services and capabilities, but other cloud services (AWS, Google Cloud Platform etc.) will be considered.
  • Confidence and strong experience with Kubernetes.
  • Recent and fluent Terraform and Chef platform experience.
  • Extensive expertise in software development/testing, development operations, and site reliability engineering.
  • Experience of Unix/Linux administration - an appreciation of systems internals (e.g., filesystems, system calls) is a bonus.
  • Experience with Continuous Integration and Deployment (CI/CD) and release orchestration.
  • Cloud-agnostic approach, with flexibility to work across various cloud platforms.
  • Strong command of communication skills (in English) (B2/C1 level).

Nice to have:

  • Bachelor's degree in Computer Science, similar technical field of study, or equivalent practical experience.
  • Experience in distributed systems, storage systems, or databases.
  • Experience designing, analyzing, and troubleshooting large-scale distributed systems.
  • Experience programming in one or more of the following languages: C, C++, Java, Python, JavaScript, Go, Perl, or Ruby.
  • Systematic problem-solving approach, combined with excellent communication skills and a sense of ownership and drive.
  • Experience in configuring application monitoring with Azure Monitor and Application Insight.
  • Experience with Infrastructure as code.
  • Experience with Service Mesh.

We offer:

  • Flexible working format - remote, office-based or flexible
  • A competitive salary and good compensation package
  • Personalized career growth
  • Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
  • Active tech communities with regular knowledge sharing
  • Education reimbursement
  • Memorable anniversary presents
  • Corporate events and team buildings
  • Other location-specific benefits
×

Easy apply


    or
    Refer a friend