Senior Site Reliability Engineer (SRE) (#4607)

South America, Ukraine
Work type:
Office/Remote
Technical Level:
Senior
Job Category:
Software Development
Project:
American brand for home crafters

 

N-iX is a global software development company that brings together over 2,400 professionals across Europe and the Americas. We’re actively expanding in Colombia, inviting local tech talent to join our international projects for industry-leading companies and Fortune 500 firms. Enjoy the freedom to work remotely or from our regional hubs, collaborate with experts worldwide, and grow your career in a fast-paced, innovative environment.

The Senior Site Reliability Engineer (SRE) will play a key role in designing, maintaining, and scaling the infrastructure and automation systems that ensure the reliability, availability, and performance of the company's critical applications and services. This position requires deep expertise in cloud-native platforms, infrastructure as code (IaC), CI/CD, and modern observability practices. The role involves a blend of software engineering and systems engineering skills to build resilient, secure, and scalable infrastructure.

Responsibilities:

  • Design and implement scalable infrastructure solutions using AWS cloud services (e.g., EC2, EKS, RDS, MSK, Lambda, CloudFront).
  • Develop and maintain infrastructure automation using Terraform, Helm, and ArgoCD within GitOps workflows.
  • Architect and manage multi-region, highly available systems to ensure business continuity and disaster recovery.
  • Lead incident response, postmortems, and root cause analysis efforts to improve system reliability and performance.
  • Define and enforce Service Level Objectives (SLOs), Service Level Indicators (SLIs), and error budgets.
  • Build and maintain CI/CD pipelines using GitHub Actions, ensuring efficient and secure software delivery.
  • Implement and manage observability stacks including Datadog, CloudWatch, and Prometheus/Grafana.
  • Ensure compliance and security best practices, including IAM policies, secrets management, and audit logging.
  • Collaborate with software engineering, security, and infrastructure teams to define and implement reliable architecture.
  • Conduct cost optimization, capacity planning, and performance tuning for cloud workloads.
  • Mentor junior engineers and contribute to knowledge sharing and process improvements.

Requirements:

  • Bachelor’s degree or foreign equivalent in Computer Science, Computer Engineering, Information Technology, or a related field.
  • 5+ years of professional experience in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles.
  • Strong proficiency with AWS (EKS, RDS, S3, IAM, Lambda, CloudFormation/Terraform).
  • Proficient in Terraform, Helm, Kubernetes, and CI/CD automation.
  • Solid understanding of networking, DNS, TLS, load balancers, and container orchestration.
  • Experience with monitoring/alerting tools (e.g., Datadog, Prometheus, Grafana).
  • Strong scripting skills in Python, Bash, or Go.

Nice to have:

  • Master’s degree in Computer Science or related field.
  • AWS Certified Solutions Architect or DevOps Engineer certifications.
  • Experience with Crossplane, OpenSearch, or multi-cloud architecture.
  • Prior experience in SRE teams or initiatives.

 

We offer*:

  • Flexible working format - remote, office-based or flexible
  • A competitive salary and good compensation package
  • Personalized career growth
  • Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
  • Active tech communities with regular knowledge sharing
  • Education reimbursement
  • Memorable anniversary presents
  • Corporate events and team buildings
  • Other location-specific benefits

*not applicable for freelancers

×

Easy apply

    or
    Refer a friend