Lead Site Reliability Engineer (#14769928)

Work type:
Flexible (Office/Remote)
Technical Level:
Job Category:
Software Development
Fully digital bank in UAE

​​​​Eager to join the biggest Fintech company that is predicted to really shake up the banking industry in the Middle East? In this role you will be working for an Abu Dhabi Investment Firm that is currently building a Digital Bank with a banking license and a seed capital of $545 Million.

About the job:

   Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that NeoFintech Cloud's services—both our internally critical and our externally-visible systems—have reliability, uptime appropriate to customer's needs and a fast rate of improvement. Additionally, SRE’s will keep an ever-watchful eye on our systems capacity and performance.
    On the SRE team, you’ll have the opportunity to manage the complex challenges of scale which are unique to NeoFintech, while using your expertise in coding, algorithms, complexity analysis and large-scale system design. SRE's culture of diversity, intellectual curiosity, problem solving and openness is key to its success.
   Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a blame-free environment. We promote self-direction to work on meaningful projects.

   As a Site Reliability Engineer (SRE), you will provide scalable, reliable, durable, and secure global NeoFintech services using a customer-first approach while innovating technically. You will understand our customer's needs and how we can meet them.


  • Provide technical leadership for the local team and work closely with engineering team tech leads and cloud leadership.
  • Develop and scale systems sustainably through mechanisms such as automation, and evolve systems by pushing for changes that improve reliability and velocity.
  • Provide guidance to other team members on managing availability and performance of mission critical services, on building automation to prevent problem recurrence, and building automated responses for non-exceptional service conditions.
  • Manage execution of project priorities, deadlines, and deliverables.
  • Lead designs of major software components, systems, and features to improve the availability, scalability, latency, and efficiency of Client’s services. Analyze and support services before they go live via system design consulting, developing software platforms and frameworks, capacity planning, and launch reviews.
Minimum qualifications:
  • Experience in Azure Services and capabilities
  • Extensive expertise in software development/testing, development operations, and site reliability engineering
  • Experience with algorithms and data structures and/or Unix/Linux systems internals (e.g., filesystems, system calls) and administration.
  • Experience with Continuous Integration and Deployment (CI/CD) and release orchestration
Nice to have:
  • Bachelor's degree in Computer Science, similar technical field of study, or equivalent practical experience.
  • Experience in distributed systems, storage systems, or databases. Experience designing, analyzing, and troubleshooting large-scale distributed systems.
  • Experience programming in one or more of the following languages: C, Terraform, C++, Java, Python, JavaScript, Go, Perl, or Ruby.
  • Systematic problem-solving approach, combined with excellent communication skills and a sense of ownership and drive.
  • Experience in configuring application monitoring with Azure Monitor and Application Insight
  • Experience with Infrastructure as code Experience with Service Mesh

Easy apply

    Get a Bonus