We are looking for a Middle Big Data Engineer (Python+Azure) to join our team!
Client Overview:
Our client is involved in a large-scale Data Transformation project, with a focus on solidifying the foundation of their data operations. They are aiming to ensure that data is accurate, consistent, and available at critical times to support their business needs.
Project Objectives:
The project aims to build and maintain robust data pipelines, scalable storage systems, and efficient processing mechanisms using Azure technology.
The goal is to support the client's data-driven decision-making by ensuring clean, transformed, and readily accessible data for reporting, analytics, and machine learning across the organization.
Responsibilities:
- Design and implement data pipelines to collect, clean, and transform data from various sources.
- Build and maintain data storage and processing systems, including databases, data warehouses, and data lakes.
- Follows data governance policies and procedures.
- Collaborate with data analysts, data scientists, and other stakeholders to understand and meet their data needs.
- Participate in code reviews, performance tuning, and best practice discussions within the team and brain-storm session
- Work with Big Data Solution Architects to design, prototype, implement, and optimize data ingestion pipelines.
- Ensure solutions are production-ready in terms of operational, security, and compliance standards.
- Participate in daily project and agile meetings, providing technical support for issue resolution.
- Communicate clearly and concisely with the business about item status and blockers.
- Maintain comprehensive knowledge of the client's data landscape.
Requirements:
- 2+ years of design & development experience with big data technologies.
- Proficiency in Python and PySpark.
- 2+ years of development experience in cloud technologies like Azure.
- Strong skills in querying and manipulating data from various databases (relational and big data).
- Experience in writing effective and maintainable unit and integration tests for ingestion pipelines.
- Familiarity with static analysis and code quality tools, and experience building CI/CD pipelines.
- Excellent communication, problem-solving, and leadership skills.
- Experience working on high-traffic and large-scale software products.
Nice to Have:
- Experience with data visualization tools (e.g., SSRS, Power BI).
- Knowledge of machine learning algorithms and their applications in big data.
- Familiarity with data privacy regulations (e.g., GDPR, CCPA).
We offer:
- Flexible working format - remote, office-based or flexible
- A competitive salary and good compensation package
- Personalized career growth
- Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
- Active tech communities with regular knowledge sharing
- Education reimbursement
- Memorable anniversary presents
- Corporate events and team buildings
- Other location-specific benefits