big data infrastructure, data processes and data pipelines
FULLY REMOTE, EUROPE
Our client is a global fintech on a fast track trajectory looking to increase their fully remote team.
In this role, you will contribute to building the data ecosystem and ensure its performance and reliability, as well as take responsibility for the accuracy and reliability of the processed data
On a day to day basis you will be :
- Translating complex business requirements into scalable technical solutions
- Designing and building data structures on MPP platforms like AWS RedShift and Google BigQuery
- Designing and building highly scalable data pipelines using AWS tools like Glue (Spark based), Data Pipeline, Lambda.
To be successful in this role, you will require
- In-depth understanding of data structures and algorithms
- Work experience with the Google Cloud Platform: BigQuery, Dataflow, TensorFlow, Data Catalog.
- Experience in designing and building dimensional data models to improve accessibility, efficiency, and quality of data.
- Experience in building real-time data processes
- Experience in designing and developing ETL data pipelines and orchestrating data processes.
- Proficiency in writing Advanced SQLs, Expertise in performance tuning of SQLs.
- Programming experience in building high-quality software. Python or Scala preferred.
- Strong analytical and communication skills.
NICE TO HAVE SKILLS
- Work experience with AWS tools to process data (DMS, Glue, Pipeline, Kinesis, Lambda, etc).
- Experience using Java, Spark, Hive, Oozie, Kafka, NiFi and Map Reduce.
- Work/project experience with big data and advanced programming languages.
- Experience with or advanced courses on data science and machine learning.