Intermediate Data Engineer needed at Relx

Job Expired

Intermediate Data Engineer needed at Relx

Cape Town, Gauteng Full Time ICT March 9, 2024 - April 8, 2024

Job title : Intermediate Data Engineer

Job Location : Gauteng, Cape Town

Deadline : April 08, 2024

Quick Recommended Links

Jobs by Location
Job by industries

Responsibilities

As a senior data engineer on our team, you will work on new product development in a small team environment writing production code in both run-time and build-time environments. You will help propose and build data-driven solutions for high-value customer problems by discovering, extracting, and modeling knowledge from large-scale natural language datasets. You will prototype new ideas, collaborating with other data scientists as well as product designers, data engineers, front-end developers, and a team of expert legal data annotators. You will get the experience of working in a start-up culture with the large datasets and many other resources of an established company. You will also:

Build and scale data infrastructure that powers real-time data processing of billions of records in a streaming architecture
Build scalable data ingestion and machine learning inference pipelines
Build general-purpose APIs to deliver data science outputs to multiple business units
Scale up production systems to handle increased demand from new products, features, and users
Provide visibility into the health of our data platform (comprehensive view of data flow, resources usage, data lineage, etc) and optimize cloud costs
Automate and handle the life-cycle of the systems and platforms that process our data

Requirements

Masters in Software Engineering, Data Engineering, Computer Science or related field
2-3 years of relevant work experience
Strong Scala or Java background
Knowledge of AWS, GCP, Azure, or other cloud platform
Understanding of data modeling principles.
Ability to work with complex data models.
Experience with relational and NoSQL databases (e.g. Postgres, ElasticSearch/OpenSearch, graph databases such as Neptune or neo4j)
Experience with technologies that power analytics (Spark, Hadoop, Kafka, Docker, Kubernetes) or other distributed computing systems
Knowledge of API development and machine learning deployment