Send to friend Save 3 weeks ago

Job Location : Gauteng, Johannesburg

Deadline : November 09, 2024

Quick Recommended Links

About the Role

At LexisNexis we develop the legal profession’s most innovative products for data analysis, visualization, and research. We use the latest techniques in AI, machine learning, and data visualization to uncover insights about judges’ rulings, build forecasts of likely outcomes, and reveal critical connections in massive datasets spanning the law, news, and finance.

Responsibilities

As a senior data engineer on our team, you will work on new product development in a small team environment writing production code in both run-time and build-time environments. You will help propose and build data-driven solutions for high-value customer problems by discovering, extracting, and modeling knowledge from large-scale natural language datasets. You will prototype new ideas, collaborating with other data scientists as well as product designers, data engineers, front-end developers, and a team of expert legal data annotators. You will get the experience of working in a start-up culture with the large datasets and many other resources of an established company. You will also:
Build and scale data infrastructure that powers real-time data processing of billions of records in a streaming architecture
Build scalable data ingestion and machine learning inference pipelines
Build general-purpose APIs to deliver data science outputs to multiple business units
Scale up production systems to handle increased demand from new products, features, and users
Provide visibility into the health of our data platform (comprehensive view of data flow, resources usage, data lineage, etc) and optimize cloud costs
Automate and handle the life-cycle of the systems and platforms that process our data

Requirements

Masters in Software Engineering, Data Engineering, Computer Science or related field
2-3 years of relevant work experience
Strong Scala or Java background
Knowledge of AWS, GCP, Azure, or other cloud platform
Understanding of data modeling principles.
Ability to work with complex data models.
Experience with relational and NoSQL databases (e.g. Postgres, ElasticSearch/OpenSearch, graph databases such as Neptune or neo4j)
Experience with technologies that power analytics (Spark, Hadoop, Kafka, Docker, Kubernetes) or other distributed computing systems
Knowledge of API development and machine learning deployment