Senior Data Platform Engineer

Reporting to: Head of Engineering and Data Science
Location: Remote (London & SE for meetings)

Contract: Permanent

Apply
 

About EntityX

EntityX is a high-growth, investment-funded business. We are a dedicated team of engineers and data scientists building the next generation of privacy-first audience intelligence.

EntityX's technology uses a range of in-house and 3rd party NLP and AI techniques to merge the curated knowledge of Wikipedia with billions of daily data points about consumer media consumption. This allows us to generate a deep understanding of consumers and culture (‘Cultural Intelligence’) and deliver digital ad targeting solutions that have zero dependency on personal data (‘Cultural Activation’).

Role Overview

We are looking for a Senior Data Platform Engineer to join our team of 10 and build our next-generation data architecture. We are currently evolving from a BigQuery-centric workflow toward a sophisticated, model-driven system of distributed services and GPU-backed inference.

EntityX currently analyses roughly 20TB of historical data and processes 1 billion daily events about consumers viewing 25 million documents daily, but we are rapidly ratcheting up our capabilities. This role is at the heart of that expansion—focusing on the engineering required to run complex models over millions of documents daily and billions of views without breaking the cluster or blowing the budget. You will be responsible for developing the robust, high-performance wrappers and pipelines that power our systems, ensuring that they remain performant, resilient, and cost-effective as we scale.

This position is scoped for a Senior Engineer, but we are eager to accommodate Staff or Principal-level candidates capable of driving the entire architectural roadmap. Title and compensation will be levelled accordingly.

Key Responsibilities

  • Scaling Data Pipelines: Build and maintain the distributed systems required to process 25M documents every day. 

  • Maintain Data Marts: Continue to improve the creation of the datasets that EntityX is built on.

  • Cost-Effective Architecture: Take ownership of our data warehouse spend and data processing costs, ensuring that as our data grows, our costs remain sustainable through smart management strategies.

  • Model Inference: Build the high-performance infrastructure that allows models to be used at scale.

  • Knowledge Sharing: Act as a sounding board and expert within the team for big data best practices, helping us avoid common pitfalls as we scale our data from 20TB to >50TB over the next two years.

Requirements

  • Large-Scale Data Experience: You have significant experience working with high-volume data environments (think billions of rows or hundreds of terabytes) and understand the specific challenges they present.

  • Python & Go: We use both for our ingestion and backend services. You should be proficient in one and happy to work in (and learn) the other.

  • SQL & Big Data Warehousing: Deep experience with SQL-based modeling in a big data context. We use BigQuery, but experience with Snowflake, Redshift, or similar at scale is perfectly fine. Experience with modern transformation frameworks like dbt is essential.

  • Distributed Systems: A solid understanding of how to build systems that handle large batch jobs and high-throughput data streams without falling over.

  • Pragmatic Architecture: You prefer well-designed, resilient systems over over-engineered ones. You know how to build for failure and ensure data integrity.

Bonus Points

  • Ad Tech & Privacy: Any exposure to the digital advertising ecosystem or an interest in the "privacy-first" movement (e.g., working without cookies/PII).

  • Kubernetes & Orchestration: We use Kubernetes with Argo Workflows to orchestrate jobs, experience with this would be helpful.

  • Terraform: Using Infrastructure as Code to manage cloud environments (AWS for compute/apps, BigQuery for warehousing). 

  • Low-Latency Engineering: Experience building or optimising APIs for sub-20ms response times.

  • Vector Databases: Experience with handling embeddings or using vector search tools (e.g., Pinecone, Milvus, or BigQuery Vector Search).

What We Offer

  • Opportunity to work on cutting-edge AI and data analysis technology

  • A collaborative, authentic work environment that values talent and self-motivation

  • Chance to make a significant impact in a growing start-up and help build the team (both hiring and mentoring)

  • Competitive salary and benefits package

  • Remote-first with occasional in-person collaboration in London (no fixed hybrid schedule)

  • Continuous learning and professional development opportunities

At EntityX, we believe in the power of diverse perspectives and are committed to creating an inclusive environment for all employees. We encourage applications from candidates of all backgrounds.