William Harvey Research Institute, Centre for Translational Bioinformatics 

 Data Scientist / Researcher
Our team are working at the intersection of healthcare and technology, applying machine learning and AI to structured biomedical data warehouses. To drive forward innovation on all fronts we are looking for a technically skilled data scientist, with an interest in data wrangling, visualization and analysis. This is a unique opportunity to be part of the Centre for Translational Bioinformatics at QMUL, a new and impactful team working across the translational spectrum with a diverse community of clinicians and research scientists, involved in disease research, drug discovery, imaging, clinical trials and healthcare delivery.

To unlock this data for researchers, we are building data warehouses in eMedLab, a £9m MRC investment in cloud computing. Our team have an open source / open data ethos and we strongly believe in supporting the open source software development community, developing data integration tools, such as tranSMART-I2B2. It will be your task to design and populate these data warehouses using your technical data management and analysis skills, in an extended team working with software developers, statisticians and clinician scientists. Our team are uniquely placed for far reaching impact working across the dynamic London data science network, including Health Data Research-UK ( and the Alan Turing Institute( This post is co-funded by two MRC-funded projects working on Immune mediated inflammatory diseases (IMIDs). CLUSTER is investigating Childhood Arthritis and Uveitis, while IMID-Bio-UK is a consortium of scientists who are building a biobank and data warehouse focused on IMIDs, including RA, Psoriasis, Lupus, Sjogren’s and autoimmune liver disease.

To be considered for this role you will have good experience in the Linux environment, including the use of command line utilities for data wrangling. You should also have demonstrated expertise in one or more of the following programming languages: Python, Perl, Java or R/shiny R. Experience with cloud computing, virtualisation and SQL databases including schema and query design, performance optimization, and security will also be an advantage. Knowledge of biomedicine is desirable although not essential, we actively encourage multi-disciplinarity in our team and would welcome applications from data scientists in other fields.

You will have a good first degree, Masters, or a PhD (or equivalent) in a data science orientated field with an emphasis on large-scale scientific data management. The post holder will be expected to play a central role in evolving and populating the eMedLab cloud data warehouse infrastructure as well as establishing collaborations with other groups.

This full-time appointment and is funded for 3 years in the first instance. Starting salary will be in the range £34,522-£40,568 per annum inclusive of London Allowance. Benefits include 30 days annual leave, defined benefit pension scheme and interest-free season ticket loan. 

