Senior Data Scientist - Hinxton, United Kingdom - Wellcome Sanger Institute

    Wellcome Sanger Institute
    Wellcome Sanger Institute Hinxton, United Kingdom

    Found in: Talent UK C2 - 1 week ago

    Default job background
    Full time
    Description

    Do you want to help us improve human health and understand life on Earth? Make your mark by shaping the future to enable or deliver life-changing science to solve some of humanity's greatest challenges.

    Senior Research Data Scientist

    We seek a Senior Research Data Scientist with machine learning knowledge working between Haniffa Lab /) and Lotfollahi Lab ( This collaborative project will leverage investments in observing biological systems at cellular resolution (such as the Human Cell Atlas (HCA) data) to develop state-of-the-art generative machine learning (ML) tools that can model cellular behaviours across various modalities and scales. You will work within an interdisciplinary team of life and computer/ML scientists, with a shared goal of improving our understanding of the rules of life and using this to improve health for all. This role will bridge between both groups and the successful candidate will be responsible for delivering their portfolio of scientific research projects as part of the broader team strategy.

    About the role

    You will be expected to work on and lead specific computational biology/machine-learning projects at the intersection of single-cell biology, spatial 'omics and machine learning. Your work will contribute to the overall aims of this collaboration which are to address two fundamental yet significant questions: "Can we predict cellular changes in time?" and "Can we build a generative model to predict the progression and treatment rationale for skin diseases?". To achieve this, you will work with open-source software, proposing, developing, and maintaining new solutions to analyse and interpret large-scale single-cell datasets.

    Our teams are well-positioned to tackle this problem with experience in both generating and analysis datasets, including millions of cells across multiple tissues and conditions (e.g., disease, healthy), a detailed understanding in the training of large-scale ML models and a track record of undertaking large data-science projects. The HaniffaLab team leverages large-scale single cell omics to address various areas with focus on human development, developing immune system, and skin health and disease across the human lifespan. Lotfallahi's lab develop machine and deep learning tools that exploit such high content biological data to develop predictive and generative models that drive high-throughput biological discovery and validation in mechanistic biology and translational health care.

    You will be responsible for

    • Independently manage and lead machine learning research projects and write outcomes in a scientific publication for submission to journals or machine learning conferences (ICLR, ICML, CVPR, etc).
    • Collaborate with team members, propose, develop, and evaluate new machine learning models that enable understanding single-cell data and its application in drug discovery.
    • Work with Ph.D. students and postdocs in collaborating teams on developing solutions for interdisciplinary scientific problems in biology, providing supervision and training to junior members of the team.
    • Contribute to writing scientific papers on biotechnology and biology.
    • Distill your developed solutions into open-source and easy-to-install packages with documentation that facilitates the usage of your solution for downstream users, including biologists and bioinformaticians.
    • Present your research and analysis pipelines to internal and external audiences.

    About You

    You will be supported in your personal and professional development and have the opportunity to lead peer-reviewed publications around using genetics and genomics approaches to guide drug discovery and present them at national and international conferences.

    Essential Skills

    To be successful in the Senior Data Scientist role, you will have the following:

    • Ph.D. or MSc with equivalent research experience in a relevant quantitative discipline (e.g., Computer Science, Computational Biology, Bioinformatics, Physics, Engineering, or Applied Statistics/Mathematics)
    • Proven experience using advanced statistical techniques, machine learning, and modern deep learning techniques.
    • Previous ML work experience in scientific/academic environment (RA/Internships are considered as work experience)
    • Strong knowledge of Python, including core data science libraries such as Scikit-Learn, SciPy, TensorFlow, and PyTorch.
    • Knowledge of software development good practices and collaboration tools, including git-based version control, python package management, and code reviews.
    • Excellent communication skills, with the ability to explain complex machine learning algorithms and statistical methods to non-technical stakeholders.
    • Evidence of related work experience as a researcher in the area of Machine learning
    • Strong publication record, first author position ideal
    • Proven record in driving projects independently and working as part of an interdisciplinary team

    In addition to the above technical skills, you will also have the following:

    • Ability to quickly understand scientific, technical, and process challenges and breakdown complex problems into actionable steps
    • Ability to think independently and critically.
    • Ability to work in a frequently changing environment with the capability to interpret management information to amend plans
    • Ability to present their analysis to a diverse set of researchers that include physicist, mathematicians, bioinformaticians, biologists and clinicians and build on constructive criticisms.
    • Express interest in understanding biological systems especially biological perturbations and cellular mechanisms.
    • Ability to prioritize, manage workload, and deliver agreed activities consistently on time
    • Demonstrate good networking, influencing and relationship building skills
    • Strategic thinking is the ability to see the "bigger picture"
    • Ability to build collaborative working relationships with internal and external stakeholders at all levels
    • Ability to solicit and implement as well as providing constructive feedback from and to the other team members
    • Demonstrates inclusivity and respect for all

    Relevant publication of the groups

    Lotfollahi, M., Naghipourfar, M., Luecken, M. D., Khajavi, M., Büttner, M., Wagenstetter, M., Avsec, Ž., Gayoso, A., Yosef, N., Interlandi, M. & Others. Mapping single-cell data to reference atlases by transfer learning. Nature Biotechnology 1–10 . Lotfollahi, M., Wolf, F. A. & Theis, F. J. scGen predicts single-cell perturbation responses. Nature Methods 16, 715–721 . Lotfollahi, M., Rybakov, S., Hrovatin, K., Hediyeh-Zadeh, S., Talavera-López, C., Misharin, A. V. & Theis, F. J. Biologically informed deep learning to query gene programs in single cell atlases. Nature Cell Biology . Goh, I., Botting, R. A., Rose, A., Webb, S., ... & Haniffa, M. Yolk sac cell atlas reveals multiorgan functions during human early development. Science . Stephenson, E., Reynolds, G., Botting, R. A., Calero-Nieto, F. J., Morgan, M. D., Tuong, Z. K., Bach, K., Sungnak, W., Worlock, K. B., & ... Haniffa, M. Single-cell multi-omics analysis of the immune response in COVID-19. Nature medicine . Jardine, L., Webb, S., Goh, I., Quiroga Londoño, M., Reynolds, G., ... & Haniffa, M. Blood and immune development in human fetal bone marrow and Down syndrome. Nature . Gopee, N., Huang, N., ... & Haniffa, M. A human prenatal skin cell atlas reveals immune cell regulation of skin morphogenesis. Biorxiv .