Machine Learning Engineer, Computational Biology
Paris
Research (All Teams) – Research /
Permanent /
Hybrid
InstaDeep, founded in 2014, is a pioneering AI company at the forefront of innovation. With strategic offices in major cities worldwide, including London, Paris, Berlin, Tunis, Kigali, Cape Town, Boston, and San Francisco, InstaDeep collaborates with giants like Google DeepMind and prestigious educational institutions like MIT, Stanford, Oxford, UCL, and Imperial College London. We are a Google Cloud Partner and a select NVIDIA Elite Service Delivery Partner. We have been listed among notable players in AI, fast-growing companies, and Europe's 1000 fastest-growing companies in 2022 by Statista and the Financial Times. Our recent acquisition by BioNTech has further solidified our commitment to leading the industry.
Join us to be a part of the AI revolution!
InstaDeep is seeking talented Machine Learning Engineers to join our Research Team in Paris, who are working at the intersection of machine learning and bioinformatics to address diverse challenges in bio-sequence design and life science applications. Some of the technologies we are developing include multi-modal generative models for biological data, ML-enhanced simulations of biological and quantum systems mechanical systems and efficient strategies for low-data, lab-in-the-loop experimental regimes.
The ideal candidate will have a strong foundation in machine learning and software engineering, with computational biology experience a great bonus!, As a Machine Learning Engineer, you will work closely with our Research Scientists and Engineers to support our ambitious develop our research infrastructure; playing a key role in the implementation and validation of machine learning models along with data curation and management technologies. If you are passionate about leveraging machine learning to solve complex biological problems and driving advancements in life sciences, we encourage you to apply and join our innovative team.
Responsibilities
- Lead the engineering components of long-term research projects encompassing all stages of the project lifecycle. Responsibilities include data generation pipelines, database management, development and maintenance of codebases, as well as the design and execution of analysis pipelines and reporting mechanisms.
- Collaborate closely with the Core ML and Engineering teams to integrate and optimise cutting-edge methodologies for the distribution and scaling of large-scale (billion parameter plus) ML models.
- Align with engineering leads across other critical projects to improve standardisation and methodological best practices across the company.
- Develop and maintain robust, high-quality software solutions. Ensure code is modular, well-documented, and integrates smoothly with continuous integration systems. Work in collaboration with Research Scientists, Engineers, and technical leads from various projects to uphold high coding standards and foster standardisation and methodological best practices across the Research Team.
- Deploy machine learning models and associated processes across large-scale, distributed computing infrastructures, including CPUs, GPUs, and TPUs, utilising both in-house and cloud-based platforms.
- Manage the efficient, reproducible, and performant handling of complex, multi-modal biological data. This includes optimising data generation, storage, and retrieval processes, particularly through advanced database management systems like SQL.
- Actively contribute to the team's research initiatives, including publishing results and participating in open-source projects.
- Report and present experimental results and research findings clearly and effectively, both internally and externally, verbally and in writing.
Requirements
- Masters-level degree in Computational Science, Machine Learning or a related scientific field.
- Experience using Deep Learning frameworks like PyTorch, Tensorflow and/or Jax.
- Strong software engineering experience (Object-Oriented Programming, Unit Testing, Profiling, CI, Docker) via previous work or contributions to open-source projects.
- Excellent communication skills and collaborative spirit.
Desirables
- Experience in professional research teams; either industrial or through PhD/post-doctoral positions.
- Experience with computational biology and biological data curation and management.
- Experience applying computational modelling techniques such as sequence alignment, protein structure prediction and inverse folding, molecular dynamics simulations, and quantum chemistry calculations with density functional theory, and more.
- Published scientific papers in related domains such as ML or bioinformatics.
* Important: All applicants must submit their CV/Resume and Cover letter in English. *
Our commitment to our people
We empower individuals to celebrate their uniqueness here at InstaDeep. Our team comes from all walks of life, and we’re proud to continue encouraging and supporting applicants from underrepresented groups across the globe. Our commitment to creating an authentic environment comes from our ability to learn and grow from our diversity, and how better to experience this than by joining our team? We operate on a hybrid work model with guidance to work at the office at least 2 to 3 days per week to encourage close collaboration and innovation. We are continuing to review the situation with the well-being of InstaDeepers at the forefront of our minds.
Right to work: Please note that you will require the legal right to work in the location you are applying for.