Responsibilities: Leadership & Management
- Cross-functional Collaboration: Partner closely with teams in medicinal chemistry, biology, and analytics to define synthesis plates and experiments, analyze results, and lead the iterative Design-Make-Test-Learn (DMTL) cycle,.
- Stakeholder Coordination: Orchestrate interactions across Molecure and external partners, including the consortium member, IChO PAN (Institute of Organic Chemistry Polish Academy of Sciences), spanning chemistry, biology, and Machine Learning (ML) disciplines.
- Platform Ownership: Evaluate and integrate external models and APIs, design human-in-the-loop tools for chemists, and own documentation and technology transfer processes.
- Scientific Leadership: Mentor a smaller Data Science (DS)/ML team, set rigorous engineering and scientific standards, lead code reviews, ensure experimental rigor, and contribute actively to intellectual property (IP) documentation, scientific publications, and regulatory-ready documentation.
Key Responsibilities: R&D
- Modeling: Design, train, and validate Deep Learning (DL)/ML models for de novo design (including diffusion, VAE, GAN, flow models, and genetic RL) and predictive tasks (affinity, selectivity, ADMET/PK/Tox, docking/ranking) for both protein and mRNA-binding targets,.
- Active Learning & RL: Develop and implement closed-loop systems that utilize biological readouts from the lab (activity and orthogonal assays) to dynamically update policy/reward models and prioritize the next batch of molecules for synthesis.
- Cheminformatics Stack: Maintain and develop RDKit pipelines, implement multi-objective scoring systems, and build robust filters against toxic structures (PAINS/reactive/SMARTS), synthetic accessibility issues, novelty metrics, and diversity measures.
- Structure/Sequence Modeling: Integrate conventional docking and scoring methods with ML surrogates; leverage advanced models, such as transformers and equivariant Graph Neural Networks (GNNs), for protein and RNA structure and sequence data; provide specialized support for RNA-targeted small-molecule modeling.
- Data & MLOps: Architect and maintain the data lake and feature store; ensure data governance and lineage (using tools like DVC/MLflow); oversee containerization (Docker/K8s) and Continuous Integration/Continuous Deployment (CI/CD) pipelines; and manage scalable training environments on cloud or High-Performance Computing (HPC) clusters to ensure reproducible science.