Senior Data Scientist
ONCOSHOT PTE. LTD.

About Oncoshot :
Oncoshot is committed to transforming oncology research through data science and machine learning. Our federated data platform supports secure, privacy-compliant data exchanges between hospitals and pharmaceutical companies, helping accelerate clinical trials. We leverage cutting-edge machine learning models, including LLMs (Large Language Models), to derive actionable insights from large-scale healthcare data.
As a Senior Data Scientist , you will lead the development of advanced machine learning models to support patient-trial matching, outcome prediction, and data-driven insights. You will work with large-scale healthcare datasets to develop models that scale across multi-cloud environments and ensure compliance with global data privacy standards. This role offers the opportunity to work with innovative technologies, including LLMs, to extract meaningful insights from both structured and unstructured data.
Key Responsibilities:
- Lead the design and development of machine learning models for clinical trial optimization and patient matching.
- Design and implement robust evaluation framework to assess LLM performance, safety and reliability in healthcare context.
- Collaborate with clinical team, understand clinical trial workflow and translate medical requirements into technical specifications.
- Build scalable data pipelines and models that operate across AWS, GCP, and Azure cloud environments.
- Utilize LLMs to enhance natural language understanding for clinical data and unstructured healthcare records.
- Ensure privacy-preserving techniques like federated learning are implemented to maintain data security and compliance (HIPAA, GDPR).
- Collaborate with backend engineers and data engineers to deploy models into production environments, ensuring seamless integration into Oncoshot’s platform.
- Conduct exploratory data analysis to identify trends and actionable insights for clinical trial operations.
Required Skills:
- 7+ years of experience in machine learning and deep learning, with a focus in NLP and LLM
- 3+ years experiences in clinical informatics with an understanding of coding standards in healthcare (RxNorm, ICD10, SNOMED-CT, LOINC, HGVS nomenclature and interoperability standards (FHIR and HL7)
- Proficiency in Python and machine learning frameworks like TensorFlow, PyTorch, or Scikit-learn.
- Experience working with LLMs, RAG and TAG and other NLP-based techniques to process unstructured data.
- Experience installing and deploying LLMs locally and utilised framework to test and validate the models.
- Strong expertise in working with cloud platforms (AWS, GCP, Azure) for scalable model deployment.
- In-depth knowledge of data privacy regulations (HIPAA, GDPR) and experience implementing privacy-preserving data models.
- Hands-on experience with ETL processes, data wrangling, and large-scale healthcare datasets.
- Familiar with use of AI-DevOps framework and work with DevOps engineer to implement AI-DevOps architecture to govern the life-cycle deployment of AI models from development to production.
See more jobs in Singapore