Data Engineer (AI Enablement)

Manpower

  • Job Reference: 159160
  • Industry: Information and Communications Technology

Key Responsibilities

  • Build and maintain scalable data pipelines using Python
  • Write production-grade Python code specifically for data processing, transformation, and ETL workflows
  • Perform data cleaning, preprocessing, and feature preparation for analytics and AI use cases
  • Use data analysis and manipulation tools to handle large datasets efficiently
  • Develop reusable Python modules for data ingestion and pipeline automation
  • Perform exploratory data analysis (EDA) to understand data patterns and quality issues
  • Optimize data workflows for performance, scalability, and reliability
  • Support data requirements for AI/ML and Generative AI systems
  • Build data services and APIs to support downstream AI applications
  • Ensure data quality, consistency, and observability across pipelines

Python & Data Libraries (Hands-on Experience Mandatory)

Candidates must have solid practical experience with:

  • Pandas — data manipulation, transformation, and analysis
  • NumPy — numerical operations and array-based processing
  • Matplotlib — data visualization and reporting
  • scikit-learn — basic ML workflows and model evaluation
  • PyTorch — deep learning and AI model experimentation

AI / Generative AI Enablement

  • Prepare and structure datasets for ML and LLM-based systems
  • Support integration of AI models into data pipelines and applications
  • Enable workflows for Generative AI use cases (RAG systems, agent workflows)
  • OpenAI
  • Anthropic
  • LLaMA
  • Mistral
  • Exposure to AI orchestration frameworks such as LangChain, AutoGen, and CrewAI

Core Requirements

  • Solid hands-on Python coding expertise focused on data systems (critical requirement)
  • Ability to write clean, efficient, production-grade Python code
  • Thorough understanding of data structures, ETL pipelines, and data workflows
  • Experience working with large-scale structured and unstructured data
  • Solid SQL skills for data extraction and manipulation
  • Understanding of data modeling and analytics workflows
  • Ability to support end-to-end data-to-AI pipelines

Good to Have

  • Experience with big data or distributed processing systems
  • Understanding of vector databases and embedding-based retrieval systems
  • Experience building APIs or services for data/AI systems
  • Familiarity with cloud platforms (AWS, Azure, GCP)
  • Exposure to production monitoring and data observability tools

Location

Singapore

RECRUITER

Saraja Dornala

+65 6232 8811

***email_hidden***