Data Engineer (AI Enablement)
Manpower
- Job Reference: 159160
- Industry: Information and Communications Technology
Key Responsibilities
- Build and maintain scalable data pipelines using Python
- Write production-grade Python code specifically for data processing, transformation, and ETL workflows
- Perform data cleaning, preprocessing, and feature preparation for analytics and AI use cases
- Use data analysis and manipulation tools to handle large datasets efficiently
- Develop reusable Python modules for data ingestion and pipeline automation
- Perform exploratory data analysis (EDA) to understand data patterns and quality issues
- Optimize data workflows for performance, scalability, and reliability
- Support data requirements for AI/ML and Generative AI systems
- Build data services and APIs to support downstream AI applications
- Ensure data quality, consistency, and observability across pipelines
Python & Data Libraries (Hands-on Experience Mandatory)
Candidates must have solid practical experience with:
- Pandas — data manipulation, transformation, and analysis
- NumPy — numerical operations and array-based processing
- Matplotlib — data visualization and reporting
- scikit-learn — basic ML workflows and model evaluation
- PyTorch — deep learning and AI model experimentation
AI / Generative AI Enablement
- Prepare and structure datasets for ML and LLM-based systems
- Support integration of AI models into data pipelines and applications
- Enable workflows for Generative AI use cases (RAG systems, agent workflows)
- OpenAI
- Anthropic
- LLaMA
- Mistral
- Exposure to AI orchestration frameworks such as LangChain, AutoGen, and CrewAI
Core Requirements
- Solid hands-on Python coding expertise focused on data systems (critical requirement)
- Ability to write clean, efficient, production-grade Python code
- Thorough understanding of data structures, ETL pipelines, and data workflows
- Experience working with large-scale structured and unstructured data
- Solid SQL skills for data extraction and manipulation
- Understanding of data modeling and analytics workflows
- Ability to support end-to-end data-to-AI pipelines
Good to Have
- Experience with big data or distributed processing systems
- Understanding of vector databases and embedding-based retrieval systems
- Experience building APIs or services for data/AI systems
- Familiarity with cloud platforms (AWS, Azure, GCP)
- Exposure to production monitoring and data observability tools
Location
Singapore
RECRUITER
Saraja Dornala
+65 6232 8811
***email_hidden***