Data Engineer

KNOWLEDGESG GLOBAL PTE. LTD.

Key Responsibilities

Data Engineering & Integration

  • Design, build, and optimize ETL/ELT pipelines using Apache Spark, PySpark, Databricks, Azure Synapse, or equivalent platforms.
  • Develop scalable batch and real-time data processing solutions.
  • Integrate data from Core Banking, Payments, Treasury, Trade Finance, CRM, Compliance, and Risk systems.
  • Develop and maintain enterprise data models including 3NF, Dimensional Modeling, and Data Vault 2.0.

Streaming & Modern Data Platforms

  • Build and operationalize real-time streaming pipelines using Kafka, Confluent, or Azure Event Hubs.
  • Support data platform modernization initiatives, including migration from legacy platforms (e.g., Teradata, DB2) to cloud-native environments such as Snowflake, Databricks, or Azure Synapse.
  • Implement scalable cloud-based data lake and data warehouse architectures.

Data Quality & Governance

  • Implement data quality, validation, lineage, and observability frameworks using tools such as Great Expectations, Deequ, or dbt.
  • Collaborate with Governance and Security teams to ensure compliance with enterprise data standards.
  • Support metadata management, cataloging, and lineage initiatives using Azure Purview, Apache Atlas, or Collibra.

Regulatory & Compliance Support

  • Support regulatory reporting and risk data flows including:MAS 610MAS 649Basel III / Basel IVIFRS 9 / IFRS 17BCBS 239
  • Ensure data security controls including encryption, tokenization, masking, RBAC, and audit logging are implemented.

DevOps & MLOps

  • Develop CI/CD pipelines using Azure DevOps, GitHub Actions, or Terraform.
  • Collaborate with Data Scientists and AI teams to deploy ML feature stores and model-serving pipelines.
  • Support automation and Infrastructure-as-Code (IaC) initiatives.

Required Technical Skills

Programming Languages

  • Python
  • PySpark
  • SQL
  • Scala

Data Platforms

  • Azure Data Lake
  • Azure Synapse Analytics
  • Databricks
  • Snowflake

Data Orchestration

  • Apache Airflow
  • Azure Data Factory (ADF)
  • dbt

Streaming Technologies

  • Apache Kafka
  • Confluent Platform
  • Azure Event Hubs

Data Governance

  • Azure Purview
  • Apache Atlas
  • Collibra

Security & Compliance

  • Encryption
  • Tokenization
  • Role-Based Access Control (RBAC)
  • Audit Logging

DevOps & Infrastructure

  • Terraform
  • Azure DevOps
  • GitHub Actions

Qualifications

  • Bachelor's or Master's Degree in Computer Science, Data Engineering, Information Technology, or related discipline.
  • 6–10 years of Data Engineering experience.
  • Minimum 3 years of experience within Banking, Financial Services, Insurance, or Capital Markets environments.
  • Strong experience designing and implementing cloud-based data platforms on Azure and/or AWS.
  • Hands-on experience with batch and real-time data processing frameworks.
  • Understanding of regulatory reporting and risk data management frameworks.