Data Engineer

CADENZA SOLUTIONS PTE. LTD.

·Provide technical vision and create roadmaps to align with the long-term technology strategy

·Proficient in building data ingestion pipelines to ingest data from heterogeneous sources like RDBMS, Hadoop, Flat Files, REST APIs, AWS S3

·Key player in Hadoop Data Ingestion team which enables data science community to develop analytical/predictive models and implementing ingestion pipelines using Hadoop Eco-System Tools Flume, Sqoop, Hive, HDFS, Pyspark , Trino and Presto sql.

·Work on governance aspects for the Data Analytics applications such as documentation, design reviews, metadata etc

·Extensively use DataStage, Teradata\Oracle utility scripts and Data stage jobs to perform data transformation/Loading across multiple FSLDM layers

·Review and help streamline the design for big data applications and ensure that the right tools are used for the relevant use cases

·Engage users to achieve concurrence on technology provided solution. Conduct review of solution documents along with Functional Business Analyst and Business Unit for sign-off

·Create technical documents (functional/non-functional specification, design specification, training manual) for the solutions. Review interface design specifications created by development team

·Participate in selection of product/tools via RFP/POC.

·Provide inputs to help with the detailed estimation of projects and change requests

·Execute continuous service improvement and process improvement plans

·8 - 10 years of experience with Data Engineering experience in the banking domain including implementation of Data Lake, Data Warehouse, Data Marts, Lake Houses etc

·Experience in data modeling for large scale data warehouses, business marts on Hadoop based databases, Teradata, Oracle etc for a bank

·Expertise in Big Data Ecosystem such as Cloudera (Hive, Impala, Hbase, Ozone, Iceberg), Spark, Presto, Kafka

·Experience in a Metadata tool such as IDMC, Axon, Watson Knowledge Catalog, Collibra etc

·Expertise in designing frameworks using Java, Scala, Python and creation of applications, utilities using these tools

·Expertise in operationalizing machine learning models including optimizing feature pipelines and deployment using batch/API, model monitoring, implementation of feedback loops

·Knowledge of report/dashboards using a reporting tool such as Qliksense, PowerBI

·Expertise in integrating applications with Devops tools

·Knowledge of building applications on MPP appliances such as Teradata, Greenplum, Netezza is a mandatory

·Domain knowledge of the banking industry include subject areas such as Customer, Products, CASA, Cards, Loans, Trade, Treasury, General Ledger, Origination, Channels, Limits, Collaterals, Campaigns etc

Minimum Qualification Level

Bachelor's Degree or equivalent