Karthikeyan Siva Baskaran

Data Engineer

Singapore, Singapore10 yrs 7 mos experience

Most Likely To SwitchHighly Stable

Key Highlights

Expert in building scalable data pipelines.
Strong architectural knowledge in Big Data Analytics.
Proficient in both batch and real-time data processing.

Stackforce AI infers this person is a Big Data and Cloud Data Engineering expert.

Contact

Skills

Core Skills

AzureData EngineeringReal-time StreamingBig Data EngineeringReal-time Data ProcessingMachine LearningCloud Data EngineeringBig Data Pipeline DevelopmentData QualityData Visualization

Other Skills

Adobe PhotoshopAdobe Premiere ProAfter EffectsApache SparkAzure Stream AnalyticsAzure Synapse AnalyticsCDC ToolCascading Style Sheets (CSS)Data LakeData PipelineData Validation FrameworkData WarehouseDatabricksDebeziumDelta Lake

About

Senior Data Engineer at Grab with experience in various technologies, including Big Data ecosystem, Cloud technologies, Software Engineering, DevOps, and Business Intelligence. A Passionate Technologist with a demonstrated history of good architectural and technical knowledge in Big Data Analytics. I have strong experience in building distributed, scalable systems and end-to-end batch and real-time streaming data pipelines that provide visibility to meaningful data through reporting(BI) and ML Applications. Extensive experience in Solutioning, developing and operationalising complex Big Data pipelines, Data Lake & large scale data processing systems both on-premise and in the cloud. Strong expertise and possess sound Technical and Architectural knowledge in the below tech stacks: ★ Big data Ecosystem: HDFS, Hive, Sqoop, Spark, NiFi, HBase, Kafka ★ Database: Oracle, SQL Server, MySQL ★ Datawarehouse: Snowflake, Azure Synapse Analytics ★ Cloud: Azure, AWS ★ Big Data Environment: Databricks, Hortonworks, Azure, Snowflake Data Cloud ★ Languages: Scala, Python, SQL, Shell Scripting ★ BI Tool: Tableau Desktop, Power BI ★ Streaming: Spark Structured Streaming, Azure Stream Analytics ★ API Framework: Python Flask ★ Orchestration: Apache Airflow, Azure Data Factory, Apache NiFi, Apache Oozie ★ DevOps: Docker, Kubernetes ★ CDC Tool : Attunity, Debezium ★ SQL Build Tool: dbt 🎯 Avid learner exploring different Big data and cloud technologies.

Experience

10 yrs 7 mos

Total Experience

2 yrs 7 mos

Average Tenure

4 yrs 9 mos

Current Experience

Grab

Senior Data Engineer

Jul 2021 – Present · 4 yrs 9 mos · Singapore

Tiger analytics

3 roles

Senior Data Engineer

Jul 2020 – Jul 2021 · 1 yr

★ Built Enterprise Data and Analytics Platform in Azure. Architected End-to-End data pipeline and developed the star schema data model to build Facts and Dimensions in Data Warehouse.
★ Developed real-time Streaming application POC using Azure Stream Analytics and Azure IoT Hub for Generator sensor data to visualize the real-time generator cooling status and alerts in case of any anomaly detection based on the rule engine.

AzureData WarehouseData PipelineStar SchemaReal-time StreamingAzure Stream Analytics+2

Data Engineer

Jul 2019 – Jun 2020 · 11 mos

★ Developed Near Real-Time Streaming Application and build a Databricks Delta Lake to capture the CDC data using Debezium, Kafka and Spark Structured Streaming. This CDC pipeline will help the enterprise to bring data(changing) in near real-time in a robust scalable way to form a unified data platform to do analytics on top of it.
★ Productionizing Machine learning models using Python Flask framework. Designed and implemented an End to End Architecture for automating price recommendation, which speeds the business workflow to quote cost and price using rule engine.

DatabricksDelta LakeDebeziumKafkaSpark Structured StreamingPython Flask+2

Senior Software Engineer

Aug 2018 – Jun 2019 · 10 mos

★ Built an enterprise data lake platform on Azure cloud for a large health care device company by integrating storage service, PaaS for data engineering and analytics. Migrated the data from the source system to Azure data lake storage (ADLS) and snowflake via CDC tool (Attunity). Build big data pipelines for various use cases ranging from CDC to GDPR.
★ Built a customized NiFi real-time alerts and automated log capturing & implemented multi-threading inside the NiFi processor group to address the existing NiFi cumbersome log tracking process. By using the NiFi REST API and doing some configuration changes, developed an automated pipeline for log capturing in Spark tables and trigger email alerts for errors and statuses (started, stopped and died) of NiFi machine(s). A customized approach to handle the thread-level concurrency on the processor group level in Apache NiFi.

AzureData LakeCDC ToolNiFiSparkPaaS+2

Capgemini

Associate Consultant

Jan 2018 – Aug 2018 · 7 mos · India

★ Developed a Generic Database/Datawarehouse benchmarking framework to evaluate different cloud data warehouses.
★ Developed Data Validation Framework using Hashing Algorithm and Apache Spark for Big Data Workloads to ensure the Data Veracity.

Data WarehouseData Validation FrameworkHashing AlgorithmApache SparkData EngineeringData Quality

Cognizant

2 roles

Program Analyst

Jun 2016 – Jan 2018 · 1 yr 7 mos

★ Real-time Fleet Analytics: ATM vehicle tracking POC for real-time data processing, through Python Simulated IoT sensors.
★ Data Migration to the Hadoop Ecosystem from various source systems. Extensively used Tableau to build sophisticated dashboards in order to derive actionable decisions and to enable sales.

Real-time Data ProcessingHadoopTableauData EngineeringData Visualization