Deepak Sahu — Data Engineer

Thanks for stopping by my profile. A Big Data technology-driven professional with 9+ years of experience extensively in Data gathering , modelling , analysis , validation and architecture/solution design to build next generation analytics platform. **Interests: Analytics, Business Intelligence, Data Science, Big Data, Machine Learning, Predictive Analytics and Product Management. Also have great Interest in Developing and implementing new concepts in Distributed Computing technologies: Spark, Hadoop, Map-Reduce, NoSQL databases, AWS, etc. -Strong Analytical and technical background with good problem-solving skills to develop effective complex business solutions. -Experience in Spark Core, Spark SQL, Spark Streaming, Sqoop, HDFS. - Experience in writing lambda functions to automate tasks on AWS using python. -Experience in designing and implementing Big data projects using Hadoop Ecosystem like HDFS, Hive, MapReduce. -Worked on performance optimizations for Spark and Cassandra. -Fair knowledge of database SQL Server and NoSQL database like Cassandra. -Specialization in the field of Big Data Analytics. -Experience on Elasticsearch, Mongo DB, and Redis. I have an unquenchable thirst to explore the limits of Big data technologies. -Experience in using GIT, Gradle, Jenkins, Version one and Apache Subversion. -Excellent communication skills with internal Stakeholders and Senior Management including the ability to work effectively with the business users within the global multicultural team environment. -Involved in different phases of SDLC including requirement gathering, development with Quality Assurance for the projects involved. My insatiable curiosity drives me to constantly push the boundaries of Big Data technologies. With expertise in GIT, Gradle, Jenkins, Version one, and Apache Subversion, I excel in various SDLC phases and effective collaboration within global, multicultural teams. Let's connect and embark on exciting Big Data adventures!

Stackforce AI infers this person is a Data Engineering expert in Fintech with a focus on Big Data technologies.

Location: Bangalore Urban, Karnataka, India

Experience: 10 yrs 2 mos

Skills

Data Engineering
Big Data Analytics
Compliance
Data Science
Software Development

Career Highlights

9+ years in Big Data technologies.
Expert in data engineering and analytics.
Strong background in compliance and data quality.

Work Experience

OnePay

Data Engineering (1 yr 11 mos)

6sense

Lead Data Engineer (2 yrs 10 mos)

Grab

Senior Software Engineer, Data Services | GrabPay (3 yrs 1 mo)

Axis Bank

Data Engineer (1 yr 1 mo)

Tech Mahindra

Software Engineer (1 yr 3 mos)

Education

Data Science Internship at Indian Institute of Management, Lucknow

Bachelor of Technology - BTech at National Institute of Science and Technology (Autonomous, NBA and NAAC Accredited)

Deepak Sahu

Data Engineer

Bangalore Urban, Karnataka, India10 yrs 2 mos experience

Highly Stable

Key Highlights

9+ years in Big Data technologies.
Expert in data engineering and analytics.
Strong background in compliance and data quality.

Stackforce AI infers this person is a Data Engineering expert in Fintech with a focus on Big Data technologies.

Contact

Skills

Core Skills

Data EngineeringBig Data AnalyticsComplianceData ScienceSoftware Development

Other Skills

AWSAgile MethodologiesAirflowAmazon RedshiftApache AirflowApache KafkaAutomationCC++CCPACassandraCore JavaData AnalyticsData ArchitectureData Integration

About

Experience

10 yrs 2 mos

Total Experience

2 yrs

Average Tenure

1 yr 11 mos

Current Experience

Onepay

Data Engineering

Jun 2024 – Present · 1 yr 11 mos

Reconciliation Framework Development: Engineered a robust solution utilizing Databricks and Apache Airflow to optimize data reconciliation processes
Terraform Automation: Transitioned manually created resources in Databricks to a Terraform-managed setup, enhancing resource management and historical tracking
Monorepo Structure: Established a monorepo for the data platform, fostering improved collaboration and development efficiency
Integration Pipelines: Developed multiple vendor integration pipelines with integrated quality checks, ensuring reliable and high-quality data flow
Data Quality Enhancement: Upgraded the Deeque framework within the data pipeline, significantly bolstering data quality and compliance
Engineered a 90% cost reduction by migrating the SCD implementation from DynamoDB to Delta Live Tables, optimizing data processing efficiency and resource allocation.
Led end-to-end design and delivery of financial products including loans and credit cards while enabling cross-functional collaboration as a horizontal team.

DatabricksApache AirflowTerraformData QualityData IntegrationData Engineering+1

6sense

Lead Data Engineer

Aug 2021 – Jun 2024 · 2 yrs 10 mos · Remote

Spearheaded the development and implementation of a privacy and compliance data pipeline, ensuring alignment with GDPR, CCPA, and other regulatory standards.
Streamlined workflows by automating manual processes, resulting in increased efficiency and reduced operational overhead.
Orchestrated the automation of privacy notifications and communications to data subjects, enhancing transparency and compliance.
Post acquisition - led and managed data integration, processes, tech and systems migrations, releasing of new product as part of integration, and all things around core data.
Ideate, Design, Automated and maintained the below pipelines:
Slintel Dashboard People and Company Data
Sintel Website Data
6Sense Sales Intelligence
6Sense Slintel Data Privacy and Compliance
PS:
Joined Slintel as Series A startup as lead data engineer.
Slintel got acquired by 6Sense in late 2021, and joined 6Sense as part of that.
Skills: Spark · Amazon Web Services (AWS) · System Architecture · Full-Stack Development · Python · SQL · Hive · Single Store · MongoDB · Elasticsearch

GDPRCCPAData PipelineAutomationData IntegrationData Engineering+1

Grab

Senior Software Engineer, Data Services | GrabPay

Jun 2018 – Jul 2021 · 3 yrs 1 mo · Bengaluru Area, India

Grab is the largest ride hailing & FinTech company in South East Asia.
Grab Financial Group (GFG) is the largest FinTech startup in South East Asia (part of Grab Holding). We provide the following services: Payments, Loyalty Program, Remittance, Lending, Insurance.
Roles and Responsibilities:
Spearhead the design, development and deployment efforts of our next-generation data pipeline and warehousing infrastructure to meet our future needs.
Build and manage scalable ETL pipelines from various internal and external data sources that come from a variety of systems (OLTP Databases, text-based data files, etc.)
End to end development and manage the new ETL pipeline infrastructure, including its orchestration, monitoring and alerting.
Tools and Technologies:
Worked on some cutting edge high-performance technologies on the Amazon Web Services Cloud. These include EMR, Spark, Presto, Hive, MySQL, S3 and more.
Working on the design for multi-tenant ingestion.
Basic knowledge of Docker, K8s
Designed and developed a data processing platform to support SQL workloads at a high scale.
Worked on productionisation of a low latency data query platform using Presto.
Implemented the RBAC on the Data through Presto
Worked on the data catalog using Alation.
Worked on end to end development of several tools to optimize and establish the data pipelines using boto3, Spark, perfios API, etc.
Implemented a cost-effective solution for Presto using Qubole.
Languages used: Scala, Python, Go
Worked with some visualization and monitoring tools like Prometheus, Grafana.

ETLData PipelineAWSSparkPrestoData Engineering+1

Axis bank

Data Engineer

May 2017 – Jun 2018 · 1 yr 1 mo · Mumbai, Maharashtra, India

My responsibilities in Axis Bank ( Consultant from Kogentix) includes but not limited to
End to end process for migrating data warehouse of various reports using Sqoop, HDFS, Spark and Cassandra/HBase.
Actively contributing to Data Science team for a renowned client in banking domain.
Involves Time Optimization and Performance tuning of all reports.
Having experience in Spark framework and optimizing transformations and actions in spark.
Implementing Machine learning for different use cases for client.
Migrated PL/SQL code to Pyspark code for performance optimization.
Facilitated data cleansing and enrichment through PySpark.
Used models like Random forest, logistics regression and clustering techniques like K Means.
Contributes to the design and implementation of Kogentix in-house products as well.
Working closely with client in client location, understanding client requirement on fly and working on it.

SqoopHDFSSparkCassandraMachine LearningData Engineering+1

Tech mahindra

Software Engineer

Jan 2016 – Apr 2017 · 1 yr 3 mos · Bhubaneshwar Area, India

Occupied with the development and support programme for a project to modify the changes as per requirement of the client.
Develop moderately complex software applications using Spark SQL, Scala for clients in accordance with applicable software development methodology and release processes.
Data Management, fixing data issues and ensuring data accuracy with the other servers.
Perform Query optimization.
Build shell scripts to schedule the log generation/deletion.

Spark SQLScalaData ManagementSoftware Development