V

Vikesh Malik

Data Engineer

Glasgow, Scotland, United Kingdom4 yrs 11 mos experience

Key Highlights

  • Achieved 50-70% improvement in data processing speed.
  • Delivered high-priority projects on time and within scope.
  • Expert in building efficient data pipelines using PySpark.
Stackforce AI infers this person is a Data Engineer specializing in Big Data solutions for SaaS and Fintech industries.

Contact

Skills

Core Skills

Data EngineeringBig Data

Other Skills

BitbucketC (Programming Language)CassandraClouderaContinuous Integration and Continuous Delivery (CI/CD)Data AnalysisData ExtractionData ModelingData SolutionsDatabricksETLHBaseHQLHTMLHadoop

About

I have always found joy in structured data sets and transforming them into meaningful, actionable insights. At my previous role at EXL , I developed new PySpark scripts to reduce over all batch processing time. I spearheaded the development of a new data processing pipeline that improved processing speed by 50-70%. I was able to dramatically decrease the time it took to analyze data, which in turn allowed us to make faster, data-driven decisions. Skills : Big Data, Hadoop, MySQL,Apache Spark,Apache Sqoop, Hive, Git, Bitbucket, Linux, Data Warehouse, ETL, AWS (S3, Athena,EMR), Presto, JIRA, SQL, Cloudera, Databricks.

Experience

Jpmorganchase

Associate Data Engineer - II

Jan 2025Present · 1 yr 2 mos · Glasgow, Scotland, United Kingdom · On-site

  • Migrated legacy Hive-based data processing workflows to Databricks cloud platform, modernizing data
  • infrastructure and improving scalability.
  • Developed high-performance data pipelines using PySpark, reducing batch processing time by 8hrs
  • and achieving 50% efficiency improvement in system operations.
  • Performed end-to-end code testing and validation across multiple environments (dev, UAT,
  • production), ensuring robust application performance and user experience
  • Performed comprehensive data validation across multiple business logic scenarios to ensure data
  • accuracy, integrity, and compliance with business requirements.
  • Investigated and mapped business logic workflows to identify optimization opportunities and ensure
  • accurate system behaviour.
  • Successfully delivered two high-priority projects on time and within scope, meeting all stakeholder
  • requirements and business objectives.
  • Tech Stack: Spark, MySQL, Cloudera, Hive, Python, Bitbucket, Jira, IntelliJ, Data Pulse, Databricks, Jules, Airflow
DatabricksPySparkMySQLClouderaHiveBitbucket+3

Exl

Software Engineer-II (Data Engineer)

Mar 2024Dec 2024 · 9 mos · India · Remote

  • Design, build and optimize the data architecture using suitable design patterns and effective Data Model to data pipelines to make them accessible for Business Data Analysts, Data Scientists and Business users to enable data-driven decision making.
  • Gathered, defined and refined requirements, led project design and implementation.
  • Designed data models for complex analysis needs.
  • Optimize data pipelines and data storage to improve performance and scalability.
  • Wrote Complex SQL views for Data Analyst and Data Scientist to leverage models on top of it.
  • Brought best-practice to proactively and continuously build data related practices within the team.
  • Built data pipelines in PySpark to reduce batch processing time of system, leading efficiency increase.
  • Maximised performance benefits for clients, delivering testable, maintainable and modern data solutions in HQL, PySpark.
  • Creating reports from sales and fraud data.
  • Migrated existing HQL based system to Pyspark and enabled to reduce batch processing time 30-50%.
PySparkSQLHQLData ModelingETLData Engineering+1

Iris software inc.

Associate Engineer (Data Engineer)

Sep 2022Jan 2024 · 1 yr 4 mos · Noida, Uttar Pradesh, India · Hybrid

  • Citi Bank - Anti Money Laundering Project
  • Built data pipelines in PySpark to reduce batch processing time of system, leading 43% efficiency increase.
  • Maximised performance benefits for clients, delivering testable, maintainable and modern data solutions in HQL, PySpark.
  • Experienced in working with AML system which process years of data from warehouse to track legal activities of customers.
  • Developing data pipeline to find eligible records for deletion based on retention period.
PySparkHQLData SolutionsData EngineeringBig Data

Ibm

Associate System Engineer

Jan 2021Sep 2022 · 1 yr 8 mos · Bengaluru, Karnataka, India · Remote

  • Customer Inventory
  • Monitored system performance using recognised and agreed criteria.
  • Modified current systems to enhance workflows and meet new needs.
  • Partnered with users to understand and define system requirements.
  • Extracted maximum value from existing data by leveraging new open data sources.
  • Built data pipelines in pySpark to reduce system processing time, leading to 20% efficiency increase.
PySparkSystem MonitoringData ExtractionData EngineeringBig Data

Education

National Institute of Technology, Tiruchirappalli

Master’s Degree — Computer Application

Aug 2016Jul 2019

National Level Exam

Gate 2020 Qualified — Computer Science

Jan 2019Jan 2020

Mahatma Jyotiba Phule Rohilkhand University

Bachelor's degree — Computer Application

Aug 2012Sep 2015

Uttar Pradesh State Board of High School and Intermediate Education (UPMSP)

XII

Jul 2010Jun 2011

Uttar Pradesh State Board of High School and Intermediate Education (UPMSP)

X — Mathematics

Jul 2008Jun 2009

Stackforce found 100+ more professionals with Data Engineering & Big Data

Explore similar profiles based on matching skills and experience