Raj Kumar Mittal

Data Engineer

Long Beach, California, United States6 yrs 2 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • Expert in building data pipelines and ETL processes.
  • Proven track record in optimizing data extraction and processing.
  • Strong background in IoT and data analytics.
Stackforce AI infers this person is a Data Engineer specializing in Big Data and IoT solutions.

Contact

Skills

Core Skills

Data EngineeringBig Data ProcessingIot Development

Other Skills

AWSAWS DataSyncAgile & Waterfall MethodologiesAgile MethodologyAirflowAlgorithmsApache Ni FiArtificial Intelligence (AI)Artificial Neural NetworksAzure DatabricksConvolutional Neural Networks (CNN)Data AnalysisData PipelineDeep LearningETL

Experience

6 yrs 2 mos
Total Experience
1 yr 7 mos
Average Tenure
1 yr 4 mos
Current Experience

Keck medicine of usc

Data Engineer

Feb 2025Present · 1 yr 4 mos · Los Angeles, California, United States · Remote

Walmart global tech

Data Engineer

May 2024Feb 2025 · 9 mos · Sunnyvale, California, United States · Hybrid

Goldman sachs

Data Engineer

Aug 2022Apr 2024 · 1 yr 8 mos · United States · Remote

  • Working in designing tables in Hive and processing data like importing and exporting of databases to the HDFS, involved in processing large datasets of different forms of data.
  • Developing Spark, and Python for regular expression (regex) projects in the Hadoop/Hive environment with Linux/Windows for big data resources.
  • Built data pipelines in AWS, achieving a 25% reduction in data extraction time from S3 buckets, enhancing data availability for analytical teams.
  • Building and maintaining Extract, Transform, Load (ETL) pipelines using Databricks.
  • Querying big data, Data pipeline design, and implementation for data extraction, Scheduling, and Automation of tasks from data fetching, and data cleaning to model testing with DAG in Airflow.
  • Designing and developing SSIS Packages to import and export data from MS Excel, SQL Server, and Flat files.
  • Developing data pipeline programs with Apache Spark (PySpark), data aggregations with Hive, and formatting data (JSON) for visualization, and generating dashboards.
  • Implementing load balancing strategies for data processing clusters, improving resource utilization and reliability by 30%.
  • Implementing Agile Methodology for building an internal application.
  • Developing Spark applications using Python.
  • Implemented version control for data pipelines, ensuring data lineage and traceability, and reducing development conflicts by 25%.
  • Scheduling and automation of processes by writing Python programs (DAGs) in Apache Airflow.
  • Working closely with Data Scientists to know the data requirements for the experiments.
HadoopHiveSparkPythonAWSETL+6

Bitinfocom technologies

Data Engineer

Feb 2019Jul 2021 · 2 yrs 5 mos · India · On-site

  • Worked as a module lead on an IoT project - Connected Vehicle Data Validation and Analytics Platform (CVDAP), focusing on server-side development for enhancing integration between cars and the Ira (intelligent Realtime Assist) mobile application.
  • Developed a module to send the latest firmware directly to cars using MQTT protocol as a pub-sub messaging system to upgrade firmware for on-road vehicles.
  • Worked with Hadoop infrastructure to store data in HDFS storage and use Hive SQL to migrate the underlying SQL codebase in AWS.
  • Used Pandas, NumPy, SciPy, and Scikit-learn in Python for scientific computing and data analysis.
  • Ingested streaming data from external REST APIs into Hadoop using data transformations and Pig and HIVE.
  • Utilized PySpark, the Python API for Apache Spark, for large-scale data processing and analytics tasks.
  • Created and presented data dashboards using Tableau and ggplot2, improving stakeholder understanding by 20% through clear data visualization techniques.
  • Responsible for creating on-demand tables on S3 files using Lambda Functions and AWS Glue using Python.
  • Analyzed and examined customer behavioral data stored in MongoDB, resulting in a 10% improvement in customer engagement strategies.
IoTHadoopPythonAWSTableauData Analysis+2

Education

University of Maryland Baltimore County

Master's degree — Information Science

Aug 2021Aug 2023

Guru Gobind Singh Indraprastha University

Bachelor of Technology - BTech

Jan 2017Jan 2021

G R M Public School

12th — non-medical

Jan 2015Jan 2016

Mamta Modern Sr. Sec. School

10th

Jan 2013Jan 2014

Stackforce found 100+ more professionals with Data Engineering & Big Data Processing

Explore similar profiles based on matching skills and experience