Subrat Kumar 🇮🇳 — DevOps Engineer

CCA Spark and Hadoop Developer | AWS Cloud | Spark | Hadoop | Python | Java 8 | JEE | Airflow | Hive | LinuxData Engineering professional having 10 years of experience in Python, Java for Spark & Hadoop Big Data Applications with Hadoop ecosystem (like HDFS, Hive, Sqoop, Apache Airflow, Oozie, MapReduce etc) and Java for Java/JEE Enterprise Applications.Having knowledge on real time processing technologies like Spark Streaming, Kafka and Interactive SQL tool on Hadoop like Impala.Extensive experience in designing, developing and monitoring of Spark, Hadoop based applications on Windows, UNIX and Linux platforms.Experienced on major components of Hadoop Ecosystem including HDFS, Map-Reduce, Hive, Pig, SQOOP, Yarn, NoSQL Database like HBase and its Integration with other components.Extensive knowledge and experience on data pipeline tools like Apache Airflow and Oozie along with their Integration.Experienced in designing, developing and automating ETL processes, Batch processes on Hadoop Ecosystem and handling variety of data using the right Hadoop components on different Hadoop distributions like MapR, Hortonworks and Cloudera.Designed and Built Data Extraction / Ingestion, Quality, Aggregation / Provisioning framework using Sqoop, Flume, Kafka, Spark, MapReduce, Hive.Experienced in Java/JEE/JDBC/Servlets/JSP, Spring, Struts, Hibernate, REST Web Services and excellent hold on query language like SQL/PLSQL.Experienced in Software architecture, design, development where I performed spectrum of roles during encompassing entire Software Development Life Cycle.Strong analytical and reasoning skills with extensive working experience in data structures, algorithm, design patterns etc.Strong engineering professional with a Master's degree in AI and Bachelor's degree focused in Computer Science & Engineering.

Stackforce AI infers this person is a Data Engineering and Big Data specialist with a strong focus on machine learning applications.

Location: Bengaluru, Karnataka, India

Experience: 12 yrs 2 mos

Skills

Data Engineering
Machine Learning
Big Data
Software Development
Java
Embedded Systems
Robotics

Career Highlights

10 years of experience in Data Engineering and Big Data.
Expert in building scalable data pipelines and machine learning systems.
Strong background in both software development and data analytics.

Work Experience

ConcertAI

Lead Data Engineer (2 yrs 9 mos)

Senior Data Engineer (2 yrs)

IBM

Data Engineer: Big Data (2 yrs 2 mos)

ITC Infotech

Associate IT Consultant - Software Engineering (3 yrs 1 mo)

UST Global

Software Developer (10 mos)

Scala

Software Engineer R&D (1 yr 4 mos)

ROBOSAPIENS TECHNOLOGIES Pvt. Ltd.

Research Engineer (5 mos)

Education

Master of Technology - MTech at Birla Institute of Technology and Science, Pilani

PG Level Diploma Degree (DAI & ML) at University of Hyderabad

Bachelor of Technology - BTech at The ICFAI University, Dehradun

Subrat Kumar 🇮🇳

DevOps Engineer

Bengaluru, Karnataka, India12 yrs 2 mos experience

Most Likely To SwitchHighly Stable

Key Highlights

10 years of experience in Data Engineering and Big Data.
Expert in building scalable data pipelines and machine learning systems.
Strong background in both software development and data analytics.

Stackforce AI infers this person is a Data Engineering and Big Data specialist with a strong focus on machine learning applications.

Contact

subrat.dataengineer@gmail.com LinkedIn

Skills

Core Skills

Data EngineeringMachine LearningBig DataSoftware DevelopmentJavaEmbedded SystemsRobotics

Other Skills

AWS GlueAdvanced Deep LearningAmazon EC2Amazon RedshiftAmazon S3Amazon Web Services (AWS)Apache AirflowApache KafkaApache SparkApache Spark MLlibApache Spark SQLApache SqoopArtificial Intelligence (AI)Azure DatabricksC (Programming Language)

About

Experience

12 yrs 2 mos

Total Experience

2 yrs 5 mos

Average Tenure

4 yrs 9 mos

Current Experience

Concertai

2 roles

Lead Data Engineer

Aug 2023 – Present · 2 yrs 9 mos · Hybrid

Working with Top Pharmaceutical Clients to deliver Real World Data using Advanced Data Analytics RWD360, RWD360+Claims, NLP360, Precision360, Patient360, PT360+Claims, Site360, Genome360 and Process using the ConcertAI's unique Patient Solutions AI Data Platform.
Responsibilities:
Building of scaled Production machine learning systems by designing pipelines and engineering infrastructure.
Provide support for deployed data applications and analytical models by being a trusted advisor to Data Scientists and other data consumers by identifying data problems and guiding issue resolution with partner Data Engineers and source data providers.
Design and implement scalable data architecture and data pipelines.
Solving complex problems with multi-layered data sets, as well as optimizing existing machine learning libraries and frameworks.
Maintain awareness of relevant technical and product trends through self-learning/study, training classes, and job shadowing.

DockerLinuxData Structure & AlgorithmsAmazon Web Services (AWS)Python (Programming Language)Amazon Redshift+12

Senior Data Engineer

Jul 2021 – Jul 2023 · 2 yrs · Hybrid

Responsibilities:
Create and maintain optimal data pipeline architecture.
Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS 'big data' technologies.
Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data infrastructure needs.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
Work with data and analytics experts to strive for greater functionality in our data systems.
Support software developers, database architects, data analysts and data scientists on data initiatives and ensure optimal data delivery architecture throughout projects.

DockerLinuxData EngineeringAmazon Web Services (AWS)Data StructuresPython (Programming Language)+12

Ibm

Data Engineer: Big Data

May 2019 – Jul 2021 · 2 yrs 2 mos · Bengaluru, Karnataka, India

Worked with Data Engineering and Analytics team for Autonomous Driving. Building a platform to parse and segregate huge amount of data coming from the sensors/cameras of the car.
Responsibilities:
Extract, sort, convert and ingest automated car driving data into data lake using big data technologies.
Develop prototypes, technical proof of concepts and code with various solutions and convert platform architecture blueprints to real life working artifacts.
Deploying and managing Full life cycle of Big Data solution. Along with requirement analysis, platform selection, designing the Technical architecture and testing.
This data is then used to gain insights and to improve the software and features of the car.

LinuxData EngineeringApache AirflowBig DataData StructuresPython (Programming Language)+7

Itc infotech

Associate IT Consultant - Software Engineering

Apr 2016 – May 2019 · 3 yrs 1 mo · Bengaluru, Karnataka, India

Developed ZEAS (Z Labs Enterprise Analytics System) is a data lake management tool and an incubator partner from the Product Engineering Services Business Unit at ITC Infotech, is a group of passionate engineering professionals focused on leveraging recent technologies and frameworks to solve complex business problems in areas of Analytics and Big Data, Enterprise Mobility, DevOps and Cloud Infrastructure. ZEAS can be used to ingest, prepare, cleanse, transform, store, control, analyze and dashboard all data into a single place.
Responsibilities:
Ingested multiple type of data from files (CSV, XLS, XML, JSON and Oracle, MySQL, DB2) to HDFS and created Hive table for data lineage.
Written multiple Spark jobs for complex data transformations and data transfer to HDFS.
Created Hive queries that helped analysts and store the refined data in partitioned tables.
Developed Change Data Capture (CDC) Feature using Sqoop and Java.
Technologies Used: Java, Scala, Spark, Hadoop, Hive, Sqoop, Kafka, Elastic Search etc

Apache KafkaLinuxData EngineeringHiveBig DataData Structures+12

Ust global

Software Developer

Jun 2015 – Apr 2016 · 10 mos · Bengaluru, Karnataka, India

LinuxData StructuresHibernateOracle DatabaseSQLGitHub+6

Scala

Software Engineer R&D

Jan 2014 – May 2015 · 1 yr 4 mos · Bengaluru, Karnataka, India

With Scala’s Dashboard, that will output reports of the current content playlist length, for any given time period using category breakdown, category duration, and how they are tracking against goals, broken down as percentages derived from total time for all content played. They will display this information in a list as well as in a pie chart. This website should require user authentication. User selects either a player or player group with player categories. (Player groups to be defined in Content Manager and reporting engine are reading from list of those available).
Responsibilities:
Written Spring Service Interfaces and their implementations.
Injected Spring Services into Controller classes.
Written Spring DAO`s and their implementations with Hibernate.
Written Hibernate components.
Worked on enhancement and responsible for efficient operations of the web application using Java.
Technologies Used: Java, Spring, Hibernate, JDBC, Servlets, SQL, Rest Web Services etc.

LinuxRESTful WebServicesData StructuresHibernateOracle DatabaseSQL+6

Robosapiens technologies pvt. ltd.

Research Engineer

Jul 2011 – Dec 2011 · 5 mos · Delhi, Delhi, India · On-site

During my internship, I developed expertise in robotics and marketing, with a strong emphasis on embedded system design, programming, and achieving targeted marketing goals. I gained practical experience in building and deploying robotics applications such as edge avoiders, obstacle avoiders, sound detectors, light seekers, and vision-based robotics systems using Embedded 'C' & Micro-Controller.
Responsibilities:
Developed various robotic applications such as edge avoiders, obstacle avoiders, sound detectors, light seekers, and vision-based robotics systems.
Conducted robotics workshops at colleges and institutions across India, earning recognition for my engaging delivery and impactful sessions.
Proficiency in troubleshooting and fault resolution significantly contributed to the efficient testing and verification of robotics equipment, enhancing overall project outcomes.

C (Programming Language)Data StructuresEmbedded CMicrocontrollersEmbedded SystemsRobotics