Nagarjuna Putluri

Software Engineer

Hyderabad, Telangana, India11 mos experience

Key Highlights

Expert in Big Data Engineering and Data Pipeline Development
Proven track record in cloud data engineering with Microsoft Azure
Skilled in migrating and transforming large datasets

Stackforce AI infers this person is a Big Data Engineer with expertise in cloud data solutions and retail analytics.

Contact

Skills

Core Skills

Big Data EngineeringData Pipeline DevelopmentData MigrationEtl Development

Other Skills

HadoopHiveSparkPythonSqoopAtomic SchedulerDB2SQL ServerTeradataKafka

Experience

11 mos

Total Experience

5 mos

Average Tenure

Current Experience

Walmart

Big Data Engineer

Feb 2019 – Present · 7 yrs 4 mos · Bentonville, Arkansas, United States

Walmart Canada Finance Project:
As part of the Canada Data Lake team, key data sources for Finance were discovered, analyzed and landed in the data lake to support analytical products. After having landed those key data sources, additional data pipelines were engineered to create enriched consumption layer datasets and power Flash Sales 2.0, the key reporting deliverable from this pioneering effort. In addition to the aforementioned achievement, the team worked to onboard eCommerce reporting from an soon-to-be decommissioned Oracle system to the data lake and drive all their reporting form this platform.
Walmart Canada ECommerce Migration Project:
Due to the increase in volume and diversity of data existing in multiple RDBMS and to serve the business with better and faster analytical solutions, the Walmart Canada eCommerce division decided to migrate all their core data into the Canadian Data Lake powered by big data technologies. The migration project consisted in bringing financial, operational, and marketing data coming from various eCommerce channels and brick and mortar into a single silo. As part of the team, I engineered the data movement coming from various sources into our lake and ensured the data consistency, integrity and availability in the new architecture.
Responsibilities:
Design and Develop Data application using Hadoop, Hive, Spark, python, Sqoop, Atomic Scheduler,DB2,SQL Server,Teradata,Kafka, Created a data pipeline to ingest data from Hive to MS SQL server using python and shell scripting, Ingested data into hive tables and druid tables from live Kafka feed which is used for making near real time reports in tableau. Created a Oracle Model Iterator equivalent in Hadoop using python as per requirement for aggregating Vendor related data, Create data warehouse applications on top of Hadoop and Teradata database, Create hive external and managed tables and implement partitioning and bucketing for efficient data processing.

HadoopHiveSparkPythonSqoopAtomic Scheduler+6

Cognizant

Big Data Developer

Aug 2018 – Jan 2019 · 5 mos · Portland, Oregon Area

Responsibilities:
Experience in working with Microsoft Azure components like Microsoft Azure, Microsoft Azure DataBricks(ADBX), Azure DataLake (ADLS), Azure Blob Storage, Azure Data Factory, Azure Data Storage Explorer
Developed working code and automated them using Azure Databricks notebooks.
Used PySpark, SparkSQL, Hive and Pyhton as required in this project to perform the ETL operations.
Worked on Azure Data Factory to Buildl several Data Pipelines.
Used VSTS as the tracking tool.
Utilized Spark in Memory capabilities, to handle large datasets.
Worked on Azure Blob storage to send huge amounts of data to Retalon.
Experience in working with Hive Functions to perform the required data Transformation.
Experience in working with DataFrames in spark to create Various Datasets.
Woked in loading several tables into DataBricks and used Parquet format for tables with snappy compression and also worked on GZip format to save the files into Azure Data Lake.
Created Hive external tables to perform ETL on data that is produced on daily basis
Worked on Data Validation by loading the files into tables and validated using SQL Commands.
Tried to Resolve many tickets generated from the data pipelines when the data Pipeline fails.
Followed Agile methodologies in analysis, define and document the applications, which will support functional and business requirements
Environment: Microsoft Azure, VSTS, Microsoft Azure DataBricks(ADBX), Azure DataLake (ADLS), Azure Blob Storage, Azure Data Factory, Azure Data Storage Explorer, PySpark, SparkSQL, Hive, Scala, JDA, Sim Tool, Hortonworks.

Scepter technologies inc.

Hadoop/Spark Developer

Jan 2018 – Jul 2018 · 6 mos · Columbia, Maryland, United States

Project Description:
Project objective is to assemble a Centralized Analytical information store for the whole association to have the capacity to convey all the expository necessities of all the business units. Migrated different information sources from existing Enterprise Data-warehouses to HDFS.
Responsibilities:
 Experience in working with migrating data from traditional RDBMS to HDFS.
 Ingested data into HDFS from Teradata, MySQL using Sqoop.
 Part developing spark application to perform ETL kind of operations on the data.
 Redesigned the existing MapReduce jobs to Spark transformations and actions by utilizing Spark RDDs, Data frames and Spark SQL API's
 Used Hive partitioning, Bucketing and performed various kinds of joins on Hive tables
 Created Hive external tables to perform ETL on data that is produced on daily basis
 Migrated ETL jobs to Pig scripts do Transformations, even joins and some pre-aggregations before storing the data onto HDFS.
 Validated the data being ingested into HIVE for further filtering and cleansing.
 Developed Sqoop jobs for performing incremental loads from RDBMS into HDFS and further applied Spark transformations
 Worked on loading data into hive tables from spark and used Parquet columnar format.
 Created Oozie workflows to automate and productionize the data pipelines
 Migrating Map Reduce code into Spark transformations using Spark and Scala.
 Collecting and aggregating large amounts of log data using Kafka and staging data in HDFS for further analysis.
 Used Sqoop to extract and load incremental and non-incremental data from RDBMS systems into
 Hadoop.
 Worked on various enterprise data-warehouses as a part of migration project.
 Worked with Tableau to connect to Impala for developing interactive dashboards.
 Followed Agile Methodologies.