Harshitha A

Data Engineer

United States5 yrs 11 mos experience

Key Highlights

  • 8+ years of experience in Data Engineering.
  • Expert in building and managing data pipelines.
  • Proficient in both AWS and Azure cloud services.
Stackforce AI infers this person is a Big Data Engineer with expertise in cloud-based data solutions.

Contact

Skills

Core Skills

Data EngineeringBig Data AnalyticsCloud Services

Other Skills

Python (Programming Language)ScalaAWS LambdaAWS GlueAWS Step FunctionsAmazon Elastic MapReduce (EMR)Amazon S3PySparkSQLApache AirflowApache SparklambdaMapReduceGlueJava

About

About: Passionate Data Engineer with 8+ years of experience in building and managing Data Pipelines, creating Data workflows and leveraging cloud services. Specialized in handling large datasets, migrating data, transforming data and utilized Hadoop related technologies to build a data driven eco-system. Deep understanding of distributed system architecture and the principles of parallel computing. Extensive experience with Kafka for real-time data processing. Hands-on with various Hadoop distributions such as Cloudera, Hortonworks, and AWS EMR. Skilled in AWS Cloud services including EMR, Redshift, S3, Athena, SNS, EC2 and Glue for big data analytics. Experienced in analyzing large datasets using PySpark scripts and Hive queries. Familiar with deployment automation tools like Jenkins and containerization concepts including Docker and Airflow. Extensive SQL query expertise for backend database analysis. Strong knowledge of NoSQL column-oriented databases like HBase, Cassandra, DynamoDB (AWS), and MongoDB, and their integration with Hadoop. Hands-on experience with SQL databases such as SQL Server, Hive, Oracle, MySQL, DB2, and PostgreSQL. Experienced with Sqoop for importing and exporting data between HDFS and RDBMS. Proficient in Azure Cloud services including ADLS, Azure Databricks, Azure Functions, Azure SQL Data Warehouse, Azure Synapse Analytics, and Azure Data Factory. Led data analysis and integration projects involving Hadoop and ETL processes. Transferred large data sets from Teradata RDBMS to HDFS using Sqoop. Experienced with visualization tools such as Tableau, Looker, and Power BI. Strong understanding of version control tools like Git and GitHub. Involved in various testing methodologies including unit, integration, and acceptance testing to ensure data quality and functionality. Skills: Data Modeling, Data Engineering, Big Data Analytics, Object Oriented Programming (OOPS), Data Warehousing. Programming Skills: Python, SQL, Scala, Hadoop, PySpark. Cloud Services: AWS S3, EMR, EC2, AWS Glue, Lambda services, AWS Redshift, Azure Data Factory, Azure Databricks, ADLS, Synapse, Snowflake.

Experience

5 yrs 11 mos
Total Experience
1 yr 5 mos
Average Tenure
--
Current Experience

Availity

Data Engineer

Feb 2025Present · 1 yr 4 mos

Python (Programming Language)ScalaAWS LambdaAWS GlueAWS Step FunctionsAmazon Elastic MapReduce (EMR)+9

Cloudflare

Sr Data Engineer

Jul 2023Jan 2025 · 1 yr 6 mos

GlueJavaPython (Programming Language)Amazon RedshiftAmazon S3Amazon Elastic MapReduce (EMR)+14

Verizon

Big Data Engineer

May 2022Jun 2023 · 1 yr 1 mo · Remote

HivePySparkJSONExtractTransformLoad (ETL)+14

City of hope

Data Engineer

Feb 2021Apr 2022 · 1 yr 2 mos · Los Angeles, California, United States · Hybrid

HivePySparkData lake storageHadoopAzure SQLAzure Databricks+16

Datafactz

Hadoop/Spark Developer

Jan 2019Nov 2020 · 1 yr 10 mos · India · On-site

MapReduceAgile MethodologiesHivePySparkApache Spark StreamingJSON+16

Extended web apptech

Java Developer

Jul 2017Dec 2018 · 1 yr 5 mos · On-site

GitHTMLlambdajQueryLinuxJava+11

Stackforce found 100+ more professionals with Data Engineering & Big Data Analytics

Explore similar profiles based on matching skills and experience