Vaibhav Rai — Software Engineer
SKILLS Programming Languages: Python, Bash, SQL, Cobol Databases: MySQL, Oracle, PostgreSQL, Redshift, BigQuery Libraries: PySpark, pandas, NumPy, matplotlib Project Management Tools: Confluence, Git, Jira Cloud: AWS, GCP ETL Tools: Meltano Additional Technologies: Docker Databricks (Delta Live Tables ETL Framework, Databricks Utilities (Widgets, File system, Mounts)) AWS Services: AWS EC2, AWS S3, AWS Lambda GCP Services: Cloud Run, BigQuery, Pub/Sub, Storage, Composer, Dataform EXPERIENCE Big Data Engineer Oct 2022 - Present Infosys, Gurgaon Automated ETL processes using Meltano ELT tool for data extraction from MSSQL and loading into BigQuery, ensuring efficient and accurate data transfer. Developed and containerized the ETL pipeline using Docker, storing images in Artifact Registry and deploying via Cloud Run for serverless execution. Created Python scripts to generate Apache Airflow DAG files, managed via Google Composer, facilitating automated scheduling and orchestration of ETL tasks. Uploaded DAG files to Google Cloud Storage and integrated with Apache Airflow, enhancing workflow automation and task management. Led migration initiative, orchestrating the transition from legacy Mainframe Cobol technologies to cutting-edge ETL solutions within Databricks Medallion Architecture. Developed robust PySpark scripts to automate the execution of the Databricks DLT pipeline, facilitating the seamless transformation of Cobol and Easytrieve files into PySpark and DLT SQL formats. Big Data Engineer Aug 2021 - Oct 2022 NITS Solutions Spearheaded migration projects, transitioning ETL pipelines from AWS S3-Oracle-React to AWS S3-Python with PostgreSQL-React. Analyzed summary tables using PySpark for enhanced insights. Developed an API in Java and Apache Spark technologies to accumulate data from different sources provided by the client. Created ETL processes in Python (Pandas), PostgreSQL, and Shell Script to process records on a scheduled basis through cron job. Worked on Agile Methodology. Software Developer Dec 2018 - Aug 2021 Indian Agriculture Statistics Research Institute (IASRI) Created Python pandas scripts to analyze a huge agricultural dataset. Worked on ETL pipeline from ingesting data into MySQL database and analyzing data with the help of Python pandas and matplotlib library. Worked on Agile Methodology. Used Git for version control.
Stackforce AI infers this person is a Big Data Engineer with expertise in SaaS and cloud technologies.
Location: Gurugram, Haryana, India
Experience: 8 yrs 3 mos
Skills
- Big Data Engineering
- Etl
- Data Analysis
- Database Management
Career Highlights
- Expert in automating ETL processes using Meltano.
- Proficient in cloud technologies like AWS and GCP.
- Strong background in data analysis and database management.
Work Experience
IBM
Senior Data Engineer (1 yr 6 mos)
Infosys
Technical Analyst (2 yrs)
NITS Solutions
Assosiate big data engineer (1 yr 3 mos)
Indian Agricultural Research Institute
Senior Research Fellow (2 yrs 7 mos)
Infozech Software Private Limited
Software Developer (11 mos)
Education
Certification at Ivy Professional School
Master of Computer Applications at Bhai Parmanand Institute of Business Studies