A

Anurag Ambuja

AI Researcher

Bengaluru, Karnataka, India14 yrs 9 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • Expert in architecting scalable data pipelines.
  • Proficient in both cloud and on-prem data solutions.
  • Strong background in big data technologies and data governance.
Stackforce AI infers this person is a Data Engineering expert with a focus on cloud-based big data solutions.

Contact

Skills

Core Skills

Data Pipeline ArchitectureData ModelingData EngineeringData WarehousingBig Data Technologies

Other Skills

AirflowAllAmazon RedshiftAnsibleAnyApache AirflowApache ImpalaApache KafkaApache OozieApache SparkApache SqoopBashBig DataBig Data AnalyticsData Analysis

About

๐Ÿ‘‹ Hey there! I'm Anurag Ambuja, a versatile Data Engineer | Analyst | ML Engineer dedicated to turning raw data into actionable insights that fuel business growth and innovation. ๐Ÿ’ผ In my current role, I specialize in architecting robust data pipelines, optimizing workflows, and implementing scalable solutions to meet modern business needs. ๐Ÿ› ๏ธ Here's what I bring to the table: - Data Pipeline Architecture and ETL Development: * Build robust ETL processes with Google Cloud Data Fusion, Dataproc, or custom Python scripts, maintaining code integrity with Docker and Git. * Migrate structured data to Hadoop/Hive via Sqoop or Spark, adept in Hive programming for data manipulation. * Craft end-to-end pipelines using Data Build Tool (dbt), Apache Airflow, and Apache Spark, ensuring seamless data flow. - Big Data Technologies: * Leverage Hadoop and Spark for efficient handling of large datasets. * Write optimized Spark jobs in Pyspark, skilled in parsing XML and JSON using Python. - Cloud Platforms: * Proficient in Google Cloud Platform and on-Prem solutions, adept at scalable and cost-effective data solutions. * Exposure to Amazon Redshift and Azure SQL. - Database Management: * Manage relational (SQL) and NoSQL (Redis) databases ensuring data integrity and performance. - Data Modeling and Warehousing: * Design and implement data models using Amazon Redshift, Google BigQuery, and Hive. * Specialize in crafting Data Lake solutions tailored to specific business needs. - Data Quality and Governance: * Uphold data quality and governance standards, implementing robust validation checks and ensuring comprehensive data lineage. - Documentation and Training: * Thorough documentation of release procedures, providing comprehensive training for seamless implementation. - Communication and Reporting: * Actively engage in meetings to provide transparent updates, aligning with business objectives. - Automation and Visualization, Data Analysis: * Automate solutions using Airflow, create insightful Looker and Grafana Dashboards for data visualization. * Analyze structured and unstructured data, enabling data-driven decision-making. ๐Ÿš€ My mission is to empower organizations to leverage their data assets for informed decision-making and innovation. ๐Ÿค Let's connect and explore how we can unlock the power of data together! Whether optimizing processes, architecting solutions, or maximizing data potential, I'm here to help you succeed. Reach out, and let's embark on this data-driven journey together!

Experience

14 yrs 9 mos
Total Experience
2 yrs 11 mos
Average Tenure
3 yrs 1 mo
Current Experience

Astreya

Data Architect

Apr 2023 โ€“ Present ยท 3 yrs 1 mo ยท Bengaluru, Karnataka, India ยท Remote

Google Cloud Platform (GCP)SQLData ModelingData PipelinesPython (Programming Language)Data Pipeline Architecture

Turing

Lead Data Engineer

Feb 2022 โ€“ Apr 2023 ยท 1 yr 2 mos ยท Remote

Google Cloud Platform (GCP)Amazon RedshiftLookMLAirflowSQLData Engineering+1

Epam systems

Lead Data Engineer

Dec 2021 โ€“ Jun 2022 ยท 6 mos ยท Bengaluru, Karnataka, India ยท Remote

Google Cloud Platform (GCP)SQLApache AirflowPythonData ModelingData Engineering+1

Ihs markit

Senior Data Engineer

Sep 2020 โ€“ Dec 2021 ยท 1 yr 3 mos ยท Gurugram, Haryana, India ยท Remote

Google Cloud Platform (GCP)PythonPostgreSQLApache AirflowSQLGit+3

Dunnhumby

2 roles

Lead Data Engineer

Promoted

Jan 2019 โ€“ Sep 2020 ยท 1 yr 8 mos ยท Gurugram, Haryana, India

Apache AirflowHadoopInfluxDBGoogle Cloud Platform (GCP)PythonApache Oozie+5

Senior Data Developer

Feb 2017 โ€“ Dec 2018 ยท 1 yr 10 mos ยท Gurugram, Haryana, India

PySparkPythonHiveApache AirflowHadoopBig Data Technologies+1

Tata consultancy services

3 roles

IT Analyst

Promoted

Aug 2014 โ€“ Jan 2017 ยท 2 yrs 5 mos ยท On-site

PythonShell ScriptingPerlSQL DB2Stored ProceduresData Engineering

System Engineer

Promoted

Aug 2012 โ€“ Aug 2014 ยท 2 yrs ยท On-site

Python

Associate System Engineer

Aug 2010 โ€“ Aug 2012 ยท 2 yrs ยท On-site

Education

Liverpool John Moores University

Master of Science - MS โ€” Data Science

Aug 2020 โ€“ Oct 2022

International Institute of Information Technology Bangalore

Post Graduate Diploma โ€” Data Science

Aug 2020 โ€“ Sep 2021

Cochin University of Science and Technology

Bachelor of Technology - BTech โ€” Computer Engineering

Aug 2006 โ€“ May 2010

Stackforce found 100+ more professionals with Data Pipeline Architecture & Data Modeling

Explore similar profiles based on matching skills and experience