Anurag Ambuja โ AI Researcher
๐ Hey there! I'm Anurag Ambuja, a versatile Data Engineer | Analyst | ML Engineer dedicated to turning raw data into actionable insights that fuel business growth and innovation. ๐ผ In my current role, I specialize in architecting robust data pipelines, optimizing workflows, and implementing scalable solutions to meet modern business needs. ๐ ๏ธ Here's what I bring to the table: - Data Pipeline Architecture and ETL Development: * Build robust ETL processes with Google Cloud Data Fusion, Dataproc, or custom Python scripts, maintaining code integrity with Docker and Git. * Migrate structured data to Hadoop/Hive via Sqoop or Spark, adept in Hive programming for data manipulation. * Craft end-to-end pipelines using Data Build Tool (dbt), Apache Airflow, and Apache Spark, ensuring seamless data flow. - Big Data Technologies: * Leverage Hadoop and Spark for efficient handling of large datasets. * Write optimized Spark jobs in Pyspark, skilled in parsing XML and JSON using Python. - Cloud Platforms: * Proficient in Google Cloud Platform and on-Prem solutions, adept at scalable and cost-effective data solutions. * Exposure to Amazon Redshift and Azure SQL. - Database Management: * Manage relational (SQL) and NoSQL (Redis) databases ensuring data integrity and performance. - Data Modeling and Warehousing: * Design and implement data models using Amazon Redshift, Google BigQuery, and Hive. * Specialize in crafting Data Lake solutions tailored to specific business needs. - Data Quality and Governance: * Uphold data quality and governance standards, implementing robust validation checks and ensuring comprehensive data lineage. - Documentation and Training: * Thorough documentation of release procedures, providing comprehensive training for seamless implementation. - Communication and Reporting: * Actively engage in meetings to provide transparent updates, aligning with business objectives. - Automation and Visualization, Data Analysis: * Automate solutions using Airflow, create insightful Looker and Grafana Dashboards for data visualization. * Analyze structured and unstructured data, enabling data-driven decision-making. ๐ My mission is to empower organizations to leverage their data assets for informed decision-making and innovation. ๐ค Let's connect and explore how we can unlock the power of data together! Whether optimizing processes, architecting solutions, or maximizing data potential, I'm here to help you succeed. Reach out, and let's embark on this data-driven journey together!
Stackforce AI infers this person is a Data Engineering expert with a focus on cloud-based big data solutions.
Location: Bengaluru, Karnataka, India
Experience: 14 yrs 9 mos
Skills
- Data Pipeline Architecture
- Data Modeling
- Data Engineering
- Data Warehousing
- Big Data Technologies
Career Highlights
- Expert in architecting scalable data pipelines.
- Proficient in both cloud and on-prem data solutions.
- Strong background in big data technologies and data governance.
Work Experience
Astreya
Data Architect (3 yrs 1 mo)
Turing
Lead Data Engineer (1 yr 2 mos)
EPAM Systems
Lead Data Engineer (6 mos)
IHS Markit
Senior Data Engineer (1 yr 3 mos)
dunnhumby
Lead Data Engineer (1 yr 8 mos)
Senior Data Developer (1 yr 10 mos)
Tata Consultancy Services
IT Analyst (2 yrs 5 mos)
System Engineer (2 yrs)
Associate System Engineer (2 yrs)
Education
Master of Science - MS at Liverpool John Moores University
Post Graduate Diploma at International Institute of Information Technology Bangalore
Bachelor of Technology - BTech at Cochin University of Science and Technology