P

Pulak Chandan

Associate Consultant

Gurugram, Haryana, India9 yrs 11 mos experience

Key Highlights

  • 9+ years of experience in Big Data systems.
  • Expert in building data pipelines and processing.
  • Strong foundation in data structures and algorithms.
Stackforce AI infers this person is a Big Data Engineer with expertise in data processing and cloud technologies.

Contact

Skills

Core Skills

AirflowContinuous IntegrationData EngineeringBig Data

Other Skills

AlgorithmsAmazon DynamodbAmazon EKSAmazon ElastiCacheAmazon Elastic MapReduce (EMR)Amazon RedshiftAmazon Web Services (AWS)Apache SparkAzkabanAzure DevOpsBig Data AnalyticsContinuous Integration and Continuous Delivery (CI/CD)Data AnalyticsData StructuresData Warehousing

About

Pulak has 9+ years of hands on experience with Big Data systems, pipelines and data processing using Hadoop, PySpark/ Spark, Python/ Scala, Databricks, Hive, SQL/ NoSQL, AWS, Docker & K8S. He has Strong CS fundamentals including data structures, algorithms, networking & operating systems.

Experience

9 yrs 11 mos
Total Experience
1 yr 11 mos
Average Tenure
2 yrs 2 mos
Current Experience

Thoughtworks

Senior Consultant

Apr 2024Present · 2 yrs 2 mos · Gurugram, Haryana, India

Boston consulting group (bcg)

Senior Data Engineer

May 2021Apr 2024 · 2 yrs 11 mos · Gurugram, Haryana, India

  • Industrialization of Systems for Model Training, Management & Deployment: Continuous
  • Integration Framework, Cloud Infra Orchestration, Central Error Logging using Airflow, Docker,
  • Azure DevOps & MLflow.
  • Implemented CICD pipelines for building & deploying Docker containers using Azure
  • DevOps to boost release by speeding development, decrease risks of deployment,
  • quicker integration of user feedback & create trust in code quality.
  • Developed Airflow DAGs using Python & deployed Airflow Docker containers in VMs
  • to enable parallel pipeline runs and optimize throughput, track progress as they
  • execute, set run time configurations to achieve dynamism.
  • Leveraged Mlflow library for model’s parameters & metrics inspection, and also as an
  • artifact registry.
Snowflake CloudAirflowContinuous Integration

Hp

Software Engineer II

Feb 2020May 2021 · 1 yr 3 mos · Greater Bengaluru Area

  • HP is pioneer in print and imaging technologies – INK/ Laser/ LargeFormat/3D printing. HP generates huge volume of sensor/ telemetry data from different products in market PCs, Printers etc. As part of HP R&D, developed a common data platform to ingest and process printer telemetry data from different clients (printers, mobile apps, desktop applications).
  • Developed Kinesis Firehose based data ingestion platform to load streaming data from various clients into data lake & process in batch manner.
  • Developed data models, designed tables for Redshift and implemented Databricks/ Spark based ETL pipelines to process enrichment and telemetry sources of data.
  • Analysed the existing technology environment and architecture, to develop technical recommendations to improve application performance.
  • Created Docker images and co-ordinated with DevOps team to deploy pods in Kubernetes cluster (EKS) using Azure Codeway and Terraform.
Airflow

Tavant

Senior Software Engineer

Mar 2019Feb 2020 · 11 mos · Bengaluru, Karnataka, India

  • Developed data models and created a Data Platform for Chicago based online food ordering & delivery marketplace that connects diners with local restaurants. It offers a method of accessing data warehouse in a consistent, supportable, fast and scalable way.
  • Optimized logical data model for Hive and created data pipelines while delivering high data quality solutions that are testable and adhere to SLAs.
  • Interfaced with business customers, gathered requirements and delivered complete data and reporting solutions owning the design, development, and maintenance.
  • Developed Python/ Pyspark based ETL jobs which leveraged AWS's data technologies to pull the data from Cassandra, Salesforce, MySQL and other third parties into the data warehouse on Apache Hive.
  • Identified downstream implications of data load changes during new data source integrations (e.g., data quality, code failure, future challenges).

Infosys

System Engineer

Jun 2016Feb 2019 · 2 yrs 8 mos · India

  • Build Enterprise Content Management solution for Australia based Bank. It can handle huge amount of structured, semi structured and unstructured data.
  • Designed, developed & tested solutions for integrating data from varied sources.
  • Developed Hive and PySpark based applications for file transformations.
  • Monitored & troubleshot production bugs by identifying the root cause and implementing code fixes.

Education

CV Raman College of Engineering (CVRCE), Bhubaneswar

Bachelor of Technology - BTech — Computer Science

May 2012Mar 2016

Stackforce found 100+ more professionals with Airflow & Continuous Integration

Explore similar profiles based on matching skills and experience