Sai Arvind B

Software Engineer

Chennai, Tamil Nadu, India6 yrs 11 mos experience
Most Likely To Switch

Key Highlights

  • Expert in architecting data pipelines for large-scale data transfers.
  • Proven track record in optimizing data processing performance.
  • Strong background in cloud computing and data engineering.
Stackforce AI infers this person is a Data Engineering expert with a strong focus on AdTech and Fintech solutions.

Contact

Skills

Core Skills

Data EngineeringApache SparkAdtechCloud ComputingBackend Development

Other Skills

AirflowAmazon Web Services (AWS)Apache RangerBack-End Web DevelopmentBig queryC++CI/CDCloud AdministrationCloud DevelopmentCloud SecurityCore JavaData AnalysisData PipelinesDataprocDevOps

About

I design and implement distributed systems, data pipelines and solve complex data problems

Experience

6 yrs 11 mos
Total Experience
1 yr 8 mos
Average Tenure
2 yrs 2 mos
Current Experience

Phonepe

Software Engineer - Data platform

Apr 2024Present · 2 yrs 2 mos

  • PHONEPE DATA WAREHOUSE TEAM : DATA ENGINEERING
  • Architected and developed a data pipeline facilitating data transfers of around 30 petabytes across diverse environments, including cluster-to-cluster, Iceberg to Hive, Hive to Hive, and between managed and external tables.
  • Setup Airflow for the entire org and did few patches in the open source airflow
  • Improved the query performance for the reporting tables by 8x by changing their partition scheme while ingestion.
  • Improved the runtime of compaction process from 7 hours to 1 hour by parallelising the computing
  • Reduced the number of small files generated in the stream ingestion pipeline from 700,000 per day to 6,000 per day
  • Implemented spark jvm metrics in the streaming pipeline
  • Migrated batch and stream pipelines from Spark2 to Spark3.
AirflowData PipelinesApache SparkHiveData AnalysisData Engineering

Walmart global tech india

Software Engineer - Advertisement

Jan 2023Apr 2024 · 1 yr 3 mos · India

  • DATA CLEAN ROOM : ADTECH
  • Led‬‭ high-level‬‭ and‬‭ low-level‬‭ design‬‭ for‬‭ critical‬‭ components‬‭ (platform‬‭ module,‬‭ job‬‭ executor‬‭ and‬
  • ‭monitoring workflow using airflow), ensuring scalability and performance‬
  • Successfully‬‭ managed‬‭ deployment‬‭ processes‬‭ across‬‭ various‬‭ environments,‬‭ ensuring‬‭ smooth‬
  • ‭application operations
  • TARGETING : ADTECH
  • Developed‬‭ Spark‬‭ applications,‬‭ optimizing‬‭ them‬‭ for‬‭ small‬‭ clusters‬‭ to‬‭ process‬‭ billions‬‭ of‬‭ rows‬‭ efficiently,‬
  • ‭Led the migration of the scheduling framework from Jenkins to Airflow, enhancing workflow automation‬
  • Resolved‬‭ bugs,‬‭ executed‬‭ production‬‭ releases,‬‭ and‬‭ provided‬‭ live‬‭ support‬‭ to‬‭ ensure‬‭ uninterrupted‬‭ system
AirflowSparkJenkinsData AnalysisAdTechData Engineering

Paypal

Software Engineer

Aug 2021Jan 2023 · 1 yr 5 mos

  • MERCHANT REPORTING PLATFORM
  • Engineered‬‭ and‬‭ migrated‬‭ monthly‬‭ statement‬‭ reports‬‭ for‬‭ millions‬‭ of‬‭ customers,‬‭ transitioning‬‭ from‬
  • ‭on-premises‬‭ to‬‭ Hadoop‬‭ stack‬‭ to‬‭ Dataproc‬‭ on‬‭ Google‬‭ Cloud‬‭ which‬‭ resulted‬‭ in‬‭ reduction‬‭ in‬‭ execution‬‭ time‬ of Spark jobs from 10 hours to two hours‬
  • Played‬‭ a‬‭ key‬‭ role‬‭ in‬‭ bug‬‭ fixes,‬‭ ticket‬‭ resolution,‬‭ and‬‭ production‬‭ pushes,‬‭ while‬‭ actively‬‭ contributing‬‭ to‬
  • ‭CI/CD pipelines and provid
  • DATABASE MONITORING
  • Designed‬‭ and‬‭ implemented‬‭ a‬‭ data‬‭ pipeline,‬‭ processing‬‭ data‬‭ from‬‭ 200+‬‭ database‬‭ instances,‬‭ and‬
  • ‭integrated it with various monitoring tools‬
  • Conducted‬‭ knowledge‬‭ transfer‬‭ sessions‬‭ and‬‭ interviews‬‭ for‬‭ SDE‬‭ II‬‭ role,‬‭ showcasing‬‭ leadership‬‭ and‬
  • ‭ expertise
HadoopDataprocSparkCI/CDData EngineeringCloud Computing

Publicis sapient

Software Engineer - Retail banking

Jul 2019Aug 2021 · 2 yrs 1 mo · Bangalore

  • NOTIFICATION BACKEND SYSTEM
  • Developed‬‭ a‬‭ low‬‭ latency,‬‭ high‬‭ throughput‬‭ notification‬‭ backend‬‭ system‬‭ for‬‭ financial‬‭ transactions‬‭ using‬ Spark‬‭ streaming,‬‭ which‬‭ involved‬‭ building‬‭ ETL‬‭ jobs‬‭ on‬‭ data‬‭ coming‬‭ in‬‭ from‬‭ Kafka‬‭ to‬‭ Hadoop‬‭ services‬
  • ‭ like HBase and Hive
  • INTERNET BANKING APPLICATION
  • ‬‭account‬‭ management‬‭ domain‬‭ of‬‭ existing‬‭ monolithic‬‭ Internet‬‭ Banking‬‭ Application‬‭ to‬‭ loosely‬ coupled independently deployable microservices‬
  • Designed‬‭ and‬‭ implemented‬‭ authentication‬‭ workflows‬‭ using‬‭ OAuth‬‭ and‬‭ authentication‬‭ trees,‬‭ feature‬ supported by Forgerock as Authorization server‬
  • Setup‬‭ CI/CD‬‭ Jenkins‬‭ pipeline‬‭ to‬‭ containerize‬‭ and‬‭ deploy‬‭ microservices‬‭ using‬‭ Helm‬‭ charts‬‭ on‬‭ to‬‭ Google‬
SparkKafkaOAuthCI/CDBackend DevelopmentData Engineering

Education

National Institute of Technology, Tiruchirappalli

BTech - Bachelor of Technology — Electrical and Electronics Engineering

Jan 2015Jan 2019

Kendriya Vidyalaya

Higher Secondary

Jan 2013Jan 2015

Stackforce found 100+ more professionals with Data Engineering & Apache Spark

Explore similar profiles based on matching skills and experience