Mihir Shah

Software Engineer

Mumbai, Maharashtra, India9 yrs 9 mos experience
Highly Stable

Key Highlights

  • Managed over 10 million concurrent users seamlessly.
  • Led modernization of legacy systems to cloud-native architecture.
  • Developed observability platform reducing incident response time.
Stackforce AI infers this person is a Cloud Engineering and Site Reliability expert in B2C environments.

Contact

Skills

Core Skills

Cloud ComputingSite Reliability EngineeringSoftware DevelopmentCloud EngineeringData EngineeringProcess Automation

Other Skills

AWSAmazon Machine Image (AMI)Amazon Web Services (AWS)AutomationBashCCI/CDCapacity PlanningCloud-native DevelopmentCognosContinuous Integration and Continuous Delivery (CI/CD)Cost OptimizationData GovernanceData MaskingData Quality Assurance

About

Experienced Site Reliability Engineer(SRE) with a demonstrated history of working in the distributed systems and cloud computing domain. Skilled in Python, Java, Bash with cloud platforms like AWS, GCP. Certified Kubernetes Application Developer, AWS Developer, and Solution Architect Associate. Certified Oracle Database SQL Expert and Oracle Professional Java SE 6, Programmer.

Experience

Sony pictures networks india

Principal Software Engineer

Apr 2025Present · 11 mos · Mumbai Metropolitan Region · On-site

  • ● Elevated a live streaming platform to successfully manage a peak load of over 10 million concurrent users during the Asia Cup 2025, ensuring 99.99% uptime and a seamless viewing experience at a massive scale.
  • ● Spearheaded a critical platform modernization initiative, migrating high-traffic, legacy core services to Amazon EKS (Elastic Kubernetes Service). This architectural shift significantly improved system reliability, enhanced auto-scaling capabilities, and resulted in a 20% reduction in operational costs.
  • ● Architected and led the development of a self-service, developer-centric observability platform. Automated the entire alert onboarding pipeline from Coralogix to Datadog, reducing the Mean Time to X (MTTx) for critical incidents by 30%. Concurrently mentored and managed a team of 10 junior engineers, focusing on best practices in cloud-native development and site reliability.
GitOpsHelm ChartsGoogle Cloud Platform (GCP)Incident ManagementObservability EngineeringDatadog+8

Dream11

Software Development Engineer 2

Apr 2020Apr 2025 · 5 yrs · Mumbai, Maharashtra, India · Hybrid

  • ● Handled 16 Million concurrent users during IPL 2025.
  • ● Assisted ownership in Application Integration, Capacity Planning, Development Operations, Monitoring 100+ microservices by four golden signals in production. Measured SLIs and defined alerts for SLOs.
  • ● Packaged and fine-tuned code artifacts in Amazon Machine Image (AMI), to make provisioning faster at the time of auto-scaling. This led the auto-scaling group to run the spot at 100% for the optimized cost.
  • ● Lead the Migration for monitoring of microservices, alerting, and intelligent dashboards from New Relic to Datadog.
  • ● Participate in on-call activities and ensure systems reliability and availability.
Incident ManagementMonitoringAmazon Machine Image (AMI)DatadogCapacity PlanningMicroservices+2

Quantiphi, inc.

Platform Engineer

Aug 2018Apr 2020 · 1 yr 8 mos · Mumbai, Maharashtra, India

  • DICOM Radimetrics Datalake solution - GCP
  • ● Automated provisioning of infra and data ingestion pipelines using Terraform, containerized Elastic search and UI using Docker for GKE deployments, and architected secure HIPAA compliant environments.
  • ● Designed, developed, and deployed custom metrics, ELK monitoring, alerting, and logging system to understand system performance using Google Cloud Stackdriver. Created an analytical workbench for 100 end users.
  • Healthcare Insights data-hub platform - AWS
  • ● Setup EKS cluster and wrote Helm charts to automate deploying of business logic microservices for High Availability. Maintained source code versioning using Cloud Commit and built CI/CD pipelines using Gitlab.
  • ● Encryption at Rest using KMS keys on S3 and Encryption in Transcript using ACM SSL/TLS certs for PHI data.
  • ● Implemented serverless Lambda scripts for reducing infrastructure costs by 30% across environments.
TerraformDockerGoogle Cloud StackdriverElastic SearchKubernetesAWS+2

Capgemini

Senior Software Engineer

Jul 2016Aug 2018 · 2 yrs 1 mo · Pune, Maharashtra, India

  • Morgan Stanley Data Masking QAPM (Offshore)
  • ● Responsible for ETL Development of Masking New Applications, CR Applications for MSSQL, Sybase, DB2, Flat files.
  • ● Responsible for Process Automation and Enhancements.
  • ● Requirement Gathering Walkthrough, Custom ETL Mapplet and ETL Code Generation , Autosys Job Generation and Execution, Unit Testing, Dev Peer Review, QA Peer Review, Deployment Checklist Creation, Functional Documentation.
  • ● Developed an automation script for Frequent Schema Change ETL Solution and Data Import/Export functionality for DB2 Server.
  • ● Technologies : SQL, PL/SQL, IBM DB2, Sybase, Oracle, MS-SQL, DBArtisan, Informatica Powercenter 9.0, Unix, Shell and Perl Scripts, ALM, Issue and Project Tracking SDLC JIRAs.
SQLPL/SQLETL DevelopmentProcess AutomationUnixShell Scripting+1

Education

Dwarkadas J. Sanghvi College of Engineering

Bachelor of Engineering (BE) — Computer Engineering

Jan 2013Jan 2016

Shri Baghubhai Mafatlal Polytechnic

Diploma — Computer Science Engineering

Jan 2010Jan 2013

St. Lawrence High School Kolkata

SSC

Jan 2000Jan 2010

Stackforce found 100+ more professionals with Cloud Computing & Site Reliability Engineering

Explore similar profiles based on matching skills and experience