Bapi D.

SRE (Site Reliability Engineer)

Kolkata, West Bengal, India4 yrs 8 mos experience
AI ML PractitionerHighly Stable

Key Highlights

  • Expert in building reliable and scalable banking systems.
  • Strong background in cloud-native technologies and automation.
  • Proven track record in incident response and performance optimization.
Stackforce AI infers this person is a Fintech Infrastructure Engineer with expertise in cloud and automation.

Contact

Skills

Core Skills

SreDevopsMonitoringCi/cd

Other Skills

PrometheusGrafanaAzure DevOpsShellPerlTerraformDockerKubernetesAWSModel Context Protocol (MCP)Prompt WritingAnthropic ClaudeShell ScriptingCloud ComputingCloud Security

About

As an SRE , I focus on building reliable, scalable, and secure systems that keep critical banking applications running smoothly. With hands-on experience across Azure, AWS, CI/CD pipelines, Kubernetes, Terraform, Docker, and observability tools, I work to improve system uptime, streamline deployments, and automate operational workflows. I enjoy solving complex infrastructure problems, reducing toil, and implementing practices that enhance system performance and resilience. My work revolves around monitoring, incident response, performance optimization, and ensuring seamless production releases. I’m passionate about adopting cloud-native approaches, strengthening automation, and contributing to high-impact engineering teams. Always learning, always building, and always improving. Key Skills: SRE • DevOps • Azure • AWS • Terraform • Docker • Kubernetes • CI/CD • GitHub/Azure DevOps • Monitoring • Logging • Observability • Incident Response • Automation Open to learning, collaborating, and exploring modern reliability engineering challenges

Experience

Hutech solutions

SENIOR SRE

Oct 2025Present · 6 mos · Bangalore Urban · On-site

  • Managed and supported Core Banking (CBS) and UPI/Payments systems in production and pre-production environments.
  • Designed and implemented monitoring solutions using Prometheus & Grafana, covering:
  • Application logs
  • Transaction metrics (UPI success/failure codes like 000, 909, 911)
  • Infrastructure metrics (CPU, disk, process count)
  • Developed custom exporters (Perl/Shell) to expose banking-specific metrics for Prometheus.
  • Configured alerting rules in Prometheus & Alertmanager to proactively detect:
  • Transaction failures
  • Service downtime
  • Log anomalies
  • Automated deployments using Azure DevOps pipelines, including:
  • Bulk deployment across multiple servers
  • Environment-specific configuration (UAT, PRE, PROD, DR)
  • Worked on CI/CD pipelines for application and observability stack deployments.
  • Implemented log aggregation solutions using rsync and Alloy-based pipelines for centralized logging.
  • Supported on-prem and hybrid infrastructure, including Linux/AIX servers.
  • Performed incident troubleshooting & root cause analysis (RCA) for production issues.
  • Ensured system reliability through cron automation, service monitoring, and health checks.
PrometheusGrafanaAzure DevOpsShellPerlSRE+1

Globallogic

2 roles

Senior Software Engineer

Promoted

Jul 2024Sep 2025 · 1 yr 2 mos

  • Designed and implemented CI/CD pipelines using Azure DevOps/AWS for multiple environments (DEV, UAT, PRE, PROD), enabling faster and reliable deployments
  • Developed prototype infrastructure setups using Terraform for rapid environment provisioning
  • Automated infrastructure provisioning using Terraform across cloud platforms like AWS and Azure
  • Built and managed containerized applications using Docker and orchestrated them using Kubernetes
  • Developed and maintained custom monitoring solutions using Prometheus, Grafana, and custom exporters for application, database, and log metrics
Azure DevOpsTerraformDockerKubernetesDevOpsCI/CD

Software Engineer

Jun 2022Jun 2024 · 2 yrs

  • Hitachi RM Project
  • Contributed to infrastructure automation and CI/CD pipeline setup for enterprise applications
  • Managed deployments using Azure DevOps pipelines, ensuring smooth multi-environment releases
  • Implemented monitoring and alerting solutions to improve system reliability and uptime
  • Supported production environments by troubleshooting deployment and performance issues
  • Worked closely with development teams to streamline build and release processes
Azure DevOpsKubernetesAWSDevOps

Digihunk

Software Engineer

May 2021May 2022 · 1 yr · Remote

  • Built a POC using Amazon Bedrock for Generative AI use cases in governance platforms (GovOS)
  • Explored LLM-based automation for log analysis, intelligent summarization, and anomaly detection
  • Integrated AI capabilities with existing monitoring/logging systems for enhanced observability insights
  • Designed workflows to reduce manual analysis effort and improve incident response time.
  • Automated infrastructure provisioning using Terraform, enabling consistent and repeatable environment setup
AWSTerraformDevOps

Education

JIS University

Bachelor of Technology - BTech

Kendriya Vidyalaya

Stackforce found 100+ more professionals with Sre & Devops

Explore similar profiles based on matching skills and experience