Sohom Bhattacharjee — DevOps Engineer

I'm a DevOps engineer who began as a sysadmin and never lost that deep appreciation for how systems actually work. I've led infrastructure transformations that significantly reduced deployment times, improved cluster efficiency, and enabled high availability at scale—across both startup and enterprise environments. I spend waay too much staring at logs. I write tools in Python, Bash, and Go—and can read C, C++, and Java (still debating if I should spend more time with Perl). At Last9, I maintained a highly available, multi-region time-series data platform ingesting over 1.5 billion time-series per minute. I re-engineered the backend infrastructure to support rack-aware deployments, which reduced stateful cluster scaling time from 10–12 hours to a fully automated 2-hour operation—dramatically improving reliability and operational efficiency. At Druva I took a detour from SRE/Devops and spent time working on two research projects involving the Kubernetes Control Plane and VMDK VDDK virtual storage layer. I genuinely enjoy the challenge of migrations—whether it's orchestrating entire data-centre moves across regions or lifting and shifting legacy applications into the cloud. I’ve planned and executed complex transitions with zero downtime and a strong focus on reliability and repeatability. 💡 Things I work with regularly: Kubernetes, AWS, GCP, Docker, Prometheus, Terraform, Go, Python, GitOps, CI/CD 📦 Things I’ve worked with: Kafka, Riak, SolrCloud, ZooKeeper, VictoriaMetrics (cluster mode), Prometheus, GitLab, Postfix + Dovecot (Email), Prosody (XMPP), LVM 🛠️ What I care about: Simplicity in design, observability, reducing operational burden, teaching & mentoring On the side, I tinker with my home lab (a 5TB NAS on a Raspberry Pi) and keep diving deeper into Linux, networking, and large-scale storage systems—the places where hardware and software shake hands.

Stackforce AI infers this person is a DevOps Engineer specializing in high-availability systems and cloud infrastructure.

Location: Bengaluru, Karnataka, India

Experience: 7 yrs 11 mos

Skills

Server Architecture
Performance Tuning
Site Reliability Engineering
Kubernetes
Engineering
Linux
Devops

Career Highlights

Reduced stateful cluster scaling time from 10-12 hours to 2 hours.
Managed a time-series database ingesting over 1.5 billion time-series per minute.
Executed complex data-center migrations with zero downtime.

Work Experience

Cloudflare

Systems Engineer (4 mos)

E2E Cloud

Senior Site Reliability Engineer (5 mos)

Career Break (4 mos)

Last9 Inc

Site Reliability Engineer (2 yrs 1 mo)

Druva

Software Engineer (1 yr 11 mos)

People Interactive

Senior DevOps Engineer (6 mos)

DevOps Engineer (11 mos)

VoiceReach

DevOps Engineer (1 yr)

Azim Premji Foundation

Technical Consultant (5 mos)

Education

Bachelor of Comptuer Applications at S Nijalingappa College

Sohom Bhattacharjee

DevOps Engineer

Bengaluru, Karnataka, India7 yrs 11 mos experience

Key Highlights

Reduced stateful cluster scaling time from 10-12 hours to 2 hours.
Managed a time-series database ingesting over 1.5 billion time-series per minute.
Executed complex data-center migrations with zero downtime.

Stackforce AI infers this person is a DevOps Engineer specializing in high-availability systems and cloud infrastructure.

Contact

Skills

Core Skills

Server ArchitecturePerformance TuningSite Reliability EngineeringKubernetesEngineeringLinuxDevops

Other Skills

Linux System AdministrationAnsibleGrafanaTerraformDocker ProductsContainer OrchestrationPrometheus.ioGoogle Kubernetes Engine (GKE)golangAPI TestingPython (Programming Language)GitGitOpsCI/CDKafka

About

Experience

7 yrs 11 mos

Total Experience

1 yr

Average Tenure

4 mos

Current Experience

Cloudflare

Systems Engineer

Dec 2025 – Present · 4 mos · Bengaluru, Karnataka, India · Hybrid

E2e cloud

Senior Site Reliability Engineer

Jun 2025 – Nov 2025 · 5 mos · Bangalore Urban, Karnataka, India · On-site

Building AI/ML platform on top of k8s
Revamp Observability and Alerting for the entire org
Automating bare-metal server provisioning and maintenance (Open Nebula)

Server ArchitecturePerformance TuningLinux System Administration

Career Break

Feb 2025 – Jun 2025 · 4 mos · Mountains

Went off to the mountains to complete my Basic Mountaineering Course.
Went off to Nepal to hike the Manaslu Circuit

Last9 inc

Site Reliability Engineer

Jan 2023 – Feb 2025 · 2 yrs 1 mo · Pune, Maharashtra, India · Hybrid

Rewrote the backend infrastructure of our product to support rack-aware deployments. This reduced our stateful cluster scaling operations from 10-12 hours to a highly automatic 2 hour operation.
Running a TSDB for customers with three-9s of availability in multiple regions across multiple deployments.
Owning the observability pipeline end-to-end
TSDB runs at scale (peak ingestion > 1.5 Billion time-series / min) in our largest deployment
Inter-Region Data-Center migration (>30TB data) without any disruption in reads / writes
Load-Testing / Stress-Testing / Performance analysis of new software. Capacity planning based on the same.
Reduced toil on internal storage operations by writing an automation platform. Time for new deployments is less
than 5 mins as opposed to 45 mins.
Regular on-call rotations and capacity-planning discussions
Wrote tools in Python and Golang to support internal workloads and teams.
Managed a cluster with > 900 individual nodes
Building incident-response runbooks for the entire team
Mentoring and supporting Juniors during on-call incidents.

AnsibleGrafanaTerraformSite Reliability EngineeringDocker ProductsContainer Orchestration+3

Druva

Software Engineer

Jan 2021 – Dec 2022 · 1 yr 11 mos · Pune, Maharashtra, India

Software Engineer at Druva-Labs.
Worked on the Kubernetes Protection Project. We built a Prototype Operator that was able to perform backup and restore operations on K8s StatefulSets (on top of AWS)
Worked on Vmware VMDK disk format for another project.
Performed API testing using Cypress.
Performed M & A analysis of other tools/companies from a technical perspective.

EngineeringgolangTerraformDocker ProductsContainer OrchestrationKubernetes

People interactive

2 roles

Senior DevOps Engineer

Jul 2020 – Jan 2021 · 6 mos

AnsibleLinuxGrafanaPython (Programming Language)TerraformSite Reliability Engineering+4

DevOps Engineer

Aug 2019 – Jul 2020 · 11 mos

Owner of the entire container platform; including the CI/CD layer, multiple staging and production cluster across multiple business units.
Maintained system stability during multiple AWS outages
Migrated Redis live in prod without service disruption.
Built and Maintained infra to run SolrCloud on top of ECS
Built and Maintained infra to run ZooKeeper on top of ECS
Helped Maintain multiple Kafka Clusters
Maintained an internal CI/CD and deployment tooling built on top of Gitlab
Day-to-Day operations and tasks / optimizations.

AnsibleLinuxGrafanaPython (Programming Language)TerraformSite Reliability Engineering+3

Voicereach

DevOps Engineer

Aug 2018 – Aug 2019 · 1 yr · Mumbai, Maharashtra, India

I started as a sysadmin / DevOps Engineer where I managed day-to-day operations using Terraform, Ansible, Packer and Jenkins.
With a view to better understand the functioning for the database (Riak) we were running, I grokked the manual and performed benchmarks. This resulted in significant improvement in the uptime of our Riak-Cluster. I also set up monitoring for the JVM and Riak.
Wrote a couple of tools that would make bulk import/export from Riak fast in Python. After this, the time taken to export the entire DB was reduced to 2 hours as opposed to 1 day. This enabled us to perform faster database backups and migrations.
Volunteered to write a data-cleaning tool for the data-team. This reduced their toil by automating away repetitive tasks. This reduced the toil for the data-team by reducing the time taken to clean data (before ingestion) from 6 hours to 1 hour.

AnsibleLinuxPython (Programming Language)TerraformDocker ProductsGit+2

Azim premji foundation

Technical Consultant

Apr 2017 – Sep 2017 · 5 mos · Bengaluru Area, India

I was responsible to training the Content Development team at Azim Premji Foundation on Linux and other FOSS tools that can be used in a classroom context for teaching high school children.

Linux