Preetish Kumar Tripathi — SRE (Site Reliability Engineer)

Accomplished hands-on technical leader with 16+ years of experience and a proven track record of architecting and delivering reliable and scalable systems in a variety of areas, such as Operating Systems, Telecom, Ad serving engines , Finance, Streaming services, Hybrid Sales/Marketing events . Technical Expertise : Stateless/Stateful Microservices, Loosely coupled REST API, Application Orchestration, Application Monitoring, Linux/NetBSD/FreeBSD/Solaris/SolarisZones, chroot, Docker, Kubernetes, OpenShift, Hadoop, Kafka, Storm, JDG, Redis, Cassandra, Neo4j, MySQL, Chef, Foreman, Puppet, CloudStack, Xen, AWS, Cobbler, PXE, Elastic Search, Kibana, DNS, DHCP, TCP/IP Stack, Perl, Python, Ruby, Golang, C#, Bash and a few more. My learning in the last 2 decades ( will change over time ): Trying to make a pure Software/Engineer,Developer,Programmer into a SRE, is like "Giving the thieves the key to your home". There is nothing like a bad company to work for. You learn "what not to do and it is more enlightening". “It is no measure of health to be well adjusted to a profoundly sick society.” ( company/team ) . This is by Jiddu Krishnamurtiji . Culture should not just come from Top down, it can even come from Bottom up "Never hire someone who created their resume using AI" , if a person cannot write 3000 words on their own, they are no good. Never join a company where they ask you "how will you design a parking lot", "What are the classes and methods you will write to to do some shit" , "Debug this piece of code on wordpad" . I would prefer someone asks me " what is the relationship between /dev/random and the second law of thermodynamics " , or , "why is load average a logarithmic decay function" “निंदक नियरे राखिये, आँगन कुटी बांधिये, बिन पानी साबुन बिना, निर्मल करे सुभाय।” संत कबीर दास

Stackforce AI infers this person is a seasoned Site Reliability Engineer with expertise in cloud infrastructure and high-availability systems.

Location: Bengaluru, Karnataka, India

Experience: 15 yrs 4 mos

Skills

Site Reliability Engineering
Cloud Computing
Data Management
Devops
Infrastructure Management
Automation

Career Highlights

16+ years of experience in technical leadership.
Expert in architecting scalable systems across multiple industries.
Proven track record in Site Reliability Engineering.

Work Experience

RedHat company

Principal SRE (1 yr 4 mos)

Goldcast

SRE@goldcast.io (1 yr)

PhonePe

Sr Manager SRE@PhonePe (3 yrs)

Goods And Services Tax Network

VP Infrastructure and DevOps (2 yrs)

Walmart Global eCommerce

Tech@Walmartlabs ( SRE/TDO ) (2 yrs)

Citrix

DevOps@Citrix (Dev Ops ) (2 yrs)

Yahoo!

Technical Lead Unix/Linux (Dev Ops) (2 yrs)

Red Hat Software Services

Engineer (2 yrs)

Preetish Kumar Tripathi

SRE (Site Reliability Engineer)

Bengaluru, Karnataka, India15 yrs 4 mos experience

Highly Stable

Key Highlights

16+ years of experience in technical leadership.
Expert in architecting scalable systems across multiple industries.
Proven track record in Site Reliability Engineering.

Stackforce AI infers this person is a seasoned Site Reliability Engineer with expertise in cloud infrastructure and high-availability systems.

Contact

Skills

Core Skills

Site Reliability EngineeringCloud ComputingData ManagementDevopsInfrastructure ManagementAutomation

Other Skills

AWSAgile MethodologiesBusiness Process ImprovementC#Capacity PlanningContinuous IntegrationContinuous Process ImprovementCritical ThinkingData WarehousingE-commerceEmotional IntelligenceGlobal OperationsGolangHadoopKubernetes

About

Experience

15 yrs 4 mos

Total Experience

2 yrs

Average Tenure

1 yr 4 mos

Current Experience

Redhat company

Principal SRE

Jan 2025 – Present · 1 yr 4 mos

I am a boomerang. Worked for RedHat back in 2008 till 2010 .
Presently working as a SRE in Azure RedHat OpenShift ( Our team takes care of 1000's of OpenShift clusters on Azure and maintain a SLA of 99.95 )
Working on ARO-RP repo which takes care of the entire lifecycle of a ARO cluster ( https://github.com/Azure/ARO-RP , https://github.com/preetisht) Golang .
Worked on fluentbit upgrade , and now working on some very critical security fixes in our RP infra ( This requires working on ARO-RP and some internal Microsoft repos )
Also, working on improving our Monitoring ( C# )
Still learning OpenShift, Kubernetes, Golang, C#

Site Reliability EngineeringOpenShiftKubernetesGolangC#Cloud Computing

Goldcast

SRE@goldcast.io

Oct 2022 – Oct 2023 · 1 yr

The most fun/tough company I worked for in terms of scaling complexity and end user workflow.
I joined as a SRE ( no prefix or suffix ). Very young engineering team and very young leadership ( Please look at goldcast.io , I have never seen a UI so great, kudos to the backend team too ). The product is like Zoom on steroids and web based.
I learnt a lot on how UI is developed, deployed and the bad things a web UI can do to bring down the backend, and how the backend of streaming service works ( similar to netflix, youtube, zoom meetings etc ) and a think or two about AWS ( specially RDS, Lambda functions ). How does one scale up/down with business like goldcast where one only has a window of 5 minutes to scale from literally 5000 users to 100,000 users.
Goldcast.io was like 1 year of crash course on how online marketing events work where there might be 10 speakers and 100,000 of attendees and there is a realtime conversation going on with all the things one can think about in an interactive webinar .

Site Reliability EngineeringAWSStreaming ServicesCloud Computing

Phonepe

Sr Manager SRE@PhonePe

Jan 2019 – Jan 2022 · 3 yrs · Bengaluru Area, India

To understand the scale at which PhonePe works https://www.phonepe.com/pulse/explore/transaction/2024/4/
Joined as an IC4 . My first task was working on Staging infra in the new DC and then Migration of our Hadoop stack from old to new DC ( IIRC we were doing 3.2 billion events per day ( hope I am correct. 3 terrabytes of data per day ) . Once done, I along with two great engineers ( I was learning from them ), was taking care of PhonePe's data warehouse .
Corona came, and almost at that time, we had a re-org where PhonePe's leadership decided to change the company structure into multiple Business Verticals . So started with 2 business verticals with 2 direct reportees ( 2 other engineers from a consulting company taking care of the entire Hadoop Stack ) . In the next 2 years, our team grew into 15 member team taking care of approximately 50% of PhonePe's clusters ( Mariadb, Mesos, elasticsearch, rabbitmq, aerospike and nginx etc ).
Our team also did migration of On Prem application stack ( Insurance business ) to Azure, in 2 months ( Dev, Stage and Prod env, Not perfect, however we abided to the SEBI rules, Passed all the security requirements ).
By the end of my tenure in PhonePe, the team was taking care of the entire Merchant Ecosystem, Recharge and Bill Pay's , Fraud and Risk Analysis Platform, Insurance , Had started working on Mutual Funds and a few other backend services ( Kindly look at https://www.phonepe.com/pulse/explore/transaction/2024/4/ to get an idea of the scale of the Business Verticals )
The team did some experimental projects like , Config search engine, Project Nirvana ( form based deployment of dev/stage infra ) and a few more which did not see the day of light since we had to work under a lot of governmental regulations and we had to concentrate on making sure that we pay more attention on expanding our infra ( during my tenure, we went from 10 million transactions to more than 120 million transaction per day )

Team ManagementSite Reliability EngineeringHadoopData WarehousingData Management

Goods and services tax network

VP Infrastructure and DevOps

Jan 2017 – Jan 2019 · 2 yrs · New Delhi Area, India

Was heading the Infra and DevOps of Good and Service Tax Network, the technical backbone of the one of the largest and most complicated Taxation Network in the world, serving ~12 million tax payers ( Business Entities ). This was a historical move from the Government of India from Value Added Tax to Goods and Service Tax .
Lead a team of engineers (MSPs) taking care of Infrastructure (Network, Storage, Security, Systems), Platform (Redis, Kafka, Hadoop, Storm, JDG, Kubernetes, Docker, MySQL) and Application (around 40+ different microservices)
Was heading the Business Intelligence and Fraud Analytics team directly reporting to the CEO. Responsible for providing insights of GST to the various departments of Government of India including PMO, Finance Secretary and Chief Economic Advisor of India . Had the privilege to meet on regular basis the then Finance Secretary of India, and Chief Economic Advisor of India, Finance Ministers of a few states, High ranking IRS officers from Direct Taxes, people from FIU, India Head of World Bank, High ranking officials from various countries who worked in the Taxation department of their countries and discuss about Business Intelligence and Fraud Analytics. Also gave a talk to the Tax Offers from 37 states/union territories on how to interpret data on certain kinds of GST Fraud .
Goods and Service Tax network was the company that made me realise that Computer Science has got nothing to do with Computers or Science. Its an art of Problem Solving . My CEO ( He would sit for hours explaining us what kind of data he wants and what tables we should look into ), CTO ( well he is from a core Tech Background and I had worked with him in Yahoo and Walmart ), SVP ( PhD + Colonel) Very senior IRS officers and the Present CEO of GSTN, they all would literally come up with Algorithms to catch fraud and to get more insights into our data, me and my team would just codify it. I literally worked with IES officer writing Do and R code .

Team ManagementDevOpsInfrastructure Management

Walmart global ecommerce

Tech@Walmartlabs ( SRE/TDO )

Jan 2015 – Jan 2017 · 2 yrs · Bengaluru Area, India

Initiated a project on running cassandra clusters under docker containers. The orchestration is done using CHEF + Python
Worked on PCI check automation for our entire fleet (20000+ Hosts)
Worked on Inventory management system where I wrote the client which captures data from the systems at regular intervals and does a POST to our framework. Also wrote a couple of API's for the backend for group aggregation
Writing spec files for packaging DevOps tools which we build for automation purpose Automating Sanity checks for our build jobs
Worked as Technical Duty Officer, taking part in P1/P2 incidents for our 5 ecommerce pillars. As a TDO, it was my responsibility to do whatever it takes to bring the site back up. We were 7 TDO across three timezones ( US, Europe, Asia)

DevOpsSite Reliability EngineeringAutomation

Citrix

DevOps@Citrix (Dev Ops )

Jan 2013 – Jan 2015 · 2 yrs · Bengaluru Area, India

Worked on Foreman, Cobbler, Puppet, Cloudstack, Xen, AWS, Zenoss, Proteus IPAM, Strongmail, Sendmail
Was managing around 7000 Baremetal hosts and 3000 VMs
Wrote Host Build/Decommission Automation in Perl which took care of adding/deleting the host profile to Cobbler, Registering/UnRegistering it with Foreman and Puppet, Adding/Deleting DNS entries, Adding/Deleting Zenoss entry
Wrote SAN auditing for our entire fleet in Perl
Wrote Cloudstack load testing which would create upto 100 VM's and do sanity check on them and then destroy them

DevOpsAutomationInfrastructure Management

Yahoo!

Technical Lead Unix/Linux (Dev Ops)

Jan 2011 – Jan 2013 · 2 yrs · Bangalore

Was Technical Lead DevOps for Advertising Platform which had a fleet of around 16000 boxes responsible for Targeted Ads
Worked on Data Center Consolidation project moving from 13 DC to 6 DC
Worked on Migrating monitoring from Nagios to Inhouse Monitoring as a Service
Wrote Application endpoint data monitoring framework using Perl
Setup log monitoring using Flume and Splunk for a component responsible for generating 50TB of compressed logs per month

DevOpsInfrastructure Management

Red hat software services

Engineer

Jan 2008 – Jan 2010 · 2 yrs · Pune

Worked on issues related to kernel, Filesystem, Software RAID, LVM, Multipath, QPID AMQP Messaging, Condor GRID and Performance issues for all the RHEL provided services
Wrote detailed knowledge bases for rhn.com (now known as access.redhat.com)
Worked with some great engineers like Alan Cox ( Kernel maintener ) , Niel Hornam ( TCP/IP stack maintener ) and Ulrich Drepper (Glibc maintener)