Preetish Kumar Tripathi

SRE (Site Reliability Engineer)

Bengaluru, Karnataka, India15 yrs 3 mos experience
Highly Stable

Key Highlights

  • 16+ years of experience in technical leadership.
  • Expert in architecting scalable systems across multiple industries.
  • Proven track record in Site Reliability Engineering.
Stackforce AI infers this person is a seasoned Site Reliability Engineer with expertise in cloud infrastructure and high-availability systems.

Contact

Skills

Core Skills

Site Reliability EngineeringCloud ComputingData ManagementDevopsInfrastructure ManagementAutomation

Other Skills

AWSAgile MethodologiesBusiness Process ImprovementC#Capacity PlanningContinuous IntegrationContinuous Process ImprovementCritical ThinkingData WarehousingE-commerceEmotional IntelligenceGlobal OperationsGolangHadoopKubernetes

About

Accomplished hands-on technical leader with 16+ years of experience and a proven track record of architecting and delivering reliable and scalable systems in a variety of areas, such as Operating Systems, Telecom, Ad serving engines , Finance, Streaming services, Hybrid Sales/Marketing events . Technical Expertise : Stateless/Stateful Microservices, Loosely coupled REST API, Application Orchestration, Application Monitoring, Linux/NetBSD/FreeBSD/Solaris/SolarisZones, chroot, Docker, Kubernetes, OpenShift, Hadoop, Kafka, Storm, JDG, Redis, Cassandra, Neo4j, MySQL, Chef, Foreman, Puppet, CloudStack, Xen, AWS, Cobbler, PXE, Elastic Search, Kibana, DNS, DHCP, TCP/IP Stack, Perl, Python, Ruby, Golang, C#, Bash and a few more. My learning in the last 2 decades ( will change over time ): Trying to make a pure Software/Engineer,Developer,Programmer into a SRE, is like "Giving the thieves the key to your home". There is nothing like a bad company to work for. You learn "what not to do and it is more enlightening". “It is no measure of health to be well adjusted to a profoundly sick society.” ( company/team ) . This is by Jiddu Krishnamurtiji . Culture should not just come from Top down, it can even come from Bottom up "Never hire someone who created their resume using AI" , if a person cannot write 3000 words on their own, they are no good. Never join a company where they ask you "how will you design a parking lot", "What are the classes and methods you will write to to do some shit" , "Debug this piece of code on wordpad" . I would prefer someone asks me " what is the relationship between /dev/random and the second law of thermodynamics " , or , "why is load average a logarithmic decay function" “निंदक नियरे राखिये, आँगन कुटी बांधिये, बिन पानी साबुन बिना, निर्मल करे सुभाय।” संत कबीर दास

Experience

Redhat company

Principal SRE

Jan 2025Present · 1 yr 2 mos

  • I am a boomerang. Worked for RedHat back in 2008 till 2010 .
  • Presently working as a SRE in Azure RedHat OpenShift ( Our team takes care of 1000's of OpenShift clusters on Azure and maintain a SLA of 99.95 )
  • Working on ARO-RP repo which takes care of the entire lifecycle of a ARO cluster ( https://github.com/Azure/ARO-RP , https://github.com/preetisht) Golang .
  • Worked on fluentbit upgrade , and now working on some very critical security fixes in our RP infra ( This requires working on ARO-RP and some internal Microsoft repos )
  • Also, working on improving our Monitoring ( C# )
  • Still learning OpenShift, Kubernetes, Golang, C#
Site Reliability EngineeringOpenShiftKubernetesGolangC#Cloud Computing

Goldcast

SRE@goldcast.io

Oct 2022Oct 2023 · 1 yr

  • The most fun/tough company I worked for in terms of scaling complexity and end user workflow.
  • I joined as a SRE ( no prefix or suffix ). Very young engineering team and very young leadership ( Please look at goldcast.io , I have never seen a UI so great, kudos to the backend team too ). The product is like Zoom on steroids and web based.
  • I learnt a lot on how UI is developed, deployed and the bad things a web UI can do to bring down the backend, and how the backend of streaming service works ( similar to netflix, youtube, zoom meetings etc ) and a think or two about AWS ( specially RDS, Lambda functions ). How does one scale up/down with business like goldcast where one only has a window of 5 minutes to scale from literally 5000 users to 100,000 users.
  • Goldcast.io was like 1 year of crash course on how online marketing events work where there might be 10 speakers and 100,000 of attendees and there is a realtime conversation going on with all the things one can think about in an interactive webinar .
Site Reliability EngineeringAWSStreaming ServicesCloud Computing

Phonepe

Sr Manager SRE@PhonePe

Jan 2019Jan 2022 · 3 yrs · Bengaluru Area, India

  • To understand the scale at which PhonePe works https://www.phonepe.com/pulse/explore/transaction/2024/4/
  • Joined as an IC4 . My first task was working on Staging infra in the new DC and then Migration of our Hadoop stack from old to new DC ( IIRC we were doing 3.2 billion events per day ( hope I am correct. 3 terrabytes of data per day ) . Once done, I along with two great engineers ( I was learning from them ), was taking care of PhonePe's data warehouse .
  • Corona came, and almost at that time, we had a re-org where PhonePe's leadership decided to change the company structure into multiple Business Verticals . So started with 2 business verticals with 2 direct reportees ( 2 other engineers from a consulting company taking care of the entire Hadoop Stack ) . In the next 2 years, our team grew into 15 member team taking care of approximately 50% of PhonePe's clusters ( Mariadb, Mesos, elasticsearch, rabbitmq, aerospike and nginx etc ).
  • Our team also did migration of On Prem application stack ( Insurance business ) to Azure, in 2 months ( Dev, Stage and Prod env, Not perfect, however we abided to the SEBI rules, Passed all the security requirements ).
  • By the end of my tenure in PhonePe, the team was taking care of the entire Merchant Ecosystem, Recharge and Bill Pay's , Fraud and Risk Analysis Platform, Insurance , Had started working on Mutual Funds and a few other backend services ( Kindly look at https://www.phonepe.com/pulse/explore/transaction/2024/4/ to get an idea of the scale of the Business Verticals )
  • The team did some experimental projects like , Config search engine, Project Nirvana ( form based deployment of dev/stage infra ) and a few more which did not see the day of light since we had to work under a lot of governmental regulations and we had to concentrate on making sure that we pay more attention on expanding our infra ( during my tenure, we went from 10 million transactions to more than 120 million transaction per day )
Team ManagementSite Reliability EngineeringHadoopData WarehousingData Management

Goods and services tax network

VP Infrastructure and DevOps

Jan 2017Jan 2019 · 2 yrs · New Delhi Area, India

  • Was heading the Infra and DevOps of Good and Service Tax Network, the technical backbone of the one of the largest and most complicated Taxation Network in the world, serving ~12 million tax payers ( Business Entities ). This was a historical move from the Government of India from Value Added Tax to Goods and Service Tax .
  • Lead a team of engineers (MSPs) taking care of Infrastructure (Network, Storage, Security, Systems), Platform (Redis, Kafka, Hadoop, Storm, JDG, Kubernetes, Docker, MySQL) and Application (around 40+ different microservices)
  • Was heading the Business Intelligence and Fraud Analytics team directly reporting to the CEO. Responsible for providing insights of GST to the various departments of Government of India including PMO, Finance Secretary and Chief Economic Advisor of India . Had the privilege to meet on regular basis the then Finance Secretary of India, and Chief Economic Advisor of India, Finance Ministers of a few states, High ranking IRS officers from Direct Taxes, people from FIU, India Head of World Bank, High ranking officials from various countries who worked in the Taxation department of their countries and discuss about Business Intelligence and Fraud Analytics. Also gave a talk to the Tax Offers from 37 states/union territories on how to interpret data on certain kinds of GST Fraud .
  • Goods and Service Tax network was the company that made me realise that Computer Science has got nothing to do with Computers or Science. Its an art of Problem Solving . My CEO ( He would sit for hours explaining us what kind of data he wants and what tables we should look into ), CTO ( well he is from a core Tech Background and I had worked with him in Yahoo and Walmart ), SVP ( PhD + Colonel) Very senior IRS officers and the Present CEO of GSTN, they all would literally come up with Algorithms to catch fraud and to get more insights into our data, me and my team would just codify it. I literally worked with IES officer writing Do and R code .
Team ManagementDevOpsInfrastructure Management

Walmart global ecommerce

Tech@Walmartlabs ( SRE/TDO )

Jan 2015Jan 2017 · 2 yrs · Bengaluru Area, India

  • Initiated a project on running cassandra clusters under docker containers. The orchestration is done using CHEF + Python
  • Worked on PCI check automation for our entire fleet (20000+ Hosts)
  • Worked on Inventory management system where I wrote the client which captures data from the systems at regular intervals and does a POST to our framework. Also wrote a couple of API's for the backend for group aggregation
  • Writing spec files for packaging DevOps tools which we build for automation purpose Automating Sanity checks for our build jobs
  • Worked as Technical Duty Officer, taking part in P1/P2 incidents for our 5 ecommerce pillars. As a TDO, it was my responsibility to do whatever it takes to bring the site back up. We were 7 TDO across three timezones ( US, Europe, Asia)
DevOpsSite Reliability EngineeringAutomation

Citrix

DevOps@Citrix (Dev Ops )

Jan 2013Jan 2015 · 2 yrs · Bengaluru Area, India

  • Worked on Foreman, Cobbler, Puppet, Cloudstack, Xen, AWS, Zenoss, Proteus IPAM, Strongmail, Sendmail
  • Was managing around 7000 Baremetal hosts and 3000 VMs
  • Wrote Host Build/Decommission Automation in Perl which took care of adding/deleting the host profile to Cobbler, Registering/UnRegistering it with Foreman and Puppet, Adding/Deleting DNS entries, Adding/Deleting Zenoss entry
  • Wrote SAN auditing for our entire fleet in Perl
  • Wrote Cloudstack load testing which would create upto 100 VM's and do sanity check on them and then destroy them
DevOpsAutomationInfrastructure Management

Yahoo!

Technical Lead Unix/Linux (Dev Ops)

Jan 2011Jan 2013 · 2 yrs · Bangalore

  • Was Technical Lead DevOps for Advertising Platform which had a fleet of around 16000 boxes responsible for Targeted Ads
  • Worked on Data Center Consolidation project moving from 13 DC to 6 DC
  • Worked on Migrating monitoring from Nagios to Inhouse Monitoring as a Service
  • Wrote Application endpoint data monitoring framework using Perl
  • Setup log monitoring using Flume and Splunk for a component responsible for generating 50TB of compressed logs per month
DevOpsInfrastructure Management

Red hat software services

Engineer

Jan 2008Jan 2010 · 2 yrs · Pune

  • Worked on issues related to kernel, Filesystem, Software RAID, LVM, Multipath, QPID AMQP Messaging, Condor GRID and Performance issues for all the RHEL provided services
  • Wrote detailed knowledge bases for rhn.com (now known as access.redhat.com)
  • Worked with some great engineers like Alan Cox ( Kernel maintener ) , Niel Hornam ( TCP/IP stack maintener ) and Ulrich Drepper (Glibc maintener)

Stackforce found 100+ more professionals with Site Reliability Engineering & Cloud Computing

Explore similar profiles based on matching skills and experience