Pushkar Joshi — SRE (Site Reliability Engineer)

I'm a founding team member of Palindrome, where we've built an AI-native platform that transforms how wealth managers and private banks operate. I focus on creating and maintaining a robust cloud infrastructure, enabling our GenAI solutions to deliver seamless, secure experiences. Our platform at Palindrome eliminates time-consuming manual processes throughout the client journey, transforming everything from client onboarding to review cycles, risk assessments, and regulatory compliance. This technology empowers financial firms to expand their business, enhance client service, and substantially reduce administrative burden. We currently support institutions collectively managing assets exceeding $120B. Prior to Palindrome, I was a Senior Site Reliability Engineer at DeepIntent, where I spearheaded the implementation of Kubernetes clusters on AWS and reduced incident response time by 40% through advanced monitoring solutions. Before that, at Vuclip India, I managed 40+ GKE Kubernetes clusters and orchestrated a major data center to GCP migration involving over 300TB of data. Throughout my 9+ year career, I've engineered cloud infrastructure solutions across media, advertising, insurance, and financial sectors. My most significant technical contributions include: Designing high-availability cloud architectures that maintained 99.99% uptime during critical migrations while supporting mission-critical applications. Creating comprehensive monitoring ecosystems with Prometheus, Grafana, and AlertManager that reduced mean time to resolution by 70% and provided actionable insights. Implementing infrastructure as code with Terraform across multiple cloud providers, ensuring consistent, secure, and scalable deployments. Developing automated CI/CD pipelines that accelerated deployment cycles from days to hours while strengthening security posture. I've earned certifications including AWS Solutions Architect, Google Cloud Professional Architect, Kubernetes Administrator (CKA), Kubernetes Application Developer (CKAD), HashiCorp Certified Terraform Associate, and Red Hat Certified System Administrator. I thrive on tackling complex infrastructure challenges that deliver tangible business value—combining technical depth with a customer-centric approach to build reliable, efficient, and secure systems that scale. SRE | DevOps | GCP | AWS | CKA | CKAD | RHCSA | Docker | Kubernetes | CICD | ELK | Jenkins| Prometheus | Grafana | New Relic | IBM Aspera

Stackforce AI infers this person is a Cloud Infrastructure Engineer with expertise in Site Reliability Engineering and DevOps in the Fintech sector.

Location: Pune, Maharashtra, India

Experience: 10 yrs 4 mos

Skills

Cloud Infrastructure
Devops
Site Reliability Engineering
Cloud Management
Cloud Migration
It Consulting
System Administration

Career Highlights

Designed high-availability cloud architectures with 99.99% uptime.
Reduced incident response time by 40% through advanced monitoring.
Accelerated deployment cycles from days to hours with CI/CD.

Work Experience

Palindrome

Lead SRE (1 yr 8 mos)

DeepIntent

Senior Site Reliability Engineer (7 mos)

Site Reliability Engineer II (3 yrs 5 mos)

Vuclip Inc.

Site Reliability Engineer (2 yrs)

Allstate India

IT Consultant (5 mos)

Associate IT Consultant (2 yrs 3 mos)

Education

Engineer’s Degree at MMCOE, PUNE

DIPLOMA IN E&TC at T B GIRWALKAR POLYTECHNIC

Pushkar Joshi

SRE (Site Reliability Engineer)

Pune, Maharashtra, India10 yrs 4 mos experience

Highly Stable

Key Highlights

Designed high-availability cloud architectures with 99.99% uptime.
Reduced incident response time by 40% through advanced monitoring.
Accelerated deployment cycles from days to hours with CI/CD.

Stackforce AI infers this person is a Cloud Infrastructure Engineer with expertise in Site Reliability Engineering and DevOps in the Fintech sector.

Contact

Skills

Core Skills

Cloud InfrastructureDevopsSite Reliability EngineeringCloud ManagementCloud MigrationIt ConsultingSystem Administration

Other Skills

AWSAkamaiAmazon CloudWatchAmazon EC2Amazon Relational Database Service (RDS)Amazon Route 53Amazon S3AsperaBashCCI/CDChange ManagementCloud SecurityCloud StorageData Center

About

Experience

10 yrs 4 mos

Total Experience

2 yrs 10 mos

Average Tenure

1 yr 8 mos

Current Experience

Palindrome

Lead SRE

Sep 2024 – Present · 1 yr 8 mos · Pune, Maharashtra, India · Hybrid

At Palindrome, I design, implement, and maintain a secure and scalable cloud infrastructure for web and mobile applications, powering our AI-native platform for wealth managers and private banks. Collaborating closely with our Product Engineering team, I deploy and manage cutting-edge technologies, ensuring seamless integration with clients' existing workflows while upholding the highest security standards essential for the financial services industry.
Key Responsibilities:
Infrastructure Management: Leveraging Terraform to provision and manage cloud resources that support our agentic AI solutions.
CI/CD Implementation: Developing automated pipelines that facilitate rapid deployment of new features and enhancements.
Cloud Security: Enforcing robust security measures in compliance with financial services regulations.
Monitoring & Optimization: Utilizing Prometheus and Grafana for real-time system health insights and optimizing resource utilization.

TerraformCI/CDCloud SecurityPrometheusGrafanaCloud Infrastructure+1

Deepintent

2 roles

Senior Site Reliability Engineer

Promoted

Feb 2024 – Sep 2024 · 7 mos · Pune, Maharashtra, India · Hybrid

Deepintent is a digital advertising company. It offers its clients Healthcare Marketing Platform, the industry’s first and only DSP with in-platform optimization toward business outcomes. The deepintent platform connects marketers with patients through unique data, premium media partnerships across all devices, and custom integrations.
● Implement monitoring and alerting solutions with Prometheus, Grafana, and integration with Pagerduty.
● Setting up the CICD pipeline with GitHub actions.
● Creating & maintaining Kubernetes clusters across Production, Dev, Staging, and Release environments.
● Setting up Aerospike, and Airflow on EC2 as well as on Kubernetes.
● Provide tech support to Dev’s.
● Troubleshoot issues during and after application deployments on
Kubernetes.
● Manage different AWS services which include EC2, RDS, VPC, S3,
Route53 etc.
● Configuration of VPC peering between different AWS accounts and
setting up routing.
● Taking regular backups of Grafana dashboards and data sources.
● Setting up Oauth-2 proxy service on Kubernetes for external-facing
URL’s for authentication with google gsuite account.
● Setting up Jenkins on Kubernetes for cost optimization and high
availability.
● Currently working on LinkerD Service Mesh.
● Setting up a logging system with Loki on Kubernetes.
● Helm chart creation for different APIs/services.

PrometheusGrafanaGitHub ActionsKubernetesAWSSite Reliability Engineering+1

Site Reliability Engineer II

Sep 2020 – Feb 2024 · 3 yrs 5 mos · Pune, Maharashtra, India · Hybrid

● Implement monitoring and alerting solutions with Prometheus, Grafana, and integration with Pagerduty.
● Setting up the CICD pipeline with GitHub actions.
● Creating & maintaining Kubernetes clusters across Production, Dev, Staging, and Release environments.
● Setting up Aerospike, and Airflow on EC2 as well as on Kubernetes.
● Provide tech support to Dev’s.
● Troubleshoot issues during and after application deployments on
Kubernetes.
● Manage different AWS services which include EC2, RDS, VPC, S3,
Route53 etc.
● Configuration of VPC peering between different AWS accounts and
setting up routing.
● Taking regular backups of Grafana dashboards and data sources.
● Setting up Oauth-2 proxy service on Kubernetes for external-facing
URL’s for authentication with google gsuite account.
● Setting up Jenkins on Kubernetes for cost optimization and high
availability.
● Currently working on LinkerD Service Mesh.
● Setting up a logging system with Loki on Kubernetes.
● Helm chart creation for different APIs/services. Deepintent is a digital advertising company. It offers its clients Healthcare Marketing Platform, the industry’s first and only DSP with in-platform optimization toward business outcomes.

PrometheusGrafanaGitHub ActionsKubernetesAWSSite Reliability Engineering+1

Vuclip inc.

Site Reliability Engineer

Sep 2018 – Sep 2020 · 2 yrs · Pune, Maharashtra, India

Deployment of new k8s Cluster and maintenance.
Setting up the CD pipelines on Harness for k8s microservices.
Creating a new infrastructure using Terraform on GCP and AWS.
AWS
Implementation of consul for service discovery in the k8s cluster.
Creating Route53 registering, DNS, Nginx server to route the traffic.
Setting up the new Prometheus Grafana and Thanos servers.
Writing the scripts in shell and python for automation. for event handlers
using Jenkins jobs for pod AWS, EBS restart.
Worked on multiple projects such as Data Center to GCP migration server
migrations, Integration of new microservices with current services in GCP.
Cost optimization solutions in order to reduce bills on infrastructure.
POC on different tools for process improvements.
Experience in AWS & GCP in different services such as Route53, X-Ray, EC2,
S3,ELB, VPC, CodePipeline,GKE, GCS, etc.
Bringing up and developing the SRE culture in the organization to minimize
the gap between development and operations for minimum downtime

TerraformKubernetesAWSGCPPrometheusGrafana+2

Allstate india

2 roles

IT Consultant

Promoted

Apr 2018 – Sep 2018 · 5 mos

Coordinating with Dev/QA teams for CR and bug fixing for the application with Allstate Canada Project.
Experience of working on supporting development teams and change
management.
Restoring and cloning different environments as per dev requirements.
Worked on release management tasks and production deployments &
patches & Coordinate the same with business.
Monitoring & management AWS Cloud infra for prod applications with
components like EC2, RDS, S3, IAM.
Resolving Incident, Change, Problem Management, User Generated requests, and queries.
Package Management – Installation, Up-gradation, Verification, and
troubleshooting of RPM packages.