Danish Khan

DevOps Engineer

Hyderabad, Telangana, India12 yrs 10 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • Expert in Kubernetes and cloud infrastructure management.
  • Proven track record in automating CI/CD processes.
  • Strong background in Site Reliability Engineering best practices.
Stackforce AI infers this person is a DevOps and Site Reliability Engineering expert in the Fintech and SaaS sectors.

Contact

Skills

Core Skills

Continuous Integration And Continuous Delivery (ci/cd)Google Cloud Platform (gcp)Site Reliability EngineeringKubernetesAmazon Web Services (aws)FintechGalera ClusterTerraformDevops

Other Skills

AI Software DevelopmentAerospikeAnsibleApache MesosAutomationBackend EngineeringBash ScriptingBitbucketCassandraCloud ManagementConfluenceDOCKERSDaemontoolsGitGitHub Copilot

About

Experienced in DevOps/SRE domain.Majorly skilled in below tech stack :- Orchestration - Kubernetes,GKE, Mesos,OCP Containers - Dockers SCM - Gitlab, BitBucket Public Cloud - GCP, AWS Databases - Mariadb Galera, Aerospike, Cassandra, Patroni-PostgresSql Config management - SaltStack,Ansible Monitoring and Alert Tools - Riemann, Influxdb, Grafana, Sensu, Prometheus,Kentik Web Servers - Nginx, Haproxy (as a Reverse proxy) Project/Incident Managment - Jira, ServiceNow Content Management - Confluence Clusters - Mesos,Kubernetes, Veritas & OracleRAC Linux/Ubuntu Server Administration and upgrade.

Experience

Lloyds technology centre india

Senior DevOps Engineer

Oct 2024Present · 1 yr 5 mos · Hyderabad, Telangana, India · Hybrid

  • Implemented Jenkins CI/CD pipelines for the Banking Access lab, ensuring seamless deployment and verification processes.
  • Developed automated monitoring and alerting systems using Dynatrace to enhance the platform's reliability and performance.
  • Collaborated with cross-functional teams to optimize application consumption for online banking customers.
Nexus IQAutomationContinuous Integration and Continuous Delivery (CI/CD)Google Kubernetes Engine (GKE)SonarqubeGoogle Cloud Platform (GCP)+4

Fortanix

Senior Site Reliability Engineer (R&D)

Nov 2023Oct 2024 · 11 mos · Bengaluru, Karnataka, India · Hybrid

  • Owned the global Fortanix DSM SaaS infrastructure, managing multiple geographically distributed on-prem Kubernetes clusters across the USA, UK, EU, APAC, Australia, and KSA. Oversaw dedicated environments for several Fortune 100 clients requiring strict data-sovereignty and GDPR compliance. Enabled full regional data-sovereignty by restricting access to approved clusters, contributing to $35M+ in annual revenue through secure, compliant, and highly reliable service delivery.
  • Improved platform reliability by implementing SRE best practices, enhancing observability, refining dashboards, tuning alerts, and strengthening monitoring workflows across all regions. Led monthly cyclic application upgrades across clusters, ensuring consistent releases, zero-downtime rollouts, and version uniformity.
  • Reduced operational toil by automating repetitive tasks using Bash and Ansible, improving team efficiency and minimizing manual work. Led the PoC and migration from VictorOps to SquadCast, improving alerting, routing efficiency, and on-call responsiveness. Performed weekly GC and disk cleanup across clusters to maintain optimal audit-log retention and system performance.
  • Specialized in Cassandra DB operations, including local/global quorum monitoring, replication and compaction checks, and safe node maintenance by disabling gossip during cleanup.
  • Tech Stack:
  • Kubernetes: Managed 200+ on-prem nodes across global regions.
  • Databases: Cassandra (quorum, replication, compaction, cluster health).
  • Automation: Bash, Ansible for config and infra tasks.
  • Monitoring/Logging: Observenic, Filebeat, Auditbeat, Metricbeat, Suricata.
  • Alerting: Sensu, VictorOps, SquadCast.
  • Infra Health: Pingdom for SaaS uptime and sanity checks.
  • Network Monitoring: Kentik for global network availability.
  • Additional: Served in 12-hour rotational on-call, proactively detecting and resolving production issues across all regions to ensure high availability and fast incident recovery.
Site Reliability EngineeringKubernetesCassandraAnsibleBitbucketBash Scripting

L&t technology services

2 roles

DevOps Specialist

Mar 2023Nov 2023 · 8 mos · Bengaluru, Karnataka, India · On-site

  • Worked as a Senior SRE Engineer for SKY Communications UK for its DNE (Digital Network Enabler) team for its in-house ETL based DashBoard project. This setup comprises the below Tech stack:-
  • 1. Patroi Postgres 3 Nodes Cluster.
  • 2. Airbyte Setup
  • 3. Superset Setup
  • 4. Python Prefect Code.
  • 5. Haproxy with HA via Keepalived across 3 nodes.
  • 6. ETCD 3 Node cluster.
  • 7. Database Benchmark using Sysbench.
KubernetesAmazon Web Services (AWS)

Specialist- DevOps Engineer

Aug 2021May 2022 · 9 mos · Bengaluru, Karnataka, India · On-site

  • Working as SRE/DevOps Engineer for SKY-ISP-SRE-Engineering team for its new digital modernization projects for various sites in Harrisburg-London and Bluebird-Italy locations. This Setup caters almost for 25million+ subscriber’s Mobile, Data and ISP services across Europe.
  • Providing and commissioning PaaS to various SKY applications.
  • Working on AWS instances creation via TERRAFORM.
  • Building & customizing AWS-AMI images using PACKER tool.
  • Knowledge and exposure on various AWS services like EC2, VPC, IAM, S3, ELB, EBS, Route53, etc.
  • Enrolled within a Two (2) team members team taking full responsibilities for Terraform & Packer configuration build for this deployment on Staging and PRD Environments via Gitlab SCM.
  • Detecting and resolving Production issues.
  • Monitor and govern the given Bluebird Infrastructure Site and its reliability.
  • Doing post audit upgrades and modifications in the setups.
TerraformGitlabKubernetesAnsibleBash ScriptingShell Scripting+1

Phonepe

2 roles

Site Reliability Engineer-SDE2

May 2022Oct 2022 · 5 mos · Bengaluru, Karnataka, India

  • Worked as SRE-SDE2 in PhonePe Engineering team as a member of a horizontal team working across various SRE Teams for more than 350+million subscriber’s UPI requests across PAN India.
  • Using PhonePe Enterprise Cloud (PPEC) for Commissioning & Decommissioning VMs & BMs Machines/Instances from scratch used in PhonePe’s.
  • Doing Nginx directives setting for achieving different features of Internal, Inbound and Outbound Proxy, like Page Rendering, Adding CSP, CORS for avoiding Clickjacking, etc.
  • Sharing the BAU workload of SRE team for all the SRE Teams of PhonePe rendered for all the Features inside PhonePe app like Payments, UPI, etc.
  • Business as Usual (BAU) activities includes: -
  • ◦ Mariadb Galera cluster DB-Truncate using Rolling Schema Upgrade (RSU) or Total Order Isolation (TOI) depending on the business use cases.
  • ◦ MariaDB Galera cluster DB-Alter
  • ◦ MariaDB backup/replication using socat tunnel via TLS encryption
  • ◦ Aerospike Nosql Cluster Node addition and node deletion.
  • ◦ Forward and Reverse XDR configuration in Aerospike
  • ◦ Namespace/Schema addition in Aerospike
  • ◦ Mesos Slave/Agent installation and adding it to the Mesos Master Orchestration fleet
  • ◦ EdgeInfra/DMZ Nginx Proxy configuration using various directives available in Nginx
  • ◦ Taking care of the PhonePe STG-Infra as the primary SPOC for this testing ground of PhonePe Engineering Team
Apache MesosGalera ClusterFinTechSaltStackBash ScriptingShell Scripting

Site Reliability Engineer

Sep 2020Aug 2021 · 11 mos · Bengaluru, Karnataka, India · Remote

  • Worked as a SRE for PhonePe SRE-Engineering team for its new ICICI BK1 & Yes Bank BK3 Setup for 250+million subscriber’s UPI requests across PAN India.
  • Using PhonePe Enterprise Cloud (PPEC) Commissioned 137+ Machines/Instances from scratch used in PhonePe’s YesBank BK3 setup for a planned migration of PhonePe’s UPI-Traffic from Reliance-DC to NetMagic-DC.
  • SPOC for 137+ Servers/instances of PhonePe rendered for ICICI Bank as Gateway setup for doing the UPI based transaction with NPCI infrastructure.
  • This PhonePe Setup has a mix of different instances like: -
  • ◦ Application Servers.
  • ◦ Load balancers -Ngnix, HA-Proxy.
  • ◦ Middleware-RabbitMQ Servers.
  • ◦ Database: MariaDB Galera Cluster, Aerospike NoSQL DB
  • ◦ Automation: Salt Master & Salt Minions
  • ◦ Jump Servers.
  • ◦ DMZ Servers, etc.
  • Enrolled within a Five (5) team members team taking full responsibilities for Monitoring and updating BK1 & BK3 infra setup of PhonePe for ICICI Bank and YesBank
  • Working on Mission Critical Infrastructure with DCOS-Mesosphere which behaves as an On-Premises Cloud.
  • Detecting and resolving Production issues using JIRA w.rt the set SLO/SLI.
  • Monitor and govern the BK1 & BK3 infrastructure Site and its reliability.
  • Doing post audit upgrades and modifications in this BK1 & BK3 setups.
DevOpsApache MesosInfluxDBSite Reliability EngineeringGalera ClusterFinTech+4

Dxc technology

Senior System Administrator

Jan 2020Sep 2020 · 8 mos · Bengaluru, Karnataka, India · On-site

  • Working as Senior System Administrator for Deutsche Bank on behalf of DXC Technologies.
  • Enrolled in SAN Migration Project from VMAX2 to VMAX3 Symmetrix Storage for almost 7500+ Servers that includes Oracle-RAC clusters and Veritas Clusters on Linux & Solaris boxes with SRDF solution enabled on PRD boxes.
  • Work in coordination with Dell-EMC team for Symmetrix Storage allocation and management along with zoning and masking for the new SAN Fabric Symmetrix LUNs.
  • Recommend, schedule, and perform OS and hardware checks Pre and Post SAN migration activity.
  • Raise Service now Incidents along with recommendations to the stakeholders if any issues are found related to OS, Cluster and respective hardware.
  • Experience in replacing ASM & non-ASM LUNs in RAC and Veritas clusters respectively.
  • Experience in replacing LUNs used inside DG of Symmetrix Remote Data Facility (SRDF) for data replication from one Symmetrix storage array to another through a storage area network or Internet Protocol (IP) network.
  • Experience on working with SYMCLI command lines and LUNs & storage related identification using SYMCLI utility command line.
  • Implemented many new Pre-check points for improvements to running methodologies thus streamline delivery process.
  • Responsible for maintaining organization's effectiveness and efficiency by defining, delivering, and supporting strategic plans for implementing information technologies.
  • Project Delivered:
  • Member of PROD box SME for DHSO RAC clusters in SAN Migration delivery from Unix Factory team. As part of the DHSO team successfully delivering, migration & transition for old VMAX-2 array to new VMAX-3 array for Oracle RAC and Veritas Clusters for Deutsche Bank Infra. This includes Project execution, knowledge transfer, implementations of Pre-Post task under SAN Migration, training and documentation for new team members.
Bash ScriptingShell Scripting

Wipro

Linux Lead Administrator

Nov 2019Jan 2020 · 2 mos · Noida, Uttar Pradesh, India

  • Worked as Linux Lead for Citi bank Client for Asia Pacific region for Data Center migration from Singapore to India location.
  • Provided Linux L3 level recommendations and troubleshooting for the escalated issues from L2 level.
  • Plan and approve the MOPs shared by L2 level for Linux OS related task required for Data Center migrations.
  • Project delivered:
  • Citibank Singapore Data Center Migration to India.
Bash ScriptingShell Scripting

Ericsson

Senior Solution Integrator

Sep 2013Nov 2019 · 6 yrs 2 mos · Gurgaon, India

  • Working as Senior Solution Integrator for Mediation Dept. below are the achievements so far : -
  • Got ACE Award for Idea Online Mediation Upgrade & Expansion
  • Got ACE Award for GrameenPhone CBiO Phase1 Convergence
  • Got Apple Watch as award for Zain Kuwait Mediation Phase2 Upgrade
  • Got SPOT Award for Zain Saudi Arabia CBiO Phase1
Bash ScriptingShell Scripting

Nsn - nokia solutions and networks

3 roles

Fault Management Engineer

Jan 2013Sep 2013 · 8 mos

  • Was associated with Nokia Solutions & Networks as FM-Engg in Data Charging & Care (RECC) for Vodafone India.
Bash ScriptingShell Scripting

OSS Tools Engineer

Jun 2012Dec 2012 · 6 mos

  • Ø Customer Support on Trouble resolution and technical queries for various Netact OSS products.
  • Ø Collecting alarms, measurements and radio network parameters from Network Element and managing NEs from remotely.
  • Ø Provides a view to the entire network and used for online monitoring.
  • Ø Customer Acceptance Testing on OSS, FM, CM and PM in NetAct after commissioning and upgrades.
  • Ø Doing upgrades on NetAct with Care Team.
  • Ø Integrating new Network Elements(BSC,MSC,HLR,MGW,MSS,CGS,GCS,RNC) into OSS.
  • Ø Have good understanding of Alarm flow and Measurement flow from Network Elements such as BSS, MSS, MGW and RNC.
  • Ø Trouble shooting issues raised by CM, FM and PM functionalities.
  • Ø System Administration (CPU Load, Process Checks).
  • Ø Taking Backups from NetAct using Backup tool.
  • Ø Establishing and maintaining connectivity between GNSC and Circle OSS server.
  • Ø Integrating R4 Routers, EMC and other Switches for alarm management.
  • Ø OSS User Administration.
  • Ø Daily OSS health checkup and rectification of Alarms in OSS.
  • Ø Have good understanding of Supervision Process (wpmanmx), Measurement Process (Mecmanmx) and other NetAct related process.
  • Ø Supporting Care team and Other Vendors on Hardware and Software implementation/upgrade activity.
  • Ø Routine Administration of DCN Routers and Switches.
  • Ø Defining IP Routes and Natting, X.25 configurations of NE's in Routers.
  • Ø Network User Management.
Bash ScriptingShell Scripting

OSS Tools Engineer

Jun 2012Dec 2012 · 6 mos

Bash ScriptingShell Scripting

Valuefirst

2 roles

NOC-Associate

Mar 2011Jun 2012 · 1 yr 3 mos

Bash ScriptingShell Scripting

Associate

Mar 2011Jun 2012 · 1 yr 3 mos

Bash ScriptingShell Scripting

Education

Punjab Technical University

Bachelor of Technology (B.Tech.) — Electronics and Communications Engineering

Jan 2005Jan 2010

ISC Board

Jan 2005Present

ICSE Board

Jan 2003Present

Stackforce found 100+ more professionals with Continuous Integration And Continuous Delivery (ci/cd) & Google Cloud Platform (gcp)

Explore similar profiles based on matching skills and experience