Saquib Zeya

DevOps Engineer

Patna, Bihar, India6 yrs 7 mos experience
AI ML PractitionerAI Enabled

Key Highlights

  • Over 6 years of experience in Site Reliability Engineering.
  • Expert in managing Kubernetes deployments across multiple cloud platforms.
  • Proven ability to optimize big data technologies for performance.
Stackforce AI infers this person is a Cloud Infrastructure and Site Reliability Engineering expert in the SaaS industry.

Contact

Skills

Core Skills

Site Reliability EngineeringCloud InfrastructureData AnalysisInfrastructure ManagementDevops

Other Skills

API DevelopmentAWSAccern No-Code AIAmazon S3Apache KafkaApache SparkAutomationAutomation FrameworksAzureBashCICI/CDCloud SecurityComputer ScienceDebugging Code

About

Experienced Site Reliability Engineer | 6+ Years in Building Scalable, Reliable Systems and Optimizing Infrastructure

Experience

6 yrs 7 mos
Total Experience
2 yrs 2 mos
Average Tenure
--
Current Experience

Wand ai

Senior DevOps Engineer

Feb 2025Present · 1 yr 4 mos · Patna, Bihar, India · Remote

  • Managing Kubernetes deployments across AWS and Azure environments.
  • Specializing in optimizing, scaling, and monitoring big data technologies such as Kafka, Elasticsearch, Spark, and PostgreSQL.
  • Enhancing cloud security and implementing best practices.
  • Automating and optimizing operational processes to improve efficiency.
KubernetesAWSAzureKafkaElasticsearchSpark+5

Accern (acquired by wand ai)

3 roles

Senior Site Reliability Engineer

Promoted

Oct 2024Feb 2025 · 4 mos · Remote

  • Defined, managed, and analyzed SLIs and SLOs to measure system performance and availability effectively.
  • Leveraged data analysis and statistical methods to identify performance trends, detect anomalies, and drive proactive optimizations.
  • Designed and implemented automation frameworks to reduce manual effort and improve efficiency.
  • Implemented and optimized monitoring, alerting, and logging tools to proactively mitigate issues and recommend improvements.
  • Partnered with internal teams to diagnose and resolve incidents, ensuring system reliability.
  • Collaborated with product engineering teams to promote and implement scalable, resilient system designs.
SLIsSLOsData AnalysisAutomation FrameworksMonitoring ToolsIncident Management+1

Site Reliability Engineer

Promoted

Feb 2022Sep 2024 · 2 yrs 7 mos · Remote

  • Manage and build a versatile tech stack, including Kubernetes deployments across AWS and AZURE environments.
  • Specialize in optimizing, scaling, and monitoring big data technologies such as Kafka, Elasticsearch, Spark, and PostgreSQL.
  • Enhance cloud security by implementing best practices and security measures.
  • Test and improve system integrity, application development processes, and other infrastructure-related components.
  • Leverage open-source technologies and tools, including CI/CD pipelines and version control systems like Git, to streamline development and deployment workflows.
  • Automate and optimize operational processes to improve efficiency and reduce manual intervention.
  • Oversee code deployments, fixes, and updates, and manage the overall release process.
  • Respond to system alerts and provide on-call support to ensure high availability and rapid resolution of incidents.
  • Estimate, plan, and execute on various projects, features, and integrations, ensuring alignment with business goals and timelines.
  • Stay current with industry trends and continuously seek new technologies and methods to improve system performance and reliability.
KubernetesAWSAzureKafkaElasticsearchSpark+6

Senior Application Support Engineer

Sep 2021Feb 2022 · 5 mos · Remote

Ocrolus

Application Support Engineer

May 2019Aug 2021 · 2 yrs 3 mos · Gurgaon, Haryana, India · On-site

  • Infrastructure Management: Managed infrastructure and applications like CURA, ensuring seamless operations and high availability.
  • Kubernetes Deployment: Deployed AWS (EKS) and GCP (GKE) Kubernetes clusters, provisioning infrastructure using Terraform for efficient resource management.
  • Automation: Automated workflows with Jenkins, streamlining CI/CD processes and improving deployment efficiency.
  • Helm Charts: Developed and deployed Helm charts for applications, facilitating easy deployment and management in Kubernetes.
  • Monitoring & Logging: Monitored systems and logs using Kibana, CloudWatch, New Relic, and RDS Performance Insights, ensuring timely issue detection and resolution.
  • Issue Resolution: Resolved critical issues using Docker, Kubernetes, and Elastic Beanstalk, maintaining system stability through debugging and service restarts.
  • API Development: Built and deployed RESTful APIs with Python (Flask, FastAPI), testing with Postman and Insomnia to ensure functionality.
  • Incident Management: Managed incidents via JIRA and PagerDuty, ensuring prompt resolution and service uptime.
  • Version Control: Made code changes, managed deployments in Bitbucket, and pushed updates to production environments.
  • Kubernetes Monitoring: Monitored and scaled Kubernetes clusters with tools like Weave, k9s, and kubectl, ensuring efficient resource utilization.
  • Redash Dashboards: Configured Redash for database queries and created dashboards to meet business requirements, effectively managing downtime.
Infrastructure ManagementKubernetesTerraformAutomationMonitoringAPI Development+2

Skyway technocom consultants pvt ltd

Software Associate

May 2018May 2019 · 1 yr · Pune, Maharashtra, India · On-site

Magicpin

Data Analyst

Feb 2018May 2018 · 3 mos · Gurugram, Haryana, India

  • overall responsible to give support for urgent data requests.

Education

Indian Institute of Technology, Patna

Master of Technology - MTech — Cloud Computing

Jun 2025Jun 2027

DIT UNIVERSITY

Bachelor of Technology - BTech — Information Technology

Jan 2014Jan 2018

Stackforce found 100+ more professionals with Site Reliability Engineering & Cloud Infrastructure

Explore similar profiles based on matching skills and experience

Saquib Zeya - DevOps Engineer | Stackforce