J

Jibu Chacko

SRE (Site Reliability Engineer)

Amsterdam, North Holland, Netherlands19 yrs 11 mos experience
Highly Stable

Key Highlights

  • Expert in deploying and managing Kafka clusters.
  • Proficient in AWS and cloud infrastructure management.
  • Strong leadership in Site Reliability Engineering practices.
Stackforce AI infers this person is a Site Reliability Engineer with expertise in cloud infrastructure and DevOps practices.

Contact

Skills

Core Skills

Site Reliability EngineeringKafka Management

Other Skills

AWS SecurityAmazon EC2Amazon ECSAmazon ELBAmazon VPCAmazon Web Services (AWS)American Welding Society (AWS)AnsibleApache KafkaAvailabilityBare Metal ServersBashCICisco CertifiedCitrix Products

About

At Booking.com, my focus as Senior Site Reliability Engineer centers on deploying robust Kafka Clusters, ensuring seamless schema registry integration, and fortifying our systems with SSL protocols. Our team's dedication to monitoring with tools like Grafana and Kafka Manager has been pivotal in optimizing performance and reliability. Previously, as an SRE Manager, I championed DevOps and infrastructure initiatives, leveraging my AWS certification and proficiency in server architecture. With a track record of enhancing processes and team leadership, I'm committed to driving business value through innovative SRE practices and sustainable solutions.

Experience

19 yrs 11 mos
Total Experience
4 yrs
Average Tenure
8 mos
Current Experience

Together ai

Site Reliability Engineer

Oct 2025Present · 8 mos · Amsterdam, North Holland, Netherlands · Hybrid

Booking.com

3 roles

Senior Site Reliability Engineer

Promoted

Oct 2022Sep 2025 · 2 yrs 11 mos

  • Experience in Leading deploying and managing multi - node development,
  • testing and production of Kafka Cluster’s</li>
  • Implemented Schema Registry, Rest API and SSL protocols
  • Experience in using load balancer’s, disaster recovery and traffic routings<
  • Proficient in monitoring technologies Kafka Manager, Grafana, Kibana and
  • Kafka tool
Apache KafkaPython (Programming Language)Server ArchitecturePerformance TuningSite Reliability EngineeringKafka Management

Site Reliability Engineering Manager

Sep 2021Oct 2022 · 1 yr 1 mo

Reliability Engineering Manager

Sep 2021Oct 2022 · 1 yr 1 mo

  • Contributed upstream to **bmc-toolbox** (bmclib, bmcbutler, dora, actor), adding features, and enhancing reliability and security across bare‑metal fleet management tools.
  • Led a cross-functional team to manage bare-metal infrastructure across multi-regional data centers, emphasizing reliability and SLO-based metrics to support dynamic workloads.
  • Established key SLIs/SLOs and defined incident management processes, directly contributing to improved service continuity and performance.
  • Implemented service‑mesh observability via Istio and Envoy filters; created custom statsd exporters to feed Datadog dashboards.
  • Drove compliance and security reviews, integrating IAM policies, KMS encryption, and HashiCorp Vault secrets management.

Walmart global tech

Site Engineering Manager

Aug 2019Aug 2021 · 2 yrs

  • Led centralized CI/CD strategy: migrated 50+ services to GitHub Enterprise monorepo, leveraging GitHub Actions and ArgoCD for canary deployments.
  • Implemented service‑mesh observability via Istio and Envoy filters; created custom statsd exporters to feed Datadog dashboards.
  • Drove compliance and security reviews, integrating IAM policies, KMS encryption, and HashiCorp Vault secrets management.
  • Spearheaded a centralized DevOps lifecycle framework for multi-platform releases, enhancing reliability and deployment speed across distributed systems.
  • Redesigned CI/CD pipelines and containerized microservices with Docker and Kubernetes, boosting security and scalability for health-related software deployments.
  • Collaborated with cross-functional teams to streamline operations and onboard support initiatives, producing detailed incident runbooks and automating key tasks.

Paytm money

DevOps Manager

Apr 2018Aug 2019 · 1 yr 4 mos · Bengaluru, Karnataka, India

  • Built real‑time trading platform infrastructure: autoscaled EKS clusters, hardened network policies, and deployed Prometheus‑based alerting.
  • Designed and enforced SLOs for trading throughput and latency; created automated canary analysis with Flagger.
  • Directed AWS-based infrastructure and DevOps practices, automating deployment and configuration management using Kubernetes/EKS and Terraform.
  • Developed robust monitoring and alerting frameworks, ensuring high system availability and rapid incident resolution in a dynamic production environment.
  • Integrated CI/CD pipelines using GitLab CI, significantly reducing deployment times and improving release reliability.
  • Spearheaded migration from GitLab CI to GitHub Actions, optimizing CI/CD workflows.
  • Implemented security best practices across infrastructure provisioning, including IAM policies, KMS encryption, and secrets management in AWS.

Oracle

2 roles

Principal Technologist

Mar 2016Mar 2018 · 2 yrs

DevOps Engineer

Mar 2016Mar 2018 · 2 yrs

  • Migrated legacy applications to Oracle Cloud; authored Terraform modules and automated blue‑green deployments with Jenkins.
  • Migrated and optimized legacy infrastructure to Oracle Cloud using Docker and Terraform, enhancing scalability and deployment efficiency.
  • Automated performance tuning and environment setup with custom Bash scripts, supporting robust cloud-native deployments.
  • Experience with multi-cloud environments, optimizing infrastructure across AWS and Oracle Cloud,

Radiant info systems ltd

3 roles

Project Lead

Jan 2015Mar 2016 · 1 yr 2 mos · Bengaluru Area, India

Senior Devops

May 2009Oct 2014 · 5 yrs 5 mos · Bengaluru Area, India

  • BusIndia.com Principal Web Architect
  • BusIndia.com Mobile Version(m.busindia.com) Project Lead

Team Lead - Datacenter Operations

Apr 2006Sep 2008 · 2 yrs 5 mos · Bengaluru Area, India

  •  Remotely Administrating and Monitoring Servers.
  •  System Administration Tasks.
  • Lead Data Center Team

Radiant infosystems ltd

Engineering Team Lead

Apr 2006Mar 2016 · 9 yrs 11 mos

  • Led a team of 12 engineers in designing and operating secure, distributed cloud-based systems, with a focus on automation and proactive monitoring.
  • Developed custom scripts and tools to monitor and optimize system performance, directly contributing to increased uptime and system reliability.
  • Established enterprise-level monitoring frameworks (Nagios, Tivoli) and implemented process improvements to support mission-critical applications.
  • Improved resilience by building monitoring (Nagios, Graphite) and log aggregation (ELK) frameworks for enterprise clients.
  • Introduced CI/CD pipelines via Hudson/Maven and containerized dev environments, reducing onboarding time by 50%.

Education

SSM College of Engineering

Information Technology

Jun 2000Mar 2005

SSM College of Engineering, Komarapalayam

MSC (5 Year Integrateted) — IT

Jan 2000Jan 2005

Mar Thoma College, Kuttapuzha P.O. Tiruvalla, Pathanamthitta- 689103

Associate's Degree — Mathematics

Jan 1998Jan 2000

St. Johns High School, Eraviperoor

SSLC — 10th

Jan 1995Jan 1998

Stackforce found 100+ more professionals with Site Reliability Engineering & Kafka Management

Explore similar profiles based on matching skills and experience