Kailash N

Senior Software Engineer

Arlington, Virginia, United States5 yrs 9 mos experience
Most Likely To SwitchHighly Stable

Key Highlights

  • Designed scalable cloud-native platforms and microservices.
  • Achieved 32% increase in developer productivity.
  • Led migration to modern infrastructure with significant cost savings.
Stackforce AI infers this person is a Cloud Infrastructure Engineer specializing in SaaS and microservices architecture.

Contact

Skills

Core Skills

Cloud ComputingMicroservicesInfrastructure ManagementApplication DevelopmentData Science

Other Skills

PythonKubernetesGraphQLDockerAmazon Web Services (AWS)TerraformNginxDevOpsContinuous Integration and Continuous Delivery (CI/CD)AWSInfrastructure as code (IaC)Site Reliability EngineeringApplication MonitoringNetwork SecurityAmazon SQS

About

Software Engineer with 6 years of experience who loves turning concepts into scalable products. I specialize in designing distributed systems and building cloud-native platforms, microservices, and developer experience tools that accelerate engineering velocity, enhance reliability and system resilience. I’m passionate about driving innovation and efficiency in complex technology environments. At TriNet Zenefits, I designed and implemented an Apollo Federated GraphQL architecture to streamline microservice communication and data retrieval, and led the migration from monoliths to gRPC-based microservices on Kubernetes with a service mesh to enhance scalability and fault isolation. I transitioned key infrastructure to AWS using a modular AWS CDK framework, modernized CI/CD pipelines and observability frameworks to reduce build times and improve system resilience, and built a remote developer platform that cut setup time from hours to minutes increasing developer productivity by 32%.

Experience

5 yrs 9 mos
Total Experience
5 yrs 9 mos
Average Tenure
5 yrs 9 mos
Current Experience

Zenefits

6 roles

Senior Software Engineer, Platform

Promoted

Jul 2022Present · 3 yrs 11 mos

  • Designed and implemented an Apollo Federated GraphQL architecture with 22+ subgraphs and a supergraph to unify microservices under a single interface, improving data integration and performance at scale.
  • Architected a scalable API gateway using schema stitching and federation to orchestrate data across multiple domains with minimal latency.
  • Led the modernization of payroll systems into gRPC-based microservices on Kubernetes and Envoy, enhancing scalability, resilience, and fault isolation.
  • Developed a shared Python library following the Strangler design pattern to support smooth transitions from monolith to microservice architecture.
  • Built a remote development platform on AWS EC2 and Docker, cutting setup time from hours to minutes and boosting developer productivity by 32%.
  • Built and managed multi-cluster Kubernetes (EKS) and serverless event-driven architectures using AWS Lambda, EventBridge, and Kafka for scalable communication.
  • Reduced build time by 60% (55→20 mins) and cut infrastructure costs by 45% through creating selective testing solution, EBS fast snapshot restore, and weighted EC2 spot fleet optimization.
  • Automated microservice deployments through a modular AWS CDK/Terraform framework, created CI/CD and AMI pipelines using Harness, Ansible and Packer, ensuring consistent, secure releases.
  • Strengthened security and compliance with automated VPC segmentation, IAM guardrails, and network policies aligned with PCI-DSS, GDPR, and SOC2 standards.
  • Delivered 99.9% uptime through Datadog, Grafana, and Sentry-based observability, providing real-time visibility and anomaly detection.
  • Implemented ProxySQL to optimize connection pooling and query performance, reducing latency for high-traffic services.
  • Extensively worked with NGINX and uWSGI mules for high-performance API routing, proxy configurations, and application scalability across distributed environments.
PythonKubernetesGraphQLDockerApplication DevelopmentAmazon Web Services (AWS)+6

Software Engineer, Infrastructure

May 2022Jul 2022 · 2 mos

  • Engineered AWS ECS-based application deployments using Terraform and Ansible, integrating AWS Secrets Manager for seamless, secure configuration management.
  • Built scalable CI/CD pipelines in Jenkins, streamlining deployments and improving developer agility across multiple services.
  • Applied Layer 7 proxy rules, security groups, and NACLs to enforce strict traffic control across non-production environments.
  • Optimized platform performance through custom CloudWatch metrics, autoscaling policies, EC2 spot fleets, and EBS fast snapshot restore, achieving faster rollouts at lower cost.
  • Created custom AMIs with automated patching and implemented VPC network mode to support multi-container workloads efficiently.
Amazon Web Services (AWS)PythonInfrastructure as code (IaC)Site Reliability EngineeringDockerCloud Computing+1

Software Engineer, Infrastructure

Promoted

Jun 2020Apr 2022 · 1 yr 10 mos

  • Replaced AWS SWF with a database-backed orchestration engine, reducing ETA fleet size 60% and improving reliability.
  • Migrated 7 production-critical services from DuploCloud to AWS ECS with zero downtime.
  • Consolidated 6 ECS clusters into a unified platform cluster, reducing infrastructure costs by 30%.
  • Built a log aggregator, monitoring and alerting pipeline using Kinesis, Sumo Logic, PagerDuty, and Datadog, reducing MTTR by 40%.
  • Automated AMI creation via Ansible + Packer for secure, compliant image management.
  • Deployed Kinesis Firehose pipelines to stream CloudWatch metrics into Sumo Logic with low latency.
  • Developed Jenkins pipelines to automate deployments and streamline release processes.
  • Created CloudFront distributions to serve content from edge locations to users so as to minimize the load on the frontend servers.
  • Reduced the docker image size by almost 80% by removing frontend assets from docker image and in turn reducing the build and deploy time.
PythonAmazon Web Services (AWS)Continuous Integration and Continuous Delivery (CI/CD)TerraformApplication MonitoringInfrastructure as code (IaC)+3

Software Engineer, Infrastructure

Promoted

Dec 2019Jun 2020 · 6 mos

  • An event-driven system built using Python to automate the workflow of pull requests from code review to get deployed in production.
  • Designed and implemented a Python-based rule engine to extract rules from YAML files, define states and check for pull requests within repositories, ensuring precise workflow management.
  • Configured webhooks on GitHub to handle events, facilitating seamless integration with the event-driven system.
  • Implemented forwarding of all GitHub webhook events to AWS SQS for efficient event management.
  • Leveraged a Queue poller to manage events received via AWS SQS, ensuring precise and timely processing.
  • Utilized DynamoDB to centralize and scale pull request status maintenance, facilitating comprehensive tracking of code changes from development to production.
Amazon SQSPython (Programming Language)Continuous Integration and Continuous Delivery (CI/CD)Application DevelopmentAmazon DynamodbCloud Computing

Software Engineer

May 2019Jul 2019 · 2 mos · Bengaluru, Karnataka, India · On-site

  • Automated client payroll data synchronization, streamlining operations, improving accuracy, and enabling seamless onboarding of new clients with third-party payroll providers.
  • Developed POCs with Selenium and Puppeteer to optimize UI automation and ensure reliable workflows.
  • Orchestrated the transfer of critical payroll data, significantly boosting operational efficiency and reducing manual errors.
  • Gained hands-on expertise in UI automation and data management while working in a fast-paced startup environment.
PythonDjango REST FrameworkUI AutomationSoftware DesignJavaScriptNode.js+1

Data Science Intern

May 2018Nov 2018 · 6 mos · Bangalore Area, India · On-site

  • Developed an ML pipeline to classify valid vs. fraudulent companies, enhancing company validation accuracy.
  • Created interactive dashboards with drilldowns for tracking fulfillment strategies, facilitating data-driven decisions.
  • Automated ETL workflows for data extraction from Jira, improving analytics accessibility and integrity.
  • Implemented Celery-based task scheduling for reliable data extraction and ingestion, ensuring smooth operations.
Python (Programming Language)SQLAmazon Web Services (AWS)Data Science

Education

PSG College of Technology

Msc Theoretical Computer Science — Tamilnadu

Jan 2015Jan 2020

Jawahar Higher Secondary School

Higher Secondary

Jan 2015Present

Stackforce found 100+ more professionals with Cloud Computing & Microservices

Explore similar profiles based on matching skills and experience