Arbaaz Khan

Software Engineer

Hyderabad, Telangana, India7 yrs 4 mos experience

Key Highlights

  • Architected AI-powered platforms improving failure analysis.
  • Reduced CI/CD cycle times by 60% through innovative tools.
  • Led migration to modern frameworks enhancing testing efficiency.
Stackforce AI infers this person is a Fintech and SaaS expert with deep experience in distributed systems and test automation.

Contact

Skills

Core Skills

Systems DesignPlatform EngineeringMicroservicesEvent-driven ArchitectureApi DevelopmentDistributed SystemsTest AutomationContinuous Integration

Other Skills

GitSQLReact.jsTypeScriptPythonPostgreSQLJenkinsDockerGitOpsCucumberJiraJavaAWS LambdaTerraformGo

About

Software Engineer with 7+ years building large-scale distributed systems, developer platforms, and reliability infrastructure at Apple, Arcesium (D.E. Shaw Group), and Direct Line Group. Architected an AI-powered failure analysis platform using RAG pipelines with vector search serving 50+ engineers, designed chaos engineering and dark canary infrastructure for high-frequency trading systems processing $5B+ daily volume, and built developer productivity tools that reduced CI/CD cycle times by 60%. Deep expertise in systems design, platform engineering, test intelligence, and building internal tools at scale. Delivered 3 internal tech talks on distributed testing and progressive delivery.

Experience

7 yrs 4 mos
Total Experience
1 yr 10 mos
Average Tenure
1 yr 10 mos
Current Experience

Apple

Software Engineer

Jul 2024Present · 1 yr 10 mos · India · On-site

  • Architected a full-stack enterprise failure analysis platform (React 18/TypeScript + Apple Bricks UI, Python 3.12+ async FastAPI backend, PostgreSQL) serving 50+ engineers across Apple's largest PLM modernization, processing 144+ test suites and 1,200+ test cases per build with sub-second query latency
  • Engineered a RAG pipeline with Claude-3.5-Sonnet and Qdrant vector database using custom embedding models and LangChain orchestration for automated root cause classification, achieving 87% accuracy on failure categorization and reducing MTTR by 45%
  • Built real-time streaming dashboards with D3.js/Chart.js and WebSocket integration backed by Redis Pub/Sub, processing 10K+ build events/day; implemented OpenTelemetry-based distributed tracing across 15+ PLM microservices for end-to-end observability
  • Designed multi-level caching architecture (L1 in-memory LRU, L2 Redis, L3 PostgreSQL) with Jenkins API integration for automated Allure report ingestion, reducing analysis latency by 60% and handling 500+ concurrent queries with P99 < 200ms
  • Built CI/CD infrastructure with Jenkins, Spinnaker, and Docker using GitOps patterns; implemented PACT contract testing and automated schema evolution validation for asynchronous Kafka services across 15+ microservices with SLO-based deployment gates
  • Designed predictive test selection engine using test execution history correlation and code change analysis,
  • reducing regression suite runtime from 45 min to 12 min while maintaining 99.7% defect detection rate — inspired by Google's TAP system
  • Built automated test data management platform with factory patterns, synthetic data generation, and database seeding pipelines; implemented flaky test detection using statistical analysis (chi-squared tests on pass/fail distributions) with automatic quarantine, reducing flake rate from 8% to <1%
GitSQLSystems DesignPlatform Engineering

Direct line group

Software Engineer

Dec 2023Jul 2024 · 7 mos · London Area, United Kingdom · On-site

  • Led migration of legacy E2E framework to Playwright 1.40+ with data-driven architecture, implementing parallel execution with test sharding across 12 worker threads and Bazel-like dependency graph for selective test execution, improving throughput by 35% and reducing flaky tests from 12% to <2%
  • Developed event-driven microservices (Java Spring Boot, AWS Lambda, Node.js) with SQS/SNS messaging and Terraform IaC; built zero-touch deployment pipelines on AWS CodeBuild with progressive delivery using feature flags (LaunchDarkly) and automated canary analysis
  • Designed A/B testing infrastructure with real-time statistical significance tracking (Bayesian analysis) for insurance applications serving 8M+ customers, enabling product teams to run controlled experiments with automated winner detection and rollout
  • Built performance testing platform using k6 with custom extensions and Grafana dashboards, simulating 10K concurrent users across insurance quote-to-bind flows; identified 3 critical bottlenecks reducing P99 latency by 40% through database query optimization and connection pool tuning
  • Implemented visual regression testing using Playwright screenshot diffing with perceptual hashing, integrated into CI as automated quality gates catching 15+ UI regressions per sprint pre-production
CucumberJiraMicroservicesEvent-Driven Architecture

Arcesium

Senior Software Engineer

Sep 2021Dec 2023 · 2 yrs 3 mos · Gurugram

  • Built Patt — in-house REST API automation framework (Python/Go) covering 2,000+ endpoints across asynchronous microservice-based post-trade systems handling $5B+ daily trading volume, with built-in contract validation, response schema diffing, and automatic regression detection using JSON deep-diff algorithms
  • Designed Dark Canary testing infrastructure using traffic mirroring (Istio service mesh) and shadow deployments to validate AI-driven trade throughput optimizations in production-like environments, catching 23 critical regressions before release with zero customer impact
  • Led development of distributed testing framework (Golang/Python) with custom Kubernetes CRD operators for declarative test orchestration, enabling parallel execution across 50+ pods with integrated Cassandra, Redis, and Kafka validation at scale
  • Architected event-driven data pipelines using Kafka (3 clusters, 200+ partitions) and Apache Flink for real-time trade reconciliation; executed data migrations for 12 customers across 500M+ records with zero downtime using dual-write patterns and shadow validation
  • Built chaos engineering framework using LitmusChaos on Kubernetes — simulated network partitions, pod failures, CPU/memory pressure, and Kafka broker failures across trading infrastructure; defined SLOs/SLIs and error budgets for 8 critical trading services
  • Migrated all automation to zero-touch execution with GitLab CI/CD and ArgoCD for GitOps-based deployments; implemented dynamic pipeline generation (Python Jinja2 templating) reducing pipeline config overhead by 70%
JiraGitAPI DevelopmentDistributed Systems

Accenture

Associate Software Engineer

Jan 2019Sep 2021 · 2 yrs 8 mos · India

  • Admiral Group (Cardiff, Wales): Built BDD framework with Protractor/TypeScript serving 3 Agile squads;
  • developed functional test suites at API/UI layers reducing sprint cycle by 1 week via Zero Day Automation.
  • Implemented cross-platform mobile testing with Appium, XCUITest, and Detox across 20+ device configurations.
  • Led POC for Protractor-to-Playwright migration adopted org-wide
  • Royal London: Developed Selenium-based Java framework with keyword-driven architecture processing 400+ test scenarios; built CI/CD with Jenkins, Docker, and Kubernetes for distributed cross-device execution on
  • AWS/GCP/Azure. Implemented performance baselines with JMeter for failover and scalability testing
CucumberJiraTest AutomationContinuous Integration

Education

Graphic Era Deemed to be University

Bachelor of Technology - BTech — Computer Science and Engineering

Jan 2014Jan 2018

Spring Meadows Public School

XIIth Standard

Jan 2013Jan 2014

Spring Meadows public School

Senoir Secondary Board Xth Class

Jan 2002Jan 2012

Stackforce found 100+ more professionals with Systems Design & Platform Engineering

Explore similar profiles based on matching skills and experience