Spandana Raj Babbula

Software Engineer

Bengaluru, Karnataka, India10 yrs 3 mos experience
AI EnabledHighly Stable

Key Highlights

  • Led infrastructure for Generative AI at Google.
  • Achieved 10x latency reduction for PaLM API.
  • Expert in Large Language Models and AI serving.
Stackforce AI infers this person is a Backend-heavy Infrastructure Engineer specializing in AI and Data Analytics.

Contact

Skills

Core Skills

Ai ServingLarge Language Models (llm)Generative AiData InfrastructureLow LatencyProject ManagementBig Data AnalyticsData ProcessingFull-stack Development

Other Skills

AlgorithmsArtificial Intelligence (AI)Batch ProcessingBig DataBigTableCC++Data AnalysisData AnalyticsData Storage TechnologiesData StructuresDatabase SystemsDistributed SystemsJaxMLOps

About

Impact-driven engineering leader with track record in building large scale and highly performant infrastructure. Experienced in 0-1 projects and scaling them from conception to successful products. Currently working on problems at the intersection of LLMs and infrastructure.

Experience

Google deepmind

Senior Staff Software Engineer

Apr 2024Present · 1 yr 11 mos · Bengaluru, Karnataka, India · Hybrid

  • I work on Gemini inference and serving efficiency.
JaxxlaAI servingLarge Language Models (LLM)Technical LeadershipTPU

Google

5 roles

Senior Staff Software Engineer

Oct 2023Mar 2024 · 5 mos

  • Core Labs (Applied AI) @ Google
  • I lead the GenAI Retrieval Augmented Generation (RAG) infrastructure area, which includes Vector databases and infrastructure for other semantic retrieval techniques to augment LLM's knowledge for Q&A on private corpora. This infrastructure is being used by several internal and external facing products at Google for solving business-critical problems using RAG.
  • Gemini API semantic retrieval is built on top of vector store infrastructure developed by my team: https://ai.google.dev/
Large Language Models (LLM)Generative AITechnical LeadershipData InfrastructureData ProcessingData Storage Technologies+3

Staff Software Engineer and Manager

Promoted

Apr 2021Oct 2023 · 2 yrs 6 mos

  • Generative AI APIs @ Google Labs
  • Owned performance eval and improvements for Google PaLM API. Improved TPU inference latencies and e2e API latencies by understanding and experimenting with quantization, batching, model sharding. HBM usage and bandwidth constraints, TPU topologies etc.
  • Improved latency of the PaLM API by up to ~10x, making it several times faster than other GenAI API offerings in the industry. The performance work I did was behind multiple GenAI products/features announced at Google I/O 2023 including NotebookLM, Google Makersuite and "Help me write" in Gmail.
  • PaLM API: https://developers.generativeai.google/
  • NotebookLM: https://blog.google/technology/ai/notebooklm-google-ai/
  • Makersuite: https://makersuite.google.com/
Large Language Models (LLM)Low LatencyTeam ManagementAI servingMLOpsScalability

Senior Software Engineer, Search Ads A/B Experiments Analytics

Promoted

May 2019Apr 2021 · 1 yr 11 mos

  • Tech leading Search Ads Analysis Infrastructure - An extremely fast and reliable infra for A/B experiments analytics.
Batch ProcessingData ProcessingProject ManagementTeam ManagementStrategic RoadmapsTechnology Roadmapping+1

Software Engineer III, Actions-on-Google Data Platform & Analytics

Promoted

May 2017Apr 2019 · 1 yr 11 mos

  • Built a logging and analytics framework for Actions on Google, to enable product/feature teams to compute metrics, platform insights, ranking features and developer analytics for actions on the Google assistant.
  • Built efficient, scalable and maintainable infrastructure using Google's bigdata technologies like Flume, Mesa, Bigtable.
  • Actively collaborated with 10+ teams and product managers to evolve the infrastructure as per product requirements and help those teams to successfully leverage the infrastructure.
  • Key aspects of my work involve batch and streaming data pipelines, logs processing, real time analytics generation, database schema design, SQL query authoring, SQL pipelines, and building datasets that can be sliced and diced.
PipelinesBatch ProcessingData ProcessingBig Data Analytics

Software Engineer II, AdSense Publisher Optimization

Oct 2015Apr 2017 · 1 yr 6 mos

  • Built the backend infrastructure for Ad Balance, a feature that helps publishers show fewer,
  • best-performing ads and provide better visitor experience on their sites with minimal drop in earnings.
  • Launched new features to the AdSense A/B experiments - automatic A/B experiments and autocomplete A/B experiments.
  • Launched new recommendation types for AdSense publishers - Matched content, Account linking with Google Analytics.
Full-Stack Development

Xerox research center india

Research Intern

Jun 2014Jul 2014 · 1 mo · Bangalore

  • Worked towards efficient and accurate transductive classification in Hypergraphs using a Heat-Kernel framework.
  • Developed a generic mathematical framework for classification in multi-view data represented as multi-layer graphs.
  • The framework learns a linear model that combines spectral information from multiple layers using the graph Laplacian.

Microsoft research

Research Intern

May 2013Jul 2013 · 2 mos · Redmond

  • Research Intern, Cloud Systems and Applications Group
  • Worked towards an efficient online traffic engineering scheme that exploits information on transfer sizes and deadlines to pack long-running transfers across network paths and time, using a tailored approximate solution to mixed packing and covering problem.
  • My work was a part of a paper “Calendaring for wide area networks” that has been published in ACM SIGCOMM 2014.

Education

Indian Institute of Technology, Madras

Dual Degree (Bachelors + Masters) — Computer Science and Engineering

Jan 2010Jan 2015

Stackforce found 100+ more professionals with Ai Serving & Large Language Models (llm)

Explore similar profiles based on matching skills and experience