Khai Tran

Software Engineer

San Mateo, California, United States18 yrs 3 mos experience
Most Likely To SwitchAI ML Practitioner

Key Highlights

  • Expert in scaling AI systems for real-time applications.
  • Proven track record in architecting data privacy solutions.
  • Strong background in distributed systems and query optimization.
Stackforce AI infers this person is a SaaS expert with a focus on AI and data infrastructure.

Contact

Skills

Core Skills

Artificial Intelligence (ai)Large Language Models (llm)

Other Skills

Big DataCUDAGraphics Processing UnitC/C++C#JavaPerlSQL ServerOracle SQLPostgreSQLx86 AssemblyUnixWindowsLaTeXMatlab

About

3-in-1 engineer: data infra, AI, and distributed system. Skilled in database internals, query optimization and execution, model inference, GPU kernel programming, AI modeling, LLM fine tuning, and distributed system. Strong interests in physics, neuroscience, and cell biology.

Experience

18 yrs 3 mos
Total Experience
2 yrs 3 mos
Average Tenure
3 yrs 9 mos
Current Experience

Linkedin

2 roles

Senior Staff Software Engineer

Promoted

Sep 2022Present · 3 yrs 9 mos · Sunnyvale, CA

  • Having fun with 10x scaling different inference systems for Ads Ranking model, ranging from legacy DLRM models on TensorFlow to transformer-based model on PyTorch (see serving system in https://arxiv.org/pdf/2602.11410) and small language models on SGLang.
  • Architected and built a data privacy policy enforcement system to protect LinkedIn from DMA violations (see https://arxiv.org/pdf/2502.01998)
Big DataCUDAGraphics Processing UnitArtificial Intelligence (AI)Large Language Models (LLM)

Staff Software Engineer

Sep 2016Nov 2020 · 4 yrs 2 mos

  • My main focus at LinkedIn was to provide the portability for user data transformation code.
  • First project - building the near-realtime metrics platform
  • o Built the near-realtime metrics platform from scratch by auto-generating streaming
  • code with Apache Beam API from offline data transformation batch scripts in Pig Latin or Hive. This was almost my sole project where I investigated possible solutions for the problem, proposed the design, implemented the solution, developed a deployment system, and operated the service.
  • Second project - building a portable data transformation fluent API in Java that can be shared among online OLTP engines, nearline streaming engines, and offline batch engines.
  • Key features:
  • o Declarative and type-safe, similar to JOOQ ( https://www.jooq.org )
  • o Support imperative logics with Java Lambda functions, which are converted into Transport UDFs later
  • o Can be translated into any SQL dialects using Coral
  • o Easy to test on any type systems (like Avro GenericRecord, Spark Internal Row, … )
  • To provide those features, we architected the system with four components:
  • o Code generator to auto-generate type-safe structures from a given schema
  • o Core API and implementation with AST elements
  • o SQL compiler to translate ASTs into Calcite relational algebra plans and then target SQL using Coral
  • o A portable row-at-a-time engine that can be used to unit test user data transformation code or embedded in a host engine
  • Open source contributions:
  • o Apache Calcite: Converting Pig Latin scripts into Calcite relational algebra
  • https://github.com/apache/calcite/pull/1265
  • o A founding member and one of the major contributors of project Coral, a library for translating SQL among different dialects https://github.com/linkedin/coral
  • o Designed and contributed to project Transport UDFs, a framework for writing performant user-defined functions (UDFs) that are portable across multiple engines https://github.com/linkedin/transport

Airbnb

Staff Software Engineer

Nov 2020Jun 2022 · 1 yr 7 mos

  • In Data Infrastructure team

Amazon web services

2 roles

Software Engineer

Dec 2015Sep 2016 · 9 mos · Palo Alto, CA

  • In Redshift team, working on Redshift query processing.

Software Engineer

Aug 2014Nov 2015 · 1 yr 3 mos · Palo Alto, CA

  • In DynamoDB (NoSQL services at AWS) team, worked on every aspect of DynamoDB frontend.

Oracle

Senior Member of Technical Staff

Feb 2013Aug 2014 · 1 yr 6 mos · Redwood City, CA

  • Integrating Rapid, a hardware-software co-design system targeting large-scale data management and analysis, into Oracle RDBMS.
  • Designed and implemented cross-engine query optimization algorithms for Oracle query optimizer. Worked with Oracle query optimizer and query compilation code for splitting queries across two execution engines. (a patent filed)
  • Implemented a platform-independent representation of query execution plan for transporting execution plans across different query execution engines (like a serialization language for query execution plans.)
  • Designed a change propagation system that synchronizes data updates from Oracle RDBMS to Rapid and implemented a prototype for append-only insert statements.

Google

Summer intern

May 2012Aug 2012 · 3 mos · Mountain View, CA

  • Designed and implemented MapRedduce-based checksum workers for computing checksums of ads tables in Google stats servers. Obtained a speedup of 120x on a cluster of 200 machines.
  • Designed and implemented map-only MapReduce-based expansion workers for aggregating data among different versions of ads tables.

Microsoft

Summer intern (MSR)

Jun 2011Aug 2011 · 2 mos · Redmond, WA

  • Optimized and tuned a system, called Deuteronomy, for faster performance.
  • Proposed a new threading model to avoid the context switching cost.
  • Improved the system performance by a factor of 10.

Oracle

Summer intern

Jun 2010Aug 2010 · 2 mos

  • Analyzed the system to find the bottlenecks of the hash-join operator at the runtime.
  • Proposed solutions to eliminate the bottlenecks.

Microsoft jim gray systems labs

Research Assistant

Jan 2009Jan 2013 · 4 yrs · Madison, Wisconsin Area

  • Seeking for innovative concurrency control and data partitioning for OLTP workloads on multicores:
  • Implemented a system in C to run simple database transactions using hardware Transactional Memory, spinlocks, and database locks for concurrency control. Worked with a hardware prototype that does not support C compilers.
  • Developed a new concurrency control approach for highly-partitioned OLTP workloads
  • on multicore systems. Implemented the approach in a system running TPC-C transactions without using locking.
  • Developed a framework for automatically partitioning OLTP databases. Obtained a good partitioning solution for TPC-E with the framework.

University of wisconsin-madison

Teaching Assistant & Research Assistant

Sep 2007Dec 2008 · 1 yr 3 mos · Madison, Wisconsin Area

  • Research Assistant:
  • Integrated Cell-sort into the MapReduce framework on Cell processors.
  • Teaching Assistant:
  • Instructed students for programming tasks.
  • Graded student assignments and exams

Education

University of Wisconsin-Madison

PhD — Computer Science

Jan 2009Jan 2012

University of Wisconsin-Madison

M.S. — Computer Science

Jan 2007Jan 2009

Hanoi University of Science and Technology

BS — Computer Science

Jan 2001Jan 2006

Stackforce found 100+ more professionals with Artificial Intelligence (ai) & Large Language Models (llm)

Explore similar profiles based on matching skills and experience