Anil Katti

Co-Founder

San Francisco, California, United States19 yrs 3 mos experience
AI ML PractitionerAI Enabled

Key Highlights

  • Led modernization of Apple's ML inference stack.
  • Developed advanced on-device ML features for Apple products.
  • Co-founded a startup focused on robotics education.
Stackforce AI infers this person is a Machine Learning and Software Engineering expert in the Technology sector.

Contact

Skills

Core Skills

Machine LearningArtificial Intelligence

Other Skills

Engineering LeadershipSoftware EngineeringCore MLAlgorithmsCC++Video CompressionDigital Image ProcessingParallel AlgorithmsCachingComputer ScienceOperating SystemsVideo StandardsJavaScriptData Structures

About

I am passionate about solving complex engineering problems and building great products. At Apple, my north star has been making machine learning and artificial intelligence accessible to more developers on our platforms. In the past, I have worked extensively on video coding, image processing, and parallel algorithms. As an engineering leader, I drive clarity during uncertain times and keep teams motivated and focused. I promote open culture and advocate a decision making framework built around optimizing user experience. I have recruited top-notch engineers and built great teams within Apple. I push for excellence while prioritizing team well-being and health.

Experience

South park commons

2 roles

Founder Fellow

Apr 2025Present · 11 mos

Member

Nov 2024Present · 1 yr 4 mos

Uttara labs

Co-Founder, CTO

Aug 2024Present · 1 yr 7 mos · San Francisco Bay Area · Remote

Apple

3 roles

Senior Engineering Manager, On Device ML

Promoted

Jan 2022Jul 2024 · 2 yrs 6 mos

  • In this role I led initiatives to modernize and unify our inference stack centered around a new model format, intermediate representation and a ML compiler stack. This effort involved close collaboration with teams across AIML, Video Engineering and Software Engineering to provide a solid platform and ecosystem to accelerate the journey from research to production.
  • My organization contributed significantly to the development of on-device infrastructure supporting Apple Intelligence features like Writing Tools, Image Playgrounds, App Intent, and Predictive Code Completion in Xcode. I was fortunate to represent the collective efforts of hundreds of talented engineers at WWDC 2024.
  • As the engineering leader for Core ML and its underlying frameworks at Apple, I was driven by a commitment to serving our amazing clients. We enabled advanced on-device experiences in apps like Camera, Keyboard, Siri, and other first-party services. We also supported third-party applications from companies like Adobe and Meta and thousands more helping them leverage Apple’s powerful hardware for on-device machine learning. Additionally, I spearheaded collaborations with Meta on the ExecuTorch Core ML integration and partnered with Hugging Face to revitalize Core ML’s open-source initiatives.
Engineering LeadershipMachine LearningArtificial Intelligence

Engineering Manager, CoreML

Jan 2019Jan 2022 · 3 yrs

  • Lead a team of engineers responsible for Core ML, Apple’s on-device inference / training framework. Shipped key public features including on-device training, model deployment, encryption and the new model package format.

Senior Software Engineer, IMG

Sep 2015Jan 2019 · 3 yrs 4 mos

  • HLS is Apple’s media streaming technology and FPS is Apple’s content protection technology. Shipped key public features in Apple’s content protection stack including support for secure offline key management, dual expiry windows, secure key invalidation, secure stop for auditing, key preloading, encrypted media extensions on WebKit, HDCP monitoring and enforcement. Worked on different aspects of player stack to support features like offline playback, HEVC support, player pre-warming, HLS stream validation tool and built a heads-up display for visualizing streaming performance statistics.

Cisco systems (scientific atlanta)

Software Engineer

Jul 2011Sep 2015 · 4 yrs 2 mos · Greater Atlanta Area

  • Shipped key features in AnyRes, a 4Kp60 real-time HEVC encoder including hierarchical motion estimation and input / reference picture buffer management. Architected modular encoder design to achieve real-time performance by exploiting CTU row-level and video frame-level parallelism. Built in-house HEVC bitstream analyzer to help with algorithm development. The tool overlaid CTU partitions, prediction units, intra modes, inter mode directions, and motion vectors on reconstructed video frames for visualizing algorithm efficiency. Devised and ran experiments to assess subjective quality of compressed video to evaluate algorithm efficiency.

The university of texas at austin

2 roles

Graduate Student

Promoted

Aug 2009May 2011 · 1 yr 9 mos · Austin, Texas Area

  • Courses: Algorithms, Parallel Algorithms, Digital Image and Video Processing, Introduction to Cognitive
  • Sciences, Distributed Computing, Operating Systems Implementation, Autonomous Robots
  • Major: Theoretical Computer Science

Graduate Research Assistant

Aug 2009May 2011 · 1 yr 9 mos · Austin, Texas Area

  • Worked with Prof. Vijaya Ramachandran on cache replacement strategies for Multi-Core processors. Involved in extensive theoretical research and published the work in IPDPS 2012 (Top Tier CS conference). A copy of my thesis is here.
  • Abstract:
  • We consider cache replacement algorithms at a shared cache in a multicore system which receives an arbitrary interleaving of requests from processes that have full knowledge about their individual request sequences. We establish tight bounds on the competitive ratio of deterministic and randomized cache replacement strategies when processes share memory blocks. Our main result for this case is a deterministic algorithm called global-maxima which is optimum up to a constant factor when processes share memory blocks. Our framework is a generalization of the application controlled caching framework in which processes access disjoint sets of memory blocks. We also present a deterministic algorithm called rr-proc-mark which exactly matches the lower bound on the competitive ratio of deterministic cache replacement algorithms when processes access disjoint sets of memory blocks. We extend our results to multiple levels of caches and prove that an exclusive cache is better than both inclusive and non-inclusive caches; this validates the experimental findings in the literature. Our results could be applied to shared caches in multicore systems in which processes work together on multithreaded computations like Gaussian elimination paradigm, fast Fourier transform, matrix multiplication, etc. In these computations, processes have full knowledge about their individual request sequences and can share memory blocks.

Vmlogix

Software Engineer

Apr 2008Jul 2009 · 1 yr 3 mos · Bengaluru Area, India

  • Development of key features in virtual lab automation software. Focused on implementation of IP fencing feature and GuestAgent.

Techsouls

Co-Founder

Jul 2007Mar 2008 · 8 mos · Bengaluru Area, India

  • With a vision to make robotics education accessible in India, developed affordable robotic kits using Arduino, an open-source prototyping platform. Worked on creating robotics simulation software, course material and conducted workshops in engineering schools.

Trilogy e-business software india pvt.ltd

Technical Analyst

Jan 2006Jan 2007 · 1 yr · Bengaluru Area, India

  • Development of key features in Distribution Channel Management (DCM). Customization of DCM as per customer's requirements.

Education

The University of Texas at Austin

Master of Science (MS) — Computer Science

Jan 2009Jan 2011

National Institute of Technology Karnataka

Bachelor of Engineering (BE) — Computer Science

Jan 2002Jan 2006

Jyothy Kendriya Vidyalaya

High School

Jan 1990Jan 2002

Stackforce found 100+ more professionals with Machine Learning & Artificial Intelligence

Explore similar profiles based on matching skills and experience