Anthony Polyakov

Director of Engineering

Seattle, Washington, United States14 yrs 7 mos experience

Key Highlights

  • Expert in building scalable cloud infrastructure.
  • Proven track record in leading large engineering teams.
  • Innovative solutions for complex data processing challenges.
Stackforce AI infers this person is a SaaS and Cloud Infrastructure expert with a strong focus on database management and observability.

Contact

Skills

Core Skills

Cloud InfrastructureSoftware DevelopmentDatabase Management

Other Skills

compute servicesruntime infrastructureAPIsKubernetesmulti-cloudSDTKobservabilitycontinuous integrationcloud servicesgRPCCI/CDBigtableNoSQLconcurrent architectureJDBC driver

About

Senior hands on leader, manager of managers with a proven track of record delivering with teams of different sizes and spirits - from focused startup-like fast-paced groups to geo-distributed departments of 50+ engineers. I can build things ground up as well as working in mature environments with serious production workloads. Inspirer, speaker, executor, visioner and someone who strongly believes in leading by example and keeping things simple. I love debugging things and tracing complex problems. Eager learner and truth seeker. I enjoy tech work just as much as I enjoy building teams and helping people grow. My areas of tech interests include: streaming data processing, distributed systems, reactive system design, functional programming, in-memory processing. Some tech keywords: Java, Go, C/C++, Python, gRPC, REST, linux, docker, kubernetes, opentelemetry

Experience

14 yrs 7 mos
Total Experience
1 yr 11 mos
Average Tenure
--
Current Experience

Nvidia

Director

Dec 2024Mar 2026 · 1 yr 3 mos · Seattle, Washington, United States · On-site

  • Compute Infrastructure, DGX Cloud

Datarobot

VP of Engineering

Jun 2022Dec 2024 · 2 yrs 6 mos · Vancouver, British Columbia, Canada

  • Head of Global Platform team (80+ FTE) running foundational services (compute, runtime, storage, security, tenant context and other) for all DataRobot components including model training and AutoML, inferencing, GenAI, Notebooks supporting control plane for single and multi-tenant SaaS and infrastructure provisioning. Global Platform also includes Developer Experience team delivering observability, continuous integration services, service bootstrapping scaffolding framework (helping bootstrapping gRPC based services integrated with common k8s platform using Helm)
  • In my current role I was tasked to help moving DataRobot from on-premise monolithic application to a multi cloud multi service distributed system.
  • I started with re-defining platform from being a set of common tools to a actually being a product exposing well defined APIs to customers. We built the technical strategy and operational principles for the team working backwards from our customers (internal teams) and identified foundational services to be built in the first place:
  • Common k8s-based Runtime infrastructure for running DataRobot components in a secure and scalable way
  • Foundational compute services and APIs covering typical use case pattern - batch jobs API for offline training, FaaS like runtime services for real-time inferencing, hosting API for running user provided models and notebooks
  • SDTK (service development toolkit) for bootstrapping new services with all batteries included (RPC, o11y, deployment, pipelines, etc) to foster developing new services by ML teams outside of old monolith while having all best practices and platform capabilities included.
  • In under a year we had DataRobot moved to a new Kubernetes based architecture and rolled it out across AWS, GCP, Azure and self-managed k8s, in the next year we delivered foundational compute services and SDTK. We modernized DevEx stack getting rid of in-house tools in favour of modern o11y and CI/CD platforms integrated to SDTK
compute servicesruntime infrastructureAPIsKubernetesmulti-cloudSDTK+4

Google

Senior Engineering Manager

Sep 2021Sep 2022 · 1 yr · Waterloo, Ontario, Canada

  • Bigtable lead in Canada.
  • Bigtable is Google's petabyte scale ultra low latency managed NoSQL database powering most demanding Google services internally - YouTube, Gooogle Maps, Search, etc as well as largest enterprises externally on GCP.
  • I built Canadian team focusing on making it a self sufficient center of excellence vs a team extension to existing teams.
  • Together with the team focused on one of the hardest issue in Bigtable - noisy neighbour problem when requests made by one client negatively affect other clients. Together with the team debugged and improved concurrent architecture of the Bigtable API frontends services dropping number of noisy neighbour related tickets by 50%
  • Independently, as a new year hackathon project developed fully ANSI SQL compatible Bigtable JDBC driver based on Apache Phoenix
  • Drove the latency and resource consumption improvement project rearchitecting side cars for critical path authentication and authorization functionality to in-process architecture. As a result we saved up to 10% of CPU resrouces for Bigtable fleet
BigtableNoSQLconcurrent architectureJDBC driverlatency improvementDatabase Management+1

Atlassian

Senior Principal Engineer

Jul 2020Sep 2021 · 1 yr 2 mos · Vancouver, British Columbia, Canada

  • Senior Principal Engineer in Cloud Infrastructure team. Architrect of the new observability platform for Atlassian. OpenTelemetry open source contributor. Brought OpenTelemetry to Atlassian, evangelized and drove adoption, inspired a "if you miss something - make a PR" culture resulting in Atlassian being amongst top 30 OpenTelemetry contributors.
  • Authored design and drove production delivery for Observazaurus - a cross Atlassian observability platform. It was a realime stream data processing pipeline taking telemetry data from every Atlassian service (10's of terrabytes a day) and intelligently processing it to:
  • allow for quotas and limits
  • control cardinality
  • fan-out to hot and cold storages
  • detect anomalies in real-time
  • provide meta-observability
  • harmonize dimensions
  • route data to appropriate configurable observability backend (Splunk, SignalFX, cold storage, LightStep, etc)
  • Lead for company wide cross-functional technical advisory groups in SRE and Cloud Infrastructure engaging principal engineers across the organization to work on strategic initiatives and drive company tech strategy.
observability platformOpenTelemetryreal-time data processingcross-functional collaborationCloud InfrastructureSoftware Development

Amazon web services (aws)

Software Development Manager

Feb 2018Jul 2020 · 2 yrs 5 mos · Vancouver, British Columbia, Canada

  • Run Aurora MySQL, RDS MySQL and MariaDB teams (30 FTEs) - the next generation cloud databases at AWS. Responsible for all engineering and operational aspects of one of the largest database fleets in the world.
  • Among other things my team launched Aurora Global Database, Aurora Multi-Master, delivered major upgrade to RDS control plane to support MySQL 8.0, launched new RDS Recommendations service.
  • Major contributor to the design of Aurora Global Database - a geographically distributed relational database with sub second multi region latencies and global control plane. I drove the control plane and API design, developed a global metadata storage layer backing the control plane, partnered with the engine and storage teams making the global control plane for them.
  • I inspired and started RDS Recommendations - an intelligent database co-pilot delivering optimization and best practices advices to customers in real-time looking at their database fleet. We designed it to be an extensible reactive platform consuming large amount of telemetry, metadata, control plane data and producing actionable recommendations to clients. The engine ran on extremely large fleet of databases (millions of instances) and was scalable and extensible such that new types of recommendations could be added easily.
  • Reduced KTLO and engineering toil by 50% in 1 year. Fully automated engine release process going from multi month to same day new engine releases.
  • Was founding EM for AWS Location service (5 FTEs growing to 15). Was responsible for geofencing and real-time tracking domains building both the team and technology ground-up. Partnered with Product management to deliver the concept, the product vision and the very first customer demos of the product. Was a co-author of a new patent on real-time geofencing algorithm. Drove the architecture for a high performant distributed geofencing engine processing real-time position data from clients and producing fencing events at AWS scale.
cloud databasesAurora Global DatabaseRDS RecommendationsgeofencingDatabase ManagementSoftware Development

Clouddbappliance project

Senior Software Engineer

Jan 2017Dec 2017 · 11 mos · Greater Paris Metropolitan Region

  • Working as a Use Case developer for CloudDBAppliance - Horizon 2020 project sponsored by European Comission. The aim is to develop a system computing real-time CVA risks utilising capabilities delivered by CloudDBAppliance. CloudDBAppliance is a brand new appliance consisting of ultra-fast operational data lake and in-memory analytics engine. Aim is to achieve multi-terabyte, multi-hundred cores scalability with predictable performance and strong consistency guarantees by utilising NUMA architecture, proprietary K/V storage engine and unique resource allocation algorithms

Infoshare.pl

Speaker

May 2016May 2016 · 0 mo · Gdańsk, Pomorskie, Poland

  • Tech stage

Nordea markets

3 roles

Head of Application Development, Core Services and Risk

Promoted

Jan 2016Jan 2018 · 2 yrs

  • Hands-on leader and solution architect. Managing software development teams in Denmark and Poland (60+ people in total) which are working on the key parts of the new Nordea Capital Markets IT ecosystem in Trading and Risk domain. Technical manager, solution architect and owner for number of critical systems and foundational components (messaging layer, service discovery, operational data stores). Responsible for designing and building:
  • Realtime FpML document-based trade vault delivering trade contract information and serving continuous queries for Capital Markets systems (multi terabyte MongoDB-based solution)
  • New FRTB-compliant Market Risk infrastructure - including Scenario Engine and IMA engine capable of doing on-demand simulations fed into in-memory aggregation cubes (> x10 capacity increase comparing to existing one, reactive and on-demand computations comparing to overnight batch)
  • Core messaging layer for Capital Markets infrastructure based on Kafka and Confluent Platform
  • Core cloud-ready infrastructure architecture (dynamic service discovery, fluid machine-agnostic cluster-based deployment capable of doing canary rollouts, containerisation)
  • Improved Credit Risk system landscape to ensure continuous delivery model with vendor solution
application developmentmessaging layercloud architecturereal-time systemsSoftware DevelopmentCloud Infrastructure

Head of Risk IT

Sep 2015Jan 2016 · 4 mos

Head of Market Risk IT

Aug 2014Sep 2015 · 1 yr 1 mo

  • Market Risk IT department is 25 developers and business analysts working for Capital Markets and responsible for market risk management systems. These are high throughput systems calculating mission critical risk figures on Nordea-wide portfolios.
  • Existing systems suffered from substantial capacity, stability and scalability issues. There were no automated deployment processes, systems required a lot of manual efforts just to keep operating, development process in the team was very adhoc and communication to business was not transparent.
  • I headed the team as a crisis manager with the aim of (re) engineering existing systems, bringing in engineering culture, building strong development practice, ensuring agile and transparent development processes.
  • In less than a year we:
  • identified most problematic places and developed an evolutionary strategy of re-engineering critical components. Following this strategy we were able to do dramatic improvements in throughput and stability in a smooth continuous way without business interruption and necessity to do massive regression
  • established fully transparent agile process with strong ownership and ensured continuous delivery chain. That required building the full continuous delivery pipeline and migrating existing legacy code base to automatic deployment framework
  • cleaned up old systems and built a new microservice-based set of components leveraging web-scale technologies like Redis and Kafka, employing reactive principles with RxJava and doing stateless scalable architecture
  • established set of strong, mobile and self-organized engineering teams
  • As a result critical processing times dropped from 4 hours to several minutes, we've got highly concurrent reusable components operating in real-time with no global shared state which allowed to do things like what-if analysis.
  • System now handles millions of trades in a semi-real-time mode allowing to do instant slice-and-dice leveraging in-memory olap cube

Deutsche bank

3 roles

Assistant Vice President

Feb 2010Mar 2012 · 2 yrs 1 mo

  • Leading project of building Deutsche's FIX API Options platform
  • Leading development of electronic option orders (ATOM) platform
  • Leading development of Autobahn FX Options (http://www.autobahnfx.com/options.html) and AutobahnFX Structured Products (http://www.autobahnfx.com/deposits.html) - a market leading FX Options platform

Team lead

Promoted

May 2009Jan 2010 · 8 mos

Lead developer

Sep 2007Apr 2009 · 1 yr 7 mos

Sap

ABAP developer

Nov 2006Aug 2007 · 9 mos

  • Developer in SAP R3 Globalization team focusing on SAP FI (financial) module. Owner of a few FI subsystems

Competentum group

Lead software developer

May 2005Nov 2006 · 1 yr 6 mos

Netcracker

Software developer

May 2004May 2005 · 1 yr

Education

Moscow Institute of Physics and Technology (State University) (MIPT)

Doctor of Philosophy - PhD — Applied Mathematics

Moscow Institute of Physics and Technology (State University) (MIPT)

MS

Jan 2001Jan 2007

Moscow Institute of Physics and Technology (State University) (MIPT)

MS — Applied physics and math

Jan 2001Jan 2007

Stackforce found 100+ more professionals with Cloud Infrastructure & Software Development

Explore similar profiles based on matching skills and experience