Arkaprava De

Engineering Manager

San Jose, California, United States11 yrs 9 mos experience
Most Likely To SwitchAI ML Practitioner

Key Highlights

  • Led multiple high-impact engineering teams at AWS.
  • Delivered scalable AI solutions with significant user adoption.
  • Expert in cloud computing and distributed systems.
Stackforce AI infers this person is a SaaS expert with a strong focus on cloud computing and AI solutions.

Contact

Skills

Core Skills

LeadershipDistributed SystemsCloud ComputingProduct ManagementAi/mlProject ManagementSoftware DevelopmentUi/ux DesignSoftware DesignCollaboration ToolsCloud InfrastructureUser Growth StrategiesPerformance OptimizationPerformance BenchmarkingDatabase Management

Other Skills

ScalabilityExecutive LeadershipTrust and SafetyPeople ManagementStakeholder ManagementAWSAgile Software DevelopmentGitCJavaC#JavaScriptSQLMySQLC++

About

Engineering manager with over 11 years of experience in driving innovation and leading high-performing teams in AI, cloud platforms, and distributed systems. Built products from scratch and delivered scalable AWS SageMaker solutions. Led end-to-end product development, from vision to launch, across secure remote access, partner AI, and integrated IDEs. Skilled in cloud computing, distributed systems, AI/ML platforms, and performance optimization, with a proven impact on user growth, revenue, and platform efficiency. Passionate about leveraging technology for transformative solutions and strategic product growth.

Experience

Amazon web services (aws)

4 roles

Engineering Manager

Promoted

Mar 2023Present · 3 yrs

  • Currently leading three engineering teams of 15+ engineers in the SageMaker Studio team building scalable distributed notebook solutions on high performance clusters (HyperPod). Expertise in managing backend and frontend products and leading backend and full-stack engineers. Developed AI agents based testing workflow to improve manual operational overhead by 80% using strands SDK.
  • [2025 - present] Engineering manager of launching secure remote access support to SageMaker compute
  • Developed road mapping, execution and marketing of remote access feature in SageMaker to securely connect to SageMaker compute like GPU. Supported 100+ PFRs and led working backwards plan to deliver end-to-end solution, making the feature one of the most widely adopted by SageMaker users. Led the AWS on air live demo. Most adopted features by customers with ~30% WoW adoption for existing customers.
  • [2023 - 2024] 0-1 Engineering manager of launching Partner AI Apps on SageMaker
  • Developed vision, road mapping and execution of building a new platform on SageMaker to host 3rd party partner AI apps (like AI observability, model evaluation, AI security) with a team of 30+ engineers, led cross-functional teams across 6+ external partner engineering teams and led the execution of new product launch at reInvent 2024 as part of keynote (blog, video)
  • Formalized a 3-year plan post launch on operations and NorthStar vision for the product
  • [2022 - 2023] 0-1 Engineering manager of CodeEditor on AWS SageMaker
  • Led a team of 6 engineers, product manager and designer to support CodeEditor IDE (based on OpenSource VSCode) on SageMaker, Vision and Road mapping to secure investment for product, led cross-functional teams across Legal, Console and Platform to enable execution and launch of new product at reInvent 2023 as part of AWS CTO’s keynote. Open-sourced code maintained by our team and currently at a 10% MoM growth on adoption with 2.8MM ARR.
LeadershipSoftware DesignScalabilityExecutive LeadershipTrust and SafetyPeople Management+2

Senior Software Engineer

Jun 2021Dec 2023 · 2 yrs 6 mos

  • [2021 - 2022] Led reInvent 2022 goal to improve governance, collaboration on SageMaker by launching shared spaces with JupyterNotebooks, tag based access control for granular access and improved cost monitoring and support of multiple SageMaker domains (launch blog1, blog2)
  • [2020 - 2021] Led the roadmapping and execution of control plane infrastructure for Amazon SageMaker Studio Lab, a free notebook environment which was launched during reInvent 2021. Drove user growth to over 300k external customers in 2+ years and implemented a fraud prevention system that deactivated 3k accounts weekly.
  • Led the launch of AI powered coding assistant and code scans on AWS SageMaker (launch blog)
  • Led the launch of JupyterLab 3 on AWS SageMaker Studio as part of (launch blog) and migration of 10k+ Studio accounts from JupyterLab 1 to latest.
  • Designed and launched new AWS Sagemaker StudioLab (https://studiolab.sagemaker.aws/). Was involved in building the Control Plane with authentication and authorization layer and launching before re:Invent.
  • Leading about 10 developers in different components for StudioLab launch, and was the security point of contact for the customer facing api and webapp
  • Previously was in AWS Aurora Mysql team handling integration with AWS ML. Handling performance improvement of Aurora Mysql
Software DesignScalabilityTrust and SafetyDistributed Systems

SDE II

Promoted

Mar 2017Jun 2021 · 4 yrs 3 mos

  • [AWS RDS Aurora Mysql]
  • Designed and developed a performance service to benchmark new releases of Aurora MySQL. Coordinated with 4+ cross functional team to adopt the service across RDS org (RDS Mysql, RDS Aurora Postgres, RDS Postgres, RDS Graph database) with the support of benchmarking tools like sysbench, tpcc, tpch. The automated performance service replaced manual benchmark tests, which led to 95% reduction of cost and reducing DB engineers' manual effort for benchmark reporting by 90%.
  • Led the log replication agent and server benchmarking across 25+ supported regions for the launch of RDS Global databases
  • Led the benchmarking and launch of Graviton instance support for RDS Aurora Mysql and Postgres.
  • Built a distributed test execution infrastructure in Aurora Mysql that leveraged docker containers and ECS to reduce the existing test run time from 8 hours to 25 minutes. Each workflow handled execution of ~5000 tests distributed using a message queue and about ~20 workflows executed everyday in average
  • Amazon Retail:
  • Worked on migration of internal system from SQL to NoSQL database and backfilling millions of records; created a backfill service to streamline the process.
  • Designed a content moderation pipeline where moderators can review content submitted by sellers/vendors and then publish them to the retail page of Amazon.
  • Build a standalone service to push data from DynamoDB to Redshift for metrics generation using AWS Kinesis.
Software DesignScalabilityPerformance Optimization

SDE I

Jun 2015Feb 2017 · 1 yr 8 mos

  • Designed and developed a content management system to manage and publish videos uploaded by business team, which is then viewed by thousands of sellers on Amazon’s Seller Central platform.
  • Designed and implemented a new system to push millions of seller activity information everyday from a 3rd party system (Marketo) to Amazon’s Data Warehouse tables; removed dependency on MUV (3rd party app).
  • Worked on front-end side to create websites for sellers

S&p capital iq

Software Engineer

Jun 2014Jun 2015 · 1 yr · Hyderabad Area, India

  • Worked on optimizing the workflow system by and reduced processing time by 90%.
  • Worked on REST apis to support mobile applications in C#

Motifworks

Summer Intern

Jun 2013Jul 2013 · 1 mo · Bengaluru Area, India

  • Created a dashboard to view metrics on ASP .NET and Ruby On Rails

Education

Vellore Institute of Technology

Bachelor of Technology (B.Tech.) — Computer Science

Jan 2010Jan 2014

Stackforce found 100+ more professionals with Leadership & Distributed Systems

Explore similar profiles based on matching skills and experience