Varad Ghodake

Software Engineer

Chicago, Illinois, United States4 yrs 6 mos experience
Most Likely To Switch

Key Highlights

  • Improved fault-tolerance by 50% in job submission platform.
  • Redesigned event processing engine for 2 million events/day.
  • Developed web-app reducing MTTR by 19%.
Stackforce AI infers this person is a Backend Engineer with strong expertise in Site Reliability Engineering and event-driven architectures.

Contact

Skills

Core Skills

Site Reliability EngineeringPostgresqlReact.js

Other Skills

Apache KafkaApache ZooKeeperC++Customer ServiceDjangoDjango REST FrameworkElasticSearchGo (Programming Language)HTML ScriptingJavaScriptLinuxPHPPythonRed Hat Linux

About

Software Engineering, Backend and System Design. More about me: https://varadghodake.github.io

Experience

Imc trading

Software Engineer

Aug 2023Present · 2 yrs 7 mos · Chicago, Illinois, United States · On-site

Meta

Production Engineering Intern

May 2022Aug 2022 · 3 mos · San Francisco Bay Area · On-site

  • Ads Machine Learning team: Facebook Feature Engineering Platform

Georgia institute of technology

Graduate Research Assistant

Jan 2022Aug 2023 · 1 yr 7 mos · Atlanta, Georgia, United States

  • Georgia Tech - Research Network Operations Center (RNOC) under the Institute for People and Technology (IPaT). On-premise RedHat OpenShift cluster maintenance and improvements.

The d. e. shaw group

2 roles

Software Engineer

Jul 2019May 2021 · 1 yr 10 mos · Hyderabad, Telangana, India · On-site

  • As an active member of the Quant Systems SRE team, improved fault-tolerance of the job submission platform by 50% with streaming replicas using the PostgreSQL Patroni framework in a newly created hypercluster and saved 19% of the monthly error budget.
  • As a monitoring Subject Matter Expert, redesigned the event processing engine to facilitate event-sourcing with Kafka; bumped up the maximum event coverage capacity (SLO) up to ~2 million events/day.
  • Conducted enterprise and DMZ Linux server patching and releases – servers that run ElasticSearch cluster, Vault secret store, and core infra services such as DNS, Puppet, and Kerberos.
  • Set up a multi-site and highly-available Artifact Repository to store ML models and CI/CD outputs with Anycast routing for low-latency uploads and downloads. I was a Directly Responsible Individual (DRI) for this system serving more than 600K artifacts with 99.9% availability SLO.
Site Reliability EngineeringApache KafkaPostgreSQLElasticSearchLinux

Software Engineer Intern

May 2018Jul 2018 · 2 mos · Hyderabad, Telangana, India · On-site

  • Analyzed tickets and system-wide issues reported for better operational efficiency. To mitigate some frequently occurring issues, designed a web-app to provide insights.
  • Developed a visualization console with React front-end, Python backend, and DE Shaw proprietary JavaScript libraries.
  • On top of reducing MTTD (Mean Time To Detect), the project also reduced MTTR (Mean Time To Repair) taken by SREs to fix users’ NFS home directories, grid job submissions, active sessions, and group memberships by 19%.
React.jsPythonJavaScript

Coep's satellite initiative

Software Engineer

Feb 2016May 2018 · 2 yrs 3 mos · Pune · On-site

  • Attitude Determination and Control Subsystem

Skyline labs

Software Developer

Dec 2015Jul 2017 · 1 yr 7 mos · Pune · Hybrid

Education

Georgia Institute of Technology

Master of Science - MS — Computer Science

Aug 2021May 2023

COEP Technological University

Bachelor of Technology - BTech — Computer Engineering

Jan 2015Jan 2019

Stackforce found 100+ more professionals with Site Reliability Engineering & Postgresql

Explore similar profiles based on matching skills and experience