Hari Kishore Chaparala

Software Engineer

Irvine, California, United States10 yrs 1 mo experience
Most Likely To Switch

Key Highlights

  • Expert in optimizing big data performance.
  • Strong background in software engineering and system architecture.
  • Experience in bioinformatics and machine learning applications.
Stackforce AI infers this person is a Big Data and Software Engineering expert with a focus on performance optimization.

Contact

Skills

Core Skills

Apache SparkData AnalyticsDatabase ManagementR&dTeachingData ManagementBig DataAlgorithm OptimizationSoftware EngineeringSystem ArchitectureBioinformaticsSoftware DevelopmentMachine LearningNlpControl SystemsElectrical EngineeringEmbedded SystemsPrototyping

Other Skills

Open Data AnalyticsEMR Spark EngineIcebergQuery optimizationGlobal secondary indexesVector indexesDatabase internalsPrepare homeworkConduct discussion sessionsGrade assignmentsIntegrate Apache IcebergBig Data Management SystemApache IcebergAsterixDBBig Data Management

Experience

10 yrs 1 mo
Total Experience
1 yr 2 mos
Average Tenure
2 yrs 3 mos
Current Experience

Amazon web services (aws)

Software Development Engineer - II

Mar 2024Present · 2 yrs 3 mos · Redmond, Washington, United States · On-site

  • Open Data Analytics
  • EMR Spark Engine and Open Table Formats Performance.
  • 2024: Spark--Redshift performance optimizations.
  • Present primary focus: Spark-Iceberg performance optimizations -- Query optimization and Query Execution improvements.
  • Read: https://aws.amazon.com/blogs/big-data/amazon-emr-7-1-runtime-for-apache-spark-and-iceberg-can-run-spark-workloads-2-7-times-faster-than-apache-spark-3-5-1-and-iceberg-1-5-2/
  • Read -- EMR 7.12: https://aws.amazon.com/blogs/big-data/run-apache-spark-and-iceberg-4-5x-faster-than-open-source-spark-with-amazon-emr/
  • Write -- EMR 7.12: https://aws.amazon.com/blogs/big-data/run-apache-spark-and-apache-iceberg-write-jobs-2x-faster-with-amazon-emr/
Open Data AnalyticsEMR Spark EngineApache SparkIcebergQuery optimizationData Analytics

Couchbase

Software Engineer

Jul 2023Mar 2024 · 8 mos · Santa Clara, California, United States · On-site

  • R&D Core - Global secondary indexes and vector indexes. Database internals.
Global secondary indexesVector indexesDatabase internalsDatabase ManagementR&D

Uc irvine donald bren school of information and computer sciences

5 roles

Graduate Teaching Assistant

Apr 2023Jul 2023 · 3 mos · Irvine, California, United States

  • CS122D - Beyond SQL Data Management
  • Prepare homework, conduct discussion sessions, and grade assignments.
Prepare homeworkConduct discussion sessionsGrade assignmentsTeachingData Management

Graduate Student Researcher

Jan 2023Jul 2023 · 6 mos · Irvine, California, United States

  • Integrate Apache Iceberg, a high-performance table format for data lakes into Apache AsterixDB, a scalable, open source Big Data Management System (BDMS). The primary version includes “Select query” support.
  • Feature: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/17419
  • Advisor: Michael J. Carey
Integrate Apache IcebergBig Data Management SystemBig DataDatabase Management

Graduate Teaching Assistant

Sep 2022Dec 2022 · 3 mos · Irvine, California, United States

  • Teaching Assistant for CS 220P - Databases and Data Management

Graduate Teaching Assistant

Mar 2022Jul 2022 · 4 mos · Irvine, California, United States

  • Teaching Assistant for CS 122A - Introduction to Data Management.
  • Roles:
  • Administer Discussion sessions
  • Review course material
  • Hold office hours
  • Provide assistance in preparing assignments
Administer discussion sessionsReview course materialHold office hoursTeachingData Management

Graduate Reader

Jan 2022Mar 2022 · 2 mos · Irvine, California, United States

  • Reader for the Course IN4MATX 172 - Project in Health Informatics

The apache software foundation

Apache AsterixDB - Contributor

Jan 2023May 2023 · 4 mos

  • Integrate Apache Iceberg, a high-performance table format for data lakes into Apache AsterixDB, a scalable, open source Big Data Management System (BDMS). The primary version includes “Select query” support.
  • Feature: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/17419
Integrate Apache IcebergBig Data Management SystemBig DataDatabase Management

Meta

Software Engineer Intern

Jun 2022Sep 2022 · 3 mos · Seattle, Washington, United States

  • Part of Algorithmic Optimization team working on Regional fluidity and Infrastructure scheduling. Added new signal features to the DSL used to define constraints for MILP solver. My work helps with plan explainability and reduces toil on end users.
Algorithmic OptimizationMILP solverPlan explainabilityAlgorithm OptimizationSoftware Engineering

Gojek

Senior Software Engineer

Nov 2019Sep 2021 · 1 yr 10 mos · Bengaluru, Karnataka

  • Worked with an awesome team of engineers in Supply Incentives (Mobility Marketplace) to build systems that award bonuses to our driver-partners upon meeting a set of criteria aimed at improving the Booking Completion rate. Also collaborated closely with the data science team to engineer services driving supply positioning for improved serviceability in high-demand areas.
  • ●Built highly scalable and distributed systems handling more than 100k requests per minute
  • ●Removed all single point of failures from incentive systems by making service components highly available
  • ●Decommissioned our Monolith service saving over 40K USD/year in infra costs
  • ●Setup Patroni for our PostgreSQL nodes for automatic failovers, HA, and easy vertical and horizontal scaling
  • ●Designed and executed a strategy to rescale overprovisioned virtual machines with minimal downtime and product impact. Saved >25K USD/year in Google Cloud compute engine costs
  • ●Integrated Cartography in Supply positioning for computing ETA and distance metrics
  • ●Onboarded our microservices to dynamic Protobuf updates and made Driver performance architectural changes that helped reduce Dev effort and blockers by at least 3 days whenever a new service type or booking cancellation reason is introduced
  • ●Onboarded Incentive systems to KernelUX alerts/monitoring
  • ●Built a debugging utility that collates driver stats from our microservices and databases. This helped in significantly reducing engineers’ time to identify any false alarms/root causes during production incidents
  • ●Open-sourced flattening utilities for Google Protobuf struct deserialization
  • ●Worked on OpenJDK 11 and Ubuntu 20.04 migrations
  • ●Revamped our Driver Statistics service making it robust to upstream failures. This solved a long-term issue of driver performance exceeding 100% and wrongful bonus denials and payouts
  • ●Solved a long-term issue of wrongful bonus denials due to latency in the upstream. Brought down the number of such cases from ~ 500/week to 0/week

Golive games

Machine Learning Engineer

Sep 2018Mar 2019 · 6 mos

  • Implemented models for keypoint extraction from online articles using Supervised Learning. Devised a search model based on Lucene and an article summary predictor using NLP. Wrote spiders to crawl data from various online sources.
Variant NormalizationMaximum Entropy ModellingCopy-Number VariationBioinformaticsSoftware Development

Strand life sciences

Associate Software Engineer - IV

Jul 2017Nov 2019 · 2 yrs 4 mos · Bengaluru Area, India

  • Worked alongside Bioinformaticians and Data Scientists to develop software that helps academicians and clinical laboratories with Oncology studies and medical diagnoses. I was involved in the desktop development of Genespring, our data analysis software for genomics and backend development for StrandOmics, our variant interpretation, and reporting software.
  • ● Worked on Variant Normalization, Maximum Entropy Modelling, Variant LiftOver, and Copy-Number Variation(CNV) in StrandOmics, our Variant Interpretation and Reporting Platform; Noninvasive Prenatal Testing (NIPT) in Clinical Research team; and Kendrick Mass defect plot, Z-Score plot, and Method automation in Genespring, Strand Avadis.
  • ● Performed backend optimizations in StrandOmics, achieving several folds of time and memory improvements. Optimizations involved MySQL database schema and query restructuring, cache handling, deployment of Bloom filters, and Multi-threading. This helped in scaling to over 30 Million variants with low latency in our platform which was previously limited to 200K gene variants.
  • ● Optimized the components used to determine the quality of RNA and DNA sequencing inputs. Brought down execution times from several hours to a couple of minutes by using interval trees for quick search of overlapping regions.
  • ● Optimized the component for calling Splice Variants, reducing the run time from over 8 hours to under an hour using Segment Trees and Doubly linked lists to only consider alignments that start from the first Exon. Also improved the memory consumption of hashing millions of BAM reads using Tries.
  • ● Developed a custom library on top of Vis.js for fast rendering of ellipsoids in confidence plots.
Supply IncentivesDistributed systemsPostgreSQLGoogle CloudSoftware EngineeringSystem Architecture

Bhabha atomic research centre

Electrical Engineering Intern

May 2016Jun 2016 · 1 mo · Mumbai, Maharashtra, India

  • Worked on sample Irradiation Conveyor Control by 10MeV Electron Beam Accelerator using PLCs with SCADA interface. Parameters that control the motion were the number of passes and speed provided by HMI. A feedback system was introduced between the electron beam intensity and the conveyor speed, giving it all the necessary features including protection from power surges. Some generic TWIDO utility modules were also developed.

Dozee

Intern

Nov 2015Dec 2015 · 1 mo · Bangalore Urban, Karnataka, India

  • Prototyped air quality sensors, heart rate, and breathing rate monitors to help track hospitals’ indoor air quality and patients’ sleep quality. Implemented in Arduino with Python and Processing on a Raspberry Pi environment.
PLCSCADA interfaceFeedback systemControl SystemsElectrical Engineering

Indian institute of technology, indore

2 roles

Student Mentor

Promoted

May 2015May 2017 · 2 yrs

Keypoint extractionSupervised LearningNLPMachine Learning

Hostel Affairs Secretary

May 2015May 2016 · 1 yr

Indian institute of technology, bombay

Research Assistant

May 2015Jun 2015 · 1 mo · Mumbai, Maharashtra, India

  • Modeled PV cells in Matlab and designed an algorithm for MPPT under Multi-Level Insolation.
Prototyped sensorsImplemented in ArduinoEmbedded SystemsPrototyping

Education

UC Irvine

Master of Science - MS — Computer Science

Sep 2021Jun 2023

Indian Institute of Technology, Indore

Bachelor of Technology (B.Tech.)

Jan 2013Jan 2017

Stackforce found 100+ more professionals with Apache Spark & Data Analytics

Explore similar profiles based on matching skills and experience