Sarang Shrivastava

CEO

Menlo Park, California, United States9 yrs 8 mos experience
AI ML PractitionerHighly Stable

Key Highlights

  • Expert in developing AI/ML products for 6+ years.
  • Led a team to build generative AI solutions.
  • Proven track record in document AI and NLP.
Stackforce AI infers this person is a Fintech and AI specialist with strong expertise in NLP and document processing.

Contact

Skills

Core Skills

Generative AiLarge Language ModelsDocumentaiNamed Entity RecognitionRecord LinkingMachine LearningRecord De-duplicationSemantic UnderstandingNerText SegmentationRelation ExtractionDocument ProcessingMicroservicesSoftware Development

Other Skills

AlgorithmsApache KafkaApache SparkApproximate Nearest Neighbour SearchArtificial Intelligence (AI)BERTBig DataCC++Data ScienceData StructuresDeep LearningDiscriminative ModelsDistributed SystemsDocument Storage

About

I possess nearly 8 years of industry experience, with over 6 years dedicated to developing AI/ML products. At Language Machines, I lead a small team of engineers in building a generative AI stack for video understanding in the sports and entertainment domain. My work has extensively involved experimentation and application of GPT-4, GPT-4-V, and other open-source large language models (LLMs) to solve a variety of tasks, including semantic segmentation of videos, character recognition in videos, transcript correction, and video summarization. During my tenure in the Research and Development team at Goldman Sachs, my focus was on DocumentAI. I was tasked with developing a higher-level document representation layer, designed for complex document structures such as multi-column documents with numerous financial tables, paragraphs referencing other paragraphs throughout a document, and understanding the hierarchy of column headers in financial tables. I have also addressed a broad range of firm-wide use cases across divisions such as Data Engineering, Client Onboarding, and Investment Banking Division (IBD), specifically working on named entity recognition (NER), relation extraction, long document classification, retrieval/ranking, and RAG using state of the art transformer based models.

Experience

9 yrs 8 mos
Total Experience
2 yrs 8 mos
Average Tenure
1 yr 8 mos
Current Experience

Meta

Senior Research Scientist

Oct 2024Present · 1 yr 8 mos · Menlo Park, California, United States · Hybrid

  • Working at the intersection of GenAI and Ads in the Ranking & Foundational AI: Modeling Intelligence team, part of the Monetization pillar.

Language machines

Founding AI Engineer

Mar 2023Sep 2024 · 1 yr 6 mos · Silicon Valley, California, United States · Hybrid

  • At Language Machines, I led a small team of engineers in building a generative AI stack for video understanding in the sports and entertainment domain. My work has extensively involved experimentation and application of GPT-4, GPT-4-V, and other open-source large language models (LLMs) to solve a variety of tasks, including semantic segmentation of videos, character recognition in videos, transcript correction, and video summarization.
Generative AILarge Language ModelsVideo UnderstandingGPT-4GPT-4-V

Goldman sachs

4 roles

Senior AI/NLP Engineer in Data Engineering

Aug 2022Mar 2023 · 7 mos

  • 1) Record Linking across large Databases
  • a) Trained a sentence transformer on Database records, followed by an approximate nearest neighbour
  • search on the embedding space of records to reduce the search space. This step focused on achieving
  • high recall and reduced the search space from 10 million records to 100.
  • b) Used LLMs (T0pp and Flan) for bootstrapping the training data required to train a high precision
  • discriminative model (SetFit).
  • c) Trained an ensemble of LLMs and SetFit as a final model to Link records.
  • d) Improved the linking Accuracy by more than 10% above the baseline.
  • 2) Record De-Duplication in a Database
  • a) Curated a Dataset of duplicate records by working closely with the business team
  • b) Fine Tuned a BERT based classification model to filter non-human users which were not relevant to the business
  • c) Leveraged Sentence Transformers to map the textual descriptions of the remaining records into an
  • embedding space
  • d) Performed an approximate Nearest neighbour search on the embedding space to create a graph-based representation of the entire database. The nodes of the graph represent the record and the edges
  • represent the distance between them in their embedding space
  • e) Used Depth First search to form the clusters of duplicate records. Augmented the stopping condition of the algorithm by leveraging LLMs (Flan)
  • f) Fine-tuned SetFit to trim down the clusters created in the previous step to remove the false positives
  • introduced. Achieved an accuracy of 92% for this problem
DocumentAINamed Entity RecognitionRelation ExtractionLong Document ClassificationTransformer Models

Senior ML Engineer @ Research and Development Engineering

Dec 2021Aug 2022 · 8 mos

  • 1) Semantic Understanding of Tabular structures - Identifying Table components like Column
  • headers, Row headers, Captions and the hierarchy between them
  • Focus Areas: Layout aware Language Models, Tabular Data, Classification, Relation Extraction
  • a) Curated a dataset of table components( column headers, row headers, captions content cells) and
  • the hierarchy between column headers
  • b) Fine-tuned BERT and LayoutLMv1 for the component identification and hierarchy detection tasks and
  • showed that incorporating layout information in language models helps in tasks where visual structure
  • and layout of textual data is important
  • 2) NER and Relation extraction on Layout Rich Documents
  • Focus Areas: Language models v/s Layout aware Language models, Joint NER and relation relation
  • extraction
  • a) Curated two datasets for NER and Relation extraction tasks on visually rich documents - Cover pages in
  • Credit Agreements and Directory pages in Prospectuses
  • b) Fine-tuned BERT, ROBERTA, LayoutLMv1, LayoutLMv2 in various settings and showed that joint
  • training of NER and relation extraction tasks using layout-aware language models on layout rich
  • documents outperform standard language models
Record LinkingApproximate Nearest Neighbour SearchSentence TransformersDiscriminative ModelsBERTMachine Learning

Associate (ML Engineer) @ Research and Development Engineering

Promoted

Jan 2020Dec 2021 · 1 yr 11 mos

  • 1) Mathematical Constraint Extraction - Extraction of Negative Covenants from Credit
  • Agreements
  • Focus Areas: Text Segmentation, Multi-Label Multi-Class Classification, Relation extraction
  • a) Curated a dataset from public Credit agreements sourced from SEC filings of Negative Covenant fine
  • grained Extractions
  • b) Developed a Multi Class Multi-Label text classifier using BERT based backbone for correctly tagging
  • passages with Provisions in Credit Agreements
  • c) Developed a Random Forest based model for identifying relations between Comparators and trigger
  • points. Extracted the fine-grained negative covenant using a combination of soft signals and inputs from the provisions and the relation model
  • d) This model is a key component in saving the Credit risk desk X dollars
  • 2) Stanza Graph Construction - Identifying Section headers, Distinguishing Enumeration lists
  • start ( folded and unfolded ) with Section references and find hierarchy between them
  • Focus Areas: BERT, Classification, Heuristics, Representation Learning
  • a) Curated a span detection dataset for section starts and section references from Credit agreements
  • b) Developed a model for distinguishing between sections starts and section references using BERT and
  • CNN based techniques
  • c) Developed an algorithm for finding links between various section starts and created a tree-based
  • representation for the entire document
  • d) Converted a raw pdf document into a navigable HTML for faster navigation and readability
  • 3) Ternary Relation extraction in Prospectuses - ”Who is the Legal
  • advisor of Amazon for this fund offering?”
  • Focus Areas: Relation Extraction, BERT with Entity Markers, Random Forest
  • a) Curated a ternary relation extraction dataset amongst organizations, person names and roles
  • b) Developed a BERT-based model leveraging entity markers to enrich entity embeddings sent to the
  • relation head
  • c) Reduced review time of cases from a few hours to a couple of minutes when a new onboarding of a client takes place
Semantic UnderstandingNERRelation ExtractionLayout Aware Language Models

Technology Analyst (ML Engineer) @ Research and Development Engineering

Dec 2017Dec 2019 · 2 yrs

  • 1) Document Processing pipeline for natural language processing extractors
  • a) The main objective of this was to provide a seamless experience for folks who want to integrate new
  • extractors into the current system. Designed, built and integrated a document processing pipeline for
  • running various financial extractors on the fly
  • 2) Lex : Firm’s strategic Document Storage system
  • a) Reduced document transformation/indexing time in our Document Storage system from a few
  • hours to almost a second. This involved breaking down a monolithic service into multiple scalable
  • microservices
Text SegmentationMulti-Label ClassificationRelation Extraction

Arista networks

Software Engineer

Jul 2016Nov 2017 · 1 yr 4 mos · Bangalore

  • Part of the CVX( Cloud Vision Exchange ) team. The team works in the development of CloudVision Controllers that orchestrate behavior across a group of physical network devices running EOS; provide a single point of visibility and management to the customer; and serve as an integration point into other controllers, orchestration systems, or network management systems. It can be run as a standalone VM or cluster of VMs, or for smaller deployments can be run directly on the physical switches.
  • 1) Implemented the Virtual Ip support for CVX clusters that gives customers a single point of contact within the cluster. The vitual Ip actively follows the master node of the cluster.
  • 2) Implemented Directory level replication service that internally uses rsync and inotify Linux utilities to figure out when a particular file is changed and then synchronises it across the cluster using rsync.
  • 3) Implemented IpV6 support for Arista’s Zero Touch Provisioning (ZTP) feature which is used to configure a switch without user intervention. It provides an extensible solution for automatic configuration of as-yet unconfigured switches.
  • 4) Developed google-group support for ask-bot that takes into account a query string and displays the most appropriate results from google-groups. Whenever a command fails or throws out an error, it will search and display most relevant results which will save the user from searching for the error manually.
Document ProcessingMicroservicesDocument Storage

Iiit hyderabad

Summer Intern @

May 2015Jul 2015 · 2 mos · Hyderabad Area, India

  • NLP it is... :D
C++CSoftware Development

Indian institute of technology, bombay

Summer Intern @

May 2014Jul 2014 · 2 mos · Mumbai Area, India

  • Open Source is good...!!

Education

Motilal Nehru National Institute Of Technology

Bachelor's Degree — Computer Science and Engineering

Jan 2012Jan 2016

Stackforce found 100+ more professionals with Generative Ai & Large Language Models

Explore similar profiles based on matching skills and experience