R

Ravi S.

Co-Founder

United States13 yrs experience
AI ML PractitionerAI Enabled

Key Highlights

  • 10+ years of experience in data engineering and research.
  • Expert in machine learning and natural language processing.
  • Published research in ACM and BIOSTEC.
Stackforce AI infers this person is a Data Engineering and AI specialist with a focus on educational and social media applications.

Contact

Skills

Core Skills

Natural Language Processing (nlp)Machine LearningData AnalysisData EngineeringData ManagementArtificial Intelligence (ai)Cloud ComputingSoftware DevelopmentData QualityData AnalyticsEducation

Other Skills

AWS EC2AWS GlueAWS LambdaAWS S3AirflowAlgorithm DesignAmazon EC2Amazon S3Analytical SolutionsAnalyticsApache AirflowAutomationBERTBig Data AnalyticsC

About

Ph.D. researcher and data engineering professional with 10+ years of experience across academia and industry, focusing on machine learning, natural language processing, and data-driven systems. Published in peer-reviewed venues such as ACM and BIOSTEC, with research exploring digital mental wellness, empathetic response generation, and social media analysis. Experienced in building NLP pipelines, designing ETL workflows, and working with streaming and batch data using tools like Spark, Kafka, and Airflow. Skilled in deploying models and data solutions using Python, PyTorch, and cloud-based services such as AWS and Azure. Committed to developing thoughtful, interpretable AI systems with a focus on human-centered impact.

Experience

Autocado

ML/NLP and ETL Developer

May 2023Aug 2023 · 3 mos · Tampa, Florida, United States · Hybrid

  • Implemented ELT pipelines to ingest and process 370+ XML feeds (PIES/ACES, 10 M+ records) into AWS S3, using PySpark for schema validation, normalization, and transformation into curated CSV datasets for NLP model training.
  • Fine-tuned RoBERTa and DistilBERT models on 500+ high-confidence vendor descriptions, reducing perplexity from 3.1 to 1.9 and generating 90% high-quality product descriptions validated with BLEU/ROUGE metrics.
  • Developed 10+ RESTful APIs using FastAPI to serve processed catalog data and model-generated descriptions, enabling integration with Autocado’s e-commerce platform and QA dashboards.
  • Orchestrated data processing workflows with Apache Airflow, automating daily ingestion, transformation, and validation of 10 M+ records across six microservices deployed with Docker.
  • Performed data validation and deduplication on heterogeneous vendor datasets, using Pandas and PySpark to enforce PIES/ACES schema compliance and ensure model-ready data quality.
  • Utilized Jupyter Notebooks for exploratory data analysis (EDA), identifying patterns in vendor data to optimize preprocessing and model performance.
Extract, Transform, Load (ETL)Data Build Tool (DBT)PySparkSnowflakeBig Data AnalyticsAnalytics+4

University of south florida

Graduate Researcher and Teaching Associate

Aug 2018Jul 2024 · 5 yrs 11 mos · Tampa/St. Petersburg, Florida Area

  • Implemented NLP pipelines to ingest and process 6,000+ student responses into AWS S3, fine-tuning BERT for empathetic auto-response generation, achieving 20% improvement in emotional intelligence scores (Tech Stack: Python, BERT, spaCy, PyTorch, AWS EC2, AWS S3; Published in ACM J. Comput. Sustain. Soc., 2023).
  • Analyzed 9,000+ student texts to identify COVID-19 stressors (Tech Stack: Python, Sentence-BERT, NLTK, Pandas, AWS EC2, AWS S3; Presented at BIOSTEC 2021).
  • Developed ELT workflows to extract and transform 12M Twitter and 7.9M Reddit posts, using Sentence-BERT for sentiment analysis and community detection.
  • Built a recommendation system for 548,552 Amazon products, using networkX for graph-based co-purchase analysis and Pandas for data preprocessing, achieving 85% prediction accuracy (Tech Stack: Python, networkX, Pandas, Matplotlib; Aug 2020 – Dec 2020).
  • Orchestrated SQL-based pipelines to preprocess student communication datasets, integrating TextBlob for sentiment scoring and spaCy for entity recognition, achieving 90% accuracy in resilience metrics (Tech Stack: Python, TextBlob, spaCy, SQL, Jupyter Notebooks).
  • Developed 15+ TensorFlow-based labs for Web Systems and Social Media Mining courses, deploying Jupyter Notebooks to enhance student engagement by 25% through hands-on NLP projects (Tech Stack: Python, TensorFlow, Jupyter Notebooks).
  • Deployed scalable NLP models on AWS EC2, optimizing training and inference for large-scale social media datasets with PyTorch and distributed computing (Tech Stack: Python, PyTorch, AWS EC2)
PyTorchSciPyGitDeep LearningPySparkGenerative AI+21

D-q.co | co-founder

Applied AI & Data Platform Engineer

Jan 2016May 2018 · 2 yrs 4 mos · New Delhi Area, India

  • Built a system DQ (Decency Quotient) by analyzing behavioral patterns while mining Social Network Data using Twitter (top 3200 Tweets) and Facebook APIs.
  • Deployed the Decency Quotient platform on AWS EC2 using Django, Nginx, and Gunicorn, ensuring scalability for real-time analytics.
  • Implemented ELT pipelines to ingest and process over 1M raw JSON records from Twitter and Facebook APIs into AWS S3, using PySpark to clean, enrich, and transform them into curated datasets for Transformer-based NLP model training.
  • Orchestrated ETL workflows with AWS Glue to automate data extraction and transformation for 1M+ social media records, reducing processing time by 20%.
  • Engineered serverless data preprocessing with AWS Lambda to trigger real-time transformations on incoming social media streams, improving pipeline scalability.
  • Designed optimized data schemas using SQL and Pandas for Amazon RDS, enhancing query performance by 30% for sentiment and behavioral analytics.
  • Designed and integrated 15+ Django-based RESTful APIs to serve decency scores and sentiment insights across 10+ internal dashboards and multiple third-party interfaces, enabling real-time user behavior analysis.
  • Utilized PySpark to perform data validation, normalization, and deduplication on large-scale social media datasets, ensuring model-ready data quality.
PyTorchData ManagementExtract, Transform, Load (ETL)Data Build Tool (DBT)Cloud ComputingGit+16

T.i.m.e. (triumphant institute of managment education)

Data Analytics Developer & Subject Matter Expert

Sep 2015Dec 2017 · 2 yrs 3 mos · New Delhi Area, India

  • Mentored students for GATE exams, including subjects like Theory of Computation, Algorithms, Data Structures, Compiler Design, and Operating Systems, integrating real-world coding challenges to enhance problem-solving skills (Tech Stack: C, C++)
  • Designed and automated ETL pipelines using Python, Pandas, and Shell scripts to ingest and process 10K+ learner records for performance analysis across CS fundamentals (Algorithms, Data Structures, Automata Theory).
  • Developed backend scoring modules and concept-tracking logic integrated into lightweight dashboards using Jupyter, HTML, and Matplotlib, enabling longitudinal analysis of student progression.
  • Deployed processing workflows and reporting artifacts to AWS EC2 and maintained reproducibility and scheduling with Git and cron-based automation.
Extract, Transform, Load (ETL)PySparkDatabasesAlgorithm DesignLeadershipData Analytics+1

Nit jalandhar

Assistant Professor - Department Of Computer Science & Engineering

Jan 2015Aug 2015 · 7 mos · Jalandhar Area, India

  • Courses Taught: Advanced Computer Networks, Distributed Computing
DatabasesAlgorithm DesignLeadershipManagement

Indraprastha institute of information technology, delhi

Research Assistant

Jun 2014Jan 2015 · 7 mos · New Delhi Area, India

  • Collaborated on a DST project in which implemented data pipelines to process mobile gait sensor data, using MATLAB and NumPy for feature extraction and classification of five gait types slow walk, walk, brisk walk, jogging and running with 90% accuracy (Tech Stack: Python, MATLAB, NumPy).
  • Developed NLP workflows to extract autobiographical insights from Twitter data, achieving 80% accuracy in narrative detection (Tech Stack: Python, Tweepy, NLTK, Pandas).
  • Utilized SQL and Pandas for data validation and preprocessing, improving analysis efficiency by 20% (Tech Stack: Python, Pandas, SQL)
Artificial Intelligence (AI)Algorithm DesignData AnalyticsLeadership

Career break

Professional development

Oct 2012May 2013 · 7 mos · Delhi

  • Developed a mobile app with Java and Android Studio, processing accelerometer data to detect unsafe travel patterns
  • and notify emergency contacts with 95% reliability (Tech Stack: Java, Android Studio, SQLite).
  • Implemented MATLAB-based algorithms for sensor data analysis, optimizing pattern recognition for intelligent
  • decision-making (Tech Stack: MATLAB, Java).

Ebay

Software Engineer

Aug 2011Sep 2012 · 1 yr 1 mo · Chennai Area, India

  • Centralized Application Logging (CAL) team. Part of the Platform Engineering group

Indian institute of technology, madras

Graduate Teaching Assistant

Jul 2009Jun 2011 · 1 yr 11 mos · Chennai Area, India

  • Courses: Programming in C, Fundamentals of Programming
Data ManagementSQLSoftware DevelopmentDatabasesAlgorithm DesignStatistical Analysis+2

Education

University of South Florida

Doctor of Philosophy - PhD — Computer Science & Engineering

Aug 2018Dec 2023

Indian Institute of Technology, Madras

Master of Technology - MTech — Computer Science and Engineering

Jul 2009Jun 2011

IETE DELHI

Bachelor of Technology - BTech — Computer Science & Engineering

Jan 2004Jan 2008

Stackforce found 100+ more professionals with Natural Language Processing (nlp) & Machine Learning

Explore similar profiles based on matching skills and experience