Vikas Reddy

Data Scientist

United States3 yrs experience

Highly StableAI Enabled

Key Highlights

Achieved 35% improvement in enterprise answer accuracy.
Co-authored IEEE paper on multimodal AI.
Reduced release cycles from two weeks to five days.

Stackforce AI infers this person is a skilled AI/ML Engineer specializing in Generative AI and machine learning solutions for SaaS and Fintech industries.

Contact

vikasreddy270@gmail.com LinkedIn

Skills

Core Skills

Generative AiMachine Learning

Other Skills

Retrieval-Augmented Generation (RAG)LangChainPythonAI auditingGrafanaPrometheusGenAI evaluationCNN-Transformer modelEEGSQLSpark-Hadoopscikit-learnTensorFlowXGBoostDocker

About

AI/ML Engineer with 5+ years of experience building and deploying production-grade machine learning and Generative AI systems across AWS and Azure. I specialize in end-to-end GenAI pipelines — from fine-tuning LLMs and embedding models to deploying RAG systems and agentic workflows at scale. My work has delivered measurable outcomes: 35% improvement in enterprise answer accuracy, 90% SME-scored relevance on decision-support systems, and release cycles reduced from two weeks to five days through robust LLMOps practices. Beyond industry work, I am an IEEE-published researcher in multimodal AI, with contributions in real-time CNN-Transformer models for emotion recognition. Core stack: Python · SQL · PySpark · Databricks · LangChain · OpenAI · Hugging Face · AWS · Azure · Docker · Kubernetes · MLflow Open to: AI/ML Engineer · GenAI Engineer · ML Engineer · Applied Scientist · Data Scientist Open to: • AI/ML Engineer • Machine Learning Engineer • Data Scientist • GenAI / LLM Engineer • Applied Scientist Core skills: Python, SQL, Spark, Databricks, AWS, Azure, Kubernetes, MLflow, LangChain, OpenAI, TensorFlow, PyTorch.

Experience

3 yrs

Total Experience

3 yrs

Average Tenure

Current Experience

Enigma technologies, inc.

Data Scientist

Aug 2024 – May 2026 · 1 yr 9 mos · United States · Remote

Built an enterprise-grade RAG + agent solution for customer-facing operations (tool-calling + retrieval only when needed), improving firstcontact resolution/answer correctness by 35% with grounded responses and citation-driven traceability.
Designed a GenAI evaluation platform (golden sets, automated graders, regression suites, failure taxonomy, human review workflow) that cut evaluation cycle time from days to hours and prevented 25%+ post-release regressions via release gates.
Implemented AI auditing + guardrails (prompt-injection tests, jailbreak heuristics, PII/secret filters, bias checks, policy checks, safe tooluse constraints), reducing high-severity unsafe outputs by >60% in red-team runs.
Delivered ROI reporting with exec-ready scorecards (quality, containment, p95 latency, cost/request, deflection), driving 15–20% lower cost per resolved case through retrieval tuning, caching, and model routing.
Operationalized production reliability with SLIs/SLOs, Grafana/Prometheus dashboards, alerting, runbooks, and on-call readiness, improving incident MTTR by ~30% for AI services.
Enforced secure enterprise deployment patterns: RBAC + OIDC/JWT, and audit logs capturing user identity, prompts, tool calls, retrieved doc IDs, and model/version for end-to-end governance and compliance traceability.

Retrieval-Augmented Generation (RAG)LangChainPythonAI auditingGrafanaPrometheus+2

Vinjamuri lab (brain machine interfaces)

Data Scientist

Jan 2024 – Jul 2024 · 6 mos · Baltimore, MD · On-site

https://vinjamurilab.cs.umbc.edu/
Co-authored an IEEE BSN ’24 paper on a multimodal CNN-Transformer model combining EEG and facial features for real-time emotion detection, achieving 97% accuracy with sub-10ms inference latency.
Designed and optimized the “EmoFormer” Vision Transformer for affective computing and applied model pruning and compression to reduce inference cost by 30%, setting new performance benchmarks on FER2013 (+8%) and AffectNet-7 (+5%).
Built an end-to-end real-time neuro-inference pipeline in Python, integrating model serving with strict runtime constraints, delivering 81% accuracy under a 200ms SLA for a commercial partner.

CNN-Transformer modelEEGPythonMachine Learning

Infosys bpm

Data Scientist

Jun 2020 – Jul 2023 · 3 yrs 1 mo · India · Remote

Delivered the full ML lifecycle for fraud detection and forecasting models, from feature engineering to training, evaluation, and deployment, using Python, SQL, Spark-Hadoop, and scikit-learn/TensorFlow/XGBoost, improving model accuracy by ~20%.
Built scalable ETL and data pipelines with PySpark, Spark, Airflow, and SQL, processing 1M+ transactions per day and reducing pipeline runtime by ~40% for reliable analytics and training datasets.
Productionized model inference using Docker, Kubernetes, and Amazon SageMaker, supporting 100K+ API requests per day at sub-200ms latency with MLflow-based versioning and rollback.
Designed and executed A/B tests on fraud decision rules and applied two-proportion z-tests (SciPy) to validate statistically significant performance gains while keeping false positives within target thresholds.
Implemented model and data observability and governance using MLflow and Prometheus, tracking drift proxies, KPIs, and latency/accuracy trends with alerting, reducing recurring incidents and manual intervention by ~40% and ensuring security and compliance alignment.

PythonSQLSpark-Hadoopscikit-learnTensorFlowXGBoost+4