S

Suraj Singh

Software Engineer

Bangalore Urban, Karnataka, India2 yrs 6 mos experience
Most Likely To SwitchAI ML Practitioner

Key Highlights

  • Achieved 300% faster responses for HR chatbot.
  • Drove 90% latency reduction in MVP development.
  • Refactored automation framework, cutting report time by 83%.
Stackforce AI infers this person is a Backend-heavy Software Engineer specializing in Fintech and SaaS solutions.

Contact

Skills

Core Skills

Generative AiReact.jsFastapiWeb ScrapingPythonUnit Testing

Other Skills

AWS LambdaAlgorithmsAmazon CloudFrontAmazon EC2Amazon S3Apache KafkaC (Programming Language)CLICSS3CloudflareData AnalysisData StructureDebuggingDjangoDjango REST Framework

About

Curiosity is the habit, shipping is the outcome. As a backend‑leaning engineer who enjoys debugging the hard parts, the work revolves around building low‑latency APIs, making systems observable, and reducing real user wait times. The journey started at VIT (Integrated M.Tech), grew through an optimization‑heavy internship at Intel, accelerated in a fast SF‑based startup, and now runs in enterprise GenAI at TCS (Prime). Recent work includes delivering an HR RAG assistant for a global bank using Redis Vector Search, grounded prompting, and semantic caching, cutting response latency by ~3x and trimming token costs 20–40%. At the startup, an MVP was built end‑to‑end: FastAPI services with GPT/Grok integrations, Pinecone retrieval, and a React chat UI, multithreading and streaming dropped p95 latencies by ~90%. Earlier at Intel, a 10k+ LOC framework was refactored, profiling bottlenecks (cProfile/Py‑Spy), batching I/O, and caching to take reports from 60 to 10 minutes. These days, attention goes to resilient deployments (Docker, Kubernetes), AWS primitives (EC2, S3, Lambda, API Gateway, SQS), and clear SLOs with Prometheus/Grafana. What matters most: solving for correctness first, then speed and cost, documenting decisions and leaving teams with cleaner code paths and better dashboards than they started with. Outside keyboards and dashboards, there’s a fondness for teaching what’s learned, long walks, and building small tools that make work lighter. Core stack and interests: Go, Python, FastAPI, Django/DRF, REST, gRPC, GraphQL, OAuth2/JWT/SSO, PostgreSQL, MySQL, MongoDB, Redis, Elasticsearch/OpenSearch, Kafka, Nginx, Docker, Kubernetes, Terraform, GitHub Actions/Jenkins, AWS (EC2, S3, Lambda, API Gateway, SQS), Cloudflare, Prometheus, Grafana, Pinecone, Vector Databases, RAG, Semantic Search, Multithreading, Streaming, p95/p99 latency, SLA

Experience

2 yrs 6 mos
Total Experience
1 yr 3 mos
Average Tenure
1 yr 9 mos
Current Experience

Tata consultancy services

System Software Engineer

Sep 2024Present · 1 yr 9 mos · Bangalore Urban, Karnataka, India · On-site

  • Launched HR-policy RAG chatbot for a global investment bank with Redis vector search, semantic caching, and grounded prompting - 300% faster responses.
  • Drove 40–60% deflection and 25–50% lower AHT via better chunking, re-ranking, and intent flows
  • Built a secure React chat UI with JWT/OAuth2 SSO, source citations, audit logs, and PII redaction; rate-limited via Cloudflare/API Gateway
  • Reduced LLM latency and token spend 20–40% using semantic caching, prompt tuning, and streaming;
  • improved grounded accuracy
Generative AIRetrieval-Augmented Generation (RAG)RedisREST APIsJSON Web Token (JWT)OAuth+9

Donna

Founding Software Engineer

Jun 2024Aug 2024 · 2 mos · San Francisco, California, United States · Remote

  • Built an end-to-end MVP: FastAPI services with GPT/Grok integrations and Pinecone vector search, shipped with a React UI and Cloudflare-backed delivery
  • Engineered ethical web-scraping pipelines and API integrations to enrich retrieval context, respecting
  • robots.txt/ToS and rate limits
  • Drove a 90% latency reduction via multithreading and streaming, materially improving UX and answer
  • quality for the assistant
Scrapy FrameworkSeleniumFastAPIPineconeWeb ScrapingMultithreading

Intel corporation

System Software Engineer

Aug 2023May 2024 · 9 mos · Bengaluru, Karnataka, India · On-site

  • Refactored 10k+ LOC automation framework, cutting cyclomatic complexity via de-nesting and deduplication
  • Reduced report runtime from 60 → 10 mins by profiling hot paths (cProfile/Py-Spy), batching I/O, and caching repeated work
  • Toughened CI/CD with linting, static analysis, unit/API tests, and review gates, lifting pipeline reliability and defect catch.
PythonpytestObject-Oriented Programming (OOP)Unit TestingWeb Caching

Education

Vellore Institute of Technology

Integrated M.Tech. — Computer Software Engineering

Jan 2019Jan 2024

Stackforce found 100+ more professionals with Generative Ai & React.js

Explore similar profiles based on matching skills and experience