Suraj Singh — Software Engineer

Curiosity is the habit, shipping is the outcome. As a backend‑leaning engineer who enjoys debugging the hard parts, the work revolves around building low‑latency APIs, making systems observable, and reducing real user wait times. The journey started at VIT (Integrated M.Tech), grew through an optimization‑heavy internship at Intel, accelerated in a fast SF‑based startup, and now runs in enterprise GenAI at TCS (Prime). Recent work includes delivering an HR RAG assistant for a global bank using Redis Vector Search, grounded prompting, and semantic caching, cutting response latency by ~3x and trimming token costs 20–40%. At the startup, an MVP was built end‑to‑end: FastAPI services with GPT/Grok integrations, Pinecone retrieval, and a React chat UI, multithreading and streaming dropped p95 latencies by ~90%. Earlier at Intel, a 10k+ LOC framework was refactored, profiling bottlenecks (cProfile/Py‑Spy), batching I/O, and caching to take reports from 60 to 10 minutes. These days, attention goes to resilient deployments (Docker, Kubernetes), AWS primitives (EC2, S3, Lambda, API Gateway, SQS), and clear SLOs with Prometheus/Grafana. What matters most: solving for correctness first, then speed and cost, documenting decisions and leaving teams with cleaner code paths and better dashboards than they started with. Outside keyboards and dashboards, there’s a fondness for teaching what’s learned, long walks, and building small tools that make work lighter. Core stack and interests: Go, Python, FastAPI, Django/DRF, REST, gRPC, GraphQL, OAuth2/JWT/SSO, PostgreSQL, MySQL, MongoDB, Redis, Elasticsearch/OpenSearch, Kafka, Nginx, Docker, Kubernetes, Terraform, GitHub Actions/Jenkins, AWS (EC2, S3, Lambda, API Gateway, SQS), Cloudflare, Prometheus, Grafana, Pinecone, Vector Databases, RAG, Semantic Search, Multithreading, Streaming, p95/p99 latency, SLA

Stackforce AI infers this person is a Backend-heavy Software Engineer specializing in Fintech and SaaS solutions.

Location: Bangalore Urban, Karnataka, India

Experience: 2 yrs 6 mos

Skills

Generative Ai
React.js
Fastapi
Web Scraping
Python
Unit Testing

Career Highlights

Achieved 300% faster responses for HR chatbot.
Drove 90% latency reduction in MVP development.
Refactored automation framework, cutting report time by 83%.

Work Experience

Tata Consultancy Services

System Software Engineer (1 yr 9 mos)

Donna

Founding Software Engineer (2 mos)

Intel Corporation

System Software Engineer (9 mos)

Education

Integrated M.Tech. at Vellore Institute of Technology

Suraj Singh

Software Engineer

Bangalore Urban, Karnataka, India2 yrs 6 mos experience

Most Likely To SwitchAI ML Practitioner

Key Highlights

Achieved 300% faster responses for HR chatbot.
Drove 90% latency reduction in MVP development.
Refactored automation framework, cutting report time by 83%.

Stackforce AI infers this person is a Backend-heavy Software Engineer specializing in Fintech and SaaS solutions.

Contact

surajsingh7346@gmail.com LinkedIn

Skills

Core Skills

Generative AiReact.jsFastapiWeb ScrapingPythonUnit Testing

Other Skills

AWS LambdaAlgorithmsAmazon CloudFrontAmazon EC2Amazon S3Apache KafkaC (Programming Language)CLICSS3CloudflareData AnalysisData StructureDebuggingDjangoDjango REST Framework

About

Experience

2 yrs 6 mos

Total Experience

1 yr 3 mos

Average Tenure

1 yr 9 mos

Current Experience

Tata consultancy services

System Software Engineer

Sep 2024 – Present · 1 yr 9 mos · Bangalore Urban, Karnataka, India · On-site

Launched HR-policy RAG chatbot for a global investment bank with Redis vector search, semantic caching, and grounded prompting - 300% faster responses.
Drove 40–60% deflection and 25–50% lower AHT via better chunking, re-ranking, and intent flows
Built a secure React chat UI with JWT/OAuth2 SSO, source citations, audit logs, and PII redaction; rate-limited via Cloudflare/API Gateway
Reduced LLM latency and token spend 20–40% using semantic caching, prompt tuning, and streaming;
improved grounded accuracy

Generative AIRetrieval-Augmented Generation (RAG)RedisREST APIsJSON Web Token (JWT)OAuth+9

Donna

Founding Software Engineer

Jun 2024 – Aug 2024 · 2 mos · San Francisco, California, United States · Remote

Built an end-to-end MVP: FastAPI services with GPT/Grok integrations and Pinecone vector search, shipped with a React UI and Cloudflare-backed delivery
Engineered ethical web-scraping pipelines and API integrations to enrich retrieval context, respecting
robots.txt/ToS and rate limits
Drove a 90% latency reduction via multithreading and streaming, materially improving UX and answer
quality for the assistant

Scrapy FrameworkSeleniumFastAPIPineconeWeb ScrapingMultithreading

Intel corporation

System Software Engineer

Aug 2023 – May 2024 · 9 mos · Bengaluru, Karnataka, India · On-site

Refactored 10k+ LOC automation framework, cutting cyclomatic complexity via de-nesting and deduplication
Reduced report runtime from 60 → 10 mins by profiling hot paths (cProfile/Py-Spy), batching I/O, and caching repeated work
Toughened CI/CD with linting, static analysis, unit/API tests, and review gates, lifting pipeline reliability and defect catch.

PythonpytestObject-Oriented Programming (OOP)Unit TestingWeb Caching