Arnab Mitra

Software Engineer

Tempe, Arizona, United States2 yrs 2 mos experience
AI EnabledAI ML Practitioner

Key Highlights

  • Designed a star schema database improving query performance by 20%.
  • Developed a GPT-3.5-powered HR chatbot reducing workload by 40%.
  • Built an AWS-enabled CI/CD forecasting pipeline for high accuracy.
Stackforce AI infers this person is a Data Engineer and AI Specialist with a focus on SaaS and cloud solutions.

Contact

Skills

Core Skills

Ai & MlCloud ComputingData EngineeringData AnalyticsAccessibility Technology

Other Skills

AI Classification SystemsAI SolutionsAI Taxonomy ExpansionAWSAWS CloudFormationAWS GlueAWS LambdaAWS SageMakerAccessibility SolutionsAlgorithm DesignAlgorithmsAmazon S3Amazon Web Services (AWS)AnacondaApplied Machine Learning

About

๐Ÿš€ Building AI & ML Systems | LLMs, Data Pipelines, Cloud (AWS, Azure), ETL, MLOps I build AI solutions that solve real-world problems and optimize data systems to make them scalable and efficient. With experience in machine learning, cloud-based ETL pipelines, and AI automation, I specialize in bridging the gap between raw data and actionable insights. Currently pursuing my M.S. in Data Science at Arizona State University, I am actively seeking Summer 2025 Data Science internships to apply my expertise in AI and data engineering. Key Achievements & Impact ๐Ÿ”น Cloud & Data Engineering: Designed and optimized a star schema database for Indigo Airlines, improving query performance by 20% and streamlining cloud migration. ๐Ÿค– AI-Powered Automation: Developed and deployed a GPT-3.5-powered HR chatbot on Azure, implementing a Retrieval-Augmented Generation (RAG) pipeline that reduced HR workload by 40% while ensuring automated monitoring and alerts. ๐Ÿ“ˆ Forecasting & Data Pipelines: Built an AWS-enabled CI/CD forecasting pipeline, automating data ingestion, preprocessing, and model training for high-accuracy predictions. โšก Data Analytics & Optimization: Enhanced Nestlรฉโ€™s Power BI reports, reducing refresh times by 50% through advanced SQL query optimization across 50+ tables, accelerating data-driven decisions. ๐ŸŒ Accessible AI for Social Good: Developed an AI-powered image-to-speech conversion system for visually impaired users, leveraging ResNet50 for image feature extraction and LSTM with GloVe embeddings for contextual text generation. Technical Expertise โœ… Machine Learning & AI: TensorFlow, PyTorch, Scikit-learn, NLP, LLMs, multimodal architectures โœ… Cloud & DevOps: AWS (Lambda, S3, Glue), Azure (Data Factory, SQL, Application Insights), CI/CD, Docker, MLFlow โœ… Data Engineering & ETL: Scalable ETL pipelines, data lakes & warehouses, data versioning with DVC โœ… Programming & Tools: Python, SQL, Unix/Linux, Flask API deployment โœ… Data Visualization & Analytics: Tableau, SQL-based reporting, Excel Publication ๐Ÿ“„ "A Comparative Study of Demand Forecasting Models for a Multi-Channel Retail Company" (Springer) โ€“ A novel hybrid ML approach for accurate retail forecasting. ๐Ÿš€ I am passionate about leveraging AI, automation, and cloud computing to build scalable, high-impact data solutions. Letโ€™s connect! ๐Ÿ“ง arnabmitra410@gmail.com | ๐Ÿ“ž +1 (623) 286-5758

Experience

2 yrs 2 mos
Total Experience
1 yr
Average Tenure
--
Current Experience

Gen

2 roles

Software Engineer

Sep 2025 โ€“ Present ยท 9 mos ยท Tempe, Arizona, United States ยท Hybrid

Software Engineer Intern

Jun 2025 โ€“ Sep 2025 ยท 3 mos ยท Tempe, Arizona, United States ยท Hybrid

  • ๐Ÿš€ Built and deployed AI-powered classification systems using large language models, prompt engineering, and cloud-native microservices to automate Salesforce case handling and categorize NPS survey comments.
  • Key Achievements:
  • 1. Salesforce Case Classification
  • Engineered a Ruby-based microservice integrated with Google Gemini Flash Lite 2.0 to automate classification of Salesforce support cases, with an accuracy of 88% by combining AI outputs with agent notes.
  • Designed a two-step classification pipeline with 16 specialized prompts, reducing hallucinations and ensuring schema-controlled outputs for Issue Types, Subtypes, and Categories.
  • Optimized performance to deliver predictions in <950ms per case at $0.00035 per case, enabling scalability across 39K+ monthly chat cases for <$15/month.
  • Built and deployed with Docker + Kubernetes (GKE), secured using Google Cloud IAM, and integrated into Salesforce workflows (via Romulus) using REST APIs with validation and robust error handling.
  • Conducted a BigQuery-driven POC using 1,400+ historical cases to benchmark Gemini performance before production rollout.
  • 2. NPS Survey Comment Categorization
  • Developed a BigQuery + Gemini pipeline to categorize Qualtrics NPS survey comments into 10 predefined topics (e.g., Pricing/Value, Customer Service, Messaging).
  • Authored prompt-engineered SQL queries with strict schema definitions to ensure parseable and consistent classification results.
  • Designed an AI-driven taxonomy expansion workflow that surfaced 339 new recurring themes (e.g., Website Usability, Language Barriers), extending coverage of customer feedback.
  • Delivered a sandbox environment in BigQuery for stakeholders to validate category assignments and explore taxonomy recommendations.
  • Technical Stack:
  • Backend & APIs: Ruby, Sinatra, Puma, REST APIs, JSON Web Token
  • Cloud & Deployment: Docker, Kubernetes, Google Cloud IAM, GitHub Actions, Teamcity, Terraform, Big Query, SQL
Large Language ModelsCloud-native MicroservicesPrompt EngineeringSalesforce IntegrationAI Classification SystemsAI & ML+1

Arizona state university

2 roles

Operations Support Lead

May 2025 โ€“ Jun 2025 ยท 1 mo ยท Tempe, Arizona, United States ยท On-site

Operations Support Specialist

Nov 2024 โ€“ May 2025 ยท 6 mos ยท Tempe, Arizona, United States ยท On-site

  • ๐Ÿš€ Supporting Admission Services at ASU by managing graduate and transfer student enrollment deposits and application fees, ensuring a seamless and efficient enrollment process.
  • Key Responsibilities & Achievements:
  • Case Management & Student Support: Handle high-volume student queries related to enrollment deposits and application fees via Salesforce, ensuring timely resolution and exceptional applicant experience.
  • Process Optimization: Streamlined case tracking and resolution workflows, improving operational efficiency by 15% and enhancing workflow transparency.
  • Data Management & Reporting: Developed automated data cleaning workflows in Excel (advanced formulas, pivot tables, macros) to standardize datasets, identify trends, and improve reporting accuracy.
  • Front Desk Operations: Occasionally manage the Admission Services front desk, assisting students in person and providing critical guidance on enrollment-related processes.
  • Technical Stack:
  • CRM & ERP Platforms: Salesforce, PeopleSoft
  • Data Tools: Microsoft Excel (advanced formulas, pivot tables, macros)
  • Analytics & Reporting: Data validation, cleansing, and preparation for analysis
  • ๐Ÿ† Recognized for proactive problem-solving and data accuracy, contributing to a seamless enrollment experience for students.

Transorg analytics

Data Analyst

Jun 2022 โ€“ Feb 2024 ยท 1 yr 8 mos ยท Gurgaon ยท Hybrid

  • ๐Ÿš€ Leveraged data engineering and analytics to optimize cloud databases, automate forecasting pipelines, and enhance business intelligence for aviation, FMCG, and internal projects.
  • Key Achievements:
  • 1. Cloud Optimization for Indigo Airlines: Designed a star schema for 100+ tables during cloud migration, improving query performance and creating a data dictionary for seamless stakeholder communication. Presented the optimized model to leadership, driving strategic decisions.
  • 2. AI-Powered HR Chatbot Deployment: Built and deployed a GPT-3.5-based HR chatbot on Microsoft Azure using a Retrieval-Augmented Generation (RAG) pipeline. Integrated logging and automated misuse alerts, enhancing HR query efficiency.
  • 3. Power BI Performance Enhancement for Nestlรฉ: Reduced report refresh time by 50% by optimizing SQL queries across 50+ tables, accelerating daily reporting and decision-making.
  • 4. Azure ETL Pipeline Redesign: Rebuilt Azure Data Factory pipelines for Nestlรฉ, eliminating inefficiencies, automating data ingestion, and reducing run times for improved data accuracy.
  • 5. Automated Forecasting Pipeline: Developed an end-to-end, CI/CD-enabled forecasting pipeline on AWS, automating data preprocessing, model training, and deployment for increased accuracy and efficiency.
  • Technical Stack:
  • 1. Cloud & DevOps: Azure (Data Factory, SQL, Application Insights), AWS (Lambda, S3, RDS)
  • 2. Data Engineering & Analytics: SQL (Optimization), Power BI, Python (Pandas, NumPy)
  • 3. AI & NLP: GPT-3.5 Turbo, RAG Pipelines, OpenAI API
  • 4. CI/CD: Jenkins, GitHub Actions
  • ๐Ÿ† Recognized by leadership for driving seamless cloud migrations and delivering AI-powered HR solutions.
Cloud OptimizationData EngineeringAI SolutionsSQL OptimizationETL PipelinesAI & ML

Teksands.ai

Data Science Intern

Oct 2020 โ€“ Feb 2021 ยท 4 mos ยท Remote

  • ๐Ÿš€ Developed AI-driven accessibility solutions, leveraging deep learning and NLP to enhance visual content accessibility for visually impaired users.
  • Key Achievements:
  • 1. Engineered an AI-powered descriptive audio system that converts images into spoken narratives, improving real-time access to visual content.
  • 2. Built a multimodal AI framework integrating ResNet50 for image feature extraction and LSTM for text generation, optimized with GloVe embeddings, leading to fluent and contextually accurate captions.
  • 3. Designed a custom data generator to streamline model training, reducing processing time and improving neural network efficiency.
  • 4. Developed a robust text preprocessing pipeline with tokenization, stemming, stopword removal, and <start>/<end> tokens, enhancing caption generation quality and training accuracy.
  • 5. Integrated Google Text-to-Speech (gTTS) for real-time audio conversion, enabling seamless accessibility for diverse users.
  • Technical Stack:
  • 1. Machine Learning & Deep Learning: TensorFlow, Keras, ResNet50, LSTM, GloVe
  • 2. Programming & Data Processing: Python, Pandas, NumPy
  • 3. Text Processing: Tokenization, stemming, stopword removal
  • 4. Audio Processing: gTTS for real-time text-to-speech conversion
  • 4. Optimization: Custom data generators, model tuning
  • ๐Ÿ† Contributed to the development of an AI-driven accessibility platform, enhancing inclusivity for visually impaired users through real-time multimodal interactions.
Deep LearningNLPAccessibility SolutionsAI & MLData Engineering

Education

Arizona State University

Master's degree

Aug 2024 โ€“ May 2026

Delhi Technological University (Formerly DCE)

Bachelor of Technology - BTech โ€” Mechanical Engineering

Jan 2018 โ€“ Jan 2022

Stackforce found 100+ more professionals with Ai & Ml & Cloud Computing

Explore similar profiles based on matching skills and experience