Omkar Shewale — Machine Learning Engineer
I’m a Machine Learning Engineer with a Master’s in Computer Science from Illinois Tech and 3+ years of experience designing, optimizing, and deploying intelligent, scalable ML systems for real-world impact. At ServiceNow and Orion Technolab, I delivered production-grade ML pipelines, accelerated inference for large language models (LLMs), and deployed advanced AI solutions that improved performance, reduced latency, and enabled business-critical decisions. Highlights include: • Achieved 5.7× inference speedups and reduced latency by 40%+ by optimizing LLMs with CUDA, TensorRT, ONNX Runtime, and dynamic batching. • Built and deployed end-to-end ML pipelines and MLOps frameworks with Kubernetes, MLflow, Docker, and FastAPI on cloud platforms (AWS, GCP). • Delivered transformer-based NLP systems, generative AI models (GANs, VAEs), and RAG pipelines, driving measurable KPIs like improved churn prediction, ticket resolution, and robust edge-case testing. • Deployed multi-agent systems using LangChain, AutoGen, and Hugging Face Transformers to orchestrate complex IT and analytics workflows. I’m passionate about building fast, reliable, and scalable ML/AI systems — from optimizing GPU inference to deploying distributed AI pipelines that scale under real-world demands. Currently, I’m seeking opportunities as a Machine Learning Engineer, Applied ML Engineer, AI Engineer, ML/AI Infrastructure Engineer, Inference Engineer, or MLOps Engineer, where I can deliver impactful and innovative AI solutions. Core Skills: Machine Learning | Deep Learning | LLMs | Transformers | Generative AI | Inference Optimization | CUDA | TensorRT | Hugging Face | LangChain | RAG | MLOps | Kubernetes | Docker | FastAPI | PyTorch | TensorFlow | NLP | Predictive Analytics | GPU Profiling | Distributed AI Systems
Stackforce AI infers this person is a Machine Learning Engineer specializing in AI solutions for IT services and cloud computing.
Location: San Francisco, California, United States
Experience: 4 yrs 1 mo
Skills
- Machine Learning
- Mlops
- Inference Optimization
- Data Engineering
Career Highlights
- Achieved 5.7× inference speedups in ML systems.
- Reduced latency by 40%+ for real-time AI applications.
- Delivered end-to-end ML pipelines with significant efficiency gains.
Work Experience
ServiceNow
Machine Learning Engineer (1 yr 7 mos)
Orion Technolab
Machine Learning Engineer (2 yrs 6 mos)
Education
Master of computer science at Illinois Institute of Technology
Bachelor of Engineering - BE at Savitribai Phule Pune University