Salman Bootwala

AI Researcher

Abu Dhabi, United Arab Emirates12 yrs 1 mo experience

AI EnabledAI ML Practitioner

Key Highlights

Over 12 years of experience in Data Engineering and Generative AI.
Reduced manual engineering effort by up to 70% through automation.
Expert in building scalable data platforms and AI frameworks.

Stackforce AI infers this person is a Data Engineering expert with a strong focus on Generative AI in Fintech and SaaS industries.

Contact

salmanbo@emiratesnbd.com +919731194442 LinkedIn

Skills

Core Skills

Data EngineeringGenerative AiMicroservicesBig Data

Other Skills

API developmentAWS GlueAWS LambdaAmazon S3Amazon Web Services (AWS)Apache KafkaApache PigApache SparkApache Spark StreamingBig Data AnalyticsBig Data ProjectsBusiness AnalysisClaude promptsClickstream DataCloudera

About

With over 12 years of experience in Data Engineering and Generative AI, I specialize in building scalable data platforms, agentic AI frameworks, and RAG-based analytical systems that connect enterprise data with intelligent automation. At Emirates NBD, I’ve architected a GenAI-powered framework that leverages Claude prompts, Claude sub-agents to automate the translation of legacy data integration logic into high-performance PySpark code. Developed LLM-based SQL optimizers and text-to-SQL agents powered by Llama3 and Qwen models, accelerating query performance and reducing manual engineering effort by up to 70 percent. Earlier at G42 Bayanat, I integrated in-house AI models—including Computer Vision, MLP, and Sentiment Analysis—into production-grade data pipelines via Python Flask microservices, deployed on Kubernetes with GitLab CI/CD and Azure Databricks. My experience spans real-time streaming, data lake architectures, and AI-powered ETL optimization across banking, telecom, retail, and social-media domains. Technically fluent across Spark • Kafka • LangChain • LangGraph • RAG • MCP • Azure ADF • Databricks • Docker • Kubernetes, I’m passionate about building LLM-aware data systems that bridge traditional data engineering with modern AI workflows. Driven by curiosity and impact, I thrive in environments that challenge boundaries between data infrastructure and intelligent automation, transforming raw data into actionable, explainable, and AI-augmented insight.

Experience

Emirates nbd

Senior Data & AI Engineer

Aug 2023 – Present · 2 yrs 7 mos · Dubai, United Arab Emirates · On-site

1. In-House DDI (Dynamic Data Ingestion) Framework - To Load data from multiple source systems (Oracle, Mssql, File etc) to multiple target systems (Oracle, Hive etc) using Spark.
2. Build an API connector to load data from Hive to Cloud via API and Spark.
3. In-House SQL Optimizer using LLM models to automatically identify, rewrite, and index long-running Oracle SQL queries, reducing execution time by up to 70% through intelligent LLM analysis.

Data ArchitecturePySparkLLM modelsAPI developmentData IntegrationData Engineering+1

G42

Senior Data Engineer

Aug 2020 – Jun 2023 · 2 yrs 10 mos · Abu Dhabi Emirate, United Arab Emirates · On-site

1. Created a real-time data pipeline from Kafka to HDFS using spark streaming.
2. Build graph database in Neo4j and search query database in Elasticsearch, along with performance optimization in ES for faster results on UI.
3. Developed and Integrated Various Microservices using REST Calls, asynchronous Processes using Distributed Scheduling
4. Designed and implemented Kubernetes cluster strategies, ensuring seamless integration with CI/CD pipelines and containerized applications.
5. Configuration and setup of build pipeline in Gitlab that includes security and
code check-linting , unit test cases, coverage report, security checks, sonarqube etc. along with Unit Test cases and Test Suite Integration for end to
end pipeline.

KafkaHDFSNeo4jElasticsearchKubernetesGitLab CI/CD+2

Walmart labs

Software Engineer III

Oct 2018 – Jul 2020 · 1 yr 9 mos · Bengaluru, Karnataka, India · On-site

1. Migration of jobs from hive to spark code optimization:- Optimized a long-running hive job by migrating it into spark and further optimized it using Spark
properties reducing the full completion time from 12 hours to 1.5 hours.
2. Build real-time dashboard for businesses to analyze Walmart sales, transactions and customer visits trend using Kafka and Druid.
3. Created a real-time data pipeline from Kafka to HDFS using spark streaming.
4. Metrics to analyze customer & visitor trends,sales and transaction numbers for Walmart services like lists, registry, Walmartpay etc using clickstream & transactions data. Created an end to end data pipeline to load and processed clicksteam data from on-prem clusters to Druid using Spark and swift storage

SparkKafkaDruidData PipelineClickstream DataData Engineering+1