T

Taher Paratha

Data Engineer

Chicago, Illinois, United States7 yrs 2 mos experience
Most Likely To SwitchAI ML Practitioner

Key Highlights

  • Developed a generative AI chatbot for data querying.
  • Led migration of Tableau dashboards to AWS QuickSight.
  • Created a fully automated BI converter tool.
Stackforce AI infers this person is a Data Engineer specializing in AI and Big Data solutions across various industries.

Contact

Skills

Core Skills

Generative AiAwsData MigrationBi ToolsData EngineeringBig DataMachine LearningTeachingData ScienceDeep Learning

Other Skills

AWS BedrockAWS LambdaAWS QuickSightAWS SageMakerAdvanced ExcelAmazon EC2Amazon Elastic MapReduce (EMR)Amazon QuickSightAmazon RedshiftAmazon S3Amazon Web Services (AWS)Analytical SkillsApache SparkArtificial Intelligence (AI)Artificial Neural Networks

About

Data Engineer with 5+ years of experience, specializing in data analytics and machine learning within aerospace, railroad, and quick service restaurant (QSR) domains. Expert in developing and deploying innovative solutions using cutting-edge technologies such as GenAI, PyTorch, TensorFlow, and Scikit-learn for machine/deep learning tasks, and proficient in managing large data sets using Spark in Pyspark/Databricks environments. Significant achievements include leading the design and development of a generative AI chatbot for data querying across multiple databases and applications, showcasing a deep understanding of natural language processing and AI model integration. Spearheaded the migration of Tableau dashboards to AWS QuickSight, leveraging a proprietary BIConverter tool to migrate data visuals and reports. Integral part of a team that developed the said BIConverter tool to automate BI migration efforts leveraging skills in data extraction, api integration, and logic creation. Demonstrated skills in data migration, scripting in python, contributing to the seamless transition and integration of data platforms.

Experience

Wavicle data solutions

2 roles

Data Engineer

Jan 2023Present · 3 yrs 2 mos · Chicago, Illinois, United States

  • Research and development of a generative AI sentiment analyzer that can process customer reviews from google for a given restaurant chain and its competitors. The analyzer processes the comments for sentiments, extracts food items being talked about, and sentiment for those food item, and topics of sub comments in the review. The output is saved in a dataframe which is used to create a dashboard on AWS Quicksight.
  • Used AWS Bedrock to access LLM foundational model Anthropic’s Claude V2.
  • Used pydantic and prompt engineering to extract the information in a structured form from the LLM.
  • The output from the LLM was collected as a json and saved in a dataframe.
  • Reduced the processing time of all the comments by 90% by using asynchronous processing and parallel api calls.
  • The dataframe was used as an input to create a dashboard on AWS Quicksight.
  • The script was deployed on AWS Lambda to automate the whole process.
  • Researched on the best LLM model and the best Langchain chain to use for a conversational response.
  • Conducted a hyperparameter search to find the best hyperparameters for the LLM.
  • Created a transformer model that can take unstructured data such as pdf and convert them into embeddings and store them in pinecone vectordb.
  • Performed prompt engineering to create precise and accurate response from the LLM.
  • Chatbot was able to have a conversation from the provided context and keep chat history in memory.
  • Created a UI using python’s streamlit library and hosted it on a AWS EC2 server.
  • Research and development of a fully automated BI converter tool that takes visualization workbooks from one BI tool and converts them into another tool. Worked on Tableau to Quicksight converter tool and PowerBI to Quicksight tool.
  • Leverage artificial neural network algorithm to create a 21-class classification model which predicts the chart type of given tableau worksheet. Achieved a test accuracy of over 90% percent using a three-layer model with a VGG19 base
AWS BedrockAWS SageMakerGenerative AIOpenAI GPTPineconeAmazon EC2+6

Data Engineering Intern

Oct 2022Dec 2022 · 2 mos · Chicago, Illinois, United States

  • Research and development of a fully automated BI converter tool that takes visualization workbooks from one BI tool and converts them into another tool. Worked on Tableau to Quicksight converter tool.
  • Researched on Tableau workbook xml structure to extract worksheet level metadata using ElementTree library in python.
  • Worked on Tableau GraphQL Query to create a query which gets details about a tableau server and its selected workbooks in a single API call. The query was optimized to get as much detail as possible using pagination.
  • Environment: AWS Quicksight, EC2, S3, Boto3, Tableau, Rest APIs, GraphQL, XML, Python, ANN, VGG19, Pandas, BeautifulSoup, Selenium, Machine learning, Docker, PowerBI, Generative AI, OpenAI GPT, Sentence Transformer, Langchain, Pinecone, FAISS, LLM, streamlit, AWS Bedrock, AWS Lambda
Boto3GraphQLREST APIsAmazon QuickSightArtificial Neural NetworksDocker+2

Stout

Digital & Data Analytics Intern

Jun 2022Aug 2022 · 2 mos · Irvine, California, United States

  • Generated big data with a billion records using pyspark and map/reduce techniques.
  • Created a custom script to transfer the generated snappy parquet files from local system to an online datalake.
  • Created a cost efficient EMR cluster with 8 EC2 nodes to perform operations of the big data.
  • Collected performance metrics from Spark APIs for comparison and further use.
  • Used machine learning algorithms like Sarimax, Arima, LSTM, Prophet etc. to create an asset valuation model for a client.
  • Searched for optimized hyperparameters using techniques like grid search. Achieved a mean absolute percentage error of less than 10% asset value for the forecast of June.
  • Created a dynamic interactive dashboard using dash and plotly for the client to access their reports, the dashboard was hosted via flask.
  • Scraped three different asset websites to collect data for the client’s asset using beautifulsoup and selenium. Optimized the scraping script to run 70% faster and made it dynamic.
  • Created a python program which interacted with an RPA bot to push log files to a MongoDb database.
Machine LearningAmazon Web Services (AWS)Big Data AnalyticsAmazon Elastic MapReduce (EMR)Amazon EC2Amazon S3+1

The university of texas at arlington

Graduate Teaching Assistant

Jan 2022May 2022 · 4 mos · Arlington, Texas, United States

  • ● Assisted in managing a graduate level data science course of around 150 students.
  • ● Worked directly under the associate chair of the computer science and engineering department of the University.
  • ● Gave weekly tutorials on Python, Pandas, Numpy, Machine learning techniques, and data ETL process to over 100 students.
  • ● Designed engaging and interesting programming assignments which helped students learn the insights of data science.
  • ● Provided help, motivation, and assistance to the students of the course.
  • ● Helped in designing and grading quizzes and HWs for the class.
  • Environment: Python, Scikit learn, Advanced Excel, matplotlib, MS Teams, Numpy, Pandas.
PythonScikit learnAdvanced ExcelmatplotlibNumpyPandas+2

Indian institute of technology, bombay

Research Engineer

Jan 2020Jun 2021 · 1 yr 5 mos · Mumbai, In

  • ● Developed, trained, and evaluated deep-learning model with a detection rate of 93% using YOLOv2, Label Box, OpenCV 3.4, and PyTorch to perform video analytics to detect objects via onboard processor Nvidia Jetson Nano using Cuda.
  • ● Trained and implemented mobilenetSSD open-source custom object detection model which supports with high mAP with over 20k annotated images to detect 5+ objects through the video feed at a 20-mile long coal mine and saved Rs. 60 million annually in coal theft.
  • ● Worked on auto-landing and Take-off by implementing computer vision to detect landing pad from drone video feed to automate drone deployment in remote places.
  • ● Used flight test data of the past 10 years with 215+ features to create a Random Forrest machine learning model to classify the stability of a given drone flight. Achieved a test accuracy of 95% and reduced flight prep time from 3 hours to just 10 mins, almost 95% decrease.
  • ● Create motor database via web scraping over 40 websites using beautiful soup. Collected over 4 Gb of data and stored it in AWS S3 data lake for further processing.
  • ● Was responsible for reworking codebase as part of testing a POC to migrate the department's data from MySql RDBMS to a NoSql database MongoDB.
  • ● Automated the task of creating visuals from the motor database using python, embedded SQL in MySQL, matplotlib, and Seaborn to help in easy decision making.
  • Environment: Python, MySQL, Scikit learn, Advanced Excel, OpenCV, ROS, Yolo V2, S3, NoSQL, MongoDb, matplotlib, and Seaborn.
PythonMySQLScikit learnOpenCVS3NoSQL+5

Southern electronics (b) pvt. ltd.

Design Optimization Engineer

Jul 2018Jan 2020 · 1 yr 6 mos · Bengaluru Area, India

  • ●Created decision-making algorithms using Python for the conceptual design parameters of a MAV requirement. Reduced time required in the conceptual design phase from 3 months to a few weeks.
  • ● Created deep learning model of 5 convoluted layers and 4 linear layers using Pytorch and strain sensor data as inputs to the model to predict the collapse of an underground mine.
  • ● Processed high volume data from 800+ sensors in real time to classify future mine state as safe or unsafe.
  • ● Trained the model using cuda for best utilization of training times to use the model on the fly. Model was processed in 1300 ms.
  • ● Implemented and deployed the solution at an underground depth of around 500 feet.
  • ● Used AWS SNS and SES to notify the miners and the mining office of a potential collapse for fast and efficient evacuation.
  • ● Created automated reports and presented them for weekly and monthly departmental meetings using the project data available on the MySQL servers.
  • ● Made substantial contributions in simplifying the development and maintenance of MAVs by creating SOPs and design tables.
  • Environment: Python, Pytorch, Matlab, XFLR5, MATLAB, Advanced Excel, R, MathCAD, MySQL.

Aeronautical development agency

Graduate Student Researcher

Aug 2017May 2018 · 9 mos · Bangalore

  • I was involved in conceptually designing a solar powered electric autonomous vehicle. It had to be a low subsonic fixed high aspect ratio wing. I also had the added job of analysing the fixed wing for any aeroelastic problems that it may encounter due to its high aspect ratio, using MSC Nastran and flight loads and dynamics package.

Southern electronics (b) pvt. ltd.

Design Engineer

Feb 2016May 2016 · 3 mos · Bangalore

  • Design and development of 10 kg class fixed wing aircraft, and also overlooked development of surveillance octacopter.

Quest global

Structural Analysis Intern

Jun 2015Jul 2015 · 1 mo · Bengaluru Area, India

  • Analyse the In-Seat Power Module of the Airbus A320 neo for its structural stability during flight.
  • Designed and created a finite element structural model of the ISPM using the MSC Patran tool.
  • Provided technical assistance; analysed the ISPM for its normal modes and made the ensuing conclusions

Education

The University of Texas at Arlington

Master of Science - MS — Data Science

Aug 2021May 2023

Defence Institute of Advanced Technology (DIAT), DU, DRDO

Master of Technology - MTech

Jan 2016Jan 2018

Jain (Deemed-to-be University)

Bachelor of Engineering (BE) — Aerospace Engineering

Jan 2011Jan 2015

Stackforce found 100+ more professionals with Generative Ai & Aws

Explore similar profiles based on matching skills and experience