Sanmukh Sain Karri

Data Scientist

Panama City, Florida, United States8 yrs 1 mo experience

AI ML PractitionerAI Enabled

Key Highlights

Expert in machine learning and data science solutions.
Proficient in big data technologies like Hadoop and Spark.
Strong communicator bridging technical and non-technical teams.

Stackforce AI infers this person is a Data Scientist with expertise in Healthcare and Data Engineering.

Contact

shanmukhsain@gmail.com LinkedIn

Skills

Core Skills

Machine LearningData ScienceComputer VisionData Engineering

Other Skills

Amazon Web Services (AWS)Anomaly DetectionArtificial Intelligence (AI)Artificial Neural NetworksAttention to DetailBack-End Web DevelopmentC (Programming Language)CNNCloud ComputingConvolutional Neural Networks (CNN)Data AnalysisData IntegrationData LoadingData ProcessingData Quality

About

Data Scientist with years of experience in designing, developing, and deploying machine learning and deep learning models to solve complex business problems. Proven expertise in implementing and optimizing Graph Algorithms in embedded spaces for pattern recognition, recommendation systems, and network analysis. Extensive experience in large-scale data processing using Big Data technologies such as Apache Spark, Hadoop, and Databricks for scalable machine learning and data solutions. Skilled in integrating data storage solutions, including Snowflake, for efficient big data management and processing. Adept at collaborating with cross-functional teams to deliver end-to-end data science solutions, conducting exploratory data analysis, feature engineering, and ensuring model accuracy through evaluation metrics and hyperparameter tuning. Proficient in programming languages such as Python, R, and SQL, with experience in cloud platforms like AWS, Azure, and GCP. Strong problem-solving skills, with excellent communication abilities to present complex models and solutions to non-technical stakeholders.

Experience

8 yrs 1 mo

Total Experience

2 yrs

Average Tenure

2 yrs 4 mos

Current Experience

Clinicom

Data Scientist

Feb 2024 – Present · 2 yrs 4 mos · United States · Remote

Managed end-to-end data science projects, utilizing Python and R for data acquisition, cleaning, engineering, and scaling.
Built and deployed machine learning pipelines for chatbot development, covering data preprocessing, feature extraction, model training, and real-time deployment, optimizing chatbot performance through continuous improvement.
Designed and optimized data models and pipelines using Hadoop, Spark, and Hive within the Big Data Ecosystem.
Integrated GenAI Foundation Models and Vector DB for efficient storage and retrieval, improving chatbot response accuracy using Retrieval-Augmented Generation (RAG) techniques.
Data visualization expert, using tools like Tableau, Matplotlib, and R to create interactive dashboards and reports.
Collaborated with cross-functional teams (data engineers, developers, and analysts) in Agile and SCRUM environments, effectively communicating findings to both technical and non-technical stakeholders.
Developed ETL processes, optimizing SQL queries for efficient data extraction and transformation.
Applied computer vision techniques using OpenCV and deep learning frameworks for image classification, object detection, and video processing in healthcare applications, achieving high accuracy using CNN architectures and transfer learning.
Implemented computer vision pipelines for tasks such as image segmentation, facial recognition, and anomaly detection, ensuring smooth integration with clinic software.
Conducted statistical modeling and predictive analysis using tools like clustering, logistic regression, decision trees, and neural networks.
Extensive hands-on experience with the Software Development Life Cycle (SDLC) in Waterfall and Agile methodologies, contributing to all phases from requirement gathering to testing.
Handled large datasets, both structured and unstructured, performing data cleaning, statistical modeling, and visualization to provide actionable insights.

PythonRHadoopSparkData VisualizationETL+3

Qq tech, inc.

Data Science Intern

Jun 2023 – Dec 2023 · 6 mos · California, United States · Hybrid

Designed, developed, and deployed ML and deep learning models to solve complex business problems.
Implemented and optimized Graph Algorithms in embedding spaces for pattern recognition and network analysis.
Handled large-scale data processing using Apache Spark, Hadoop, and Databricks.
Integrated data storage solutions, including Snowflake, for efficient big data management.
Collaborated with cross-functional teams to deliver end-to-end data science solutions.
Conducted exploratory data analysis, data cleaning, and feature engineering for model development.
Ensured model accuracy through evaluation metrics and hyperparameter tuning.
Experience with graph databases such as Neo4j and TigerGraph, and familiarity with Docker and Kubernetes.
The application uses NLP technology to understand the legal documents and produce the analysis to help the customers educate well about the documents.
Worked on technologies: Machine Learning · Data Analysis · Data Visualization · Data Science · Python (Programming Language) · ReactJS

Machine LearningData AnalysisData VisualizationPythonReactJSData Science

Focus-n-fly

2 roles

Senior Software Development Engineer

Jan 2023 – May 2023 · 4 mos · Burlingame, California, United States

Applied statistical techniques, including multivariate regression, decision trees, and neural networks, for predictive modelling and customer segmentation.
Worked with big data ecosystems such as Hadoop and Spark for processing large datasets of structured and unstructured data.
Participated in all phases of the software development life cycle (SDLC), following Agile methodologies.
Collaborated with data engineers and IT teams to ensure smooth data integration and deployment of the recommendation engine into the production environment.
Conducted regular performance monitoring and model evaluation to ensure the accuracy and relevance of the recommendation engine.
Presented findings and insights to both technical and non-technical stakeholders, driving data-driven decision-making and improving customer experience.

Statistical TechniquesHadoopSparkData IntegrationData ScienceMachine Learning

Senior Software Engineer

Jan 2020 – Dec 2022 · 2 yrs 11 mos · Burlingame, California, United States

Collaboratively defined and prioritized business use cases for AI/ML solutions.
Conducted extensive data analysis and mining on large-scale customer transaction data for customer segmentation.
Utilized data visualization tools (Tableau, Python Matplotlib) for creating interactive reports and dashboards to visualize customer segments and purchasing behaviors.
Implemented machine learning algorithms, including clustering techniques and collaborative filtering methods, to develop a recommendation engine for personalized product suggestions.
Collaborated with cross-functional teams (marketing, sales, product management) to gather business requirements and understand customer needs.
Developed and optimized SQL queries for data extraction and transformation to support the ETL process and data preparation.
Utilized Python and R programming languages for data manipulation, feature engineering, and model development.

Data AnalysisMachine LearningSQLData VisualizationData Science

Navtech

Software Engineer

Dec 2018 – Dec 2019 · 1 yr · Hyderabad, Telangana, India · On-site

Collaborated with a cross-functional team of data scientists, analysts, and business stakeholders to develop and implement data engineering solutions for advanced analytics.
Led data acquisition efforts, including data extraction, data cleaning, and data transformation to prepare raw data for analysis.
Developed and maintained ETL processes using Python and SQL to efficiently process and integrate large datasets from various sources.
Implemented data quality checks and data validation processes to ensure data accuracy and integrity throughout the data pipeline.
Optimized SQL queries for data extraction, transformation, and loading (ETL) processes to improve data processing efficiency and reduce processing time.
Worked closely with data scientists and analysts to understand their data requirements and provide data engineering support for their modelling and analysis needs.
Implemented data modelling and data warehousing techniques to support advanced analytics and reporting requirements.
Developed and maintained data pipelines and workflows using distributed computing technologies such as Hadoop and Spark for processing large-scale data.
Collaborated with the data visualization team to provide clean and structured data for creating visually powerful and actionable interactive reports and dashboards.
Actively participated in Agile methodology and SCRUM process, providing technical expertise in data engineering and collaborating with team members to meet project deadlines.
Worked with version control tools such as GitHub for managing codebase and ensuring code quality.
Assisted in designing and implementing machine learning models for predictive analytics, including feature engineering and model evaluation.
Implemented data profiling and data profiling techniques to identify data quality issues and propose data quality improvement measures.

Data EngineeringETLSQLData Quality

Prahem technologies

Software Developer & Data Analyst

May 2017 – Nov 2018 · 1 yr 6 mos · Hyderabad, Telangana, India · On-site

Developed and maintained scalable data processing pipelines utilizing Hadoop and Spark for handling substantial volumes of structured and unstructured data.
Applied data profiling and analysis techniques to pinpoint data quality issues and propose enhancement measures.
Streamlined data processing workflows for enhanced performance, incorporating strategies like data partitioning, caching, and indexing.
Executed data visualization using Tableau and Python Matplotlib, creating interactive reports and dashboards for comprehensive data analysis.
Engaged in Agile methodology and SCRUM processes, collaborating within cross-functional teams to achieve project deadlines and deliverables.
Teamed up with data architects and database administrators to craft data models and implement data warehousing solutions, supporting advanced analytics and reporting requirements. Implemented robust data security measures, encompassing encryption and access controls, ensuring adherence to data privacy and security standards.
Conducted performance tuning and optimization of data processing workflows, enhancing overall system performance and scalability. Utilized version control tools like GitHub for efficient codebase
management and code quality assurance.
Collected and interpreted requirements from business stakeholders, offering technical expertise in data engineering for advanced analytics solutions.
Implemented machine learning models, including linear regression, logistic regression, and decision trees, to bolster predictive modeling and analysis efforts.