Mark Conway

CTO

Toledo, Ohio, United States11 yrs 11 mos experience

Most Likely To SwitchAI ML Practitioner

Key Highlights

Generated millions in revenue for major clients.
Expert in deploying scalable AI models.
Strong leadership in AI strategy and implementation.

Stackforce AI infers this person is a highly skilled AI and Data Science professional with expertise in FinTech and Retail sectors.

Contact

mark.conway@aholddelhaize.com LinkedIn

Skills

Core Skills

Generative AiMachine LearningNatural Language Processing (nlp)Data ScienceData EngineeringQuantitative Finance

Other Skills

A/B TestingAI PromptingAWSAffinity ScoringAgile MethodologiesAlgorithmsAmazon BedrockAmazon Web Services (AWS)AnalyticsAnonymizationApache SparkArtificial Intelligence (AI)AutoMLAzure DatabricksBacktesting

About

Principal Generative AI Scientist, Machine Learning Engineer, and Software Builder leading teams in the deployment of scalable models. These models have generated millions of dollars in revenue or savings for my respective clients: Walmart, Fidelity, Grindr, Deloitte, FCA, Ahold Delhaize, and several startups. I advise CEOs and CTOs for overall AI strategy, leading strategic initiatives for several recent client projects such as semantic search, agentic workflows, and MCP deployments. I also consider myself to be a Full-Stack AI Scientist, not only creating models but getting them into production, too. I competed on @Kaggle for 9 years, achieving a Highest Rank #183 of 200,265. Scottfree Analytics LLC is a 100% black-owned enterprise, and I am proud to be part of our team. We are all in this together, so be a mentor to someone who never had an opportunity. For Diversity in Data Science partnerships, please inquire about our client portfolio.

Experience

11 yrs 11 mos

Total Experience

4 yrs 6 mos

Average Tenure

9 yrs 1 mo

Current Experience

Ahold delhaize usa

Personalization Data Scientist

Sep 2022 – Jun 2024 · 1 yr 9 mos · Chicago, Illinois, United States

● Large Language Models (LLM) for Semantic Search and Substitution, applying sentence transformers, FAISS, and embeddings. T5 implementation with Seldon MLOps.
● Created Recommender Models with Neural Collaborative Filtering (NCF) in PyTorch, scaling performance 16X with Petastorm data loaders, multiple GPUs, and TorchDistributor.
● Implemented Learning-To-Rank (LTR) models for personalization with XGBoost Ranker, improving our customer recommendations.
● Built a Demand Forecast for estimating orders and units on an hourly basis across 2,000 stores. The AutoML pipeline generated predictions for various horizons, cross-validating the best time series model with the Nixtla StatsForecast, MLForecast, and NeuralForecast libraries.
● Developed an NLP Evaluation Toolkit for search and substitution by applying fuzzy matching techniques for last-mile refinement of semantic search results.

Time Series AnalysisData AnalysisClient ServicesBusiness RequirementsRecommender SystemsMicrosoft Azure+28

Om1, inc.

NLP Consultant

May 2022 – Aug 2022 · 3 mos · Boston, Massachusetts, United States

Streamlined Spark pipelines to extract patient histories from semi-structured clinical notes with NLP tools for Named Entity Recognition and Summarization. Note: This contract was short because my role was to help the team meet an important deadline for multiple clients.

Natural Language Processing (NLP)JupyterSoftware DeploymentStrategic ThinkingAzure DatabricksPresentations+2

Deloitte

Principal AI Scientist

Sep 2021 – Jul 2022 · 10 mos

● To establish the Data Science practice at Deloitte, I defined the Business Requirements for an AI-driven retail shopping platform, with eleven separate models for causal inference, propensity scoring, affinity analysis, and product recommenders. These requirements were instrumental in securing several years of project funding.
● Implemented affinity and behavioral scoring models on Azure Databricks using synthetic data to feed the algorithms, calculating Shapley values to interpret the model output and provide visual explanations of feature importances. Wrote extensive model validation Jupyter notebooks that tested the model ensembles.

Team LeadershipProject ManagementDesign DocumentsBusiness RequirementsRecommender SystemsMicrosoft Azure+27

Fis

Data Science Consultant

Jan 2020 – Jul 2021 · 1 yr 6 mos · Cincinnati Metropolitan Area

● Persuaded the executive team to sell Synthetic Data, spurring $75M in potential contract deals. Implemented algorithms to synthesize anonymized and generalized features to prevent re-identification, calculating both k-anonymity and l-diversity.
● Eliminated manual record matching with automated PySpark record linkage techniques for name, address, and merchant matching. The pipeline used tokenizers, N-grams, hashing transformers, and Locality Sensitive Hashing (LSH), a high-dimensional nearest neighbor search.
● Generated demographic predictions with a Spark random forest classifier from consumer transactional features. Extracted distribution features with vector assemblers, bucketizers, and
other user-defined functions (UDFs).

Time Series AnalysisData AnalysisData EngineeringBusiness RequirementsTechnical Project ManagementNatural Language Processing (NLP)+18

Fca fiat chrysler automobiles

Senior Data Scientist

Oct 2017 – Oct 2019 · 2 yrs · Auburn Hills, Michigan, United States

● Spearheaded a “small but mighty team” to reduce unplanned absences at six North American auto manufacturing plants, innovating weather and event features with NLP (spaCy and NLTK). Combined challenger models (SARIMAX, XGBoost) at multiple levels of aggregation (crew and production line level), including an auto-encoder for anomaly detection to identify outliers with dynamic thresholds.
● Streamlined model and data pipelines with Palantir Foundry, a big data platform with continuous integration. The software streamed vehicle sensor data (SQDF, Witech, Vstat, and Data Logger), which was then compressed with Dynamic Time Warping to highlight potential engine or power train problems. The PySpark pipeline chained LSTMs and Chi-Square analysis to identify unique features for out-of-sample cohorts and to predict warranty repairs as well.
● Presented the results of a production loss model to the CTO, a non-parametric Monte Carlo
Simulation and a parametric negative binomial distribution (R fitdistrplus). This simulation
estimated lost production units at manufacturing plants, in contrast to a traditional ARIMA time
series model.
● Led our Data Science team with presentations on Shapley Additive Explanations for model
interpretation; Long Short Term Memory (LSTM) Networks; Apache Spark; Model Production
Pipelines; and State-Space Time Series.

Data AnalysisData EngineeringMonte Carlo SimulationClient ServicesNatural Language Processing (NLP)Leadership+13

Scottfree analytics llc

Director of Artificial Intelligence

May 2017 – Present · 9 yrs 1 mo · Toledo, Ohio Metropolitan Area

● Private Client, Generative AI Engineer
Built a comprehensive Amazon Bedrock Agent pipeline for creating a Knowledge Base from thousands of PowerPoint, PDF, and Excel reports. Leveraged OpenSearch for the vector database, applied a Foundation Model for parsing the extracted content, and prompted Claude Sonnet 3.5 for generating LLM responses.
Developed a semantic search application with Amazon Bedrock Retrieval and Generation APIs to get relevant content from the S3 report database, including metadata for source document references.
● A.Team Client, AI Engineer
Worked on a scalable agentic system with LangChain and LangGraph on Google Cloud Platform (GCP), deploying a healthcare application with secure API endpoints for the AI Assistant (Anthropic Claude with Web Search).
Simulated application load testing with Kubernetes and Locust, identifying performance bottlenecks and ensuring the application could handle high traffic volumes, improving Firestore throughput under peak loads.
● Grindr, Generative AI Engineer, Chatbot
Developed a Retrieval-Augmented Generation (RAG) pipeline for offline knowledge curation to enhance a GenAI chatbot for the dating application. Leveraged sentence transformers, Postgres vector databases on AWS RDS, and custom prompt sources to supply the LLMs on Amazon Bedrock with additional context. Implemented similarity search and LangChain chat models to retain conversation history.
Created custom evaluators with Patronus AI for grading model output among competing LLMs such as OpenAI, Claude, and Ex-Human, with GitHub Actions and workflows to manage product releases and regression testing. Integrated PortKey's observability suite and AI gateway on both Amazon Bedrock and other custom LLMs.
Leveraged DevOps tools such as Helm, Argo, GitHub Workflows, Docker, and Kubernetes to deploy FastAPI microservices on Amazon EKS clusters. The microservices are written in Kotlin.

Amazon BedrockData EngineeringBusiness InsightsMarketing AnalyticsStatsmodelsMarket Segmentation+94

Walmart

Senior Data Scientist (TCS)

Jul 2014 – May 2017 · 2 yrs 10 mos · Bentonville, Arkansas, United States

● Led a large team of onshore and offshore developers to deliver over a dozen models to Sam’s Club on their Hadoop platform: Demand forecasting; customer segmentation and clustering (K-means); propensity scoring (multinomial); churn, renewal, and attrition (libsvm); association and basket analysis (R arules); seasonality (X-13 ARIMA); and offer assignment (dynamic bidding algorithm).
● Conducted extensive A/B experiments for assessing the efficacy of alternative member offers, measuring uplift and statistical power for member campaigns. Created a greedy algorithm in Apache Spark for offer assignment, achieving a 50X improvement in performance.
● Executed biweekly sprints on three-month epics, with customer presentations, demonstrations,
and retrospectives on a monthly basis. Consulted with Sam’s Club directors for strategic planning
to govern enterprise models with distributed logging, monitoring, and a central repository.

Team LeadershipProject ManagementData EngineeringDesign DocumentsClient ServicesTechnical Project Management+28

Agora software

Quantitative Researcher

Jun 2013 – Jul 2014 · 1 yr 1 mo · Stamford, Connecticut, United States

● Synthesized automated trading systems from machine learning models to generate predictions
with a proprietary formula language for extracting technical analysis features.
● Backtested systems in R and Python, allocating assets using the Kelly Criterion and Optimal f.
The R trading package SPLATR was the precursor to AlphaPy.
● Experimented with algorithms for statistical arbitrage (pairs trading), optimal portfolio rebalancing,
and volatility hedging, for example, Shannon’s Demon. Analyzed runs and sequences to measure
serial dependence in pricing time series, for example, the significance of streaks.

Time Series AnalysisQuantitative FinanceRLeadershipMachine LearningResearch Skills+8