N

NANDISH KARKI

AI Researcher

Magdeburg, Saxony-Anhalt, Germany3 yrs experience
AI ML PractitionerAI Enabled

Key Highlights

  • Built end-to-end ETL pipelines processing terabytes of data.
  • Designed full-stack deep learning systems for real-time translation.
  • Certified Professional Data Scientist with advanced SQL skills.
Stackforce AI infers this person is a Data Engineering and AI/ML specialist in the SaaS industry.

Contact

Skills

Core Skills

Ai/ml DevelopmentData EngineeringDevops

Other Skills

AI/MLAWS GLUAWS GlueAWS LambdaAWS PollyAdvanced GitAirflowAmazon Elastic MapReduce (EMR)AutoencodersAutomationBig DataCI/CDCNNChromaDBData Lakes

About

Master's student in Data & Knowledge Engineering @ OVGU with 3.3+ years of professional experience @ Clarivate, where I built end-to-end ETL pipelines using AWS Glue, Redshift, PySpark, and Airflow—processing terabytes of data across staging and production environments while implementing CI/CD workflows with Jenkins. Currently working as a Research Assistant (AI/ML & Data Engineering) at Otto-von-Guericke University, designing full-stack deep learning systems for Real time Speech-to-Speech Translation. My work bridges data infrastructure and AI systems—from SQL optimization and ETL development to LLM-based applications with RAG architecture. 🎓 Certified Professional Data Scientist (DataCamp) | SQL Advanced (HackerRank) 💼 Technical Expertise: • Data Engineering: ETL pipelines, AWS (Glue, Redshift, S3), BigQuery, GCP, Apache Airflow, PySpark, SQL query optimization, data quality validation • AI/ML Development: LLMs, RAG (LangChain, ChromaDB, Ollama), PyTorch, TensorFlow, Hugging Face transformers, model deployment, prompt engineering • Data Science: Python (Pandas, NumPy, Scikit-Learn), statistical analysis, data visualization (Matplotlib, Seaborn, Tableau), A/B testing, predictive modeling • DevOps & Tools: Docker, Jenkins CI/CD, Git, GitHub Actions, Postman, REST APIs, Agile methodologies 🔬 Recent Projects: → AI-Powered Learning Assistant: Full-stack RAG application with Flask backend, ChromaDB vector store, and Gradio UI for PDF/DOCX document processing → Audio Steganalysis Pipeline: CNN-based deep learning system for hidden message detection with Dockerized deployment and real-time inference → Production ETL Systems: Built and maintained data pipelines processing 5TB+ monthly across AWS infrastructure 🔍 Seeking Werkstudent opportunities (up to 20 hrs/week) in: • Data Engineering (SQL, ETL, cloud data platforms, data pipelines) • Data Science (analytics, machine learning, statistical modeling) • AI/ML Engineering (LLMs, deep learning, model deployment, MLOps) 📍 Location: Magdeburg, Germany | Open to on-site, hybrid & remote 📫 Let's connect if you're building data-driven or AI-powered solutions!

Experience

3 yrs
Total Experience
3 yrs
Average Tenure
--
Current Experience

Otto-von-guericke university magdeburg

Research Assistant (AI/ML)

Oct 2025Present · 7 mos · Magdeburg, Saxony-Anhalt, Germany

AI/MLData EngineeringDeep LearningAI/ML Development

Clarivate

2 roles

Software Engineer

Promoted

Apr 2023Oct 2024 · 1 yr 6 mos · Banglore · Hybrid

  • Built & operated end-to-end ETL pipelines with AWS Glue, Redshift, S3, PySpark, Airflow, spanning staging → stable → production environments.
  • Automated Glue workflows via Jenkins CI/CD + Bitbucket, eliminating manual triggers and cutting deployment time ~30%.
  • Migrated PostgreSQL → Redshift, handling schema mismatches & datatype conversions; added SQL/PySpark validation to ensure parity.
  • Created reusable data-validation framework (row counts, schema checks, checksums) for UAT vs Prod, reducing manual QA ~60%.
  • Optimized cloud spend ~25% by pruning redundant S3 data, tuning Glue job parameters, and refining Redshift query patterns.
  • Improved ETL runtime ~40% through PySpark tuning (partitioning, caching) and SQL optimization.
  • Applied Redshift best practices: compression tuning, result-set caching, sort/dist keys, and VACUUM maintenance.
  • Delivered 15+ production releases using Blue-Green (prod1/prod2) with zero downtime.
  • Managed 11+ Glue jobs via Jenkins with script reviews, static file checks, and formulary table updates.
  • Partnered with DevOps & API teams on IAM roles, data-flow alignment, and automated orchestration.
  • Contributed to Agile rituals (planning, triage, retros) with thorough JIRA documentation; mentored junior engineers.
  • Processed 5TB+ of structured data monthly across AWS Glue, Redshift, and S3 environments
  • Reduced data refresh time by 30% through Airflow DAG optimization and parallel processing
  • Improved data quality by implementing Jenkins CI/CD validation checks across 15+ production pipelines
ETL pipelinesAWS GlueRedshiftS3PySparkAirflow+5

Associate Software Engineer

Aug 2021Mar 2023 · 1 yr 7 mos · Banglore · Hybrid

  • Maintained & enhanced end-to-end ETL pipelines across multiple environments, ensuring reliable, on-time refreshes.
  • Migrated & validated large Payer/Provider datasets using complex SQL (joins, aggregations) with business-rule validation for accuracy.
  • Removed 100+ GB of duplicate data, reducing DB size to 5 GB and restoring production stability.
  • Resolved SQL & ETL defects that caused duplication and failures, improving uptime and data consistency.
  • Automated monthly & annual reports for Adaptive Legacy systems via optimized SQL, cutting manual reporting effort.
  • Collaborated cross-functionally to design data-driven solutions; active in Agile ceremonies (standups, reviews, retros).
ETL pipelinesSQLdata validationData Engineering

Education

Otto-von-Guericke University Magdeburg

Masters — Data science

Oct 2024Aug 2026

Ramaiah Institute Of Technology

Bachelor of Engineering - BE — Computer Science

Jan 2017Jul 2021

Stackforce found 100+ more professionals with Ai/ml Development & Data Engineering

Explore similar profiles based on matching skills and experience