Lamhot Siagian

Co-Founder

San Francisco, CA, United States8 yrs 3 mos experience

AI ML PractitionerAI Enabled

Key Highlights

Expert in building scalable AI systems and workflows.
Proven track record in LLM evaluation and testing.
Strong background in machine learning and data engineering.

Stackforce AI infers this person is a highly skilled AI Engineer specializing in machine learning and natural language processing.

Contact

Skills

Core Skills

Machine LearningNatural Language Processing (nlp)Retrieval-augmented Generation (rag)Data EngineeringArtificial Intelligence (ai)Software Testing

Other Skills

Convolutional Neural Networks (CNN)Hugging Face ProductsData WarehousingDeep LearningComputer VisionRAGASDeepEvalFairlearnOpenTelemetryTruLensLangSmithLarge Language Models (LLM)PythonFastAPILangChain

About

AI Engineer / AI Evaluation Engineer with 9+ years across software engineering and ML. I build RAG + agentic systems that are measurable, safe to ship, and observable end-to-end (evaluation, regression gating, tracing, and monitoring). Languages: Python, Java, JavaScript, TypeScript, SQL. AI Orchestration & Agents: LangChain, LangGraph, LlamaIndex, CrewAI, Microsoft AutoGen, DSPy. Machine Learning & NLP: PyTorch, TensorFlow, Keras, Scikit-learn, PEFT/LoRA (fine-tuning), NLTK, OpenCV, Hugging Face Transformers. LLM Evaluation & Testing: DeepEval, Ragas, LangSmith, Arize Phoenix, Fairlearn, Selenium, BLEU/ROUGE/METEOR/BERTScore. Data & Vector Stores: Snowflake, MongoDB, Redis, Elasticsearch, Pinecone, ChromaDB, Weaviate, Apache Spark, Hadoop. LLMOps & Deployment: Docker, Kubernetes, Jenkins, GitLab CI, AWS, GCP, vLLM, Ollama, MLflow. Core Competencies: Generative AI, RAG Systems, Agentic Workflows, NLP, Computer Vision, Prompt Engineering, Test Automation. Actively seeking opportunities in AI Engineer, AI Evaluation, Machine Learning Engineer, Applied Scientist, NLP Engineer, Generative AI Engineer, LLM Engineer, Data Scientist

Experience

8 yrs 3 mos

Total Experience

1 yr 4 mos

Average Tenure

Current Experience

University of the cumberlands

Artificial Intelligence Researcher

May 2025 – Present · 1 yr · Sunnyvale, CA

Research on RAG/agentic systems, building test-case design, Top-K retrieval evaluation, and hallucination/abstention benchmarks using RAGAS, DeepEval, and LLM-as-Judge.
Research on LLM observability with OpenTelemetry plus TruLens/LangSmith to trace prompts, retrieval, tool calls, and response quality for faster debugging and regression detection.
Conducted fairness audits on ML/LLM pipelines using Fairlearn and AIF360, assessing group-level performance gaps and bias indicators.
Built and secured ML research workloads on Google Cloud Platform, leveraging IAM, VPC, Compute, and Storage, and executed model development/experimentation using Vertex AI.

Convolutional Neural Networks (CNN)Machine LearningHugging Face ProductsData WarehousingDeep LearningComputer Vision+1

Software test architect

Founder

May 2025 – Present · 1 yr · Sunnyvale, CA · Remote

Hp

Software Engineer

Nov 2024 – May 2025 · 6 mos · Palo Alto, CA · On-site

Built an internal RAG customer support assistant (Python, FastAPI, LangChain, vector DB, AWS, Snowflake) that auto-escalates complex/low-confidence issues and reduces the human check by 30 %.
Designed an LLM evaluation + regression testing harness (RAGAS/TruLens, custom golden set, CI checks) that blocked bad prompt/model changes in PRs; cut production incidents related to LLM responses by 40% over 2 months.
Optimized retrieval quality by implementing chunking strategy + hybrid search + reranking (BM25 + embeddings + cross-encoder); increased Recall@5 from 0.68 → 0.79 and reduced “no answer found” cases by 22%.
Productionized ML/LLM pipelines with Docker + Kubernetes, AWS and GCP cloud, and monitoring (Prometheus/Grafana + logging/traces); added drift/quality alerts and automated rollback, improving uptime to 99.9% and cutting mean time to detect issues from hours → minutes.
Deploy secure, reliable LLM solutions on GCP (IAM, VPC, Compute, Storage, GKE).
Working with customers and operations teams to understand the business requirements

Machine LearningData WarehousingDeep LearningLarge Language Models (LLM)Retrieval-Augmented Generation (RAG)

Ai engineering insider

Founder

Apr 2024 – Present · 2 yrs 1 mo · United States · Remote

As the founder of AI Engineering Insider, my mission is to educate engineers and everyone about the fast-growing world of AI Engineering—from LLMs, RAG, agents, and vector databases to evaluation, observability, safety, guardrails, deployment, and production-ready AI systems. I help people understand not just how to use tools like LangChain, LangGraph, LlamaIndex, CrewAI, AutoGen, RAGAS, DeepEval, LangSmith, TruLens, Chroma, FastAPI, Docker, AWS, GCP, and Vertex AI, but how to turn them into real, scalable, secure, and testable applications that solve business problems in the real world.

Apple

Software Engineer

Nov 2021 – Nov 2024 · 3 yrs · Sunnyvale, CA · On-site

Designed and implemented an enterprise-grade Multi-Agent RAG system using LangGraph (Planner–Executor–Verifier architecture) with hybrid retrieval (ChromaDB + BM25 + re-ranking), enabling reliable knowledge grounding and tool orchestration across SQL, APIs, and email services.
Implemented hallucination-resistant response generation with citations and confidence scoring, and built an evaluation pipeline with offline metrics and LangSmith support, deploying the system via Docker, docker-compose, and GitHub Actions CI/CD for scalable production use.
Created an evaluation + regression suite for the agent (tool selection accuracy, task success, refusal/safety checks, hallucination rate), including golden test cases for edge scenarios (warranty exceptions, activation lock, regional restrictions); prevented 10 regressions per release via CI gating.
Maintained CI/CD and DevOps for ML pipelines across AWS (FaaS), Apple Cloud infrastructure, and Kubernetes.

Machine LearningHugging Face ProductsContinuous IntegrationData WarehousingDeep LearningContinuous Delivery (CD)+2

Dexcom

Software Engineer 2

Jul 2020 – Jun 2021 · 11 mos · San Diego County, CA

SQLPython (Programming Language)Data EngineeringMedical DevicesArtificial Intelligence (AI)Natural Language Processing (NLP)

Bukalapak

Senior Software Engineer

Nov 2016 – Sep 2019 · 2 yrs 10 mos · Jakarta

Machine LearningSoftware TestingGoogle Cloud Platform (GCP)JavaData Science

Funding societies | modalku group

Software Engineer

May 2016 – Nov 2016 · 6 mos · Singapore

Kurio

Software Engineer

Oct 2015 – Apr 2016 · 6 mos · Jakarta Metropolitan Area · On-site

Designed and delivered a real-time news recommendation + trending pipeline using Kafka, Spark Streaming, Hadoop, Elasticsearch, and Redis, Python, processing 2M events/day to serve low-latency personalized feeds and trending stories; improved CTR by 45%

Python (Programming Language)Natural Language Processing (NLP)