James Nguyen

Software Engineer

Dallas, Texas, United States6 yrs 7 mos experience
AI ML PractitionerAI Enabled

Key Highlights

  • Expert in building scalable ETL pipelines and cloud-native architectures.
  • Proven experience with LLM applications and generative AI workflows.
  • Strong background in AWS and data engineering best practices.
Stackforce AI infers this person is a Data Engineer with expertise in AI and cloud-native solutions across multiple industries.

Contact

Skills

Core Skills

TypescriptPythonAwsSqlAutomationGcpReact.js

Other Skills

LangChainPython (Programming Language)LangGraphFastAPIApache AirflowJenkinsAmazon Web Services (AWS)PySparkDataikuSparkAWS GlueEMRMySQLTerraformSwift (Programming Language)

About

Data Engineer / Backend Engineer with strong software engineering fundamentals and experience building scalable, cloud-native data and AI systems. Specialized in Python and advanced SQL, with deep expertise in Apache Spark (PySpark) for large-scale batch and real-time processing. Proven at delivering end-to-end ETL/ELT pipelines, streaming platforms, and high-performance APIs using FastAPI, Flask, Django, and AWS serverless architectures (Lambda, API Gateway). Strong background in AWS data platforms (S3, Glue, Redshift, Athena, Kinesis, Bedrock) and Snowflake, with exposure to GCP (Vertex AI). Hands-on experience with LLM-powered systems, including prompt engineering, LangChain/LangGraph, and RAG with vector databases. Strong foundation in distributed systems (Spark, Kafka, Hadoop), data lakehouse architectures, and DevOps (Docker, Kubernetes, CI/CD). Focused on building reliable, production-grade systems that scale and drive business impact.

Experience

6 yrs 7 mos
Total Experience
2 yrs 1 mo
Average Tenure
2 mos
Current Experience

Infosys

Software Engineer

Apr 2026Present · 2 mos · Richardson, TX · Hybrid

Revature

AI Engineer

Jan 2026Mar 2026 · 2 mos · Dallas, TX · Remote

  • Engineered and deployed LLM applications using Vertex AI, FastAPI, Docker, and CI/CD pipelines on GCP.
  • Built RAG systems with Vector Databases (Chroma DB) and implemented semantic search using embedding
  • models.
  • Developed agentic AI workflows with LangChain and LangGraph for multi-step, autonomous systems.
  • Applied MLOps best practices including model versioning, environment promotion, and cloud service
  • integration (Cloud Run, IAM).
  • Built type-safe full-stack applications with TypeScript, React, and scalable API architecture.
TypeScriptReact.jsLangChainPython (Programming Language)LangGraphPython

Cognizant

7 roles

Data Engineer — Abbvie

Apr 2025Jul 2025 · 3 mos

  • Owned the stability and governance of Dataiku workflows across QA and production, supporting
  • compliant deployment of analytics models in a regulated environment for 20+ AI/Analytics users.
  • Drove 40+ Dataiku workflows to improve scheduling and reduce failures, automating ingestion and transformation; implemented changes in Python, SQL, and REST APIs; reduced average workflow runtime by 30% and cut compute costs by 20%. Authored SIQ documentation for validation and compliance, cutting audit review time by 25% and reducing onboarding time for new users from 4 weeks to 2 weeks.
PythonDataikuMySQLAmazon Web Services (AWS)

Data Engineer — Capital One

Promoted

May 2024Mar 2025 · 10 mos

  • Architected and owned mission-critical ETL pipelines processing 10–20 TB/day across AWS (EMR, Glue, S3), Snowflake, and GCP, maintaining 99.9% uptime while serving financial analytics for thousands of users.
  • Optimized Spark/Polars jobs to reduce end-to-end processing time by 40% and migrated from legacy Spark to Polars with zero data drift.
SQLPythonApache AirflowJenkinsPySpark

Software Engineer — Western Midstream

Apr 2024May 2024 · 1 mo

  • Owned integration of industrial control systems into Ignition to automate hardware operations and workflows across 15+ devices.
  • Built real-time dashboards and connected OT data to cloud-based reporting, improving monitoring and operational visibility.
  • Reduced manual checks by 70%, enabling faster detection of equipment issues and safer pipeline operations.
SQLAutomation

Data Engineer — Duke Energy Corporation

Mar 2023Apr 2024 · 1 yr 1 mo

  • Designed high-volume data migration pipelines (5–8 TB per run) and AWS cloud infrastructure (S3 → RDS/Redshift) using Spark, Glue, and Lambda.
  • Automated deployments with Terraform and Concourse, enabling multiple weekly releases with near-zero rollback risk and 50% faster deployment times.
SQLPythonTerraformAmazon Web Services (AWS)PySparkAWS

Data Engineer — Macy's

Promoted

Nov 2022Feb 2023 · 3 mos

  • Led data validation for cross-system GCP pipelines supporting inventory and supply chain analytics, ensuring reliability for 50+ analysts and reducing manual QA time by 10 hours/week.
  • Developed 80+ automated test cases in Python and SQL, building workflows to validate message synchronization between distributed upstream systems; reduced average job runtime by 25% and cut compute costs by 15%.
  • Achieved 95% test success rate, prevented repeated P1 data quality incidents, and increased confidence in supply chain dashboards, saving the analytics team 8 hours/week in manual validation.
SQLGCP

Software QA Tester — Apple

Jul 2022Nov 2022 · 4 mos

  • Owned validation and defect analysis for device feature sets across 200 iOS/macOS devices, ensuring comprehensive coverage and accuracy while reducing average QA cycle time by 30% per release.
  • Designed test suites, reproduced issues across multiple OS versions, and performed detailed root-cause analysis, cutting manual QA effort by 15 hours/week and improving device stability.
  • Ensured 99% feature accuracy and reduced device downtime by 75%; defect reports contributed to stability improvements shipped to tens of millions of users worldwide.
Swift (Programming Language)

Data Engineer - Software Engineer

Apr 2022Sep 2025 · 3 yrs 5 mos

  • Delivered scalable data, cloud, and backend solutions for Fortune 500 clients across finance, healthcare, and energy industries. Specialized in building high-performance ETL pipelines, cloud-native architectures, and data-driven automation using AWS, Spark, Python, Terraform, and Airflow.
  • Key Projects & Achievements:
  • AbbVie: Managed and optimized Dataiku pipelines across QA and production; automated ingestion via Python, SQL, and REST APIs, improving deployment efficiency by 25%. Authored qualification documentation and onboarded new users for compliance.
  • Capital One: Engineered distributed Spark + Polars ETL pipelines on AWS Glue, EMR, and S3, cutting transformation time by 40% and maintaining 99.9% uptime. Automated data validation using Airflow, PyTest, and SQL for cross-engine consistency.
  • Duke Energy: Built serverless ETL APIs using Lambda, Glue, and API Gateway to migrate data across S3, Redshift, and RDS. Automated CI/CD with Terraform and Concourse, improving release reliability and scalability.
  • Macy’s: Created 80+ functional test cases in Python + SQL to validate GCP data pipelines, achieving 95% validation accuracy for real-time message sync.
  • Western Midstream: Integrated operational hardware data into cloud dashboards using Ignition and AWS, enhancing monitoring efficiency.
  • Apple: Tested and validated Apple software across 200+ devices, ensuring 99% feature accuracy and reducing downtime by 75% through effective root cause analysis and rapid issue resolution.
  • Core Skills: Python, PySpark, AWS (Glue, Redshift, Lambda, EMR, S3), SQL, Terraform, Airflow, Jenkins, CI/CD, Kafka, Dataiku, GCP, REST API, ETL, Big Data, Cloud Infrastructure, Data Modeling, Automation
PythonApache AirflowJenkinsAmazon Web Services (AWS)PySparkAWS

Mphasis

Software Engineer

Jan 2022Feb 2022 · 1 mo · New York City · Remote

  • Owned development of frontend features and backend APIs for enterprise applications, ensuring smooth integration across systems.
  • Built React UI components and optimized Node.js APIs, improving data exchange and overall application responsiveness.
  • Reduced page load times, increased API performance, and improved system reliability for enterprise users through targeted optimizations.
React.jsJavaScriptNodeJs

Hcl technologies

Software Engineer

Oct 2020Jan 2022 · 1 yr 3 mos · Frisco, TX

  • Re-engineered 230+ production ETL workflows at HCL Technologies using Python, Spark, and SQL, leveraging parameterized patterns and automation; cut runtime by 50% and improved reliability for banking applications serving thousands of daily users. Delivered workflow patterns adopted as the engineering team standard.
  • Provided technical guidance and mentoring to junior engineers on ETL best practices and Spark optimization; collaborated across 3 teams to standardize workflow templates, ensuring consistent data quality and maintainability.
SQLHTML & CSSJavaScriptJavaHibernate

Essilor group

Software Engineer

Nov 2018Aug 2020 · 1 yr 9 mos · United States

  • Owned development of internal analytics tools used across multiple business units, serving as the primary engineer for core data workflows used by 150+ users across 10+ teams.
  • Built New Product Studio with C, ASP.NET, and SQL to automate product data processes, and developed Titan, a visualization tool adopted by 10+ teams; reduced turnaround time for analysis from 3 hours to 30 minutes per request, cutting reporting cycle time by 83% and accelerating time to launch new products.
  • Reduced manual data entry by 40% and improved operational response time by 30%; tools became the central analytics interface for multiple departments, supporting faster decision-making.
SQLHTML & CSSJavaScriptC#

Department of bioengineering - ut dallas

2 roles

Summer Camp Counselor

Jun 2017Jul 2017 · 1 mo · Richardson, Texas, United States

  • Improved students’ SAT math and reading scores by 100 points through data-driven lesson planning and progress tracking.
  • Led and managed a class of 30 students, applying analytical and communication skills to support academic and behavioral development.
  • Ensured a safe and structured learning environment by implementing effective risk-mitigation and supervision practices.

Calculus & Chemistry Tutor

Jan 2016Jan 2017 · 1 yr · Richardson, Texas, United States

  • Improved student performance by 25% through data-driven lesson planning and individualized instruction focused on problem-solving and analytical reasoning.
  • Applied logical thinking and quantitative analysis to simplify complex STEM concepts, enhancing student understanding and retention.
  • Delivered 1-on-1 and group tutoring sessions, leveraging communication, mentorship, and feedback systems to track learning metrics and optimize outcomes.

Education

The University of Texas at Dallas

Stackforce found 100+ more professionals with Typescript & Python

Explore similar profiles based on matching skills and experience