SYSTEM ONLINELOC: FRESNO, CAROLE: SWE II

DUC
THAN

▋

I build high-throughput backend systems and AI retrieval pipelines — architecting services that stay fast and reliable at 10M+ requests a day.

View Work →Get in Touch

SYS_PROFILE.LOGv2.6

handle@ductienthan

focusbackend · ai · infra

stacknode · python · ts

uptime4+ yrs

p99< 100 ms

statusopen to talk

01 / About

Engineer for systems under load.

I'm a software engineer who lives in the backend — the queues, caches, and data pipelines that have to stay fast when traffic spikes and stay correct when things fail.

At Samsung Electronics America, I lead a core authentication service handling 120K+ requests a day and a pricing engine serving 10M+. Lately I've gone deep on AI infrastructure: RAG pipelines, vector search, LLM tooling, and MCP agents that do real work in production.

I care about the unglamorous parts — sub-100ms p99s, fault-tolerant ingestion, observability you can actually debug with. The kind of engineering you only notice when it's missing.

NameDuc Than

RoleSoftware Engineer II

CompanySamsung Electronics America

BasedSunnyvale, California

DegreeB.S. Computer Science · 3.88 GPA

Emailthantienduc@gmail.com

02 / Capabilities

Stack & systems.

[01]

Languages

JavaScript / Node.jsTypeScriptPythonSQLNext.js

[02]

Backend & Data

FastAPIHapiRabbitMQCeleryRedisPostgreSQLpgvectorElasticsearchOpenSearchPusherNeonDBPydanticAWS

[03]

AI / ML

RAG PipelinesLLM IntegrationMCP AgentsVector SearchEmbeddingssentence-transformersBM25 Hybrid SearchCross-encoder RerankHNSWtiktokenEasyOCRIsolation Forest

[04]

Systems & Tooling

Queue ClusteringLoad BalancingSQL TuningAsync / ConcurrencyCDN / Edge CachingTransactional OutboxRRFJenkinsDockerGitKibanaAlembicpytest

03 / Experience

Where I've shipped.

JUL 2022 — PRESENT

Software Engineer II

Samsung Electronics America — Mountain View, CA

Lead engineer for a core authentication service processing 120K+ requests/day — owned architecture, reliability, and incident response.
Architected a clustered SSO session-extension queue with load balancing and SQL tuning — eliminated CPU saturation at peak and held sub-100ms p99 latency globally.
Replaced legacy Java middleware with a Node.js + Hapi layer, cutting inter-service latency ~300ms across all daily traffic.
Built Isolation Forest anomaly detection on latency/throughput signals (MTTD ↓ 5–10%); shipped an LLM tool for automated Git-diff root-cause analysis.
Engineered a high-throughput shopping cart with Akamai CDN + multi-layer Redis caching to offload origin traffic under sustained peaks.
Designed a combinatorial pricing engine serving 10M+ daily requests with sub-millisecond lookups; built OpenSearch observability dashboards.

Internal Platform

AI-Powered PTO Request Tool

Designed a FastAPI + MCP-agent system letting thousands of employees submit PTO via template-driven Adaptive Cards; decoupled the agent from the HR backend via async queue for resilience against UKG API timeouts.
Built a cron worker mapping employees to managers; used AI coding agents to cut delivery time ~40%.

2021

B.S. Computer Science

University of Texas at Arlington — GPA 3.88

04 / Selected Work

Things I've built.

PRJ_01

BookRAG

AI-Powered Book Q&A System

A production-grade 5-layer RAG system with async ingestion, OCR fallback, and hierarchical chunking. Resumable embedding guarantees zero data loss on worker failure. An 8-step retrieval pipeline + cross-encoder reranking lifted answer quality ~60%; TTL caching cut repeat latency 95%+.

PythonpgvectorHNSWBM25 (RRF)OllamaDocker

PRJ_02

Movie Recommender

Async Embedding + LLM Re-rank

Celery workers ingest movies, generate embeddings, and persist to pgvector; an LLM agent re-ranks cosine-similarity candidates for natural-language quality. A Transactional Outbox guarantees no record loss between ingestion and vector storage under worker failure.

FastAPICelerypgvectorRedisLLM Rerank

PRJ_03

Automated Trading CLI

Concurrent Real-Time Engine

A hybrid threading + asyncio model parallelizes real-time price streams across instruments. Event-driven per-instrument task queues enable low-latency RSI signal processing and automated order execution — without shared-state contention.

PythonAsyncIOThreadingIG Markets API

PRJ_04LIVE ↗

Zuika

Real-Time Web Application

Built with Next.js (SSR + App Router) and Pusher pub-sub for real-time bidirectional events — no polling overhead. Hybrid persistence via NeonDB + local storage, with CDN edge caching for sub-50ms global asset delivery.

Next.jsPusherNeonDBCDN

DUCTHAN