AI / ML Engineer

I build AI systems that actually ship.

Principal-level ML engineer working on LLM systems, applied machine learning in production, and self-hosted AI infrastructure. I turn research-grade ideas into reliable, maintainable software.

View projects Work with me

Available for consulting and contract work

01 / About

Engineering, not just notebooks.

I am an ML engineer with a strong data-science foundation, now focused on the engineering side of AI: turning models into production systems that hold up under real load and real users.

My recent work centers on large language model systems, retrieval-augmented applications, fine-tuning, and the infrastructure that runs them. I keep a self-hosted GPU cluster at home for hands-on experimentation, so the things I recommend are things I have actually run end to end. I care about clean interfaces, reproducibility, and code other people can maintain.

02 / Skills

What I work with

AI / LLM Engineering

RAG
Fine-tuning
LangChain
vLLM
Embeddings
Agents
Prompt design

Machine Learning

PyTorch
scikit-learn
XGBoost
Transformers
Time series
NLP

MLOps / Infra

Docker
Linux
Proxmox
CUDA
FastAPI
CI/CD
Self-hosting

Languages & Data

Python
SQL
Bash
Pandas
Spark
DuckDB

03 / Projects

Selected work

A few things I have built. Each is something I designed, wrote, and ran.

Self-hosted LLM inference cluster

2025

A home GPU cluster serving open-weight models over an OpenAI-compatible API. vLLM for throughput, a thin FastAPI gateway for auth and routing, and Proxmox for isolation. Used daily for experimentation and fine-tuning.

vLLM
FastAPI
CUDA
Proxmox

Write-up → Code →

Retrieval-augmented document assistant

2025

A RAG pipeline over a private document set: chunking, embeddings, a vector store, and a reranking step that measurably cut hallucinations. Built to run fully offline on local hardware.

RAG
Embeddings
Python
Vector DB

Write-up → Code →

Production ML scoring service

2024

An end-to-end pipeline taking a model from notebook to a containerized service with monitoring, batch and real-time scoring, and reproducible retraining.

PyTorch
Docker
FastAPI
CI/CD

Write-up →

04 / Services

How I can help

LLM & AI systems

Design and build RAG pipelines, agents, and LLM-backed features. From prototype to a service you can actually run and trust.

Applied ML in production

Take a model out of the notebook: packaging, serving, monitoring, retraining, and the glue code that keeps it reliable.

AI infrastructure & advisory

Self-hosting open-weight models, GPU and inference setup, and pragmatic build-vs-buy advice for teams getting started with AI.

05 / Contact

Let's build something.

Open to consulting, contract work, and interesting problems. The fastest way to reach me is email.

[email protected] LinkedIn →

06 / Latest Blog Posts