AI / ML Engineer

I build AI systems that actually ship.

Principal-level ML engineer working on LLM systems, applied machine learning in production, and self-hosted AI infrastructure. I turn research-grade ideas into reliable, maintainable software.

Available for consulting and contract work

01 / About

Engineering, not just notebooks.

I am an ML engineer with a strong data-science foundation, now focused on the engineering side of AI: turning models into production systems that hold up under real load and real users.

My recent work centers on large language model systems, retrieval-augmented applications, fine-tuning, and the infrastructure that runs them. I keep a self-hosted GPU cluster at home for hands-on experimentation, so the things I recommend are things I have actually run end to end. I care about clean interfaces, reproducibility, and code other people can maintain.

02 / Skills

What I work with

AI / LLM Engineering

  • RAG
  • Fine-tuning
  • LangChain
  • vLLM
  • Embeddings
  • Agents
  • Prompt design

Machine Learning

  • PyTorch
  • scikit-learn
  • XGBoost
  • Transformers
  • Time series
  • NLP

MLOps / Infra

  • Docker
  • Linux
  • Proxmox
  • CUDA
  • FastAPI
  • CI/CD
  • Self-hosting

Languages & Data

  • Python
  • SQL
  • Bash
  • Pandas
  • Spark
  • DuckDB

03 / Projects

Selected work

A few things I have built. Each is something I designed, wrote, and ran.

Self-hosted LLM inference cluster

2025

A home GPU cluster serving open-weight models over an OpenAI-compatible API. vLLM for throughput, a thin FastAPI gateway for auth and routing, and Proxmox for isolation. Used daily for experimentation and fine-tuning.

  • vLLM
  • FastAPI
  • CUDA
  • Proxmox

Retrieval-augmented document assistant

2025

A RAG pipeline over a private document set: chunking, embeddings, a vector store, and a reranking step that measurably cut hallucinations. Built to run fully offline on local hardware.

  • RAG
  • Embeddings
  • Python
  • Vector DB

Production ML scoring service

2024

An end-to-end pipeline taking a model from notebook to a containerized service with monitoring, batch and real-time scoring, and reproducible retraining.

  • PyTorch
  • Docker
  • FastAPI
  • CI/CD

04 / Services

How I can help

A

LLM & AI systems

Design and build RAG pipelines, agents, and LLM-backed features. From prototype to a service you can actually run and trust.

B

Applied ML in production

Take a model out of the notebook: packaging, serving, monitoring, retraining, and the glue code that keeps it reliable.

C

AI infrastructure & advisory

Self-hosting open-weight models, GPU and inference setup, and pragmatic build-vs-buy advice for teams getting started with AI.

05 / Contact

Let's build something.

Open to consulting, contract work, and interesting problems. The fastest way to reach me is email.

06 / Latest Blog Posts

Latest Blog Posts