Deep Dives
Deep-dive explorations of technical projects and findings.
These are not quick study guides or cheat sheets — treat them as engineering case studies. They are useful for reviewing context, motivation, tradeoffs, and bottlenecks before an interview; consulting the “Risks & Mitigations” and “Gap Analysis” sections to avoid reinventing the wheel on similar projects; and measuring personal growth over time by seeing how past decisions were framed.
- AI/ML Workshop
—
ml
,
onboarding
,
privacy
A carefully curated set of practical, highly reproducible machine learning examples (PyTorch, Hugging Face, NumPy) featuring MPS-aware benchmarks and rigorous experiment hygiene for local hardware.
- Chowist
—
extensibility
,
monitoring
,
tooling
A decade-spanning food discovery app migrated from Ruby Sinatra to Rails to Django. Lessons on incremental framework migrations and sustaining a single codebase through ecosystem shifts.
- Grit
—
algorithms
,
extensibility
,
performance
and +2 more
A from‑scratch Git implementation in Rust; exploring content-addressable storage, plumbing/porcelain layering, and high-performance object caching.
- Mailprune
—
data-pipelines
,
monitoring
,
networking
and +1 more
A highly effective, local-first email auditing and automated cleanup tool designed to definitively identify noisy senders and deliver actionable, strictly privacy-preserving recommendations.
- Photohaul
—
deduplication
,
extensibility
,
media
and +1 more
A robust Java-based tool engineered for seamlessly organizing and migrating extensive photo collections; featuring rigorous deduplication, automatic metadata preservation, and resumable execution.
- Ragchain
—
ml
,
privacy
,
retrieval
A comprehensive local RAG stack (ChromaDB + Ollama) designed for strictly private, reproducible retrieval and LLM inference; heavily focusing on hybrid retrieval strategies and index versioning.
- Rustoku
—
algorithms
,
performance
,
rust
A highly optimized Sudoku engine engineered in Rust, featuring advanced human-like techniques, multi-platform support (Python, WASM), and microsecond-level performance.
- Spark Trial
—
data-pipelines
,
etl
,
monitoring
and +1 more
An intensive end-to-end ETL processing example leveraging Apache Spark for large-scale parquet datasets; deeply focusing on strict schema handling, optimal partitioning, and reproducible aggregations.
- Streaming Frameworks
—
data-pipelines
,
monitoring
,
streaming
A deep architectural comparison of streaming pipelines: evaluating Apache Beam's portable unified model (Java/DirectRunner) against Apache Flink's native API for stateful processing and fault tolerance.
- Video Analysis
—
extensibility
,
feature-extraction
,
media
and +1 more
An exploration of multimodal video feature extraction comparing Apple-native frameworks (Vision, AVFoundation, Core Image) against cross-platform C++/Python toolchains (OpenCV, pybind11) for ML prep.
- VirtuC
—
algorithms
,
compiler
,
performance
and +1 more
A from-scratch, Rust-implemented compiler designed for a targeted C subset that effectively emits standard LLVM IR; heavily focusing on proper AST design, semantic checking, and IR verification.