Personal Projects

Research Blog

A tiny RL pass recovers cross-task answer diversity in an instruction-tuned model

Fifty RL steps restored output diversity across nine held-out tasks, no benchmark cost.

Verifier-guided prompt mining turned a zero-success 1.5B model into a transferable bug fixer with ten LoRA examples

Prompt optimization mined ten verified fixes from the model; a LoRA on those ten generalized.

Hidden activations know when self-report does not

Linear probes recover claim-level correctness signal that structured confidence misses.

The diversity effect

What a panel-of-experts scaffold does to RL-trained reasoning and search.

nanochat-mlx

Train your own ChatGPT on Apple Silicon. MLX port of nanochat.

MLX
Apple Silicon
Training

SunShift

Adjust Night Shift intensity from the menu bar.

macOS
Menu Bar
Display

activation-probes-claim-correctness

Claim-level correctness probes on Llama hidden activations.

Research
Interp
Probes

multi-model

Panel-of-experts scaffold vs native thinking on Qwen3-30B-A3B.

Research
RLVR
Reasoning

society-of-thought-bench

Multi-persona debate benchmark with traces, evidence, and a live demo.

Research
Reasoning
Debate

hypothesis-forge

RL environment for novel, evidence-grounded, falsifiable ideas.

Research
RL
Novelty

adaptive-rag-rlm

Verifiers RLM environment for adaptive recursive search over long corpora.

Research
RAG
RL

TextDrop

Turn pasted text into files with one click.

macOS
Files
Utility

ttt-discover-autoresearch-mlx

Local-first MLX version of TTT-Discover AutoResearch.

Research
MLX
Agents

TabPilot

Safari tab command center powered by Codex App Server.

macOS
Safari
Tabs

SafariMarkdown

Convert any Safari tab to clean Markdown in one click.

macOS
Safari
Markdown

Proofgrade

Reproducible LLM proof grading benchmark and API for math.

Research
Math
API

GhostLabel

Ghostty tab renamer powered by Codex App Server.

macOS
Ghostty
Automation

PasteForge

Clipboard text transformer with case, encode, format, hash, and stats tools.

macOS
Clipboard
Utility

ClipDrop

Read clipboard text, edit it, and save to any file format.

macOS
Clipboard
Files

DiskPulse

Per-volume disk space monitoring with color-coded usage bars.

macOS
Storage
Monitor

PortSentry

See all listening TCP ports with one-click kill.

macOS
Network
Utility

ProcessBeacon

Monitor long-running processes and get notified on completion.

macOS
Processes
Notify

BrewPilot

Manage Homebrew services. Start, stop, and restart with one click.

macOS
Homebrew
Services

gemma4-m4-pro

Gemma 4 on a 24GB MacBook: measured recipes, runtimes, and fallback paths.

Research
Gemma
Apple Silicon

train-gemma4-sudoku-on-your-macbook

One-notebook Gemma 4 RL on Apple Silicon.

Research
RL
Notebook

autoresearch-evo

Novelty-search-inspired autonomous search with run memory and agentic review.

Research
Agents
Search

Collaborators

Open to research replication and OSS collaboration. These are personal projects.

GitHub LinkedIn