Blog

ConceptSeg-R1: teaching a segmenter to induce the rule, not just solve the instance
Paper notes

Paper notes on ConceptSeg-R1 — RL-style reasoning over SAM 3's prompt space, using Meta-GRPO to infer transferable segmentation rules from a few visual demonstrations.

paper-notes segmentation reinforcement-learning sam

LocateAnything: decoding bounding boxes in parallel instead of one token at a time
Paper notes

Paper notes on NVIDIA's LocateAnything — a 3B vision-language grounding model that treats boxes as atomic units via Parallel Box Decoding, hitting 10× the throughput of comparable VLMs.

paper-notes vision-language grounding nvidia

MiniMind-O: training a from-scratch omni model over a weekend
Weekend build

A hands-on plan for MiniMind-O — a ~0.1B end-to-end omni model that takes text/audio/image in and emits text + streaming speech, with the mini pipeline running in ~2 hours on one RTX 3090.

omni-model multimodal train-from-scratch weekend-project

UltraData: OpenBMB's tiered data stack, from raw web to deep-thinking SFT
Dataset notes

Notes on OpenBMB's UltraData collection — an L0–L4 data pyramid (Ultra-FineWeb, UltraData-Math, UltraData-SFT) battle-tested on MiniCPM5-1B and released open.

datasets pretraining open-source llm

ML Sharp in ComfyUI: one image to 3D splat in under a second
Workflow note

A workflow note for running Apple's ml-sharp monocular 3D Gaussian Splatting model inside ComfyUI via the ComfyUI-Sharp custom node. Setup, graph, gotchas.

comfyui 3d-gaussian-splatting apple-ml

QuCo-RAG: deciding when to retrieve by asking the pre-training corpus
Paper notes

Paper notes on QuCo-RAG — using entity frequency and co-occurrence in a 4T-token corpus as a retrieval trigger, instead of the model's own unreliable uncertainty.

rag paper-notes llm uncertainty

WBTI: MBTI for wage workers
Project writeup

A 30-question Chinese-market workplace personality test, 28 public types plus 2 hidden branches, all on Cloudflare with edge-rendered share posters. Live at wbtilab.xyz.

side-project cloudflare react quiz

ConceptSeg-R1: teaching a segmenter to induce the rule, not just solve the instance Paper notes

LocateAnything: decoding bounding boxes in parallel instead of one token at a time Paper notes

MiniMind-O: training a from-scratch omni model over a weekend Weekend build

UltraData: OpenBMB's tiered data stack, from raw web to deep-thinking SFT Dataset notes

ML Sharp in ComfyUI: one image to 3D splat in under a second Workflow note

QuCo-RAG: deciding when to retrieve by asking the pre-training corpus Paper notes

WBTI: MBTI for wage workers Project writeup