Training a 7B LLM From Scratch
No fine-tuning. No LoRA. No base model. Training a 7 billion parameter language model from random weights, on rented H100s. General-purpose first, then specialized for code through SFT and RL on real agent trajectories.