- Published on
'I trained three generations of ColQwen3.5, each with more sophisticated optimization than the last. The most optimized version barely beat the previous one on the primary benchmark (+0.0011 nDCG@5). Individual tasks reshuffled substantially, with per-task swings an order of magnitude larger than the aggregate gain.'
