Tencent's Penguin-VL Ditches CLIP and Beats Every Rival VLM Under 10B Parameters
newsTencent AI Lab's Penguin-VL replaces CLIP vision encoders with LLM-initialized encoders, setting new SOTA on doc understanding and video benchmarks at 2B and 8B scale.12 min read