Of course models purely made for image stuff will completely wipe it out. The vision language models are useful for their generalist capabilities
Of course models purely made for image stuff will completely wipe it out. The vision language models are useful for their generalist capabilities