> The Nemotron-4 340B family includes base, instruct and reward models that form...

jsheard · on June 14, 2024

> Most big models and APIs have clauses that ban its use to improve other models.

I will never get over the gall of anything and everything being deemed fair game to use as training data for a model, except you're not allowed to use the output of a model to train your own model without permission, because model output has some kind of exclusive super-copyright apparently.

vineyardmike · on June 14, 2024

> because model output has some kind of exclusive super-copyright apparently

Well, its not copyright that is being used to forbid this, its terms of service, but yea, it is quite a hypocrisy.

alwayslikethis · on June 15, 2024

It's likely unenforceable since there is no copyright and copying it to someone else not in the contract trivially bypasses it. Still hypocritical nonsense though.

spacemanspiff01 · on June 17, 2024

They are just saying they will close your account if they catch you and feel like it.

logicchains · on June 14, 2024

>They explicitly are releasing this to help generate synthetic training data

Synthetic training data is basically free money for NVidia; there's only a fixed amount of high-quality original data around, but there's a potential for essentially infinite synthetic data, and more data means more training hours means more GPU demand.

cyanydeez · on June 14, 2024

GIGOaaS