Btw samplers do in fact help with this. Random tokens deep in your output contex...

		Der_Einzige 9 days ago \| parent \| context \| favorite \| on: Our eighth generation TPUs: two chips for the agen... Btw samplers do in fact help with this. Random tokens deep in your output context are due to accumulated sampling errors from using shit samplers like top_p and top_k with temperature. Use a full distribution aware sampler like p-less decoding, top-H, or top-n sigma, and this goes away Yes the paper for this will be up for review at NeurIPS this year.

		help