In many cases each word has a number/token associated with it, but some words get broken up into several tokens. Same with longer numbers.
You can check out their tokenizer under the following domain. It probably gives you an idea why GPT is not very good at dealing with numbers.
https://beta.openai.com/tokenizer
In many cases each word has a number/token associated with it, but some words get broken up into several tokens. Same with longer numbers.
You can check out their tokenizer under the following domain. It probably gives you an idea why GPT is not very good at dealing with numbers.
https://beta.openai.com/tokenizer