I have a similar project that generates random SQL rows, and yes, getting everything consistent is a huge pain. Mine generates city/country pairs that match, and has a limited set of countries that it will also generate matching phone numbers for. It ignores area/operator code differences.
Email addresses, it uses first.last@$RANDOM.com, no country-specific.
The other struggle I have is that to properly test a DB, you need millions of rows, not thousands. Mine does quite well up through a few million, but then starts struggling. I need to overhaul the generator functions to use threading. I hadn’t initially because I assumed the CPU would be too busy to context switch, but then I tried a smaller example and found I was wrong - massive speed up.
Email addresses, it uses first.last@$RANDOM.com, no country-specific.
The other struggle I have is that to properly test a DB, you need millions of rows, not thousands. Mine does quite well up through a few million, but then starts struggling. I need to overhaul the generator functions to use threading. I hadn’t initially because I assumed the CPU would be too busy to context switch, but then I tried a smaller example and found I was wrong - massive speed up.