Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: How do you test your ML datasets?
2 points by ThePhysicist on July 17, 2019 | hide | past | favorite
To all data scientists and machine learning researchers & practitioners:

- What do you do first when you get a new dataset for machine learning?

- How do you analyze your data to find relevant features?

- How do you identify data quality problems?

- Which statistical tests do you perform on the dataset?

- Which visualization techniques do you use to investigate the data?

I'm working on a library that helps people to find potential problems in datasets and ML models (which is not ready to share/publish yet), so I'd love to get some feedback on what you think are best practices for preparing and validating datasets for ML.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: