This is a general and significant problem with machine learning: training data is usually filtered, cleansed or sanitized and has to comply with all kind of restrictions. Training data is one of the crucial components of machine learning. However, if possibly important data is thus eliminated before training, machine learning may seriously underperform.
"Machine-learning algorithms that can predict reaction yields have remained elusive because chemists tend to bury low-yielding reactions in their lab notebooks instead of publishing them, researchers say. ‘We have this image that failed experiments are bad experiments ..."
No comments:
Post a Comment