Strategic data selection and curation practices significantly reduce annotation costs and drive development productivity.
Marwitz et al. demonstrate the use of large language models to build semantic concept graphs from materials science abstracts and train a machine learning model to predict emerging topic combinations ...