here are two basic approaches to creating AI datasets. The first one, which is typical of the case we have been studying, a pool of open works is purposefully chosen to ensure license compliance. The second approach creates the dataset by scraping the “raw internet” and relying on copyright exceptions. LAION , a dataset of 400 million image-text pa... See more