Accurate, Focused Research on Law, Technology and Knowledge Discovery Since 2002

Internet Archive – One Million Audio Cover Images for Research

Internet Archive – “Culled from various sources, this collection includes over one million JPG, PNG and GIF album covers. The resolution ranges from “thumbnail” through to very large sizes. Filenames are variant in usefulness, although a good number indicate at least the name of the original album. This dataset is for experimentation and image processing research only. At 148gb, the collection is large but not unmanageable (there is a torrent available) and allows a developer or artist to work with the material through various means. The differences in resolution, filename structure and arrangement encourage machine learning or visual recognition algorithms to be used.
Some possible experiments or outcomes that might be worth pursuing:

  • Album recognition software that links to album collections, allowing a user to aim a phone at their album covers and see if they match any services or known digitized albums.
  • Facial/Text Recognition that gives additional metadata about the images related to the content on them.
  • Color/Palette analysis of the album covers to find themes or preferred colors – combined with the album recognition above, it is possible to find general “genre rules” for album covers.

Play around! Have fun! Please bear in mind, you must be respectful of the original creators of these materials.”

Sorry, comments are closed for this post.