نبذة مختصرة : High-quality metadata has always been essential for the use of collections in GLAMs. The use of machine learning methods opens numerous new possibilities in this context, but also poses new challenges for cultural heritage institutions. The image archive of the ETH Library in Zurich addresses this situation by generating and curating not only one but multiple metadata layers. The basis is formed by classic manual archive and library cataloguing. In addition, a group of external volunteers assists the Image Archive (https://ba.e-pics.ethz.ch) in enriching sparse or missing metadata (approx. 20,000 items per year). 35’000 images per year are georeferenced on a three-dimensional globe using the crowdsourcing platform sMapshot (https://smapshot.heig-vd.ch/owner/ethz). In the past months, the Image Archive started to include machine learning tools in its portfolio to broaden its experience in this field. First, key metadata fields were converted into English with the help of automatic translations. In addition, commercial software was used to recognize and label the nearly one million digitized images. The resulting keywords were integrated into the search interface, so far without manual post-processing, as a public pilot. In our paper, we would like to take a closer look at the work with these different levels, analyze the resulting challenges and provide an outlook on future activities. In doing so, we would like to contribute to the question of how archives can implement the idea of collections as data in the context of visual AI. We are particularly interested in how GLAM institutions can overcome data silos and create a data cycle that is characterized by the principles of FAIR and ideally generates added value for all stakeholders involved.
No Comments.