The application of artificial intelligence methods plays an important role in certain areas at the ZBW. To continuously access and structure information according to certain requirements, to analyse information in its context, and to make it searchable with intelligent tools, the ZBW takes up the latest findings, methods, and tools of artificial intelligence, tests their practical viability, and either implements them in own applications or uses them for further development.

In particular, the ZBW is engaged in applied research in the field of machine learning and focuses on the following research and development topics:

Subject indexing of literature

Subject indexing, i.e. the annotation of literature resources with semantic information, is a core task of the ZBW in order to facilitate the discovery of relevant literature in the collection. Due to the digital flood of publications, it is hardly possible to annotate all resources intellectually, so that automation strategies have to be considered. From the perspective of machine learning, subject indexing is a so-called multi-label classification challenge, and with the rise of artificial intelligence in recent years, more and more methods, including open-source software, are now available to solve this task.

Especially deep learning methods (i.e. the use of multiple layers of neural networks) are currently attracting a lot of attention. A study on automatic subject indexing at the ZBW showed that neural networks were superior to all classical methods available at the time (such as nearest-neighbour classifiers or support vector machines) under the circumstances investigated, and the quality of the results was at a comparable level if only the title was used instead of the full text for subject indexing. Furthermore, it could be shown that neural networks using only title data even provide better results than full-text models under certain circumstances.

Results from applied research at the ZBW have also shown that a combination of several methods produces better results because the strengths of individual methods continue to come to bear, but their weaknesses are less significant. Machine learning methods can also be used to estimate the quality of the expected output of individual methods for different inputs and to decide accordingly which method should be applied to which resource. New research results on methods from the statistical or lexical-semantic area as well as for neural networks are constantly flowing into the applications.

The challenge of a practical application is to adapt the AI methods to the library context of the ZBW, to the controlled vocabulary and metadata, as well as to find sufficient training data for these methods. In the AutoSE (Automatic Subject Indexing), the ZBW is tackling the question of how machine learning solutions for automated subject indexing developed in-house can be integrated sustainably as a productive procedure in the library indexing process and how these can be continuously developed further during ongoing operations.

Point out relations and context

In addition to automated subject indexing, the methods of artificial intelligence also offer development opportunities for innovative downstream applications. With the help of Natural Language Processing, for example, it is possible to highlight similarities and differences between an existing collection of literature and a single document.

Users of the ZBW usually use the search portal Econbiz for their literature search. They formulate search queries and receive a sorted list of documents that match their search query. The individual knowledge of the user is not taken into account.

Using the latest insights in computational linguistics, prototypes are currently in development at the ZBW that are able to meet this challenge. Natural language processing/understanding methods can be used to provide information on how a new text fits into the documents already read. It is helping users to assess a search result, for example, to decide whether the content is completely new, or are there overlaps with documents that have already been read? The algorithms used here derive contexts and facts from single words and sentences in the texts and their relationships. Meaningful and appropriate keywords will be identified and associated concepts are recognised.

Learning from text and graph data

Currently, the ZBW is also researching the use of neural networks for further building blocks of literature search. In particular, the research focuses on possible applications and the factors influencing the developed models (e.g. network structure or the title of a publication). Recent developments for literature recommendation systems can, for example, point to possible missing citations based on the works cited or recommend further descriptors based on descriptors already indexed.

Another research field deals with the use of word vectors and matrices for literature search. Word vectors allow the detection of semantic similarities and connections between different words. With the help of machine learning, they can be generated from large amounts of unstructured text data. By taking the order of the words into account, which is usually neglected in classical word vectors, they can be extended to word matrices. Both concepts will be explored according to their potential to improve the response to search engine queries.

Analysis of scientific innovation processes

The existing metadata resources at the ZBW and in partner institutions are also highly appropriate for research on machine learning processes. In the project Q-Aktiv, the ZBW is studying the learning of representations in dynamic networks from bibliographic metadata. The objective of the Q-Aktiv project is to learn a representation for the concepts of a controlled vocabulary. The network structure between research papers, authors, concepts, journals, and institutions serves as a data basis. So far, different techniques for learning representations on concepts have been compared. Currently, this methodology is being extended for dynamic networks that change over time, for example, in order to analyse and predict scientific dynamics.

With these adaptations of existing AI procedures to the library context, the ZBW provides a transfer into practice. The research and concrete implementation into existing ZBW products and services provide valuable insights and lead to even better access to relevant information in economics. While working on these topics, we maintain an exchange with other libraries and research institutions. The ZBW is willing to contribute the application-oriented experiences and findings gained by the information science community, especially in machine learning, to current discourses and debates on the topic of information retrieval and artificial intelligence.