English-language extension of the TIB AV Portal

Facts

Management

Margret Plank

Team

Dr. Sven Strobel

Duration

1 December 2013 – 31 March 2014

The aim of the project was to obtain English technical vocabulary for tagging English-language videos in the TIB AV Portal by mapping GND terms to DBpedia and other standard data.

 

Cooperation

Hasso-Plattner-Institut für Softwaresystemtechnik, Potsdam

Links

 

Description

Videos in the TIB AV Portal are automatically tagged with a total of 63,356 GND terms from the realms of science and technology. In addition to German-language videos, the TIB AV Portal also contains numerous English-language videos. The GND contains only very few English identifiers for the terms used in the TIB AV Portal knowledge base. There is therefore a lack of English indexing vocabulary that could be used to automatically tag the English-language videos. The problem was to be tackled as follows: the English identifiers were to be obtained by mapping GND terms to other datasets that contain an English translation for the terms. The mapping strategies applied used the results of DBpedia, LCSH, MACS and the WTI thesaurus. (At least) one English label was ultimately identified for each of the 35,025 GND terms. These English identifiers can be directly used to automatically tag English-language videos. Although it was not possible to ‘translate’ 11,694 GND terms into English, they were at least associated with a hypernym for which an English translation exists. This association helps to expand the search results.

 

Back to list