|
|
2007
Up one level
-
Presentations
-
-
Visualizing endangered indigenous languages of French Polynesia with LEXUS
-
This paper reports on the first results of the DOBES project ‘Towards a multimedia dictionary of the Marquesan and Tuamotuan languages of French Polynesia’. Within the framework of this project we are building a digital multimedia encyclopedic lexicon of the endangered Marquesan and Tuamotuan languages using a new tool, LEXUS. LEXUS is a web-based lexicon tool, targeted at linguists involved in language documentation. LEXUS offers the possibility to visualize language. It provides functionalities to include audio, video and still images to the lexical entries of the dictionary, as well as relational linking for the creation of a semantic network knowledge base. Further activities aim at the development of (1) an improved user interface in close cooperation with the speech community and (2) a collaborative workspace functionality which will allow the speech community to actively participate in the creation of lexica.
-
Language Archiving Technology at the MPI
-
Poster
-
Creating multimedia dictionaries of endangered languages using LEXUS
-
This paper reports on the development of a flexible web based lexicon tool, LEXUS. LEXUS is targeted at linguists involved in language documentation (of endangered languages). It allows the creation of lexica within the structure of the proposed ISO LMF standard and uses the proposed concept naming conventions from the ISO data categories, thus enabling interoperability, search and merging. LEXUS also offers the possibility to visualize language, since it provides functionalities to include audio, video and still images to the lexicon. With LEXUS it is possible to create semantic network knowledge bases, using typed relations. The LEXUS tool is free for use.
-
Language Archives Newsletter (LAN)
-
Today’s language archives, especially those involved with documentation and endangered languages, exist in a world where there is constant change not only in the technologies we use but also in the methodologies of the fields we serve. As we respond to these changes in technologies and the expectations of us, new opportunities and new problems arise.
The current issue of LAN works within this theme: Dieter van Uytvanck et al discuss the potential for map-based access of language resources to enhance our audience and our effectiveness. Bernard Howard reviews an audio recorder that continues the trend to compactness while maintaining quality and flexibility. On the other hand, David Nathan draws attention to the increasing use of video and raises doubts about how it is currently handled and resourced.
The next issue of LAN will publish further discussion on the subject of video, so please send any responses to that – or any other – matter to us.
David Nathan, Paul Trilsbeek, Marcus Uneson
-
Disambiguating automatic semantic annotation based on a thesaurus structure
-
The use/use for relationship a thesaurus is usually more complex than the (para-) synonymy recommended in the ISO-2788 standard describing the content of these controlled vocabularies. The fact that a non preferred term can refer to multiple preferred terms (only the latter are relevant in controlled indexing) makes this relationship difficult to use in automatic annotation applications : it generates ambiguity cases. In this paper, we present the CARROT algorithm, meant to rank the output of our Information Extraction pipeline, and how this algorithm can be used to select the relevant preferred term out of different possibilities. This selection is meant to provide suggestions of keywords to human annotators, in order to ease and speed up their daily process and is based on the structure of their thesaurus. We achieve a 95 % success, and discuss these results along with perspectives for this experiment.
-
The value of usage scenarios for thesaurus alignment in Cultural Heritage context
-
Thesaurus alignment is important for efficient access to heterogeneous Cultural Heritage data. Current ontology alignment techniques provide solutions, but with limited value in practice, because the requirements from usage scenarios are rarely taken in account. In this paper, we start from particular requirements for book re-indexing and investigate possible ways of developing, deploying and evaluating thesaurus alignment techniques in this context. We then compare different aspects of this scenario with others from a more general perspective.
-
A Federation of Language Archives Enabling Future eHumanities Scenarios
-
This paper describes the need for new infrastructures for future
eScience scenarios in the humanities. Three projects working on different
aspects of these infrastructures are examined in detail. The first
project is trying to achieve a federation of archives, developing an integration
layer at the level of localization, access to and referring to an
archive’s raw data objects. The other two try to achieve interoperability
at the level of semantic interpretation of linguistic data-types and
tagging systems. The project’s different approaches to this problem
show the trade-of between flexibility and the user’s workload. All
three approaches give an impression about the necessary steps to come
to an eHumanities scenario.
-
Anchoring Dutch Cultural Heritage Thesauri toWordNet: two case studies
-
In this paper, we argue on the interest of anchoring
Dutch Cultural Heritage controlled
vocabularies to WordNet, and demonstrate
a reusable methodology for achieving this
anchoring. We test it on two controlled
vocabularies, namely the GTAA thesaurus,
used at the Netherlands Institute for Sound
and Vision (the Dutch radio and television
archives), and the GTT thesaurus, used to index
books of the Dutch National Library. We
evaluate the two anchorings having in mind a
concrete use case, namely generic alignment
scenarios where concepts from one thesaurus
must be aligned to concepts from the other.
|
|