Archive for March 2011

 
 

Some news from the AVATecH project

by Przemek Lenkiewicz

The AVATecH project is an interesting initiative of the Max Planck Gesellschaft and Fraunhofer Gesellschaft. It aims at developing solutions that would allow creation of automated annotation for media recorded by linguistic researchers, therefore it has been seen as something highly desired and the expectations are high.

The project has recently passed two very important milestones. The first one has happened in November, when the AVATecH Expert Workshop took place. For two days the participants of the project have interacted with each other and with the potential users of their solutions, in order to present what is the status of the development and integration of their work and to get feedback and further suggestions from the linguists. Also experts from different fields have been present (audio/video processing, gesture and sign language research, field researchers) to see the status of work and to get an idea about what can be soon available for their purposes. Naturally they contributed numerous valuable comments.

After the status of work has been presented and suggestions have been gathered, all the project participants have worked on their solutions and another important point of the project has been reached, which was to deliver the first automated annotation functionality to the ELAN tool and make it available for Max Planck researchers. This functionality covers these initial possibilities:

  • The audio part aims at providing some functionality that takes place in major part of the annotations. This would be: detecting how many persons are speaking in the audio recording and create appropriate number of tiers; detect who is speaking when and create annotations for that at appropriate parts of the recording; align the recording with transcription from a text file.
  • The video part provides the following functionality: detecting shots and subshots in the recording; creating representative keyframes for given shots the subshots; estimating the color ranges that represent human skin in the recording; tracing the position of hands and head of the speaker. Further functionality will be built on top of the last mentioned recognizer, namely the position of the hands and head will be taken into account and together with time information they will serve to estimate the speed of hands movement, their relation to each other and to the speaker’s body, etc.

The MPI team is currently working on integrating these features with ELAN and providing manuals for researchers on how to use them.

New release of ELAN – Version 4.0.0

by Aarthy Somasundaram

Toward the end of last year a new version of ELAN has been released, containing lots of new features and improved functionalities, a new media player solution for Windows and fixes for a number of issues and bugs in previous versions.

A first implementation of interaction with LEXUS, the MPI developed web-based lexicon tool for creating and editing lexical databases, has been added. A new lexicon viewer allows the user to perform a look up for values in an online lexicon and to apply a value to the selected annotation.

ELAN has been facing many codec related problems, especially with mpeg-1 and mpeg-2 files. With the intention to eliminate a few of them, a new player, for Windows has been developed based on DirectShow (JDS, Java-Direct Show).
To use this player, it is necessary to select it first in the Platform/OS tab in the “Edit Preferences” window.

This version extends its support for controlled vocabularies with externally defined closed controlled vocabularies (located e.g. on the web). The list of supported file formats for importing controlled vocabularies has been extended with .txt and .csv. The file format of externally defined closed controlled vocabularies files is .ecv, which is close to eaf.

To make life easier and to increase the work speed of ELAN users, several improvements have been made to get things done with fewer steps and clicks.  A few tier-based operations, like removing multiple annotations or annotation values from selected tiers or creating depending annotations recursively on all depending tiers, can be performed much faster and with more ease of use. Now it is also possible to automatically create depending annotations, when an annotation is created on a tier with dependent tiers. The merge transcriptions function is extended with options for appending one file to the other, making the merging process more versatile.

Further support for audio and video recognizers, as developed in e.g. the AVATecH Project, has been implemented. To learn more about this project, visit the AVATecH website.

You can download the new version at the ELAN web site where you will also find the updated manual detailing how to use the new functionalities.

The VLO – Faceted Browser

by Patrick Duin

The Virtual Language Observatory (VLO), is an alternative way of browsing and searching different archives all over the world. We are happy to announce that the faceted browser for browsing resources part of the VLO has currently been updated and improved.

A faceted browser is a way to browse and search the data in the various archives based on the facets available in the data. These Facets are certain searchable aspects of the data. For instance figure 1 shows two facets Country and Language. It tells us that there are 15.548 records that have “Netherlands” as value for the country name.

Figure 1: Different Facets

When the user clicks on a facet value the user interface updates the other facets accordingly. So for example if we click “Netherlands”, we get figure 2.

Figure 2: Country = "Netherlands"

The language facet is updated and now only shows the languages of records that have “Netherlands” as the value for Country.
By clicking and selecting more facets the number of records can be narrowed down even more.

Selecting a result record gives a more detailed view of the data and if possible a link to the original context (archive) and links to resources associated with that record. See figure 3.

Figure 3: Result View

There is a direct access to the resources. This link goes directly to the archive providing the resource so authentication and authorization may be required.