Embeddable Annex

by Thomas Koller

The MPI developers recently made a new Annex feature available which allows users to embed a smaller-sized customised version of Annex into any web page. This new feature has since then been warmly welcomed by researchers inside and outside our institute as it is a great way to easily show research results to outsiders.

The embeddable version of Annex only supports access to freely accessible annotation resource bundles, i.e. resource bundles which can be accessed from the IMDI browser without user login. This restriction helps to avoid authentication issues and effectively protects resources with restricted access.

This new feature can be accessed directly from Annex by clicking on embed in the menu. Then a small dialog pops up where the user can customise the HTML snippet before copying it to the clipboard and pasting it into a webpage. This works pretty much the same way as the similar YouTube feature which users may already be familiar with. The following options are available:

  • Show border around embedded Annex application: the creator can select a border width, a border color and a border type (solid, dotted or dashed)
  • Size of the embedded Annex application: 4 predefined sizes are available. The user can also set any custom sizes directly in the HTML markup. It should be noted, however, that the embedded Annex application has been optimised in layout and components sizes for the 4 predefined sizes. So any custom size set in the HTML snippet can lead to a non-optimal looking Annex instance.
  • Default view: text or subtitle. Setting a different default view (such as timeline or grid) will be ignored, instead the ‘text’ view will be set.
  • Tier text font: This setting may be helpful if the user wants the embedded Annex to display an annotation resource with special characters which may not be contained in a standard font on the user’s computer. If the ‘Tier text font’ parameter is set with a font name which is not available on the user’s computer, then the embedded Annex application will automatically fall back to a standard font. The end user also has the option to change the tier text font and the font size at any time via a dropdown list.

The embedded Annex application has a Start Full ANNEX button in its top right corner. When the end user clicks this button, a new browser tab will open the full Annex version showing the same annotation resource.

ANNEX and ELAN – A Comparison

by Thomas Koller and Han Sloetjes

ANNEX and ELAN are two closely related applications designed for handling of digital media files and associated annotation files. While ELAN as a desktop application is used for the creation of rich annotations on audio and video recordings, ANNEX represents a web-based viewer which allows to study annotated resources once they have been properly stored on the archive server.

This short article aims at highlighting on the one hand what features they have in common and on the other hand what features are unique to each tool.

ELAN is a local tool (desktop application) for the creation of annotations to audio and or video recordings. It is a combination of a media player with viewer and editor components for annotations. The annotation documents are stored in the XML-based ELAN Annotation Format (EAF). ELAN is written in the Java programming language and is available for Windows, Mac OS X and Linux. On Windows and Mac the media playback is delegated to an available high performance native media framework: DirectX/DirectShow on Windows and QuickTime on Mac. On Linux JMF is used. The list of supported file types depends on the available media player frameworks.

ELAN main window

Although there is limited support for streaming media via the RTSP protocol, most commonly the media files are accessed directly on a local hard drive or the local network. This guarantees high accuracy in media playback, especially in (repeated) playback of fragments of the media, which is usually a basic step in the process of segmenting the media. The annotation boundaries can be determined with millisecond precision. ELAN supports simultaneous, synchronized playback of up to 4 video files. The annotation documents are stored locally as well. The variant of the TROVA search engine that is distributed with ELAN can query the contents of physical directory structures. To that end it creates temporary in-memory indexes for the content of selected folders and files. The search is limited to EAF files. The ELAN window offers several customizable views on the annotation data, all synchronized with the media player. All viewers are editors at the same time. Many operations are provided for manipulating tiers and annotations.

ANNEX is written as an ELAN compliant browser-based tool (web application) that supports media playback via HTTP pseudostreaming and the Flash Player browser plugin. For freely accessible language resources ANNEX can also be embedded in any web page by pasting a simple HTML snippet into the page (comparable to the way Youtube supports embedding of videos into web pages). Alongside the media player it contains several customizable viewer components for annotations. By default both the media files and the annotation files are streamed from the MPI online archive; there is no need for downloading files in order to be able to view their contents. ANNEX is seamlessly integrated with the archive access management tools and interacts with available web services, for example the ones exposed by the lexicon tool LEXUS. Other tools in turn can make parameterized calls to ANNEX.

ANNEX works with the online version of TROVA, which creates an index for a whole LAMUS archive using the Postgres database system. This version of TROVA supports not only EAF but also Shoebox, CHAT and generic XML, HTML and text files.

Comparison Matrix
Feature ANNEX ELAN
Number of synchronized videos 1 4
Media file types MPG for video files, WAV for audio files Depending on the media framework of the particular platform
Waveform for audio .wav only .wav only
Media playback precision Depends on keyframe rate milliseconds
Streaming media support Pseudostreaming for audio and video files Limited, via rtsp
Annotation formats EAF, Toolbox/Shoebox, Chat. Will be converted to a single XML format for transfer. EAF, import of Toolbox, Chat, Praat, Transcriber, CSV
Annotation editing No Yes
Number of tiers Unlimited Unlimited
Font usage Any font available on the system Any font available on the system supported by Java
Search options TROVA search engine, search in entire (accessible part of) archive Single file search and multiple file search (TROVA) in local corpus
Technology Flash, XML, Quicktime (temporarily for resources with master audio file) Java, XML
Tool interaction, API Support for parameterized calls to ANNEX Extension mechanism for particular parts of the application

New ANNEX and TROVA user interfaces

by Thomas Koller

ANNEX is a web-based annotation exploration tool to display and play back annotated resource bundles (incl. video, audio and annotated text) stored on local or remote language archive servers. TROVA is a web-based search tool to search for simple or complex annotations (incl. regular expressions) on resources residing on local or remote language archive servers.

In autumn 2008 we started to redesign the user interfaces for ANNEX and TROVA to make them more usable and to allow for more functionality to be added at later stages. We decided to use a programming technology called Flex for the redesign of ANNEX and TROVA. A web application developed with Flex will run in the Flash Player browser plugin which is already installed on a vast majority of web browsers due to the use of plugin on popular web sites such as Youtube.

Changing the Order of Tiers

Using the Flash Player as the delivery technology for ANNEX and TROVA has a number of advantages for ANNEX users. First of all, video-based resource bundles can now be played back not only on Windows and Mac computers but also on Linux systems. Linux support for audio-based resource bundles requires some additional changes on the server side and will be available in the near future.

With Flex-based ANNEX and TROVA, we can now provide a more homogeneous look & feel. HTML-based web pages often look different to some extent in various browsers because of the way HTML is rendered differently across browsers. This can also cause yet unknown problems over time as a single browser may change the way HTML is rendered. Flex-/Flash Player-based applications, in contrast, look and work in the same way across browsers and operating systems.

Trova Single Layer Search

Trova Single Layer Search

Using Flash Player as the delivery technology, there is a noticeable improvement in ANNEX in its ‘timeline’, ‘waveform’ and ‘combined’ data views. In the previous ANNEX version, these data views were displayed as static image files, which had two disadvantages: First, when a user wanted to change the order of tiers in the ‘timeline’ or ‘combined’ view, this action had to be executed by selecting appropriate values in two dropdown lists which were located in another part of the ANNEX screen. In the new ANNEX version, the user can directly change the order of tiers via drag & drop. Second, the static data view images were not updated as soon as the video or audio playhead reached the time displayed on the right margin of the currently displayed data view (i.e. the playback went on but the data view stayed the same). In the new ANNEX version, the currently displayed data view is automatically being updated when the video or audio playhead reaches the time displayed on the right margin of the currently displayed data view. Therefore, the user will always be presented with an updated data view when playing back resources of any time length.

The new ANNEX version provides context sensitive help for different parts of the graphical user interface (such as the ‘Video display’ and ‘Media information’ panels). To access the help content for a panel, the user can either press the H key on their keyboard (this will display the help content for the panel below the mouse cursor) or they can drag the question mark (located at the top of the screen) to a user interface panel. The help content for this panel will then be displayed as soon as the question mark has been dropped.

Annex Help Texts

Annex Help Texts

Another important improvement in ANNEX and TROVA is the addition of a font chooser dropdown list. The user is now able to apply any font installed on their computer to the annotation text of the currently selected resource(s). This is particularly useful for the display of languages with uncommon fonts. A newly selected font will immediately be applied to currently displayed annotation text without having to reload ANNEX or TROVA.

Soon, there will also be standalone and embedded versions of ANNEX available. The embedded version will support the embedding of a smaller ANNEX version with a preselected resource in any web page (similar to the way Youtube videos can be embedded in other web pages). This youtube-like feature helps resource authors to showcase their work without making their readers leave their web site. Instead of pointing directly to the IMDI browser on a language archive server, authors can then describe their resource on their own web page in any way they like. Only freely available language resources will be able to be used with the embedded ANNEX version.

Choosing Fonts in Annex

A standalone desktop-based version of ANNEX will provide the opportunity to use ANNEX when for example an Internet connection is not available. This can be useful while travelling on planes or trains or as a fallback strategy for the presentation of language resources at workshops or conferences. It can also be useful to work with data-rich language resource bundles which otherwise could prove to be too demanding for proper display in ANNEX when being used over the web.