Skip to content

Language Archiving Technology

Sections
Personal tools
You are here: Home » Tools » Elan » ELAN Forum » Importing .txt files and tokenizing them

Importing .txt files and tokenizing them

Up one level

Importing .txt files and tokenizing them

Posted by pclark at 2010-01-19 19:32  

Importing .txt files and tokenizing them

I am encountering issues with imported .txt files when I begin adjusting the timing of tokenized units in a child tier. I successfully did this with a 45 second long video. The second video is about 5 minutes long and I’ve started over several times (we are working on familiarizing ourselves with Elan’s functionality as applied to our databases). The first couple of times, each sign was successfully time-coded in Elan up to the last 4 to 6 sentences. At that point the signs in the child tier for a particular sentence would disappear (the parent tier was unaffected). The signs in the child tier for the following sentence (sometimes 2 sentences) would remain intact, and then another sentence would be missing in the child tier. The most recent attempt to set the timing of signs in the child tier resulted in the text for both the parent and child tiers dropping out in the third sentence of the text with the following sentence intact.

Is there an issue with importing .txt files or with the process that I have been using?

Let me describe the process I used and issues that may be influencing the dropped text:

- Opened a .mov file
- Imported a .txt file using CSV/tab-delimited option (this file has both beginning and ending time codes at the sentence level
- Imported the tiers from a template I created from the first 45 second text (successfully). This included the child tier with the appropriate parent-child relationship
- Shifted all annotations to the appropriate beginning point of the video, approximately 8000 ms
- Timing for each imported sentence was not aligned correctly, so adjusted the timing of the parent tier to align with the video
- Began to align each sign in the child tier with the video

I see the issue arising from several possible points:
1. Timing elements in the .txt file is based on Web timing system (hh:mm:ss:ff). Elan’s is different
2. The resulting need to Shift all annotations approx. 8000 ms
3. The need to re-align the parent tier – changing the internal time codes somehow
4. Importing the child tier from a template – internal coding issues that I am unaware of

In using a Text editor, the time codes are definitely out of sync, but how they got that way is a mystery to me.

Can anyone help?

Re: Importing .txt files and tokenizing them

Posted by hasloe at 2010-01-20 14:12  

Which ELAN version is this? The situation of disappearing parent and child annotations resemble a bug that has been fixed in the latest version.
Otherwise, when all importing etc. has been successful, I see no reason why annotations should disappear (at the n-th sentence).

Han

Re: Re: Importing .txt files and tokenizing them

Posted by pclark at 2010-01-21 20:16  

Mac version 3.7.2-1 is the version I used.

Re: Re: Re: Importing .txt files and tokenizing them

Posted by hasloe at 2010-01-27 09:49  

Could you check if the same error still exists in ELAN 3.8.1? Just to be sure...

-Han

Re: Re: Re: Re: Importing .txt files and tokenizing them

Posted by pclark at 2010-02-05 16:06  

Well, I would check in 3.8.1 if I could, but now I can't even import my .txt file to try it. I wanted to start over and began a new video file, .eaf, went to import the .txt file as a CSV/tab-delimited file and it opens in a separate window. I can't get the annotations into the .eaf file and I've tried the export function, import tiers function....the .txt is not being imported into the .eaf file.

patty

Re: Re: Re: Re: Importing .txt files and tokenizing them

Posted by pclark at 2010-02-05 16:08  

Well, I would check in 3.8.1 if I could, but now I can't even import my .txt file to try it in 3.8.1. I wanted to start over and began a new video file, .eaf, went to import the .txt file as a CSV/tab-delimited file and it opens in a separate window. I can't get the annotations into the .eaf file and I've tried the export function, import tiers function....the .txt is not being imported into the .eaf file. I also went back to 3.7.2-1 to work on it in that version, but the same problem - it won't import the .txt file that it successfully imported before.

patty

Re: Re: Re: Re: Re: Importing .txt files and tokenizing them

Posted by hasloe at 2010-02-10 13:05  

If the import of the text file was successful, though in a separate window, you could add the video to the window your text is in, via Edit -> Linked Files.
You could also use File -> Merge Transcriptions to merge the two files.
If import was not successful you might want send the text file to me so that I can do some testing (han.sloetjes AT mpi.nl)?

-Han

Re: Re: Re: Re: Re: Re: Importing .txt files and tokenizing them

Posted by pclark at 2010-02-10 21:19  

I had linked the media & .txt files before, but couldn't remember how. Thanks, your recommendation worked.

patty

 

Powered by Plone

This site conforms to the following standards: