word count in elan
Up one level
word count in elan
we often need to know the total number of words in a given corpus, or collection of eaf files. it would be great to implement an easy way to do this in elan, maybe within "search in multiple eafs", as it is also there, that a corpus(-domain) is defined.
Re: word count in elan
Hmm, I think there are two options that come close to what you want.
The multiple file export "List of Words" (File->Export Multiple Files As->List of Words), with the option "count occurrences" selected, creates a two column text file (tab-delimited) with all unique words and the number of occurrences of each word. Opened in a spreadsheet application the number of rows represents the number of unique words, the sum of the numeric values in the second column should be the total number of words.
Using a regular expression (e.g. for word boundaries) in The "Structured Search in Multiple eaf's" could also be used to render the word count, I believe.
-Han