Advanced search over multiple files
Up one level
Advanced search over multiple files
Quoting the manual at http://www.lat-mpi.eu/tools/elan/manual/ch06s04.html (Searching through multiple annotation files): "It is not possible to restrict the search results to a certain tier or to specify extra structural or temporal constraints."
Unfortunately I would need exactly this kind of functionality. That is, to search in multiple files (100+) for specific annotations on specific tiers which are followed by ... and so on. The files I'm speaking of were created with EXMARaLDA and converted to the Elan format.
As EXMARaLDA doesn't support this I have to find out whether other tools can perform this task before I go on and create my own program. A colleague told me about the advanced searching capabilities of Elan - which are awesome - but as it seems restricted to one file.
So my first question is: Did I miss something? Are these searching capabilities hidden somewhere I didn't look at?
Second, if not, is the source for the one-file-search modular enough to dive into the source, copy it and use it somewhere else?
Third, are there any other tools out there which could possibly be used for this task?
I would appreciate any sort of hint, so if you don't know exactly but just vaguely about the source or other tools or whatever, just give me the information you have and I will start to dig from there.
Thanks!
Stefan
Re: Advanced search over multiple files
Yes, the kind of search you are referring to is possible. The search window that is shown on the manual page that you are quoting, contains three tabs: Substring Search, Single Layer Search and Multiple Layer Search. The last tab is the one you need. It offers a user interface to specify search patterns within tiers (annotations followed by ...) and between tiers (and any combination thereof).
The relevant manual page for that tab is at:
http://www.lat-mpi.eu/tools/elan/manual/ch06s05s03.html
I hope this helps.
-Han
Re: Re: Advanced search over multiple files
Follow-up to what Han said: Please also have a look at the NEXT page,
http://www.lat-mpi.eu/tools/elan/manual/ch06s06.html which explains how to set up the search for searching in multiple files at once. Some forms of the search functionality in ELAN are based on the Annex / Trova annotation content search engine, so in general you should be able to do searches of the same complexity in both. Please let us know if there are problems with either ELAN multiple file complex search or Trova.
Note that some of the more powerful searches need some time to get used to. In general, you have substring, exact and regular expression search, in global, per-tier (or tier group, based on constraints such as participant-is-Joe) and matrix context. Only the per-tier search also allows N-gram search (with # as wildcard for any-one-word) while the matrix search uses drop down lists for constraints between the different fields in the matrix. If constraints involve some numerical variable (such as time distance at least X) then a pop-up will let you fill in that variable at the moment when you select the constraint from the drop down list.
Re: Re: Re: Advanced search over multiple files
Thank you both! It looks nice but sophisticated :-). I must have skipped these sections of the manual, maybe because I had troubles following the structure of the many search tools.
I'll need some time to get used to it and find out whether it fits my needs - but it seems that it does.
Stefan
Re: Re: Re: Re: Advanced search over multiple files
OK. Now we managed to "develop" some queries and it looks like they'll do the trick. But I have one more question:
There are some cases where a pattern is matched several times in one file (and we can't refine the query to exclude them) and the result list only shows the matches without the name of the file they were found in. It's hard to scroll through tens or hundreds of matches to see whether they belong to different files. We only want to have one entry for possibly many hits in one file.
Maybe you can say in which java class this result view is implemented so that I can make a patch or something to be able to see the source file name - the file name must be already there because the file opens when I click on a match, so it shouldn't be hard to implement.
Thanks!
Stefan