ANNIS - Search and Visualization in Multilevel Linguistic Corpora

annis logoANNIS is an open source, cross platform (Linux, Mac, Windows), web browser-based search and visualization architecture for complex multilayer linguistic corpora with diverse types of annotation. ANNIS, which stands for ANNotation of Information Structure, was originally designed to provide access to the data of the SFB 632 - "Information Structure: The Linguistic Means for Structuring Utterances, Sentences and Texts". It has since then been extended to a large number of projects annotating a variety of phenomena. Since complex linguistic phenomena such as information structure interact on many levels, ANNIS addresses the need to concurrently annotate, query and visualize data from such varied areas as syntax, semantics, morphology, prosody, referentiality, lexis and more. For projects working with spoken language, support for audio / video annotations is also required.

A number of different projects collect and annotate data according to the common SFB Annotation Standard, but many annotation schemes have been implemented in ANNIS. Data is often annotated using both automatic taggers/parsers and a growing set of manual annotation tools (EXMARaLDA, ELAN, annotate/Synpathy, MMAX, RSTTool, Arborator, WebAnno), and can be mapped onto an expressive format such as PAULA (Potsdamer Austauschformat Linguistischer Annotationen / Potsdam Interchange Format for Linguistic Annotations, developed in the SFB), a stand-off multilayer XML format, which serves as the basis for further processing. ANNIS provides the means for visualizing and retrieving this data. The diagram below illustrates the data flow from multiple annotation tools via SaltNPepper into the ANNIS application.

Data workflow

For more detailed information see the ANNIS Website  publish