Annobase is a tool that provides light-weight but comprehensive representations for text annotation, implemented in Java. A main goal of the software is to provide a human-readable representation for text annotation that is easy to parse and extend, while also providing a pre-defined set of linguistically-motivated annotations and operations for them. Below is an example of how to use Annobase.

String text = "This is a sentence. This is another sentence";
AnnotationBase annBase = new AnnotationBase(text);
// Run your sentence segmenter and tokenizer, and set sentences and tokens to
// the annotation base instance.
List<Sentence> sents = annBase.getSentences();
Sentence firstSent = sents.get(0);
Token firstToken = firstSent.getFirstToken();


You need the following software to run Annobase.


The latest version is 1.0.1. Annobase is available for download, and licensed under the GNU General Public License (version 2 or later).



If you are using Annobase, please cite it as follows.

Jun Araki. 2015. Annobase: A Light-weight Representation for Text Annotation.

Here is the corresponding BibTex entry for the citation.

  author       = {Jun Araki},
  title        = {{Annobase}: {A} Light-weight Representation for Text Annotation},
  howpublished = {\url{}},
  year         = {2015},

Change log

Annobase 1.0.1 (2015-05-13):

Annobase 1.0.0 (2015-05-11):