View on GitHub


Structuring the unstructured



Annot8 is about processing Items. An Item is a holder for Content.

An item could be anything but typically it’s an something of significance:

Through processing items can be broken down into sub items:

The original item can be discarded (or continued to be processed)


An Item can have multiple pieces of content.

Content is a representation of data in the item.

If we have an item which represents a file it could have:

Through processing content can be created and deleted from an item.

Let’s assume the file is written in French, a processor might produce a translated version:

We can set and use the properties of content to determine which content is which.


We want to identify aspects of content which are of interest.

Aspects of interest could be anything:

We create annotations on content to identify these aspects. Annotations live in the AnnotationStore on content.


Annotations have bounds. The bounds of an annotation define which part of the content it applies to.

Examples of bounds are:


Often we have an assoication between Annotations:

We can store this information in Groups.

Within a group, annotations have a role. For example, from the lives in relation example above:

You have have many annotations with the same role. Bob and Jane lived in London, or have roles which are missing, Bob and Jane lived in London in 2000.

Groups can link annotations accross differnt content within an item. We could have a group which capures that an annotated face in the image relates to the name in the caption.

Groups live in the GroupStore on Item.

Common elements: types, properties

Annotations and groups have types. These allow us to distinguish what the annotation or group is.

Annotations, groups, etc have properties. These are a collection of additional data which relates to the element. For example, if we annotate a area of an image because it contains a particular object, we can use the properties to store information about that object.