# Benchmark & Consensus Export

## Overview

The Benchmark & Consensus Export plugin is an export plugin designed to measure the agreement between multiple annotators and generate a comprehensive CSV report summarizing the results

It analyzes annotations across annotators, compares them using selected metrics, and produces CSV reports to support quality assessment, benchmarking, and adjudication workflows.

<table data-card-size="large" data-view="cards"><thead><tr><th align="center"></th><th data-hidden data-card-cover data-type="image">Cover image</th></tr></thead><tbody><tr><td align="center">Benchmark &#x26; Consensus Export</td><td data-object-fit="contain"><a href="https://angohub-docs-assets.s3.eu-central-1.amazonaws.com/plugin-icons-banners/consensus-calculation.png">https://angohub-docs-assets.s3.eu-central-1.amazonaws.com/plugin-icons-banners/consensus-calculation.png</a></td></tr></tbody></table>

## Plugin Functionality

The Benchmark & Consensus Export plugin analyzes annotations produced by multiple annotators and generates structured CSV outputs for quality evaluation and benchmarking. The plugin is fully configurable using calculation modes and text comparison metrics.

* The plugin processes tasks for consensus generation or benchmark evaluation. In consensus mode, annotations from multiple annotators are compared to identify agreement and derive a consensus result. In benchmark mode, annotator outputs are evaluated against ground truth annotations to measure performance.
* Compares text-based annotations using a configurable evaluation metric. Supported metrics include: exact match, case-insensitive match and BLEU score.
* Allows specific annotation classes (schemas) to be excluded from consensus and benchmark calculations. Any schema IDs listed in ignored\_schema\_ids are ignored during metric computation and report generation.
* Generates detailed CSV reports summarizing annotator agreement, consensus results, and benchmark metrics.&#x20;

### Supported Data Types

* Compatible with all data types available in AngoHub.

### Supported Annotation Tools <a href="#supported-annotation-tools" id="supported-annotation-tools"></a>

* Classifications
  * Radio
  * Checkbox
  * Single-Dropdown
  * Multi-Dropdown
  * Tree Dropdown
  * Text
  * Nested Classifications
  * Multiple Classifications
* Tools
  * Bounding-Box
  * Rotated Bounding-Box
  * Polygon
  * Polyline
  * Segmentation
  * Entity
  * Point
  * Brush
  * Voxel Brush

{% hint style="danger" %}
Following tools are not supported by the plugin:

* PDF, Message, Angle
* Nested classifications within tools
* Relations
  {% endhint %}

## Plugin Configuration

From the **Stage Filter** field, pick the stages containing the tasks you'd like to get the export of. Similarly, from the **Batch Filter** field, you may select one or more batches the tasks of which will be exported. By default, all tasks are exported from all batches.

If you wish to receive an email when the export is complete, toggle **Send Email** on.

You may vary a number of settings related to your export from the **Config JSON** field. Each option is detailed below:

```json
{
  "mode": "consensus",
  "text_metric": "exact_match",
  "ignored_schema_ids": [],
  "include_key_frames_only": false,
  "logging_frequency": 0
}
```

* **"mode":** Determines what types of tasks the plugin will process.
  * Options:
    * <kbd>"consensus"</kbd>
    * <kbd>"benchmark"</kbd>

* **"text\_metric":** Defines the metric used to compare text-based annotations.
  * Options:
    * <kbd>"exact\_match"</kbd>
    * <kbd>"case\_insensitive\_match"</kbd>
    * <kbd>"bleu\_score"</kbd>

* "**ignored\_schema\_ids":** Specifies which classes should be excluded from the consensus calculation.
  * Example:&#x20;
    * <kbd>"ignored\_schema\_ids": \["12345", "12346"]</kbd>

* **"include\_key\_frames\_only":** Specifies whether only key frames should be included in the metrics calculation. (For video assets only)
  * Example:
    * <kbd>"include\_key\_frames\_only": true</kbd>
    * <kbd>"include\_key\_frames\_only": false</kbd>&#x20;

* **"logging\_frequency":** Defines how frequently progress logs are displayed; setting this value to 0 disables logging entirely, while any positive integer enables logging at the specified interval. For more information on how to view plugin logs, [see here](https://docs.imerit.net/plugins/introduction-to-plugins/monitoring-plugin-progress).
  * Example:
    * <kbd>"logging\_frequency": 0</kbd>&#x20;
    * <kbd>"logging\_frequency": 100</kbd>
