Benchmark & Consensus Export
Overview
The Benchmark & Consensus Export plugin is an export plugin designed to measure the agreement between multiple annotators and generate a comprehensive CSV report summarizing the results
It analyzes annotations across annotators, compares them using selected metrics, and produces CSV reports to support quality assessment, benchmarking, and adjudication workflows.
Benchmark & Consensus Export
Plugin Functionality
The Benchmark & Consensus Export plugin analyzes annotations produced by multiple annotators and generates structured CSV outputs for quality evaluation and benchmarking. The plugin is fully configurable using calculation modes and text comparison metrics.
The plugin processes tasks for consensus generation or benchmark evaluation. In consensus mode, annotations from multiple annotators are compared to identify agreement and derive a consensus result. In benchmark mode, annotator outputs are evaluated against ground truth annotations to measure performance.
Compares text-based annotations using a configurable evaluation metric. Supported metrics include: exact match, case-insensitive match and BLEU score.
Allows specific annotation classes (schemas) to be excluded from consensus and benchmark calculations. Any schema IDs listed in ignored_schema_ids are ignored during metric computation and report generation.
Generates detailed CSV reports summarizing annotator agreement, consensus results, and benchmark metrics.
Supported Data Types
Compatible with all data types available in AngoHub.
Supported Annotation Tools
Classifications
Radio
Checkbox
Single-Dropdown
Multi-Dropdown
Tree Dropdown
Text
Nested Classifications
Multiple Classifications
Tools
Bounding-Box
Rotated Bounding-Box
Polygon
Polyline
Segmentation
Entity
Point
Brush
Voxel Brush
Following tools are not supported by the plugin:
PDF, Message, Angle
Nested classifications within tools
Relations
Plugin Configuration
From the Stage Filter field, pick the stages containing the tasks you'd like to get the export of. Similarly, from the Batch Filter field, you may select one or more batches the tasks of which will be exported. By default, all tasks are exported from all batches.
If you wish to receive an email when the export is complete, toggle Send Email on.
You may vary a number of settings related to your export from the Config JSON field. Each option is detailed below:
"mode": Determines what types of tasks the plugin will process.
Options:
"consensus"
"benchmark"
"text_metric": Defines the metric used to compare text-based annotations.
Options:
"exact_match"
"case_insensitive_match"
"bleu_score"
"ignored_schema_ids": Specifies which classes should be excluded from the consensus calculation.
Example:
"ignored_schema_ids": ["12345", "12346"]
"include_key_frames_only": Specifies whether only key frames should be included in the metrics calculation. (For video assets only)
Example:
"include_key_frames_only": true
"include_key_frames_only": false
"logging_frequency": Defines how frequently progress logs are displayed; setting this value to 0 disables logging entirely, while any positive integer enables logging at the specified interval. For more information on how to view plugin logs, see here.
Example:
"logging_frequency": 0
"logging_frequency": 100
Last updated