You can’t improve something if you don’t measure it. Developing an AI model is no exception – without benchmarks, you’re shooting in the dark. It's critical to choose the right metrics that will provide clear indicators of progress and areas for enhancement.
We’ve introduced new reporting features that will help you understand your datasets and your annotators better.
Our new reporting capabilities offer a range of benefits to streamline and enhance your data annotation and model training processes. You can access metrics regarding the quality of your training data, download CSV reports, and generate custom reports via the API.
Key benefits of data annotation reports:
In V7, you can track metrics and KPIs related to your annotation projects using several different methods. Each offers a unique set of information. Some methods provide insights at the dataset item level, while others allow you to collect metrics based on individual annotators or time frames.
You can find metrics related to your project in:
As you can see, there are multiple places where you can find the appropriate data, depending on your needs. If you require information about the completion status and the class population, a quick glance at the quality tab may suffice. Conversely, if you're interested in the individual performance of your annotators, a custom monthly report may offer a deeper understanding of the time your workforce spends on completing specific tasks.
While some reports include visualizations and charts within V7, using the API to generate custom reports can unlock additional insights and assist in building your own KPI dashboards. This enables you to fetch detailed information with varying levels of granularity and within specific time frames.
Now, let's explore how to collect information with the updated API features for reporting.
The steps are as follows:
Let’s discuss each of them in detail.
Before we start, generate a new API key and save it. We’ll need it for authorisation.
You also need a slugified name of your team. To find it, visit the Settings tab of any dataset, as datasets use the “/slugified name/dataset name” structure.
Query parameters include:
3. Date and granularity
4. Grouping by:
Let’s assume that we are interested in retrieving information about total annotations per user per month complete with average annotation time.
Our request looks like this:
All of the information is included in URL parameters. The complete link in this case is:
The parameters after the question mark, such as metrics=time_per_item, are used for determining which metrics we want to fetch.
Here is the response:
Note that the results include actor_id instead of the names of specific team members. This approach is used to protect the identity of annotators. However, if you wish to obtain information about unique users' IDs, you can use a different API request as outlined here. Once you determine who is who, you can replace IDs with real names during the response parsing in the next step.
As you can see, extracting specific metrics is very easy and you can generate your code snippet without creating any custom code. However, creating an interactive dashboard may require some additional effort.
Create a new Google Sheets file and go to Extensions > Apps Script to add code. Keep in mind that the request generated in step one is just a fragment of the full script. Different implementations may require tweaking certain parts and configuring additional functions. However, you can try to modify the logic and generate the correct snippet using ChatGPT.
We can set up an automation that will trigger the function with time-based rules. For example, the data can be fetched every several hours and updated in existing cells or saved in a new row of our spreadsheet . If we don’t want to overpopulate the sheet, we can also use additional parsing logic and convert specific values into more general names.
Once you are done, check the execution logs and the document itself to see if the fetching script is triggered correctly. Note that in the example provided, values are updated hourly, yet the time frame and granularity are set for the full month, as the focus is on November. This way we can avoid items being counted twice.
Once our automatic request mechanism is configured we can turn metrics into charts. In this example, we’ll use Looker Studio and connect it with the Google Sheet file. We can add the file as our data source and map specific columns onto specific elements of our dashboard.
Additionally, we can add custom filters, which will allow us to filter the results by additional parameters, such as user or dataset IDs.
Each metric offers a unique perspective on different aspects of your project, from the speed and accuracy of annotations to the overall flow of the annotation process. Understanding how to measure and interpret these metrics can greatly enhance your ability to manage your training data annotation project successfully.
Here are some essential parameters and dashboard ideas that can help you:
Use a combination of bar charts to represent annotation and review times. This dual approach provides insights into not only the speed of annotation but also the time taken for quality checks and reviews. Tracking these metrics helps in balancing the trade-off between speed and accuracy. This translates into high-quality data annotation without compromising on efficiency.
Create a scatter plot to correlate the total number of annotations with the time taken per item. This visualization offers a clear picture of annotators' productivity and the complexity of tasks. It's particularly useful for identifying whether certain types of data or specific annotators require more time, indicating a need for additional training or resources.
Implement this metric to design a nice funnel visualization. This chart can depict how data moves through different stages of the annotation process (e.g., initial annotation, review, final approval). Tracking stage transitions helps in pinpointing stages where delays or quality issues may occur, allowing for timely intervention and process optimization.