Back
New Feature
Update
Workflow stages

Improve Training Data Quality With the Logic Stage

Get an in-depth understanding of Logic Stages. Learn how to use the Logic Stage module to identify undesirable data points and accelerate the development of your AI models.
Improve Training Data Quality With the Logic Stage
April 3, 2023
5
mins read

What Is a Logic Stage and How to Use It?

In most cases, the complexity of the annotation process grows with the volume of your training data. To address this challenge, we created Logic Stages that will help streamline and automate your annotation workflows.

Let's consider a situation where a suspected soft-tissue mass is noticed on an ultrasound scan image. The file needs to be routed to a specialist for an in-depth review. Instead of assigning tasks manually, we can automate the process.

In this scenario, we can add a logic stage that checks if the image contains any annotations of the Abnormality class or relevant tags.

Based on the annotations in the image, it will be transferred to General Review or Sonographer Review automatically.

Logic stages allow you to create additional rules for your data annotation flows. You can use them to set up your own custom if/else conditions.

These are very handy for managing your data workflows or filtering out unwanted data.

The benefits of implementing logic stages in your ML pipeline include:

  • Higher degree of workflow customization
  • Increased annotation efficiency
  • Improved accuracy and QA
  • Cleaner datasets (outlier detection)

Here is another example of Logic Stage in action.

In the preceding annotation stage, a labeler can tag an image with one of three custom tags. Afterwards, they can send the annotated image to the logic stage. Based on the tags, the image will be routed to different steps.

If an image has a Class A tag it moves immediately to Complete status. Class B sends it to Review and Class C discards the image and moves it to the archive. If no class was assigned, we can set up a webhook-based notification to inform us about a missing tag.

The logic rules are always based on the presence of a specific class.

This means that once your data reaches the logic stage, the system will check if the image or video contains the given class. It makes the feature a great tool for verifying if specific annotations were completed.

To illustrate how multiple Logic Stages can be used to verify specific annotation tasks, let's consider an example. Suppose we have a dataset of images that show tennis players holding tennis rackets. In this scenario, we can assign the task of creating polygon annotations of the players to Jack and the task of labeling the rackets to John.

We want to ensure that all annotations are completed correctly. To accomplish that we can use two logic stages as the final part of our V7 workflow. The first stage checks whether the player annotations are complete, and the second stage checks whether the racket labels are complete.

If any Tennis Player or Tennis Racket annotations are missing, the image is sent back for annotation and routed to the appropriate person (Jack or John). This process ensures that all images are fully annotated before being used for model training.

You can define your own set of tags and use them to send your data through independent routes or even whole parallel flows.

Note that a logic stage will pass the image to the next step as soon as the first condition is met. Remember to keep this in mind while designing your workflows. A list of multiple conditions is in fact a if/else if/[…]/else situation.

Even if there are “wrench” objects in the picture, it will move ahead with the tape measure route because it is the first condition that is true. The highest rule wins so, in this case, the remaining conditions are ignored.

👉 It is a good practice to share information about your flow logic with all annotation team members to ensure consistency and avoid duplicating efforts. Use descriptive labels, clear annotation instructions, and comments to make it easier to understand how each Logic Stage works. 

Logic Stages are not only useful for managing annotation workflows but can also be used to identify outliers in your data.

The most common use case for logic stages is to apply custom tags like Blurry or Poor quality to automatically discard selected images.

It is also possible to interconnect AI models and logic stages to find data points that deviate significantly from the rest of the dataset. For example, instead of tagging your images manually, you can train an auxiliary model that detects blurry images.

Once your model identifies an image as an outlier, you can use logic stages to automatically move them to the archive. This will prevent the outlier from being used in your model training, and it can be further analyzed to understand why it was identified as an outlier.

You can also connect public models available in V7 to logic stages to filter your datasets.

Suppose we have a database consisting of thousands of frames from video footage. Our task is to prepare detailed keypoint skeleton annotations of people appearing in the video but the majority of our footage has no people in it at all.

We can easily set up a workflow using a Model stage and a Logic stage that will identify and filter out only the frames with people.

The example above uses the COCO model before sending images to human annotators. It recognizes many common objects and it can determine if there is a person in the frame or not. Pre-qualifying your data in this way before starting the manual annotation process can significantly save your time and resources.

Limitations and future developments

While Logic Stages offer many benefits, it's important to note that they have some limitations, such as the current inability to filter data based on attributes other than classes.

However, we’re developing new functionality that will allow you to filter data based on other attributes, such as model confidence scores.

As of March 2023, the Logic Stage functionality is hidden behind a feature flag. If you would like to use it, please contact us to enable the feature.

⚠️ Logic stages can send data further down your annotation pipeline or move them back to previous stages. Make sure to avoid creating infinite loops. If your image keeps jumping between stages because it is stuck in a logic loop, go to the datasets and change its status manually.

Related updates

Introducing Auto-Track: Seamless Object Segmentation in Videos
New feature
Update
Introducing Auto-Track: Seamless Object Segmentation in Videos
Explore V7’s Auto-Track feature for faster video annotation. Use automated instance segmentation and object tracking across frames to reduce delivery times for video labeling projects.
February 22, 2024
10
mins read
See more ->
V7 Introduces Multi/Single Select Properties
New feature
Update
V7 Introduces Multi/Single Select Properties
With customizable properties like Multi Select and Single Select, users can now enrich their datasets with detailed classifications. This update facilitates better dataset management, scalability, and adaptability, as well as training more accurate models.
December 13, 2023
10
mins read
See more ->
Gain Control of Your Training Data
15,000+ ML engineers can’t be wrong