biologit MLM-AI 1.1
BiologitContact Sales
  • biologit MLM-AI
  • Application
    • Sign on and Navigation
    • Reviews
      • Find Articles from Reviews
    • Uploading Review Data
    • Review Details
      • Searching Inside a Review
      • Author Country of Origin
      • Work Assignment
    • Article Screening
      • Attachments and Full Text Screening
      • Reviewing Articles in Related Reviews
      • Article-Level Day Zero
    • Batch Screening
    • Article Search
    • Reports
    • Dashboards
  • Configuration
    • Monitors
    • Users, Teams and Permissions
    • Settings
    • Custom Workflows
    • E2B Configuration Guide
    • Notifications
  • TOPICS
    • MLM-AI Concepts
    • Handling Article Dates
    • AI-Enabled Screening Workflows
    • Duplicate Detection
  • More Help
    • Support and Platform Details
    • Release Notes
      • Release Notes - 2023
      • Release Notes - 2022
    • Notices
      • Data Privacy
      • CFR-11
    • FAQ
Powered by GitBook
On this page
  • Design Principles
  • How it Works? Duplicate Detection in the MLM-AI Database
  • Enabling Automated Duplicate Detection
  • Inspecting Duplicate Articles
  • Limitations
  • Learn More
  1. TOPICS

Duplicate Detection

PreviousAI-Enabled Screening WorkflowsNextSupport and Platform Details

Last updated 2 days ago

Duplicate articles are often found when screening the scientific literature. Correctly identifying them is time consuming if performed manually, and becomes even more burdensome as more sources of data are added.

MLM-AI natively performs duplicate detection using different strategies, simplifying screening workflows and removing manual effort.

This page describes how MLM-AI performs duplicate detection, and how users can enable automated screening of duplicates in their results.

Design Principles

The MLM-AI duplicate engine is designed for high precision: to minimize the risk of an erroneous duplicate match, the engine only marks articles as duplicates with high confidence, employing multiple verification approaches.

Because the engine emphasizes safe duplicate detections, certain duplicate articles may not be flagged in all cases, although such instances are expected to be infrequent.

The methods and verification approaches used in MLM-AI are described in the following sections.

How it Works? Duplicate Detection in the MLM-AI Database

MLM-AI identifies duplicate articles according to these criteria:

  • Duplicate articles are only detected for results from the .

  • To be marked as a duplicate the article must:

    • Have same ID and same source database, or

    • Have same Document Object Identifier (DOI), or

    • Have similar abstract content (with high confidence)

  • Candidate duplicate articles must also pass an article title similarity check

Same ID and Source Database

In MLM-AI each article receives a unique identifier generated at the source database (in PubMed, for example, this is the article PMID). Articles with same ID are marked as duplicates.

This method is straightforward and useful when Reviews return results for overlapping date ranges.

Same Document Object Identifier

When an article DOI is available, MLM-AI will present it in the Details tab, linking the article directly to its authoritative source:

If the source journal has published valid DOI information, MLM-AI can use it to detect duplicates.

Similar Content

Finally, MLM-AI can also detect duplicates using content similarity. This happens in two stages:

  • Select articles with similar content from the MLM-AI database, based on title and abstract

    • Only articles with a valid abstract are eligible for comparison

    • Duplicate abstracts do not need to match exactly, but must be highly similar

Title Match

  • For all types of duplicate, perform a second verification step on the article titles, to prevent detection of false positives

Enabling Automated Duplicate Detection

This option is enabled by default. When enabled, duplicate articles are pre-screened and appear in the "Duplicates" tab of the Review results:

Inspecting Duplicate Articles

Any article where a duplicate has been detected will also contain a "Duplicates" tab, displaying the duplicate reason and linking to the duplicate article. This tab is always available for convenient inspection of duplicates, irrespective of whether pre-screening of duplicates is enabled for your Monitor.

Limitations

  • Duplicate detection works only on results from the same Monitor. This is by design, to prevent results from different monitors from interfering with one another.

  • Duplicate detection by content similarity is only available to results generated from the MLM-AI database. However duplicates by ID and DOI are available on uploaded results.

Learn More

This approach relies on the (DOI) of the article. The DOI is a unique ID assigned for each publication. Hence, if the same article appears on different databases it will still preserve the same DOI.

This option is only available for results obtained from the MLM-AI database. This excludes results from .

MLM-AI can perform automated screening on duplicate articles. This can be enabled when :

Learn more about how the techniques used in automated duplicate detection on:

same Monitor
Digital Object Identifier
uploads from external sources
Configuring Monitors
Feature spotlight: Duplicate Detection in Biologit MLM-AI
configuring Monitors
Article DOI visible from the Details tab
Enabling Duplicate Screening
Duplicates tab