Duplicate Detection
Duplicate articles are often found when screening the scientific literature. Correctly identifying them is time consuming if performed manually, and becomes even more burdensome as more sources of data are added.
MLM-AI natively performs duplicate detection using different strategies, simplifying screening workflows and removing manual effort.
This page describes how MLM-AI performs duplicate detection, and how users can enable automated screening of duplicates in their results.
MLM-AI identifies duplicate articles according to these criteria:
- To be marked as a duplicate the article must:
- Have same ID and same source database, or
- Have same Document Object Identifier (DOI), or
- Have similar abstract content (with high confidence) and the same title
In MLM-AI each article receives a unique identifier generated at the source database (in PubMed, for example, this is the article PMID). Articles with same ID are marked as duplicates.
This method is straightforward and useful when Reviews return results for overlapping date ranges.
This approach relies on the Digital Object Identifier (DOI) of the article. The DOI is a unique ID assigned for each publication. Hence, if the same article appears on different databases it will still preserve the same DOI.
When an article DOI is available, MLM-AI will present it in the Details tab, linking the article directly to its authoritative source:

Article DOI visible from the Details tab
If the source journal has published valid DOI information, MLM-AI can use it to detect duplicates.
Finally, MLM-AI can also detect duplicates using content similarity. This happens in two stages:
- Select articles with similar content from the MLM-AI database, based on title and abstract
- Only articles with a valid abstract are eligible for comparison
- Duplicate abstracts do not need to match exacty, but must be highly similar
- Verify that titles match exactly
- If the abstract is similar, perform a second verification step on the article titles, to prevent detection of false positives
This option is only available for results obtained from the MLM-AI database. This excludes results from uploads from external sources.
MLM-AI can perform automated screening on duplicate articles. This can be enabled when configuring Monitors:

Enabling Duplicate Screening
This option is enabled by default. When enabled, duplicate articles are pre-screened and appear in the "Duplicates" tab of the Review results:

Duplicates tab
Any article where a duplicate has been detected will also contain a "Duplicates" tab, displaying the duplicate reason and linking to the duplicate article. This tab is always available for convenient inspection of duplicates, irrespective of whether pre-screening of duplicates is enabled for your Monitor.

- Duplicate detection works only on results from the same Monitor. This is by design, to prevent results from different monitors from interfering with one another.
- Duplicate detection by content similarity is only available to results generated from the MLM-AI database.