Factiva deduplication explained
Jinfo Blog

12th August 2008

Item

Deduplication – the ability to remove the same or similar stories from search results – is a useful tool in any information product. Factiva has put control over the process of removing duplicate stories into the hands of information professionals, by offering three ways of doing it (for search results in English, at least) http://www.factiva.com/factiva/search/deduplication/. Firstly, you can set a default for all searches, which will remove duplicate articles automatically from your search results. This can be found in the ‘Preferences’ section. Secondly, you can have deduplication “on-demand’, where you apply the deduplication process only to the current search. This clearly has advantages if you are researching a subject that has a lot of press coverage. This option is available in the Search Builder screen. Against the Duplicate Articles heading a drop down box lets you select whether Identify Duplicates is On or Off. If Off, no articles will be identified as duplicates in your search results. If On, you can choose between Virtually Identical, where the system will identify the lowest number of duplicate articles in the search results, or, Similar, where the highest number of duplicate articles will be identified. The third option is to apply the deduplication process to a completed search: useful if you were unaware how much had been written about your research topic until after the search had taken place. By clicking the Duplicates button on the results page, the same three options as above are available: Off, Virtually Identical, and Similar. Sometimes, even with the best-crafted research query in the world, search results just give you too many stories to cope with. This new tool lets you manage the search results in a more controlled, numerical, way, without having to compromise the integrity of the underlying search statement.

« Blog