Removing duplicate objects from the database is one of several basic cleansing operations that Salesforce administrators employ to maintain data quality at a high level.
Of course, this probably isn’t a big news flash but if you’ve just started using DemandTools you may be wondering, what’s the best way to go about deduping a Salesforce database?
There are many different types of Salesforce objects—Accounts, Contacts, Leads, Opportunities, Tasks, Assets, Custom Objects, etc. In what order should you dedupe them? And what measures can you take to feel confident that you’ve found most of the duplicates?
In this article, we’ll give you an overview of how to go about finding duplicate objects in your Salesforce database using the DemandTools Single Table Dedupe module.
Start with Accounts and Work Your Way Down
We usually recommend deduping in this order: Account, Contact, Lead to Lead, Lead to Contact, Lead to Account, Opportunity, and Custom Objects. If you clean up parent objects first and then move on to the children, you can use the parent IDs to improve matching of child objects.
For example, if there’s only one “Example Inc.” in the database, it’s easier to identify child Contacts “J. Smith” and “John Smith” as duplicates. If these Contacts were attached to separate (duplicate) Accounts, establishing a match would be more difficult.
Develop a Multi-pass Dedupe Process
For each type of object, we recommend that you start with rigid matching criteria (i.e., exact field matches) and then dedupe several times with progressively looser matching criteria (e.g., fuzzy matching options, fewer fields). This will enable you to find duplicates faster and build confidence in data quality knowing that you’ve properly accounted for typos, different abbreviations and other data variations. DemandTools ships with a variety of prebuilt scenarios to assist you with this multi-pass approach.To keep you from being overwhelmed by the quantity of search results, we also recommend you place limits on what data will be scrutinized in each deduping pass. For example, you can break each pass up by state, creation date, or Account name (i.e., Accounts names beginning with a range of letters). Of course, if you’re only concerned with a certain subset of the database, such as the data generated at a recent tradeshow, then creating a limited deduping pass is a no-brainer.
Since there is no “undo” here, you should also view multi-pass deduping as a sensible precaution. Using a multi-pass strategy, you can also minimize any chance of experiencing problems related to Salesforce API call limits, the amount of computer RAM, or how long it takes to apply a Master Rule.
Schedule Automatic Deduping Runs
After you’ve developed some effective deduping scenarios, we recommend that you schedule them to run on a regular basis, and thus maintain data quality. Over time, you can incrementally improve data quality by modifying these scenarios and adding new ones to the schedule.