Match Categories
- All - All Categories
- Alpha - Techniques for text
- Contact - Techniques specifically for dealing with Contact names
- CRM - Techniques for Customer Relationship Management Categories
- Geo - Techniques for address information
- Internet - Techniques for domain information
- Numeric - Techniques for dealing with number based matches
Acronym: Creates an acronym for a Company/Account Name by using the first letter of every word in the name. Will show when the category is All or CRM.
Note: If the Company/Account Name contains only 1 word it will be considered to be an acronym already.
Example:
International Business Machines
Input: I.B.M.
Result: IBM
Acronym (Cleaned Name): Creates an acronym for a Company/Account Name AFTER applying the Account Cleaning List. See the "Cleaned Account Name" mapping type below for more details regarding the Account Cleaning List. Will show when the category is All or CRM.
Example:
The Hewlett Packard Company -> H.P. -> HP -> Hewlett Packard Limited
Cleaned Account Name: Uses the built in Account Name Cleaning List. To access the Cleaning list select in the upper left hand corner of Step 2 - Mapping. Will show when the category is All or CRM.
This list is saved locally on the users PC as a .xml file in the \DemandToolsData\ReplaceList\ directory (replacelist.xml). To share this list across multiple DemandTools users refer to the following Validity Community Solution: Sharing Cleaned Account & Nickname List Among Users
The cleaning list can be used to match "similar" names based on common punctuation, abbreviations, and common business prefixes and suffixes. These lists are customizable for language(s) and/or line of business.
We recommend that customers review these lists and customize as needed for better matching.
1. Default values includes in the list when DemandTools is installed
2. Values manually added in this example
3. Click here to type in more customizations
4. Click "Save" to save any changes
Three tabs are provided for customization:
- Replace: Anywhere it finds the "existing value", replace with the "new value" for matching purposes only
- Used to ignore common punctuation and match common abbreviations to the corresponding long form
- The "long" form of a word should always be listed first to avoid replacing a "portion" of a word with the new value
- e.g. If "ctr -> center" is specified the word "electric" become "elecenteric" and will not match correctly
- An existing value will be ignored if the new value is left blank (e.g. for common punctuation)
- Suffix removal: Ignore this suffix at the END of the name
- Will ignore multiples, e.g. Co., Inc.
- Prefix removal: Ignore this prefix at the BEGINNING of the name (not shown above)
- Existing values include "The" and "Dept of"
Note: These lists are NOT case sensitive and punctuation SHOULD NOT be included in replacements (since common punctuation is already ignored), suffix or prefix.
Country Match: Standardizes field values for the recognized countries of the world. It will recognize the long name of a country, the 2 digit ISO short form, the 3 digit ISO short form and the numeric ISO country value as possible matches of each other. Will show when the category is All or GEO
Date: Allows dates with different formats to be matched. Allows matching date to date/time fields (ignores the time). Great for matching dates in Find/Report ID’s without having to manipulate the dates in the input file to match Salesforce date formats. Will show when the category is All or CRM.
Domain: Allows for the independent analysis of the domain information contained within a website URL or email address. For email addresses it uses any information to the right of the @ sign. For web pages it returns everything AFTER the http:// and/or www. (e.g. the XXXXX.com and beyond portion). Can also be used to match an email to a website. Will show when the category is All or Internet.
Exact: 100% match of every character, assuming no additional mapping options apply. Will show when the category is All, Alpha, Numeric.
First Name: Uses the built in Nickname List to match long forms of a first name to their common abbreviation. To access the Nickname list select in the upper left hand corner of Step 2 - Mapping.
The Nickname list will see Bill, William and Billy as potential duplicates/matches of each other. This list is also customizable, for localization or for non contact substitution on any field by replacing the nickname list with synonyms. Will show when the category is All, Contact.
First XX Letters: Compares the first XX characters in a field. When selected a pop up box will appear in order to specify the number of characters on which to match. Applicable to text fields only. Will show when the category is All, Alpha.
First XX Words: Compares the first XX words in a field. When selected a pop up box will appear in order to specify the number of words on which to match. Applicable to text fields only. Will show when the category is All, Alpha.
Numeric: Compares only the numeric values in a field (0-9, decimal points, and a single dash if it is as the beginning of the field). All other characters, such as spaces or punctuation, will be ignored. A field with a value of "Apt # 31" is seen as the numeric characters "31", ignoring "Apt #". Numeric is also commonly used with phone numbers when you JUST want to ignore punctuation and all alpha-characters and just look at the full series of numbers.
Single dashes at the beginning of a field can be used to indicate a negative number, e.g. -54, therefore they are not ignored with the Numeric mapping type. To ignore decimal points (periods) or dashes check "alpha-clean" as an additional matching option. Will show when the category is All, Numeric.
Regular Expression: Creates custom mapping types for any text field. Knowledge of regular expressions is required to use this mapping type and customers are responsible for building their own expressions. There are several free regex tutorial websites online that users may want to reference to learn more. Will show when the category is All, Alpha.
Note: DemandTools uses .Net's regex engine.
Regular Expression mapping will use the first match located in the string:
Example:
Regular Expression: [0-9]+
Input data: 123 S Main St
Result: 123
Input data: 45 E. Center
Result: 45
Regular Expression Y/N: Returns a Yes or No value if the data passes the Regular Expression vs. returning the matching value or a blank like the standard Regular Expression mapping type. Will show when the category is All, Alpha.
Example, could create a Regular Expression to see if a field was 5 digits, blank or 5 chars:
23232 = Yes
232323 = No
Aabbc = Yes
Bob = No
Relaxed Address Match: Parses the street address to the lowest common denominator. Based on North American standards it has also proved effective with most country address formats. Will show when the category is All, CRM.
Example:
123 NW Pavillion Ave
123 Pavillion St, Suite 400
123 Pavillion Avenue, Fl 4
123 Pavillion Rd.
Are all seen as "123 Pavillion"
Note: If there no street designators (e.g. St, Rd, Ave etc.), then the address will not be parsed into its individual components. In this case the street address is returned with no changes for matching.
Relaxed Domain: Matches top level domains for websites by parsing the word before the .com, .co, .org. .edu etc. Helpful for email to website matches. Will show when the category is All, Internet.
Example:
www.validity.com
validity.com
http://www.validity.com
validity.co.uk
www.validity.com/downloads
john.sample@validity.com
Are all seen as "validity"
NOTE** If there are two words in the main body, the matching word will be the first word.
Example: www.shop.validity.com will match on the word "shop"
https://www.live.validity.com will match on the word "live"
Relaxed NA Phone Match: Removes all non-numeric characters and spaces, leading 0's and 1's, area codes and extensions on the back end, returning the 7 primary digits of the phone number. If just 7 digits are left use those 7 digits, else just return digits 4 - 10. It will not match the "Phone-word" values and will trim off the "SPOT" in the phone number and only look at the numeric portion. Will show when the category is All, CRM.
Example:
+1 879 555 1212 ext 500
(879) 555 1212
1.879.555.1212 ext 408
Are all seen as "5551212"
This is based on a 10-digit North American phone number. Although it can be used with international phone numbers (will just return 7 digits in the middle), the Numeric mapping type is the recommended mapping type to be used with international phone numbers.
Salesforce.com ID Match: Matches any Salesforce.com objects 15 digit ID to its 18 digit equivalent ID and vice versa. Used primarily with Discovery -> Find/Report ID's when the input file contains the 15 digit ID. Will show when the category is All, CRM.
State Match: Matches US state and Canadian province abbreviations to their long names. Can be customized to match state abbreviations for other countries by updating the StateMatchOverrides.txt file in the My docs\DemandToolsData\ReplaceList directory. Will show when the category is All, Geo.
Street Address Match: Slightly more rigid criteria than the Relaxed Address Match mapping type, ignores the differences in street type short forms such as crescent - cres, road - rd, street - st, etc. and match all standard abbreviations in the street address to their long forms (e.g. South->S, Floor ->Fl etc). In order for an address to match though, all the components of the address need to be in both records (e.g. if one has "North" the other needs at least "N", if one has a "Suite 100" the other needs at least "Ste 100"). Will show when the category is All, CRM.
Zip 5 and 9: Matches USPS 5 and 9 digit zip codes without the need to standardize them to a common number of digits. Will show when the category is All, CRM.
Additional Mapping Options
Fuzzy: Phonetics engine capable of analyzing words for how they sound when pronounced. Through a technique of removing vowels and analyzing the remaining consonants the fuzzy engine works well for matching fields with spelling mistakes.
Fuzzy will IGNORE numbers when matching, as numbers do not have a phonetic equivalent, therefore, Girl Scout Troop 100 and Girl Scout Troop 780 will be considered a match if fuzzy is checked. DO NOT USE for matching phone numbers or other fields that contain numeric characters that you do not wish to ignore (e.g. street addresses).
- Full description of the fuzzy algorithm in use (Metaphone) can be found here: http://en.wikipedia.org/wiki/Metaphone
Applicable Mapping Types: Cleaned Account Name, Exact, FirstName, First XX Words
Transpose: The transpositional engine allows for fields to appear as duplicates even when there are differences in the word order. For example, "Jones, Smith and Jackson" will appear to be a duplicate of "Jackson, Smith and Jones". Applicable Mapping Types: Cleaned Account Name, Exact, FirstName, First XX Words, Street Address Match
Alpha-Clean: The alpha clean engine extends some of the capabilities of the account name cleaner to other fields for easier matching. Alpha Clean is used for ascii (North American) data to ensure the only characters that are analyzed are the 26 characters of the English alphabet, numbers 0-9, space and &. Any other character that the field may contain will be ignored by the deduplication matching algorithms.
Note: Spaces can be ignored by checking the "Fuzzy" option.
Applicable Mapping Types: Cleaned Account Name, Exact, FirstName, First XX Words, Numeric, Street Address Match, Zip 5 and 9
Match Blank Value: When turned on for a field that has been chosen as a match condition, the deduper will allow records with blank field values to be matched to other records with blank field values for the field specified (Match Blank Value check box selected).
Advanced Mapping Techniques to Find Additional Duplicates
Exact matching is not the only way to match!!!!! Please review ALL the various mapping types and options available within DemandTools to help identify similar Account/Company name, nicknames for first names, similar addresses, similar phone numbers, perform phonetic matches etc.
Looser techniques are best used when more than one field is being matched on.
Here are some tips when matching:
- First Name matches
- Use the FirstName mapping type to match nicknames, e.g. "Mike" to "Michael"
- The “middle initial” dilemma: Try matching on just the first letter or first word (mapping type First XX Letters or First XX Words) and Last Name
- Last Name matches
- Add Alpha-Clean to match “Smith-Jones” -> “Smith Jones”
- Add Fuzzy to catch spelling errors
- Company/Account Name Matches
- Use the Cleaned Account Name mapping type to find similar names
- Can match abbreviations to long forms, e.g. Saint -> St
- Can ignore common suffixes and prefixes, e.g. match "The Hewlett Packard Company" -> "Hewlett Packard Inc"
REVIEW/UPDATE the Clean Account Name list to include abbreviations, suffixes, prefixes specific to YOUR industry!
- Add Transpose to catch where the order of the words in the name are different but the words themselves match, e.g. University of North Carolina -> North Carolina, University of
- Add Alpha-Clean to catch slight differences in punctuation
- Add Fuzzy to catch spelling errors
- NOT RECOMMENDED IF ALL THAT IS BEING MATCHED IN A STEP IS THE COMPANY OR ACCOUNT NAME AND CLEANED ACCOUNT NAME IS THE MAPPING TYPE
- Street Address Matching
- Use the Street Address Match mapping type to match abbreviations to their long forms (e.g. Street -> St)
- Use the Relaxed Address Match mapping type to additionally match one street address with a Suite # to another without (e.g. 100 Main Street -> 100 Main St Suite 234)
- City matches
- Use Alpha-Clean to match “St. Charles” with “St Charles”
- Add Fuzzy to catch spelling errors
- Phone Number matches
- For North American phone numbers use Relaxed NA Phone Match to ignore punctuation, leading 1's, area codes and extensions
- match +1 (781) 458-9999 to 6174589999 x123
- Since this mapping type IGNORES AREA CODES and returns just the "555-1212" portion of the phone number IT SHOULD NOT BE USED BY ITSELF (only use when additionally matching on other fields to avoid updating the wrong record).
- For North American phone numbers use Relaxed NA Phone Match to ignore punctuation, leading 1's, area codes and extensions
- Email to Website matches
- Use the Domain or Relaxed Domain mapping type to match an email address to a website in Lead to Account Matches
Click HERE to return to detailed information on Step 2