Mapping types will determine how a field is evaluated as a match. Various options are available for matching similar field values (e.g. similar Account/Company names, street addresses, phone numbers, first names etc.).
Mapping Types
Acronym
Creates an acronym for a Company/Account Name by using the first letter of every word in the name.
Note: If the Company/Account Name contains only 1 word it will be considered to be an acronym already.
Example: International Business Machines -> I.B.M. -> IBM
Acronym (Cleaned Name)
Creates an acronym for a Company/Account Name AFTER applying the Account Cleaning List. See the "Cleaned Account Name" mapping type below for more details regarding the Account Cleaning List.
Example: The Hewlett Packard Company -> H.P. -> HP -> Hewlett Packard Limited
Cleaned Account Name
Applies "cleaning" of the Account/Company Name based on rules established in the Account Name Filtering list in the DupeBlocker Settings Tab. The cleaning list can be used to match "similar" names based on common punctuation, abbreviations, and common business prefixes and suffixes. These lists are customizable for language(s) and/or line of business.
We recommend that customers review these lists and customize as needed for better matching.
1. Select Cleaned Account Name Settings and one of the sub tabs
- Three tabs are provided for customization:
- Replacements: Anywhere it finds the "existing value", replace with the "new value" for matching purposes only
- Used to ignore common punctuation and match common abbreviations to the corresponding long form
- The "long" form of a word should always be listed first to avoid replacing a "portion" of a word with the new value
- e.g. If "ctr -> center" is specified the word "electric" becomes "elecenteric" and will not match correctly
- An existing value will be ignored if the new value is left blank (e.g. for common punctuation)
- Prefixes: Ignore prefix at the BEGINNING of the name
- Suffixes: Ignore suffix at the END of the name Multiple values will be ignored (e.g. Company Inc)
- Replacements: Anywhere it finds the "existing value", replace with the "new value" for matching purposes only
2. Click New Entry to customize
- Existing entries can also be edited or deleted
3. Enter desired values
4. Save
The lists are NOT case sensitive and punctuation SHOULD NOT be included in abbreviations (replacements), suffix or prefix lists (since common punctuation is already ignored in the replace list).
Country Match
Standardizes field values for the recognized countries of the world based on rules established in the Country Filtering list in DupeBlocker Settings. It will recognize the long name of a country, the 2 digit ISO short form, the 3 digit ISO short form and the numeric ISO country value as possible matches of each other. This list is customizable in the DupeBlocker Settings Tab.
Domain
Allows for the independent analysis of the domain information contained within a website URL or email address. For email addresses it uses any information to the right of the @ sign. For web pages it parses the XXXXX.com portion. Can also be used to match an email to a website. To JUST return the XXXXX.com portion (and nothing beyond) use the Relaxed Domain mapping type.
Exact
100% match of every character.
First Name
Applies the Nickname Filtering list in DupeBlocker Settings. The Nickname list will see Bill, William and Billy as potential duplicates of each other. This list is customizable in the DupeBlocker Settings Tab.
First XX Letters
Compares the first XX letters in a field. When selected, the number of letters to be matched needs to be specified in the scenario rule.
Commonly used as looser technique for matching Account/Company Names and to match just on first initial of First Name.
First XX Words
Compares the first XX words in a field. When selected the number of words to be matched needs to be specified in the scenario rule.
Commonly used as a looser technique for matching Account/Company names with non-standard suffixes and to match First Names that contain middle names or initials.
Examples (First 1 Word):
Huffy Bicycle -> Huffy Manufacturing -> Huffy Mfg - Boston (matches just the word "Huffy")
John B. -> John -> John James (matches just the word "John")
Numeric
Compares only the numeric values in a field (0-9 and decimal point). All other characters, such as spaces or punctuation, will be ignored by the deduper. Commonly used with phone numbers to just look at the series of numbers regardless of punctuation.
Note: When used on a text field that contains numbers, leading zeros will be ignored. If leading zeros should NOT be ignored, then use EXACT as the mapping type and check "alpha-clean" as an additional matching option.
To ignore decimal points (periods) check "alpha-clean" as an additional matching option.
Regular Expression
Creates custom mapping types for any text field. Knowledge of regular expressions is required to use this mapping type and customers are responsible for building their own expressions. There are several free regex tutorial websites online that users may want to reference to learn more.
Regular Expression mapping will use the first match located in the string:
Example: Regular Expression: [0-9]+
Input data: 123 S Main St
Filtered result: 123
Input data: 45 E. Center
Filtered result: 45
Regular Expression Y/N
Returns a Yes or No value if the data passes the Regular Expression vs. returning the matching value or a blank like the standard Regular Expression mapping type.
Example: Could create a Regular Expression to see if a field was 5 digits, blank or 5 chars:
23232 = Yes
232323 = No
= Yes
Aabbc = Yes
Bob = No
Relaxed NA Phone Match
Designed specifically for 10 digit North American phone numbers. Ignores all non-numeric characters and spaces, leading 0's or 1's, area codes and extensions.
Example:
+1 879 555 1212 ext 500
(879) 555 1212
1.879.555.1212 ext 408
Are all seen as "5551212"
This is based on a 10-digit North American phone number. Although it can be used with international phone numbers (will just return 7 digits in the middle), the Numeric mapping type is the recommended mapping type to be used with international phone numbers.
Since this mapping type IGNORES AREA CODES and returns just the "555-1212" portion of the phone number IT SHOULD NOT BE USED BY ITSELF (only use when additionally matching on other fields to avoid updating the wrong record).
Relaxed Domain
Matches top level domains for websites by parsing the word before the .com, .co, .org. .edu etc. Helpful for email to website matches.
Example:
www.validity.com
validity.com
http://www.validity.com
validity.co.uk
www.validity.com/dupeblocker/downloads
john.sample@validity.com
Are all seen as "validity"
Relaxed Street Address Match
Parses the street address to the lowest common denominator, street number and name. Although based on North American standards, it has also proved effective with most country address formats.
Example:
123 NW Pavillion Ave
123 Pavillion St, Suite 400
123 Pavillion Avenue, Fl 4
123 Pavillion
123 Pavillion Rd.
Are all seen as "123 Pavillion"
State Match
Matches US State and Canadian province abbreviations to their long names. Can be customized in the DupeBlocker Settings Tab to accommodate international state matching.
Example:
MA -> Massachusetts
CA -> California
ON -> Ontario
Street Address Match
Slightly more rigid than Relaxed Street Address Match. Matches a street abbreviation with it's long equivalent. For example, crescent -> cres, road -> rd, street -> st, etc. Will also match other common abbreviations in street addresses. For example, North -> N, Suite -> Ste etc.
Zip 5 and 9
Matches a 5 digit USPS zipcode to it's 9 digit equivalent.
DupeBlocker Settings
More information on customizing DupeBlocker Settings for use with the following mapping types, Cleaned Account Name, FirstName, State Match, and Country Match can be found HERE.
Mapping Options
Additional mapping options, English Fuzzy, Transpose and AlphaClean, can be added to further refine the match.
English Fuzzy: Phonetics engine capable of analyzing words for how they sound when pronounced. Through a technique of removing vowels and analyzing the remaining consonants the fuzzy engine works well for matching fields with spelling mistakes.
Fuzzy will IGNORE numbers when matching, as numbers do not have a phonetic equivalent, therefore, Girl Scout Troop 100 and Girl Scout Troop 780 will be considered a match if fuzzy was checked. DO NOT USE for matching phone numbers or other fields that contain numeric characters that you do not wish to ignore (e.g. street addresses).
Full description of the fuzzy algorithm in use (Metaphone) can be found here: http://en.wikipedia.org/wiki/Metaphone
Transpose: Matches all words in a field regardless of their order. For example, "Smith and Jones" will match to "Jones and Smith".
AlphaClean: The alpha clean engine extends some of the capabilities of the account name cleaner to other fields for easier matching. Alpha Clean is used for ascii (North American) data to ensure the only characters that are analyzed are the 26 characters of the English alphabet, numbers 0-9, space and &. Any other character that the field may contain will be ignored by the deduplication matching algorithms.
Match Blank
By default, blank fields are not matched against each other. If this option is checked then two blanks fields will be considered a match.