Adding support for non-Latin character sets makes Gmail a more useful service, but it also creates a new means for spammers to try and invade your inbox by using homoglyphs — characters that might be mistaken for standard letters.
As Google explains on its official enterprise blog, while supporting domain names using those characters is helpful, it also requires checking for potentially suspicious combinations:
Scammers can exploit the fact that ဝ, ૦, and ο look nearly identical to the letter o, and by mixing and matching them, they can hoodwink unsuspecting victims. Can you imagine the risk of clicking “ShဝppingSite” vs. “ShoppingSite” or “MyBank” vs. “MyBɑnk”?
The Unicode Consortium has developed a list of “suspicious combinations”, which are character sequences that would be unlikely to appear in the relevant native language but which could be used to create deceptive domain names. Hit the link to read more.
Protecting Gmail in a global world [Google Enterprise Blog]
Comments
One response to “How Gmail Is Fighting The Unicode Homoglpyhs”
Sounds like a very complicated solution to a simple problem. Couldn’t they have used a table like the one above to parse incoming messages and search known unicode homoglpyhs and replace them with their Latin ‘equivalent’, then run that message through their existing spam filter along with the original?
Unicode domain names are hardly a new problem… how does Microsoft Outlook handle it? http://www.unicode.org/faq/idn.html