Brad Fitzpatrick (brad) wrote,
Brad Fitzpatrick
brad

Canonicalizing locations from user input

I've been working on improving (redoing) the LJ directory which hasn't gotten love in ages. The new design is wonderful and the subject of a future post, but I wanted to share this table first:

RU--Москва76190(Moscow)
RU--Moscow70213(Moscow)
RU--70188
RU--Санкт-Петербург14703(Saint Petersburg)
RU--Saint-Petersburg4844(Saint Petersburg)
RU--Питер4614(Saint Petersburg)
RU--SPb4209(Saint Petersburg)
RU--москва3743(Moscow)
RU--Новосибирск2887
RU--Екатеринбург2429
RU--Novosibirsk2345
RU--Moskow2232(Moscow)
RU--СПб2170(Saint Petersburg)
RU--Msk2012(Moscow)
RU--St.Petersburg1866(Saint Petersburg)
RU--St. Petersburg1533(Saint Petersburg)
RU--Нижний Новгород1503
RU--Samara1497
RU--Самара1349
RU--Ростов-на-Дону1214
RU--Челябинск1201
RU--Казань1150
RU--Уфа1057
RU-Moscow-Moscow1055(Moscow)
RU--Иркутск1036
RU-Москва-Москва1033(Moscow)
RU--Воронеж1028
RU--Калининград999
RU--Kazan965
RU--Ufa956
RU--Петербург954(Saint Petersburg)
RU--Красноярск950
RU--Vladivostok936
RU--Краснодар935
RU--Kaliningrad932
RU--Владивосток923
RU--Пермь913
RU--Ekaterinburg898
RU--Perm866
RU--Omsk820

The data for Australia is also bad, but nowhere near this bad.

Clearly some canonicalization is in order! (Don't worry, I'll never change what appears on profile pages.... just how searches are grouped...)
Tags: tech, work
Subscribe
  • Post a new comment

    Error

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.
  • 11 comments