brad's life - Naming twins in Python & Perl [entries|archive|friends|userinfo]
Brad Fitzpatrick

[ website | bradfitz.com ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Naming twins in Python & Perl [Jan. 6th, 2008|02:16 pm]
Previous Entry Add to Memories Share Next Entry
[Tags|, , , ]

Last night at Beau's party, one of Beau's guests mentioned he's expecting twins shortly, which is why is wife wasn't at the party.

I drunkenly suggested he name his kids two names that were anagrams of each other. I then wandered off downstairs to find such suitable names.

Because I'm supposed to be be working in Python these days, not Perl, I gathered what little Python knowledge I had and wrote out:
#!/usr/bin/python

by_anagram = {}

names_file = open("dist.male.first")
for line in names_file:
    # lines in file look like:
    # WILLIAM        2.451 14.812      5
    # we want just the first field.
    name = (line.split())[0]
    letters = [letter for letter in name]
    letters.sort()
    sorted_name = "".join(letters)
    if not sorted_name in by_anagram:
        by_anagram[sorted_name] = []
    by_anagram[sorted_name].append(name)

for sorted_name in by_anagram:
    if len(by_anagram[sorted_name]) < 2:
        continue

    print by_anagram[sorted_name]
Not so happy with it, but it worked, and I printed out the results and brought them up to the guy:
['TROY', 'TORY']
['CLAY', 'LACY']
['JEFFREY', 'JEFFERY']
['ANGEL', 'GALEN']
['FOREST', 'FOSTER']
['ANDRE', 'DAREN', 'ARDEN']
['LEROY', 'ELROY']
['LEONARD', 'RENALDO', 'LEANDRO']
['ARIEL', 'ARLIE']
['BRENDAN', 'BRANDEN']
['WARREN', 'WARNER']
['DEAN', 'DANE']
['CHRISTOPER', 'CRISTOPHER']
['COLE', 'CLEO']
['CARLO', 'CAROL']
['ELMER', 'MERLE']
['REUBEN', 'RUEBEN']
['JOSEPH', 'JOESPH', 'JOSPEH']
['COREY', 'ROYCE']
['JASON', 'JONAS']
['RAMON', 'ROMAN']
['JAMIE', 'JAIME']
['CARMELO', 'MARCELO']
['BYRON', 'BRYON']
['LEON', 'NOEL', 'OLEN']
['NEAL', 'LANE']
['MICHAEL', 'MICHEAL', 'MICHALE']
['KEITH', 'KIETH']
['BERT', 'BRET']
['BRIAN', 'BRAIN']
['OLIN', 'LINO']
['DION', 'DINO']
['DANA', 'ADAN']
['RONALD', 'ROLAND', 'ARNOLD']
['ISRAEL', 'ISREAL']
['DARNELL', 'RANDELL']
['ANTOINE', 'ANTIONE']
['ORLANDO', 'ROLANDO', 'ARNOLDO']
Just now, I was wondering the equivalent in Perl, and wrote:
#!/usr/bin/perl

use strict;
open (my $fh, "dist.male.first") or die;
my %by_anagram;
while (<$fh>) {
    chomp;
    s/\s.*//;
    my $name = $_;
    my $sorted_name = join('', sort split //, $name);
    push @{$by_anagram{$sorted_name}}, $name;
}

foreach my $sn (grep { @{$by_anagram{$_}} > 1 } keys %by_anagram) {
    print "@{$by_anagram{$sn}}\n";
}
In particular, I like about Python that errors-are-exceptions is the norm. But I like that regexps are built into Perl. (also hate Python's general hating on functional programming, unrelated to this post) I'm sure my Python could be way shorter, too. Anybody want to post either their short Python version, or their more-idiomatic Python version?

Also interesting: (fastest of a 3 consecutive runs each)
sammy$ time ./ananames.pl > /dev/null

real	0m0.026s
user	0m0.024s
sys	0m0.004s

sammy$ time ./ananames.py > /dev/null

real	0m0.043s
user	0m0.036s
sys	0m0.008s
LinkReply

Comments:
From: (Anonymous)
2008-01-07 02:40 pm (UTC)

(Link)

hm... wouldn't it be faster when using tuples instead of a string for the dict key?
by_anagram[tuple(sorted(name))].append(name)
instead of
by_anagram["".join(sorted(name))].append(name)
[User Picture]From: ciphergoth
2008-01-15 05:22 pm (UTC)

(Link)

This doesn't need to be fast. "".join(sorted(name)) would probably be easier if we had to debug this, but tuple is shorter and perhaps clearer, so I think it's a coin toss...