brad's life [entries|archive|friends|userinfo]
Brad Fitzpatrick

[ website | bradfitz.com ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Tonight [Apr. 5th, 2008|01:32 am]
[Tags|, , ]

I had two options for this Friday evening:
  • go to Tahoe & ski
  • go to Monterey & dive
I did neither.

Skiing would've required both days, I'd already committed to diving when my friends did their certification check-out dives, and I kinda hurt my knee the other day playing Ultimate.

But scuba: I couldn't find a scuba partner for tomorrow morning, and everybody else will be in classes, sitting on the bottom of the ocean tossing their regulators and finding them back.

So I sat at home, cleaned up music collection more, and did this hack: smart MP3-aware file chunking for Brackup. Now I can retag music at will and iterative backups in the future won't re-backup the music bytes, just the ID3 tags. MP3 files, when smart chunking is enabled, now have 1, 2, or 3 chunks, depending on number/type of ID3 tags.

As for fun stuff: I'll drive to Monterey tomorrow and just pay for one night's hotel, party it up in the hotel bar/pool, then dive Sunday morning with Erin, Whitaker, Tessa, Dan, Julie, and not Ojan.

Update: (oh, I also got iSCSI working from OS X to Linux. what a frickin' eventful evening, apparently.)
Link11 comments|Leave a comment

Steadycam for my laptop [Jan. 29th, 2008|07:53 am]
[Tags|, ]

My laptop has a camera, pointing at my face, and an accelerometer.

The shuttle is bumpy (well, in the city).

Between it doing some sort of head-to-screen relativity with the camera and feeling bumps with the accelerometer, I want it to rapidly pan the screen around, compensating for the bumps, making the text I'm always reading stable.

Too much to ask?
Link9 comments|Leave a comment

Naming twins in Python & Perl [Jan. 6th, 2008|02:16 pm]
[Tags|, , , ]

Last night at Beau's party, one of Beau's guests mentioned he's expecting twins shortly, which is why is wife wasn't at the party.

I drunkenly suggested he name his kids two names that were anagrams of each other. I then wandered off downstairs to find such suitable names.

Because I'm supposed to be be working in Python these days, not Perl, I gathered what little Python knowledge I had and wrote out:
#!/usr/bin/python

by_anagram = {}

names_file = open("dist.male.first")
for line in names_file:
    # lines in file look like:
    # WILLIAM        2.451 14.812      5
    # we want just the first field.
    name = (line.split())[0]
    letters = [letter for letter in name]
    letters.sort()
    sorted_name = "".join(letters)
    if not sorted_name in by_anagram:
        by_anagram[sorted_name] = []
    by_anagram[sorted_name].append(name)

for sorted_name in by_anagram:
    if len(by_anagram[sorted_name]) < 2:
        continue

    print by_anagram[sorted_name]
Not so happy with it, but it worked, and I printed out the results and brought them up to the guy:
['TROY', 'TORY']
['CLAY', 'LACY']
['JEFFREY', 'JEFFERY']
['ANGEL', 'GALEN']
['FOREST', 'FOSTER']
['ANDRE', 'DAREN', 'ARDEN']
['LEROY', 'ELROY']
['LEONARD', 'RENALDO', 'LEANDRO']
['ARIEL', 'ARLIE']
['BRENDAN', 'BRANDEN']
['WARREN', 'WARNER']
['DEAN', 'DANE']
['CHRISTOPER', 'CRISTOPHER']
['COLE', 'CLEO']
['CARLO', 'CAROL']
['ELMER', 'MERLE']
['REUBEN', 'RUEBEN']
['JOSEPH', 'JOESPH', 'JOSPEH']
['COREY', 'ROYCE']
['JASON', 'JONAS']
['RAMON', 'ROMAN']
['JAMIE', 'JAIME']
['CARMELO', 'MARCELO']
['BYRON', 'BRYON']
['LEON', 'NOEL', 'OLEN']
['NEAL', 'LANE']
['MICHAEL', 'MICHEAL', 'MICHALE']
['KEITH', 'KIETH']
['BERT', 'BRET']
['BRIAN', 'BRAIN']
['OLIN', 'LINO']
['DION', 'DINO']
['DANA', 'ADAN']
['RONALD', 'ROLAND', 'ARNOLD']
['ISRAEL', 'ISREAL']
['DARNELL', 'RANDELL']
['ANTOINE', 'ANTIONE']
['ORLANDO', 'ROLANDO', 'ARNOLDO']
Just now, I was wondering the equivalent in Perl, and wrote:
#!/usr/bin/perl

use strict;
open (my $fh, "dist.male.first") or die;
my %by_anagram;
while (<$fh>) {
    chomp;
    s/\s.*//;
    my $name = $_;
    my $sorted_name = join('', sort split //, $name);
    push @{$by_anagram{$sorted_name}}, $name;
}

foreach my $sn (grep { @{$by_anagram{$_}} > 1 } keys %by_anagram) {
    print "@{$by_anagram{$sn}}\n";
}
In particular, I like about Python that errors-are-exceptions is the norm. But I like that regexps are built into Perl. (also hate Python's general hating on functional programming, unrelated to this post) I'm sure my Python could be way shorter, too. Anybody want to post either their short Python version, or their more-idiomatic Python version?

Also interesting: (fastest of a 3 consecutive runs each)
sammy$ time ./ananames.pl > /dev/null

real	0m0.026s
user	0m0.024s
sys	0m0.004s

sammy$ time ./ananames.py > /dev/null

real	0m0.043s
user	0m0.036s
sys	0m0.008s
Link77 comments|Leave a comment

Treearrange: a compliment to rsync [Aug. 27th, 2006|10:59 pm]
[Tags|, ]

My quick hack of the evening is treearrange, which rearranges a directory tree based on a description of a directory tree, which the tool also generates.

What problem does this solve? Here's my typical photo-uploading workflow:

-- bring GBs of unorganized photos to work
-- upload GBs of photos from work to my personal server at 100 Mbps.
-- go home
-- rsync down from my server all photos at 6 Mbps. still pretty fast.
-- rearrange/rename. instead of DCIM/nnnCANON/, I rearrange into, say, "Day5-Paris/".

Now, how do I get my photos online? Two choices:

1) upload them from home.
2) upload them from my server (not my home server)

But the problems with the above are, respectively:

1) slow upstream. Not 100 Mbps. More like 1. GBs would take forever.
2) the files aren't in the right places on the server. I only rearranged then locally.

Rsync won't do. Rsync doesn't deal with files changing directories.

Enter treearrange:

Here's my server, where it all begins:
bradfitz@personal_web:~/honeymoon_pics$ find -type d
.
./DCIM
./DCIM/179CANON
./DCIM/180CANON
./DCIM/181CANON
./DCIM/182CANON
./DCIM/183CANON
./DCIM/184CANON
./DCIM/185CANON
./DCIM/186CANON
./Elph
./Elph/DCIM
./Elph/DCIM/135CANON
./Elph/DCIM/136CANON
./Elph/DCIM/CANONMSC
I rsync them down to my house (pretty fast), and rearrange them:
sammy:Sorted $ find -type d
.
./Barcelona_Airport_Hell
./Barcelona-1
./Barcelona-2
./Boat
./Lisa
./Malta
./Marseille
./Midnight_Buffet
./Naples-Vesuvius-Pompeii
./Palma_de_Mallorca
./Rome
./Stockholm-1
Now, using treearrange, I snapshot where the files are supposed to live:
sammy:Sorted $ ./treearrange --to=arrange.dat

$ head arrange.dat 
945fc334853b4c5edfca34c9908258eacfc86823        Barcelona_Airport_Hell/IMG_8675.JPG
fe5551ad173e425c1c8f40c4f06e72389df7c2ab        Barcelona_Airport_Hell/IMG_8676.JPG
c9f0589a24a8de4a65e8670b8bbb4f570a4452ca        Barcelona_Airport_Hell/IMG_8677.JPG
b244692481c84857d2e7824ec310ca074eee5e6c        Barcelona_Airport_Hell/IMG_8678.JPG
20c6dd346021689b32702c28ec62cde6a2c3a7be        Barcelona_Airport_Hell/IMG_8679.JPG
f1fdd495d10aee11a1cb96019b7b6c0a11e5465f        Barcelona_Airport_Hell/IMG_8680.JPG
f429010fe9a906c8bf513016e03e371ee711f3f6        Barcelona_Airport_Hell/IMG_8681.JPG
7885c7b71cd21a28c11985a591e69e81a12ee316        Barcelona_Airport_Hell/IMG_8683.JPG
Next I upload the arrange.dat and treearrange to my server, and do the opposite:
bradfitz@personal_web:~/honeymoon_pics$  ./treearrange --from=arrange.dat
file 1 / 738...
file 2 / 738...
  DCIM/179CANON/IMG_7977.JPG -> Barcelona-1/IMG_7977.JPG
file 3 / 738...
  DCIM/179CANON/IMG_7978.JPG -> Barcelona-1/IMG_7978.JPG
file 4 / 738...
  DCIM/179CANON/IMG_7979.JPG -> Barcelona-1/IMG_7979.JPG
file 5 / 738...
  DCIM/179CANON/IMG_7980.JPG -> Barcelona-1/IMG_7980.JPG
file 6 / 738...
  DCIM/179CANON/IMG_7981.JPG -> Barcelona-1/IMG_7981.JPG
.....

bradfitz@personal_web:~/honeymoon_pics$ find -type d
.
./Barcelona-1
./Boat
./Marseille
./Lisa
./Rome
./Naples-Vesuvius-Pompeii
./Malta
./Midnight_Buffet
./Palma_de_Mallorca
./Barcelona-2
./Barcelona_Airport_Hell
./Stockholm-1
Tada!

(then I can rsync and get any rotations/adjustments/etc that I did locally which weren't just a directory move...)
Link18 comments|Leave a comment

How should I hack? [May. 23rd, 2006|11:09 pm]
[Tags|, ]

Poll #735092 Hacking focus
Open to: All, results viewable to: All

How should I spend my copious free hacking time?

View Answers

Brackup (progress bars, purging old files from old snapshots, amazon restore support,....)
22 (14.4%)

Perlbal (parallel gzip files, non-blocking SSL support)
3 (2.0%)

MogileFS (test suite, pluggable replication policies)
6 (3.9%)

DJabberd (finish XMPP compliance, add clustering support)
25 (16.3%)

Memcached (merge billion patches from community_
46 (30.1%)

OpenID (work with JanRain, Verisign, community)
36 (23.5%)

other (write-in)
15 (9.8%)

Link28 comments|Leave a comment

wsbackup -- encrypted, over-the-net, multi-versioned backup [Mar. 19th, 2006|12:50 pm]
[Tags|, , , , ]

There are lots of ways to store files on the net lately:

-- Amazon S3 is the most interesting,
-- Google's rumored GDrive is surely soon coming
-- Apple has .Mac

I want to back up to them. And more than one. So first off, abstract out net-wide storage.... my backup tool (wsbackup) isn't targetting one. They're all just providers.

Also, don't trust sending my data in cleartext, and having it stored in cleartext, so public key encryption is a must. Then I can run automated backups from many hosts, without much fear of keys being compromised.

Don't want people being able to do size-analysis, and huge files are a pain anyway, so big files are cut into chunks.

Files stored on Amazon/Google are of form:

-- meta files: backup_rootname-yyyymmddnn.meta, encrypted (YAML?) file mapping relative paths from backup directory root to the stat() information, original SHA1, and array of chunk keys (SHA1s of encrypted chunks) that comprise the file.

-- [sha1ofencryptedchunk].chunk -- content being <= ,say, 20MB chunk of encrypted data.

Then every night different hosts/laptops recurse directory trees, consult a stat() cache (on,say, inode number, mtime, size, whatever) and do SHA1 calculations on changed files, lookup rest from cache, and build the metafile, upload any new chunks, encrypt the metafile, upload the metafile.

Result:

-- I can restore any host from any point in time, with Amazon/Google storing all my data, and only paying $0.15 cents/GB-month.

Nice.

I'm partway through writing it. Will open source it soon. Ideally tonight.
Link39 comments|Leave a comment

navigation
[ viewing | most recent entries ]