?

Log in

No account? Create an account
brad's life [entries|archive|friends|userinfo]
Brad Fitzpatrick

[ website | bradfitz.com ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Treearrange: a compliment to rsync [Aug. 27th, 2006|10:59 pm]
Brad Fitzpatrick
[Tags|, ]

My quick hack of the evening is treearrange, which rearranges a directory tree based on a description of a directory tree, which the tool also generates.

What problem does this solve? Here's my typical photo-uploading workflow:

-- bring GBs of unorganized photos to work
-- upload GBs of photos from work to my personal server at 100 Mbps.
-- go home
-- rsync down from my server all photos at 6 Mbps. still pretty fast.
-- rearrange/rename. instead of DCIM/nnnCANON/, I rearrange into, say, "Day5-Paris/".

Now, how do I get my photos online? Two choices:

1) upload them from home.
2) upload them from my server (not my home server)

But the problems with the above are, respectively:

1) slow upstream. Not 100 Mbps. More like 1. GBs would take forever.
2) the files aren't in the right places on the server. I only rearranged then locally.

Rsync won't do. Rsync doesn't deal with files changing directories.

Enter treearrange:

Here's my server, where it all begins:
bradfitz@personal_web:~/honeymoon_pics$ find -type d
.
./DCIM
./DCIM/179CANON
./DCIM/180CANON
./DCIM/181CANON
./DCIM/182CANON
./DCIM/183CANON
./DCIM/184CANON
./DCIM/185CANON
./DCIM/186CANON
./Elph
./Elph/DCIM
./Elph/DCIM/135CANON
./Elph/DCIM/136CANON
./Elph/DCIM/CANONMSC
I rsync them down to my house (pretty fast), and rearrange them:
sammy:Sorted $ find -type d
.
./Barcelona_Airport_Hell
./Barcelona-1
./Barcelona-2
./Boat
./Lisa
./Malta
./Marseille
./Midnight_Buffet
./Naples-Vesuvius-Pompeii
./Palma_de_Mallorca
./Rome
./Stockholm-1
Now, using treearrange, I snapshot where the files are supposed to live:
sammy:Sorted $ ./treearrange --to=arrange.dat

$ head arrange.dat 
945fc334853b4c5edfca34c9908258eacfc86823        Barcelona_Airport_Hell/IMG_8675.JPG
fe5551ad173e425c1c8f40c4f06e72389df7c2ab        Barcelona_Airport_Hell/IMG_8676.JPG
c9f0589a24a8de4a65e8670b8bbb4f570a4452ca        Barcelona_Airport_Hell/IMG_8677.JPG
b244692481c84857d2e7824ec310ca074eee5e6c        Barcelona_Airport_Hell/IMG_8678.JPG
20c6dd346021689b32702c28ec62cde6a2c3a7be        Barcelona_Airport_Hell/IMG_8679.JPG
f1fdd495d10aee11a1cb96019b7b6c0a11e5465f        Barcelona_Airport_Hell/IMG_8680.JPG
f429010fe9a906c8bf513016e03e371ee711f3f6        Barcelona_Airport_Hell/IMG_8681.JPG
7885c7b71cd21a28c11985a591e69e81a12ee316        Barcelona_Airport_Hell/IMG_8683.JPG
Next I upload the arrange.dat and treearrange to my server, and do the opposite:
bradfitz@personal_web:~/honeymoon_pics$  ./treearrange --from=arrange.dat
file 1 / 738...
file 2 / 738...
  DCIM/179CANON/IMG_7977.JPG -> Barcelona-1/IMG_7977.JPG
file 3 / 738...
  DCIM/179CANON/IMG_7978.JPG -> Barcelona-1/IMG_7978.JPG
file 4 / 738...
  DCIM/179CANON/IMG_7979.JPG -> Barcelona-1/IMG_7979.JPG
file 5 / 738...
  DCIM/179CANON/IMG_7980.JPG -> Barcelona-1/IMG_7980.JPG
file 6 / 738...
  DCIM/179CANON/IMG_7981.JPG -> Barcelona-1/IMG_7981.JPG
.....

bradfitz@personal_web:~/honeymoon_pics$ find -type d
.
./Barcelona-1
./Boat
./Marseille
./Lisa
./Rome
./Naples-Vesuvius-Pompeii
./Malta
./Midnight_Buffet
./Palma_de_Mallorca
./Barcelona-2
./Barcelona_Airport_Hell
./Stockholm-1
Tada!

(then I can rsync and get any rotations/adjustments/etc that I did locally which weren't just a directory move...)
LinkReply

Comments:
[User Picture]From: gaal
2006-08-28 07:24 am (UTC)
But then how can resume work?

I start syncing by pulling a new file from remotehost to my localhost. Then the download is interrupted, and I resume it. What identifies the partial file on localhost?
(Reply) (Parent) (Thread)
[User Picture]From: gaal
2006-08-28 07:26 am (UTC)
Hm, maybe the syncer should pre-tag all files with their own fingerprint and make sure that gets transmitted early?
(Reply) (Parent) (Thread)