Log in

No account? Create an account
brad's life [entries|archive|friends|userinfo]
Brad Fitzpatrick

[ website | bradfitz.com ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Treearrange: a compliment to rsync [Aug. 27th, 2006|10:59 pm]
Brad Fitzpatrick
[Tags|, ]

My quick hack of the evening is treearrange, which rearranges a directory tree based on a description of a directory tree, which the tool also generates.

What problem does this solve? Here's my typical photo-uploading workflow:

-- bring GBs of unorganized photos to work
-- upload GBs of photos from work to my personal server at 100 Mbps.
-- go home
-- rsync down from my server all photos at 6 Mbps. still pretty fast.
-- rearrange/rename. instead of DCIM/nnnCANON/, I rearrange into, say, "Day5-Paris/".

Now, how do I get my photos online? Two choices:

1) upload them from home.
2) upload them from my server (not my home server)

But the problems with the above are, respectively:

1) slow upstream. Not 100 Mbps. More like 1. GBs would take forever.
2) the files aren't in the right places on the server. I only rearranged then locally.

Rsync won't do. Rsync doesn't deal with files changing directories.

Enter treearrange:

Here's my server, where it all begins:
bradfitz@personal_web:~/honeymoon_pics$ find -type d
I rsync them down to my house (pretty fast), and rearrange them:
sammy:Sorted $ find -type d
Now, using treearrange, I snapshot where the files are supposed to live:
sammy:Sorted $ ./treearrange --to=arrange.dat

$ head arrange.dat 
945fc334853b4c5edfca34c9908258eacfc86823        Barcelona_Airport_Hell/IMG_8675.JPG
fe5551ad173e425c1c8f40c4f06e72389df7c2ab        Barcelona_Airport_Hell/IMG_8676.JPG
c9f0589a24a8de4a65e8670b8bbb4f570a4452ca        Barcelona_Airport_Hell/IMG_8677.JPG
b244692481c84857d2e7824ec310ca074eee5e6c        Barcelona_Airport_Hell/IMG_8678.JPG
20c6dd346021689b32702c28ec62cde6a2c3a7be        Barcelona_Airport_Hell/IMG_8679.JPG
f1fdd495d10aee11a1cb96019b7b6c0a11e5465f        Barcelona_Airport_Hell/IMG_8680.JPG
f429010fe9a906c8bf513016e03e371ee711f3f6        Barcelona_Airport_Hell/IMG_8681.JPG
7885c7b71cd21a28c11985a591e69e81a12ee316        Barcelona_Airport_Hell/IMG_8683.JPG
Next I upload the arrange.dat and treearrange to my server, and do the opposite:
bradfitz@personal_web:~/honeymoon_pics$  ./treearrange --from=arrange.dat
file 1 / 738...
file 2 / 738...
  DCIM/179CANON/IMG_7977.JPG -> Barcelona-1/IMG_7977.JPG
file 3 / 738...
  DCIM/179CANON/IMG_7978.JPG -> Barcelona-1/IMG_7978.JPG
file 4 / 738...
  DCIM/179CANON/IMG_7979.JPG -> Barcelona-1/IMG_7979.JPG
file 5 / 738...
  DCIM/179CANON/IMG_7980.JPG -> Barcelona-1/IMG_7980.JPG
file 6 / 738...
  DCIM/179CANON/IMG_7981.JPG -> Barcelona-1/IMG_7981.JPG

bradfitz@personal_web:~/honeymoon_pics$ find -type d

(then I can rsync and get any rotations/adjustments/etc that I did locally which weren't just a directory move...)

From: yuval_kogman
2006-08-28 09:13 am (UTC)


unison can do this, albeit not intelligently.

It doesn't have smart tracking of mv's for 3-way merging, but it does have the xferbycopying feature which will check if a remote file with the same checksum exists.
(Reply) (Thread)
[User Picture]From: edm
2006-08-28 09:35 pm (UTC)

Re: unison

Excellent. I was going to suggest that this feature should really be part of something like rsync (which already walks the directories, takes checksums, compares them, etc). Having a separate script is useful, but it'd be even nicer if it "just worked" in the face of running something like rsync (ie, copied/moved files that have moved, and uploaded any new ones).

(Reply) (Parent) (Thread)
[User Picture]From: awwaiid
2006-08-29 06:05 am (UTC)

Re: unison

Yes, I adore unison... use it to sync my home directory between about six machines, many of them using ssh rsa keys to do it unattended. Work on a project on my machine here tonight, tomorrow I go in to work and it's there waiting.
(Reply) (Parent) (Thread)