| Comments: |
This is a dangerous thing to create, but it's the only way to store information.
Would the server quota size as well as bandwidth? Is that client adjustable?
![[User Picture]](http://l-userpic.livejournal.com/54541970/2) | From: brad 2005-04-23 05:51 am (UTC)
| (Link)
|
Owners of clients machines would be able to choose how much disk quota is allocated to each user. I figure friends would give each other equal disk quota on each other's machines.
So only one client/server relationship?
![[User Picture]](http://l-userpic.livejournal.com/54541970/2) | From: brad 2005-04-23 07:49 am (UTC)
| (Link)
|
Of course not.
Aren't the storage requirements pretty intimidating? If every user wants to backup 10GB, they also need to donate more than 10GB to the distributed pool, right? 20GB if you want every backup to live on two separate machines. Actually more unless you can prevent new members from uploading their backups before their donated storage has been used by others.
It would be really cool if there was some way to detect common file hashes (i.e. common binaries) so the network wouldn't need to store additional copies, but as far as I can tell, that's incompatible with a fully encrypted system.
![[User Picture]](http://l-userpic.livejournal.com/54541970/2) | From: brad 2005-04-23 05:52 am (UTC)
| (Link)
|
It would be really cool if there was some way to detect common file hashes (i.e. common binaries) so the network wouldn't need to store additional copies, but as far as I can tell, that's incompatible with a fully encrypted system.
I figure this is for backing up /home/, not /usr/, so dups between separate users won't really happen. And if they didn't, you certainly wouldn't want to know about it.
Now, if you have dups in /home/, you'll only be storing it once on your friends' machines.
![[User Picture]](http://l-userpic.livejournal.com/36951816/24078) | From: scsi 2005-04-23 05:47 am (UTC)
Question/Comment | (Link)
|
I assume this is only for backing up select files, not like entire machines (sorta like backuppc)..
Since the client has no keys, it means the public/private pair are on the server... Wouldnt that defeat the purpose of encryption if someone gets into the server and just decrypts the hashes? Not unless the server accepts the files, encrypts them, and splatters them back out (distributed) to all of the clients participating..
If this is distributed, I would feel much safer if each client held the private key, and encrypted everything on the fly, calculated the hashes and sent it to the server for distribution.... If the client doesnt hold any of the keys, you are trusting everyones file to a central location, which (worst case) the box gets nabbed by feds, would be very easy for them to see exactly what everyone has uploaded.. Of course this blow denyablity out of the water since you are the sole owner of the public key...
Or I could be completely missing the point, which is mostly the case.
![[User Picture]](http://l-userpic.livejournal.com/8731982/626373) | From: maxvt 2005-04-23 07:01 am (UTC)
Re: Question/Comment | (Link)
|
How about client holds own pair of keys, server gets the content already encrypted?
From: legolas 2005-04-27 11:21 pm (UTC)
Re: Question/Comment | (Link)
|
I would feel much safer if each client held the private key
Wouldn't that defeat the whole purpose? If your client goes kaput, bye bye data if the key can't be recovered?
Then again, how does the server know if the requesting client is who he says he is (esp. after the original client machine has, say, burned (literally))?
![[User Picture]](http://l-userpic.livejournal.com/54541970/2) | From: brad 2005-04-23 07:50 am (UTC)
| (Link)
|
Has anybody built it?
Apparently not. Cringely got distracted and forgot to ever talk about it again: Following last week's column about Baxter, my idea for a distributed kinda sorta peer-to-peer Internet data back-up scheme, I expected this week to write about all the problems readers found with the idea, and all the existing Baxter-like services none of us had heard about. Well, things change, and I'll be doing that column next week the slashdot kids (ignoring the ones who went on and on about freenet) suggested: Distributed Internet Backup Systemand The OceanStore Projectthose both appear to be one-man efforts, sort of half-baked.
![[User Picture]](http://l-userpic.livejournal.com/1838084/552426) | From: eqe 2005-04-23 06:36 am (UTC)
| (Link)
|
![[User Picture]](http://l-userpic.livejournal.com/54541970/2) | From: brad 2005-04-23 07:51 am (UTC)
| (Link)
|
Thanks for the link. I'll look into it.
![[User Picture]](http://l-userpic.livejournal.com/15584071/1397) | From: greg 2005-04-23 07:53 am (UTC)
| (Link)
|
Back in '99, I worked at undoo, which turned into avamar and this is sort of what we were working on. At the time we used hashing to break down the big files and look for commonality to avoid redundancy.
From: node 2005-04-23 07:55 am (UTC)
First off, does this exist? | (Link)
|
I described a similar way to do it a couple of days ago, with links a company that wants to sell such a system for intranets.
...client machines (friend/family's computers) would throw the backup client into their "Scheduled Tasks" on Windows, or cron on Unix...Pycron works great for cron on Windows. The "Scheduled Tasks" functionality never works quite right for me for kicking of scripts, etc. What I do to back up my Windows machines at work is to just rsync everything over to a *nix machine using a really really long include / deny file, and then have the backup server comb through changes and pack it into an encrypted history file for that day. However, looking at duplicity (as posted above), it looks like I can now retire my custom scripts and just use that, as it is doing exactly what I rolled myself (but with more utilities, obviously).
![[User Picture]](http://l-userpic.livejournal.com/5887295/515656) | From: jwz 2005-04-23 08:33 am (UTC)
| (Link)
|
You also don't want to expose the number/sizes of files; e.g., you don't want to be able to look at it and say "this is mp3s, and this is a maildir" just by size/grouping statistics. So I think really you want one big file, or a bunch of files of equal size.
![[User Picture]](http://l-userpic.livejournal.com/92611566/3171) | From: mart 2005-04-23 09:06 am (UTC)
| (Link)
|
I like this idea. It sounds a bit like a miniature Freenet. I'd be a little concerned, though, about losing whatever indexes the server is retaining and being unable to recover the data. The indexes need to be backed up as well, but how do you back them up? It's a bit chicken-and-egg.
It might also be interesting to have proxies which act like clients but which hand off their storage to other clients, thus merging a bunch of different datasets together. The use I have in mind for this is (say) a family all individually backing up to each other, but also handing off to people outside the family as one big lump where it's not obvious who created what. Of course, more degrees of separation between you and your data increases the possibility that you won't be able to get it back from that source again later. A hybrid, though, where a client essentially backs up its own backups could cause your files to end up on the disks of people you've never even met.
![[User Picture]](http://l-userpic.livejournal.com/54541970/2) | From: brad 2005-04-23 05:15 pm (UTC)
| (Link)
|
The indexes need to be backed up as well, but how do you back them up? It's a bit chicken-and-egg.
I was thinking every backup client gets a full (encrypted) copy of the global index. Compressed, it's not that big. (and I have a fair number of files)
![[User Picture]](http://l-userpic.livejournal.com/44434/44630) | From: youngoat 2005-04-23 05:10 pm (UTC)
Distributed Internet Backup System | (Link)
|
DIBS (Distributed Internet Backup System) seems pretty similar to what you dsecribe. Unfortunately, it doesn't seem to have a very friendly UI. Everything is command-line. It would probably need to be wrapped in a friendly GUI in order to be usable by average users... I haven't actually used it... I just browsed the FAQ and Documentation.
![[User Picture]](http://l-userpic.livejournal.com/54541970/2) | From: brad 2005-04-23 05:17 pm (UTC)
Re: Distributed Internet Backup System | (Link)
|
That looks damn close if not exactly what I want. Thanks! I'll read up on it more before I ramble about this whole idea any more.
just curious, but why would you want to use Git in this? Despite the fact that the backup is distributed, modify 1K of data in a 1M file and you get a 2M resulting storage file.
what aspect of Git were you planning on using?
From: matt_trout 2005-05-06 01:51 am (UTC)
Have you looked at Steve Traugott's ideas for ISFS? | (Link)
|
the infrastructures mailing list ( http://www.infrastructures.org/) has been discussing a reliable peer-to-peer filesystem for some time; I suggested MogileFS a couple times and was told "yes, but it needs some extra stuff" (more accurate description better gathered from list archive). Might be worth popping up on there, maybe you can get some of the features you'd need for the storage layer from the infrastructures posters. I'm certainly a potential contributor for Bad and Wrong reasons of my own.
![[User Picture]](http://l-userpic.livejournal.com/22253211/1271974) | From: gadlen 2005-05-08 08:24 am (UTC)
Boxbackup | (Link)
|
Boxbackup is an interesting functional implementation of a similar idea. It doesn't have a distributed component but does use encryption and blobs well.
![[User Picture]](http://l-userpic.livejournal.com/54541970/2) | From: brad 2005-05-08 10:48 am (UTC)
Re: Boxbackup | (Link)
|
Thanks for the link! That looks interesting. | |