?

Log in

Fun with Unix - brad's life [entries|archive|friends|userinfo]
Brad Fitzpatrick

[ website | bradfitz.com ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Fun with Unix [Dec. 21st, 2006|09:00 am]
Brad Fitzpatrick
[Tags|, , ]

What do you do if you have a tarball (Windows users: "zip file") that's taking up most a lot of the filesystem ("disk space"), not leaving you enough room to uncompress it?

Easy:

-- cut the archive file into bite-sized chunks (with, say, dd(1)), starting at the back of the file, and calling truncate on the original file as you go backwards. Or maybe you have enough space, so you don't need to truncate. Easier. Then just delete the full-sized archive at the end of your chunking.

-- pipe all the chunks (in foward order!) into tar -zxvf - ("uncompress"), unlinking each chunk as you move on to the next.

Sure, this doens't let you resume later if there's a failure, since you killed your original chunks. But I'm assuming you can get the archive back if you need it.
LinkReply

Comments:
[User Picture]From: niallm
2006-12-21 05:11 pm (UTC)
Will truncate actually deallocate blocks upon completion of the call, or do you need to fsync() [or equivalent] immediately afterwards?
(Reply) (Thread)
[User Picture]From: gaal
2006-12-21 05:16 pm (UTC)
Nice hack! Though unless I had to do this often (and'd automate it with a script), I'd usually move the file over to another box and nc -l | tar -...
(Reply) (Thread)
[User Picture]From: gaal
2006-12-21 05:18 pm (UTC)
    (and'd automate it with a script)
Then I remembered there's a split command. Then I read the manpage and saw there was no in-place option. Time for a patch, I guess?
(Reply) (Parent) (Thread)
[User Picture]From: kunzite1
2006-12-21 05:22 pm (UTC)
you may or may not have noticed, but, this is your 8888th entry in this journal.
(Reply) (Thread)
From: jmason
2006-12-21 05:46 pm (UTC)

not with zip

Windows users are still screwed, as always; ZIP files contain the table of contents at EOF, so they'd need to be able to seek to there before they can extract anything.
(Reply) (Thread)
[User Picture]From: ghewgill
2006-12-21 06:12 pm (UTC)

Re: not with zip

Since the zip format stores some redundant directory information at the beginning of each file, it's possible to extract a zip file from a stream. You might need a custom unzip program to do so, though.
(Reply) (Parent) (Thread)
[User Picture]From: mart
2006-12-21 05:47 pm (UTC)

Assuming this tarball is coming from somewhere it can be streamed from (say, an HTTP server?), just skip the tarball file altogether:

wget -O - http://www.example.com/radfile-0.9.8d.tar.gz | gzip --uncompress -c - | tar xv

Unix is magic.

(Reply) (Thread)
[User Picture]From: hughe
2006-12-21 06:05 pm (UTC)
or one i use every day for 25gig video....

ssh user@10.0.0.1 cat blueray.tgz | tar -xvz

(Reply) (Parent) (Thread)
[User Picture]From: adamthebastard
2006-12-21 10:17 pm (UTC)
This is a great way to restore from a backup created with:

ssh user@${host} 'tar c /home/' > backup_${host}_${date}.tar

Or from the other side.

tar c /home/ | ssh user@{other_host} 'cat - > backup_${host}_${date}.tar'
(Reply) (Parent) (Thread)
[User Picture]From: andrewducker
2006-12-21 06:47 pm (UTC)
If the compressed version takes up more than 50% of the filesystem that would tend to indicate that the uncompressed version would take up more than the entire filesystem?

Unless you're sending over video or other already compressed data, I suppose.
(Reply) (Thread)
(Deleted comment)
[User Picture]From: scosol
2006-12-22 01:25 am (UTC)
What do you do if you have a tarball (Windows users: "zip file") that's taking up most a lot of the filesystem ("disk space"), not leaving you enough room to uncompress it?

what is this- 1988?
(Reply) (Thread)
[User Picture]From: brad
2006-12-22 01:27 am (UTC)
Data gets bigger over time too, yo. And I was working on a virtual machine without much disk space.
(Reply) (Parent) (Thread)
[User Picture]From: mart
2006-12-22 01:53 am (UTC)

Virtual machines always seem to be cramped. Did I ever tell you that I used to do my LJ development in a virtual machine on my Pentium II 350MHz box which itself only had 512MB RAM and 32GB disk? The VM had about 8MB RAM and 512MB disk.

The disk space was never really an issue, but often during update-db.pl -p it'd run out of memory and kill the process during the part where it compiles a bajillion S2 layers and caches all the S2::Checker objects in memory. Fun times.

(Reply) (Parent) (Thread)
[User Picture]From: brad
2006-12-23 04:48 am (UTC)
Yeah, update-db.pl -p s2 compiling keeps killing our stuff too. So it is the checkers that aren't being released?
(Reply) (Parent) (Thread)
From: reizar
2006-12-23 01:22 am (UTC)
I know that in Windows, if you know what you're doing, you can go into DOS and recover anything that's been corrupted, damaged or deleted. Not sure how it would work in Unix though.
(Reply) (Thread)