Log in

No account? Create an account
DVD ripping - brad's life [entries|archive|friends|userinfo]
Brad Fitzpatrick

[ website | bradfitz.com ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

DVD ripping [Jan. 29th, 2006|10:32 pm]
Brad Fitzpatrick
[Tags|, , , ]

I realized having all my DVDs (some ~250) ripped and available on the network from any computer/TV is very feasible.

I'm using vobcopy with --mirror, no transcoding. Then I can still have bonus features and menus and all that. VideoLAN is cool in that it can play a VIDEO_TS directory directly, as if it were a raw DVD.

I'm starting to run out of space on my 250GB RAID-1. I have room for another two disks in my current machine, so probably 800GB. Plus I need another SATA card, since this motherboard only has two ports.

If I want to do this right, it's $3573.00 for another machine with eight 400GB disks, 3.2TB. But I think ripping just our most-watched movies will be cool enough for now.

My current machine has room for 5 disks. Currently it's:

IDE (boot)
250GB SATA (raid 1)
250GB SATA (raid 1)

Assuming I don't want to mess with my current RAID-1 (my $HOME, etc), I have two choices:

1) 2x 400GB SATA RAID-1.
2) replace the IDE disk with another 400GB SATA disk and do a RAID-5 over three disks. Then I'd have 800GB new, not 400GB. Just gotta look into current state of grub/lilo and software RAIDs.

Total logical capacity needed, I'm guessing:

250 discs * average 5.5 GB = 1.3T

But if only 66% are worth ripping = 916 GB ...

So with 800 GB new, that's just 116 GB on my RAID-1 I'd need to use, which is fine. I'm not using the whole 250 GB anyway. (kept large chunks unpartitioned for testing stuff)

If I wanted to watch my movies remotely (say, from a hotel room with net access...), how much CPU/bandwidth would I need to do real-time transcoding?

[User Picture]From: matthew
2006-01-30 07:05 am (UTC)
you can boot with your OS on a RAID5, but you'll need to leave your kernel/initrd on a RAID1 /boot. 100M is big enough for that though.

Also, if you plan on cramming any more disks in that machine be sure to mount a fan (probably 80mm) in front of them if you can. They tend to get rather toasty if kept too close together.
(Reply) (Thread)
[User Picture]From: dorkmatt
2006-01-30 07:34 am (UTC)

re: watch from hotel room

isn't this what slingmedia is for, with players that rework the codec/bitrate/frame size for PDAs, etc.
(Reply) (Thread)
[User Picture]From: strand
2006-01-30 07:42 am (UTC)
I know it may be sacrosanct to suggest, but you could compress your video and get a vast improvement in total capacity. It would make streaming a better possibility as well.
(Reply) (Thread)
[User Picture]From: brad
2006-01-30 08:14 am (UTC)
Eh, menus and no artifacts are worth it. Disks will just keep getting cheaper. *shrug*

Plus I don't want to slow down the ripping process with all that transcoding! And I don't want to have to regret choosing format X when format Y comes out in 6 months.
(Reply) (Parent) (Thread)
[User Picture]From: nothings
2006-01-31 03:56 pm (UTC)
As a point of reference (since it's Windows only, but I wouldn't be surprised if somebody was trying to recreate it as open source, and the author claims he'll eventually open source it, but haha), there's ratDVD, which is a custom video format that preserves everything from the DVD format (menus, subtitles, etc.) while transcoding to a new video format. The transcoder is direct rather than requiring decoding then coding, so it's moderately but not insanely fast (about 1/2 real time on my old 2Ghz p4; AFAIUI the slow part of video compression is motion compensation so I guess it just uses the existing block correspondences from the old file rather than searching from scratch, which is fairly clever if true). The video format is supposed to be moderately like to H.264 so it's not significantly lossy with a 3x-ish savings (the author actually designed it for, reading between the lines, distributing as file sharing and then reconverting back to DVD without noticeable loss).

Of course, I found it a little too crashy, and it's not an option outside Windows right now, but I think it's a reasonable thing to consider if there's ever a stable linux version or version-alike. You don't have to worry about regretting format X when format Y comes out if format X preserves all functionality; the only advantage to format Y would be smaller size, and disks will keep getting cheaper. Of course "if ever" is not now, so it's irrelevant to your decisions right now.
(Reply) (Parent) (Thread)
[User Picture]From: mart
2006-01-30 08:30 am (UTC)

Strangely enough, I tend to consider the loss of the “menus” to be a benefit of ripping a DVD. Each to his own, I guess! ;)

(Reply) (Thread)
[User Picture]From: way2tired
2006-01-30 12:24 pm (UTC)
I'm with you. its somewhat convenient to rip out all the crap I don't care about... like french/spanish subtitles, "bonus features", etc., and just get the movie to play when put in the player.
(Reply) (Parent) (Thread)
[User Picture]From: erik
2006-01-30 09:53 am (UTC)
I'm sure you've posted about this at some point, but what are you using for audio and video output from the PC? I'm assuming component from the video card and optical from the sound card?
(Reply) (Thread)
(Deleted comment)
From: (Anonymous)
2006-01-30 12:44 pm (UTC)
ditto, i'm afraid curently it would be impossible to remotely view full dvd's unless your hotel AND your house had an DC3 connection or fiber.

i have ~200 full length movies on a 450 gig raid 5 array in the back of the house, a computer sitting next to the entertainment center can play anything on the fileserver straight to the tv, controlled through VNC, but most of the movies are divx rips ~700-1.4 gig apiece. you sacrifice some, including movies and extras, but i never watch those anyway plus my tv is crappy so i can't even notice the difference. (not hdtv)

added benefit tho is that i also have many many tv shows on the raid, all the robot chickens, south parks, futurama's, babylon 5, yadda, and they're all accessible at any time.

i need more storage however, and my goal is a SATA (current raid card is pata) hardware raid card for a raid 5 array of 8 500 gig disks giving 3.5 terabyte, but unfortunately i haven't sold any online blogging communities lately so it'll have to wait a bit ;)

(Reply) (Parent) (Thread)
[User Picture]From: valiskeogh
2006-01-30 12:44 pm (UTC)
grrrrr... last post was me on the 3.5 terabyte thingamajigger

(Reply) (Parent) (Thread)
[User Picture]From: slonick
2006-01-30 01:56 pm (UTC)

look, man

i duuno - how many people already have told it to you, but: thanks a lot for this unbelievable kind of communication - lj.
(Reply) (Thread)
From: epaulson
2006-01-30 03:51 pm (UTC)

transcode in parallel

I only coded one DVD, and it took 12 hours on a 2.4Ghz P4, so I might be doing it wrong or it might really be that slow.

But - you can transcode in parallel if you need to. Have a second array of your DVDs in the LJ machine room, and borrow some of your memcache nodes on-demand and encode it in on short order.

Or pre-transcode it at home - if you're talking 1TB to store the raw data, why not make it 1.2TB for the raw data and a transcoded version. Then, in 6 months when there's a different format, just dump your transcoded versions and re-encode in the new format.
(Reply) (Thread)
[User Picture]From: iamjosh
2006-01-30 04:14 pm (UTC)
Buy a sling box, and buy the D-Link MediaLounge Wireless HD Media Player DSM-520 use the sling box to on the fly encode the video / audio.

I currently have the media lounge hooked to just over 2T of DVDs, TV Shows, & Misc Video.

(Reply) (Thread)
[User Picture]From: hoankiem
2006-01-30 04:29 pm (UTC)
Hey Brad, I'd say a dual core Athlon64 X2 3800+ should be more than enough. Plus you'd be getting two physical CPUs, which helps a lot if the codec/encoder/decoder/transcoder supports it. Also it's a safety net since all new CPUs will be pretty much either dual core (AMD Athlon64 X2, Intel Core Duo) or multi-chip packages (Intel Pentium D).

About the RAID-5; if you're gonna go on the cheap (My Areca PCI-E 16-port card isn't exactly *cheap* :o !), I'd recommend an nVidia nForce4 Ultra/SLi based solution. You'd get 4x SATA-300 ports (2-channel), and 2 IDE channels that support 2 devices each. True, you'd have to use 4x IDE disks, but disks are dirt cheap anyway. The nVidia RAID can go across both SATA/PATA controllers, for an 8-drive RAID-5.

Another choice would be getting let's say a cheap Intel i915 board, but then you'd only have 4x SATA ports. The Intel integrated disk controllers are the BEST performance-wise. You'd be limited to 4x disks on RAID-5 though.

I concur with the comments about making sure the case airflow is top notch. This needs to be especially true if you're going to use Western Digital or Maxtor drives. Seagate, Samsung, and Hitachi drives run a fair bit cooler, so it's not as huge a problem. I have 10 drives in my system right now (all Seagates), 2x 160GB, 8x 500GB RAID-5, and both halves of the RAID are in independent enclosures with 92mm fans, while the disk heat dumped into the system gets pumped out via 2 120mm fans. If you're gonna get an Intel Pentium4 or Pentium-D or Xeon4 solution, make sure you factor the high heat output of those as well.. my dual core AMD Opteron 175 has no problems with heat :)
(Reply) (Thread)
[User Picture]From: brad
2006-01-30 04:42 pm (UTC)
You run a RAID over IDE channels with two drives on a channel? You're not afraid of a double drive failure?
(Reply) (Parent) (Thread)
[User Picture]From: hoankiem
2006-01-30 05:40 pm (UTC)
Nope, I don't run my RAID-5 that way; I run it through an ARC-1230 (http://www.areca.com.tw/products/html/pciE-sata.htm).

I'm just saying that the nForce4 (AMD/Intel) solution is one of the best bang for buck ones you can have. If the IDE/SATA channel gets knocked out, it shouldn't *kill* your drive, unless there was something really crazy going on. Just get another nForce4 Ultra board, plug your drives back in the same order they were before, and bam! you're back in business! Just make sure it's an nForce4 Ultra/SLi chipset motherboard, and it's all good. (A note though: the nForce4's SATA ports is actually based on a 2-channel SATA controller, with port multipliers to double the ports. The Intel stuff is that way too, so it's pretty much normal among integrated stuff *shrug*).

Controller failure is something that dangers every type of setup, methinks. You just have to prepare for it by backing up; I don't think there is any other surefire way to protect one's self from any computer calamity.

I suggested the nForce4 solution though, since you're just gonna be streaming stuff most of the time and won't need the high thoroughput I need for let's say, my workstation stuff. In all honestly, 2 cheap 4-port Silicon Image 3114 based PCI/PCI-E cards with *NIX software RAID-5 would still be plenty fast for streaming.
(Reply) (Parent) (Thread)
[User Picture]From: ckd
2006-01-30 05:28 pm (UTC)
Third choice: 800GB RAID-0 and just accept that if you have a drive failure, you have a lot of re-ripping to do. (It's an option.)
(Reply) (Thread)
[User Picture]From: brad
2006-01-30 05:33 pm (UTC)
I thought of that and immediately discared the idea. :-)

Plus once you have 800GB at your disposal, it'd become too tempting too put non-re-rippable content there, then a drive failure would really suck.
(Reply) (Parent) (Thread)
[User Picture]From: ydna
2006-01-30 06:26 pm (UTC)
I'm working on a similar project (starting at about 1.3TB with room for 7TB) for my video collection. But I'm planning to ditch the menus, etc. as fast as I can. I want to be able to jump right into the opening scene of an episode or film and not go through the bullshit warnings and forced previews found on some titles. Of course, this will mean building my own index into the VOBs to get the right playback. But once it's set, I can write some simple randomizer that will pick something I like that I haven't seen in a while. "Computer, entertain me, for I am bored." And then Flash Gordon will start. Fer zample. But yeah, ripping without transcoding is definitely the way to go.
(Reply) (Thread)
[User Picture]From: tijuanacartel
2006-02-02 07:42 am (UTC)
VLC supports SAP/RTP, allowing you to multicast stream your DVDs as they play to anything on your lan. Clients can flick between streams like they are flicking through channels I've seen this working, and it's uber cool. http://www.videolan.org/streaming/

As for your raid configuration, I reccomend plain old JBOD, and clever use of mount --bind. RAID 5 for a bunch of DVDs is pretty unnescessary. A jbod means you get full use of all the disks and if one of them dies, you only lose the data on that drive.

Of course, data is valued by the cost of recreating it, which is a lot of feeding DVDs in for ripping. So if you value not having to re-rip anything, RAID is an advantage. I dont store movies/tv episodes on RAID. My music collection however, which has taken years of accretion to build, lives on a raid 5 and gets backed up to dvd yearly :)

As to the current state of Linux software raid: on Serial ATA, for all but the very latest 3ware chipset, you will get better performance configuring the raid controller as a JBOD and using software raid. I have bonnie benchmarks at work somewhere that prove this. It seems that almost all SATA (9550 SX) raid controllers up till this point have been built using a SCSI controller and an intercessionary chip allowing it to talk SATA. The new 3ware chipset is the first i know of that was designed from the ground up to do SATA.

Performance aside, I greatly prefer software RAID as it gives me the flexibility to swap out the controller as nescessary. I dont know if you've ever had the misfortune of dealing with an Adaptec SCSI controller failure, but it's pretty heinous: you need the exact card, down to the hardware revision, and the exact firmware to even have a chance of recovering the array. 3ware is a little better: any card can import an array created by any other card, provided they are from the same family. But softraid trumps them all: you can rip the disks out of your freshly cooked Raidcore card and throw them at a shitty SiL 3114, and your data will still be readable.

(Reply) (Thread)
[User Picture]From: brad
2006-02-02 08:27 am (UTC)
Heh, you don't have to get me started on vendor "hardware" RAID. If LJ had tags historically you could probably go find a "FUCK $VENDOR" post for every one of them.

I'm all about the software RAID.

And I do value my ripping time! Maybe if I had an auto-disc changer I could preload when time came to re-rip, but I don't.
(Reply) (Parent) (Thread)
[User Picture]From: tijuanacartel
2006-02-02 08:52 am (UTC)
I once asked myself: Realistically: how often am I going to watch this in the next year? Nothing scored more than one. Not even How to operate your brain scored 2.

This revelation lead me to never again waste space storing movies and tv series on raid. The aforementioned rule of data valuation applies. If you REALLY get the urge to watch something, you can get it on Netflix.

Of course I dont have 250 dvds. If you haven't watched them yet, then having them network accessible is an advantage, and not having to rerip is a definate advantage. Personally, only stuff that is incredibly rare and cool lives on my raid, e.g Colossus and Be somebody.
(Reply) (Parent) (Thread)
From: ucc_journal
2006-02-02 04:15 pm (UTC)
So why aren't you using LVM? pvmove is awesome.
(Reply) (Parent) (Thread)
[User Picture]From: brad
2006-02-02 04:35 pm (UTC)
Where did I say I didn't? Why can't my LVM PVs be MD devices?
(Reply) (Parent) (Thread)