December 21st, 2003


too much, too boring

Dina got me a stuffed animal Rhino. His name is Mr. Rhino and he's cute.

Outside of that, lots going on, but nothing too interesting.

Yesterday, after letting octal borrow my car to get his license, I totally plunged into C#, spending most the day reading and tinkering with my C# reverse proxy load balancer. So much nicer than Java. I was going to write a really long post about all its cool language features (at least opposed to Java because it doesn't have anything actually new), and some of my open questions, but I don't really feel like it.

I have a stack of unopened mail here. Bills... Christmas cards.... I'd opened it all, but I don't really feel like it.

Still can't find my phone. Finally got around to cancelling it, though.

Today I put User Mode Linux on my Debian Stable mail server, running Debian Unstable inside the UML virtual machine (for newest SpamAssassin), but I got it all 90% working and I'm losing interest now. There's very little left: open virtual machine's spamd up, listening on some accessible IP address, some iptables rules, and change real machine's spamc to connect to virtual machine's spamd... and maybe some hostfs stuff so the virtual machine can get at users' spamassasin conf files. *yawn*

I have my Slimp3 Squeezebox sitting here, out of the box, but haven't plugged it in or configured it or installed the server software or anything. Just lost interest.

Having trouble staying interested in anything, actually. Even TV is boring.

I think this is when I should go exercise to make my body wake up and care.

Phone found!

Motivated by promises of hundreds of kisses, dina found my phone back. It had fallen out of my coat pocket the other night at the bar, down in-between the wall and the bench of the booth we were in.

But I have 30 days to reactivate it... maybe I'll leave it in "suspended" state for a little while longer. It's been nice.

Aight, off to bike/run now. Energy will come back, damnit.

C#: Strings without encodings? Working with buffers.

So far I'm really liking most of C#. Here's something I can't figure out, though:

Question for HTTP wizards
Writing an HTTP header parser in C# while still being 8-bit-clean looks to be a bitch. Or are headers always ASCII? RFC 2616's grammar says octet all over, which I can only assume means all 8 bits in "oct". Or is that just ideal, and in the real world so many servers suck that clients only send 7 bit headers? Or, is there an explicit encoding in HTTP? UTF-8?

The Mono XSP webserver isn't 8-bit clean. It assumes all is ASCII.

Question for C# wizards
Is there any way to conveniently operate on a buffer of bytes, perhaps as a String, without knowing its encoding? If I have a buffer with bytes over 127, the Encoding.ASCII converts them to question marks. (which I verifed by going from byte buffer to String and back again) This is what Mono's XSP does.

And you can only do regular expressions on a string, not a buffer?

If I have anonymous 8-bit data, I should still be able to split on it, search for substrings (read: other byte arrays), and run regexps on it.

Now, I think it's nice that Strings have known encoding, but I think C# should have a more powerful System.Buffer class that allows for more than just getting/setting bytes. I should be able to do searches from another byte array. And the RegExp library should allow matching on byte buffers.

class UsefulStream { ... }

jwz comments:
The fact that I wasn't able to use the various String classes, and had to write my own "network buffer" class to do protocol-ish stuff was one of my earliest gripes with Java, too.
And sure enough, that turned out to be my solution in C# as well.

Made a class "UsefulStream" which has an async method "BeginGetLine" taking the .NET-standard AsyncCallback delegate. The class then sees if it has a line. If not, begins an async read with an internal callback. Keeps checking for lines or eof, reading more as necessary. Eventually calls back into the caller's callback with the useless IAsyncResult value (I used null). Then caller invokes EndGetLine() on its UsefulStream instance, which returns the String, or null if EOF.

This actually makes things pretty easy now. I can centralize timeouts inside that class. I'll probably mimic Apache with both soft and hard timeouts (time since last read, time since first read).

UsefulStream always assumes lines are ASCII. (as protocol headers tend to be... except memcached's text protocol, which supports object key names with 8bit, I believe... feh)

And next I plan to add normal "BeginRead/EndRead" to UsefulStream, which pretty much wrap the normal NetworkStream, but also return the left-over crap that was read in while we were reading lines. (it already works the other way around: switching from getting byte chunks to reading lines...)

But yeah, hacky.

And of course this is all 2 fucking lines in Perl:

$line = <F>
read(F, $dst_scalar, $n_bytes_to_read);

(and you can even change the line separator with $/ !)

I love you Perl.