?

Log in

No account? Create an account
grumblegrumble - brad's life — LiveJournal [entries|archive|friends|userinfo]
Brad Fitzpatrick

[ website | bradfitz.com ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

grumblegrumble [Dec. 6th, 2001|03:13 am]
Brad Fitzpatrick
Nothing sucks more than going to bed nice and early, only to be awoken in the middle of the deepest part of your sleep because your fucking website broke.

I want to cry.
Give me back my sleep.
:-(
LinkReply

Comments:
[User Picture]From: patrick
2001-12-06 03:26 am (UTC)
stupid website.
i hate it.
(Reply) (Thread)
[User Picture]From: starbelle
2001-12-06 03:55 am (UTC)
eep --;

Hope you get some sleep.
(Reply) (Thread)
[User Picture]From: chris
2001-12-06 04:19 am (UTC)
Except when that happens to you about once a week, on average. I couldn't guess how many hours of sleep computers have stolen from me.
(Reply) (Thread)
[User Picture]From: tanthalas
2001-12-06 05:01 am (UTC)
Hi, sorry to bother you, but I have tried to contact the Support Team and my question got an id=0 (?), so I figured there must be some problem there too. My problem is that I received the following letter from markkraft@livejournal.com:

tanthalas,

Greetings from LiveJournal!

First off, I wanted to thank you for being a member. I also wanted to
let you know that your membership expires on 1999-04-15.

You can renew any time: new time you purchase simply extends your
membership, so you don't "lose" your current remaining time. You can
renew your membership immediately by visiting LiveJournal's online
payment site at http://www.livejournal.com/paidaccounts/

While there are a lot of nice LiveJournal features available only to
members that make membership worthwhile (with more "members only"
features in the works...), we think the biggest reason to renew your
membership is because you are helping to support a community that you
care about and that puts its users first.

LiveJournal prides itself on being supported and developed by the
people who use it. We are run primarily by dedicated volunteers from
around the world. We don't want the uglier, more commercial aspects
of the Internet to intrude on things as personal as someone's
journal. That's why LiveJournal is completely banner free! We intend
on keeping it that way, but that means that we must rely on your
support to keep us going.

Please take the time to visit http://www.livejournal.com/paidaccounts/
and renew your membership. There are membership levels as inexpensive
as $5 for two months, all the way up to $25 for a yearly
membership. As always, the membership dollars go to paying the
day-to-day expenses of running LiveJournal.

As an example of how expensive it can be to run LiveJournal, there are
over a dozen business-class servers powering LiveJournal. It also
costs LiveJournal thousands of dollars a year to co-locate the servers
and pay for network access. Frankly, Since we double in size every
three months, we regularly need additional servers to keep up with
growth.

Our goal is for LiveJournal to pull ahead of the constant need for
server upgrades, adding additional servers to increase performance and
reliability. LiveJournal users will benefit from a much faster
service able to support a wide variety of great new features that will
improve upon what is already one of the best online communities in the
world!

I hope I've convinced you that LiveJournal membership is something
worth having, but if you're not entirely convinced yet, I'd like to
hear from you and find out what would convince you to become a
LiveJournal member again. Please feel free to email me at
markkraft@livejournal.com if I can be of any assistance.
Many thanks,

Mark Kraft
LiveJournal.com

The problem is that my membership expires on April 15th, 2002!!!!! I tried replying to Mark Kraft, but the email didn't get through (bad address). How can a member that has paid his FIRST YEAR of membership on February 23rd, 2001, have the membership expired in 1999, is beyond my knowledge to know. Maybe a database error somewhere? The email does look like a standard reply, after all. But I just hope that my LJ does not get deleted.
(Reply) (Thread)
[User Picture]From: mart
2001-12-06 05:12 am (UTC)

This issue is what woke Brad from his slumber. It should be fixed now.

You're the second person who has reported that problem with support somewhere I read. I wonder what's causing it.

(Reply) (Parent) (Thread)
[User Picture]From: tanthalas
2001-12-06 05:21 am (UTC)
As a pro Beta tester, I clicked on the id=0 hyperlink, and "Bad User Name" came out. I guess there's a query somewhere referring to a database that has probably been moved or renamed.
(Reply) (Parent) (Thread)
[User Picture]From: dottey
2001-12-06 05:26 am (UTC)
No offense is meant by this, so please take it only as a tip for future reference and not like me yelling. brad tends to like to keep his personal journal seperate from LiveJournal related stuff which he posts with bradfitz. This is just to avoid having LJ-related comments (such as the above) piling up in his personal journal.

Well, hope that makes sense and hope I wasn't stepping on any toes. I think you'll be happy to read in both paidmembers and lj_maintenance that the problem has been resolved.
(Reply) (Parent) (Thread)
[User Picture]From: tanthalas
2001-12-06 05:33 am (UTC)

Ashes on my head

I know, I'm usually the first one to refer to both communities or the support team, but I was so scared that the email was an automated reply generated from the server (meaning immediate cut down of the service, since the expiry date was 1999. Plus, my reply to Mark Kraft came back with an invalid address error, and when I asked the support team it wouldn't work... I guess I just panicked.
(Reply) (Parent) (Thread)
[User Picture]From: chuck
2001-12-06 10:06 am (UTC)

Re: Ashes on my head

Panicked? What's the worst possible scenario that could happen by the info in that email? You'd have non-member status for a couple of days(until the problem was fixed).. that's it. Beware Brad's wrath, MUHAHAHAH!
(Reply) (Parent) (Thread)
[User Picture]From: tanthalas
2001-12-07 02:46 am (UTC)

Re: Ashes on my head

What about my (fucked up) custom settings, that took me days to (almost) make them work? Where would have they ended up?
(Reply) (Parent) (Thread)
[User Picture]From: symian
2001-12-06 05:18 am (UTC)
Sigh. I know that feeling. But then, at least ONE of my websites never breaks down thanks to a talented programmer(*wink* Brad).

Nothing worse? Nah. Getting woken up to bail someone out of jail is worse. The time, the waiting, the machine coffee, and the SMELL is just too much.

Getting up to drive to a family member's house at 3am to resolve a drunken issue is far worse too. The time, the listening, the stupidity, no coffee, mayhap fending off the cops from taking someone to jail, is indeed worse.

At least with a broken website you accomplish something... real... when you put effort in. The other things I mentioned are useless wastes of time and age.

(Reply) (Thread)
[User Picture]From: percolator
2001-12-06 06:27 am (UTC)
BYAH!! HA! HA! I AM THE GOD OF STOLEN SLEEP. I steal from you puny mortals to give me more power with the ladies, and well because that's my job. It is sooooo easy and fun to steal sleep from sys admin's and website maintainers, because they all think this will be the last time I'll visit, BYAH! HA! HA! So keep thinking your network and servers are running just fine fools and don't forget to carry your beepers, my fingers are too big to use a phone, BYAH! HA! HA! AGAIN!
(Reply) (Thread)
[User Picture]From: calliste
2001-12-06 07:47 am (UTC)

Das Sandmnnchen

*gives you back your sleep*

Hey, when I was a child, I used to watch a show every night called das Sandm�nnchen. It must be something East German (amazingly enough, it still exists though). In the end of the show, the Sandm�nnchen will sprinkle Schlafsand for the children so they fall asleep. Everytime I used to watch it, I got really tired afterwards. I still can't watch it without getting tired. This is conditioning � la Pavlov, I suppose.



Let's see if that image will show up correctly. I'm so bad with inserting images, especially as I always mess up width and height (no, no special reason for that, but I also mess up left and right all the time) and it comes out wrong.

*knocks on wood*
(Reply) (Thread)
From: ex_debgirl0
2001-12-06 09:31 am (UTC)
(Reply) (Thread)
[User Picture]From: chuck
2001-12-06 10:08 am (UTC)
damn girl, who you riding in that picture? =P
(Reply) (Parent) (Thread)
From: unsuffer
2001-12-06 09:55 am (UTC)

monitor your stuff

you ever think of setting up monitoring on your stuff? you could look not only at server statistics via SNMP, but you could also do some simulation database queries, application performance tests, etc., and use your cellphone's web-based paging interface to send you a page when there might be a potential problem. i could give you a hand in setting up such a system - it's what i do for a living.
(Reply) (Thread)
[User Picture]From: brad
2001-12-06 09:59 am (UTC)

Re: monitor your stuff

Blah. It is monitored. We do have test machines.

But when shit does get mad fucked, monitoring doesn't magically fix it.
(Reply) (Parent) (Thread)
From: unsuffer
2001-12-06 11:59 am (UTC)

Re: monitor your stuff

depending on what's broken, you could create a self-healing system... for example, looking at the system log and seeing apache (or foo process) spewing could say to you, "gee - i should restart it" - and so you do a little restart on the bugger. or, said process is consuming x amount of memory or y amount of cpu utilization, so we should take some sort of retroactive steps to correct that by either killing the process or whatever else. you know your application better than i do on the systems side, so i'm not entirely certain what sort of set-up you've got going that could give you an idea of potentially bad stuff. i remember a while back that i saw one of your apache config files and it had ErrorLog /dev/null in it, so you're basically shooting yourself in the foot in trying to a) debug "live" issues b) determine when stuff's going bad. your perl scripts could die "DB 123" or some error code if they can't connect to the database. you could then restart mysql or whatever if you encounter such a code in the log file that you're streaming.

just some thoughts. you should seriously take some time and find ways to automate your system administration. i created a few self-healing applications for my own servers and i honestly don't touch them - they fix themselves if a process goes nuts and decides to use a ton of cpu or memory or disk space or even bandwidth.
(Reply) (Parent) (Thread)
[User Picture]From: dormando
2001-12-06 04:29 pm (UTC)

Re: monitor your stuff

I could write a self-healing system (we already use netsaint to harass us about things)

but that's fucking gay, after I thought about it. If something needs restarting, it's going to need it again. It's going to need to be fixed. With unix systems, there are no excuses for intermittent failures.

lecturing us just pisses us off, we're not that fucking dumb.

... and that apache log file was nulling image/userpic hits, if you bothered to read it.
(Reply) (Parent) (Thread)
From: unsuffer
2001-12-06 09:33 pm (UTC)

Re: monitor your stuff

nice attitude.

> With unix systems, there are no excuses for intermittent failures.

myself being a unix systems programmer, i disagree. linux and freebsd kernels are usually very stable and reliable, and so a kernel failure that'll screw you over is uncommon. however, 'nixes do use giant heaping pieces of crap software that cause your problems. cause and point: sendmail. sendmail is a giant piece of monkey poop. the people who wrote it are self-righteous misinformed morons. given, it's unlikely that it'll hang or anything, but it sucks nonetheless. another case and point: NFS. talk about crap. if that sucker even thinks that something's wrong, it'll hang until the cows come home. yeah, unix is stable. but that's a false sense of security.

> If something needs restarting, it's going to need it again. It's going to need to be fixed.

completely true in certain cases. what about NFS? a server goes down. the client hangs. yeah, you need to know to put the server back up, but in the meantime, the client should correct itself. the point is this: if something breaks, it needs to be operational again, as soon as possible. log whatever actions were taken. you've already got log files. redirect kernel panics to the system log and anything else that you might have going to the console. log the cpu and memory utilization. after you have all of your information logged, do something to bring that sucker back up. then go through the logs and try and figure out what went wrong and fix it after the fact that you have it back up. relying and putting up backup servers, etc. is bad practice. you might have to tweak a config file or something small - the server shouldn't have to be down for the amount of time it takes for you to figure out what the real problem is - only the amount of time it takes to correct the issue by getting the process to restart or stop spewing or whatever it is you guys deal with.

> ... and that apache log file was nulling image/userpic hits, if you bothered to read it.

gee, sorry - the only thing i remembered about it was the ErrorLog thing because that struck me as very weird and odd.

> lecturing us just pisses us off, we're not that fucking dumb.

if you don't want to take my advice, that's just fine. i work with 3,000 solaris and hp-ux servers. hp-ux blows more donkey nuts than freebsd or linux. stuff breaks. for crying outload, 'rm' gets confused if you pass it more than 1,000 files to delete. still, while you're up monkeying in the middle of the night cause you're being "awoken in the middle of the deepest part of your sleep because your fucking website broke", i'm sleeping soundly and without incident.

good night, guys ;)
(Reply) (Parent) (Thread)
[User Picture]From: brad
2001-12-06 09:40 pm (UTC)

Re: monitor your stuff

Ignore him.

Dormando gets pissed off at everything.

We don't use NFS or sendmail.

It's not rm that doesn't like over a thousand files... it's your shell. find is your friend.... find . -foo -exec rm {} \;

Still, a "self healing system" couldn't fix last night's problem. Sure, stupid shit like things dying/hanging and needing to be restarted works... hell, we already do that. We have cron jobs which check the state of daemons and start them back up. All our daemons not only check the pid file to see if they're running, but also check the process table to see if they're really running. etc, etc, etc.

Every internal service (mail, DNS, etc) we address by virtual IP only and have set to failover to a backup clone.

But when some process decides to delete all the paid user records, a 10 line script isn't going to go find and fix the offending code, restore from backup, and apology to users.

That's why we got pissed off at your suggestion. You trivialized the problem we had. But still, I apologize for getting mad. I know you're only trying to help.
(Reply) (Parent) (Thread)
From: unsuffer
2001-12-06 11:59 pm (UTC)

Re: monitor your stuff

i had no idea that that was the issue at hand, to be completely honest. i thought it was purely "database puked". yeah, in that case, there's not much you can do. if you wanted to get really involved (which i'm sure you probably don't), you can commit the data you're about to change into a table with something like:

ID | Table_Name | Column_Name | Data | Date

and then you could essentially "roll back" to a last known good date for said Table_Name. this data could be cleared out over x interval, etc.

i know this is far-fetched, but for the sake of discussion, i just felt like bringing it up.

i've been looking all over to find any sort of info on the backend you guys run - the installs you have, the cron jobs you guys run like you mentioned - but can't find much of a reference other than when installs or memory go bad. just wondering if there's a link i'm missing or something. i'm interested in how various people approach different situations and whatnot.

get some rest. you guys need to relax in light of all of the crap that happens, otherwise what's the point?
(Reply) (Parent) (Thread)