I don't know what I'm doing wrong, but I've written some python code that looks at this, and it seems like there are way more entries than LiveJournal's front page is saying: as in, about 1300 entries per minute rather than just 200 that it says right now.
[crschmidt@creusa ~]$ python test.py
Entries: 100. Entries/second: 19.2611058701. Time: 1124282733
Entries: 200. Entries/second: 22.5625200497. Time: 1124282737http://crschmidt.net/python/ljentries.py
is the code: am I really insane? did I do something wrong? I don't see obvious dupes in my code, and although I'm assuming I'd miss entries (if they broke over a 1024 barrier, since I'm not looking at the buffer) I don't think that I can think of a way I'd get extras...
Added in some extra collision checking, and found out that there are indeed clashes, but I can't seem to figure out why/where they're coming from. All my code is in that Python link up there.
*shrug* No clue what's up, but thought you might want to know. It almost seems like you're grabbing a full set of 100 new URLs from the cache every 10 seconds or so, and not checking if they're already printed out somehow... but that doesn't make any sense at all. So it's probably my code, but I can't figure out how.