Log in

No account? Create an account
Never ending feed of Atom feeds - brad's life — LiveJournal [entries|archive|friends|userinfo]
Brad Fitzpatrick

[ website | bradfitz.com ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Never ending feed of Atom feeds [Aug. 16th, 2005|12:58 pm]
Brad Fitzpatrick
[Tags|, , , ]

An increasing number of companies (large and small) are really insistent that we ping them with all blog updates, for reasons I won't rant about.

Just to prove a point, I flooded a couple of them and found that sure enough, nobody can really keep up. It's even more annoying when they don't even support persistent HTTP connections.

So --- I decided to turn things on their head and make them get data from us. If they can't keep up, it's their loss.

Prototype: (not its final home)

$ telnet danga.com 8081
GET /atom-stream.xml HTTP/1.0<enter>

And enjoy the never ending XML stream of Atom feeds, each containing one entry. And if you get more than 256k behind (not including your TCP window size), then we start dropping entries to you and you see:

<sorryTooSlow youMissed="23" />

I think soon we'll get TypePad and perhaps MovableType blogs all being sent through this. The final home will probably be on a subdomain of sixapart.com somewhere, including documentation better than this blog entry.

And yes, I'm sure my Atom syntax is bogus or something. I spent a good 2 minutes on that part of it.

[User Picture]From: mart
2005-08-16 09:06 pm (UTC)

With a little bit of extra markup (plus ideally a little bit of support for handshaking on connection) this could become an XMPP stream and the existing XMPP client libraries would be able to suck it up, saving people from having to write new parsing code (which is a pain because many XML libraries won't play nice when there's no proper end to a document). I guess once it stops being HTTP Perlbal becomes less helpful, though. Last I checked the XMPP libraries for Perl were a little clunky as well.

(Reply) (Thread)
[User Picture]From: edm
2005-08-16 09:58 pm (UTC)
The two obvious ways of parsing it would be to either use a SAX API parser (ie, process the start/stop tags as they come in), or do some high-level parsing to spot start/end of Atom entries and then wrap those in enough surrounding junk to make a DOM parser happy. Neither seems especially difficult to do.

I, too, love the <sorryTooSlow/> tag. And marvel at the people who ever thought their architecture could take 6+ incoming posts a second and do useful things with them, without a LJ sized infrastructure.

(Reply) (Parent) (Thread)
[User Picture]From: crschmidt
2005-08-17 12:45 pm (UTC)
Or some regex mastery?
(Reply) (Parent) (Thread)
From: bobwyman
2005-08-22 02:30 am (UTC)

"Atom over XMPP" (see the spec)

There is a specification for doing "Atom over XMPP"... See:

At PubSub, we use an earlier version of this protocol and will be updating to "Atom over XMPP" as specified -- sometime in "near" future.

bob wyman
(Reply) (Parent) (Thread)