brad's life - Contributing to Open Source projects [entries|archive|friends|userinfo]
Brad Fitzpatrick

[ website | bradfitz.com ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

Contributing to Open Source projects [Mar. 20th, 2010|06:12 pm]
Previous Entry Share Next Entry
[Tags|, , , ]

Prior to joining Google I always joked that Google was the black hole that swallowed up open source programmers. I'd see awesome, productive hackers join Google and then hear little to nothing from them afterwards. When I joined I decided I'd solve this mystery and post about it but it's been over 2.5 years and I've been busy and somewhat forgot. Fortunately a discussion at work last week reminded me of this again, and a bunch of us got to talking about the phenomenon.

Just as there are rarely absolutes in anything, there are no absolutes about open source programmers' activities after joining Google. The main reasons for them sometimes disappearing, as far as I can tell, are:
  • Many open source programmers are just programmers. They like working on fun, hard problems, whether on open source or otherwise.
  • They're busy. Google seems to suck everybody's free time, and then some. It's not that Google is forcing them to work all the time, but they are anyway because there are so many cool things that can be done. I often joke that I have seven 20% projects.
  • The Google development environment is so nice. The source control, build system, code review tools, debuggers, profilers, submit queues, continuous builds, test bots, documentation, and all associated machinery and processes are incredibly well done. It's very easy to hack on anything, anywhere and submit patches to anybody, and notably: to find who or what list to submit patches to. Generally submitting a patch is the best way to even start a discussion about a feature, showing that you're serious, even if your patch is wrong.
Personally, my increased involvement with Google side-projects and decreased involvement with public open source projects is a bit of all three of those bullets.

Notably, though, I want to discuss the last bullet.

It's pretty difficult to figure out how to contribute in the open source community. Given some package on your system or some tarball you downloaded, it's not always obvious what the right process is for that community to get patches upstream. It's often a research project just to find the upstream version control system, or bug tracker, or the mailing list to send patches to. CONTRIBUTING files in tarballs, if present at all, are often out of date.

When you're used to this, perhaps it's not so bad, but inside a company with a very consistent and easy-to-hack-hack-hack environment, this can be daunting. I'm not just talking about Google here. I'm sure most companies have more internal consistency in tools & processes than the collective open source community.

My request:
So here's my request to the open source community: make a webpage for your project that summarizes your community's development resources & process. And then link the hell out of it. Link it from all over your project's documentation. Make sure you have a CONTRIBUTING file, but don't put the current information in the file.... it'll just get stale. Instead, put your contributing documentation URL in your CONTRIBUTING file. Tools and processes change, but tarballs get old, and distros are rarely bleeding edge.

Good examples of people doing this already (from a quick search) include Django, Mono, and MySQL.

If your project doesn't already do this, as most of mine haven't, or haven't well enough, I made a website to make this easy:

Contributing: http://contributing.appspot.com/

Anybody can (and should!) use that for their project to create a project page with a stable URL listing their project's resources and quick summary of the project's development workflow. Where's your source, bug tracker, code review tool, style guide, mailing list, etc?

I've been creating project pages for all projects I'd started in the past, and making sure to update all their docs and websites with links to the Contributing page.

Here are some of mine:

http://contributing.appspot.com/memcached
http://contributing.appspot.com/perlbal
http://contributing.appspot.com/sgnodemapper
http://contributing.appspot.com/contributing
http://contributing.appspot.com/djabberd
....

Still creating them, but afterwards I hope to be able to filter more of my mailing list subscriptions and not feel guilty about people having out-of-date information and emailing me directly.

From now on I will never either a) fail to document the contribution process for a new project I start, or b) document that sending me patches directly is the answer. That may be true for a bit, but projects often change hands, and stale documentation sucks.
LinkReply

Comments:
[User Picture]From: thedimka
2010-03-21 02:05 am (UTC)

(Link)

If people want to help an Open Source project with graphics or UI it is even harder to do, because most designers are not that familiar with source control and most of the time there are guidelines or any directions on how to contribute anything for improving UI or any usability part.
Probably it would be useful to have a paragraph with instructions for people like that.
From: iamronen.myopenid.com
2010-03-21 07:22 am (UTC)

poor usability

(Link)

I have experienced open-source as a very closed community, it is dominated by a developer mindset (that is impenetrable to anyone who is not a developer) that is extremely limiting it's reach.
http://www.iamronen.com/2009/10/closed-open-source/
[User Picture]From: iamo
2010-03-21 02:16 am (UTC)

(Link)

Github has gone a huge way towards reshaping this scenario as a rule. My first instinct now on wanting to know more about how something was implemented in an open source project or to contribute to it is to check if it's on github. I don't even bother looking for info in the README or related ALLCAPSes until after I've checked there.

And if it is, I can pretty easily guess that it's going to be fairly easy to contribute. Of course, that doesn't mean projects shouldn't also link to the main github branch or somewhere like it.
[User Picture]From: askbjoernhansen
2010-03-22 11:37 pm (UTC)

(Link)

Yeah - I want to second that. git makes this process about a billion times easier. Even without github it's trivial to make and manage your patches (maintain your fork, really) until you figure out where to send them and get them accepted "upstream".
[User Picture]From: bluesmoon
2010-03-21 03:55 am (UTC)

(Link)

I started writing opensource code in 1999, and I wrote a lot of it. I was also a very heavy contributor to various lugs in my area and on irc. Then in 2004 I joined Yahoo! and the bulk of my contributions stopped. Over the following months I almost completely disappeared from the mailing lists I used to frequent.

The biggest reason for this though, was that my primary opensource projects competed with a Yahoo! product, so I couldn't ethically keep working on it. My participation dropped to commenting on coding style and giving people commit access. Secondly, Yahoo! has an awesome internal developer community from which I could get my daily "fix" of technical discussions, rants and flame wars. I no longer felt the need to drop in to an external mailing list and the only times I did were to say hello to all the people I'd met over the years.

Contributing was never a problem. If something didn't work, I'd search for the source website, download the latest tarball and submit a patch on the mailing list. If they didn't want it, I didn't care, I'd just put it up on my web page and let others download it from there. As long as it solved my own problem, I was happy. In one instance though, my patch wasn't just accepted, but I was instantly made "lead volunteer" of the module I'd just patched since no one else was doing it.
From: http://www.google.com/profiles/bgreenlee
2010-03-21 04:06 am (UTC)

(Link)

I haven't found figuring out *how* to contribute to be terribly difficult, especially given that most of the projects I've been interested in are on GitHub. The complaint I have is that many times patch submissions (or "pull requests" in GitHub parlance) go into a black hole, with no response from the maintainers of the project. Frustrating.
From: (Anonymous)
2010-03-21 06:05 am (UTC)

And so...

(Link)

When are you going to solve the other problem? The bigger one is of large companies swallowing open source contributors and them no longer giving back to the community that made them so employable.

I know it's asking a lot, and google is probably the least of the problematic communities, but it seems like you solved the wrong problem here ;-)

(Matt Sergeant - forgot my LJ id)
[User Picture]From: coffeechica
2010-03-21 06:22 pm (UTC)

Re: And so...

(Link)

Precisely -- it's the mindset of large companies to be instantly against open source. Why, after all, would we want to share our trade secrets with the rest of the web? They do not see what's in it for them.

Well really, is there anything profitable in it for them other than rep from a small percentage of internet users? There's my open-ended question that I'm really interested in hearing practical (not idealistic) answers to.
From: ingulf
2010-03-21 07:38 am (UTC)

(Link)

"I'm sure most companies have more internal consistency in tools & processes than the collective open source community."

ROFL!
From: (Anonymous)
2010-03-21 03:10 pm (UTC)

DOAP

(Link)

Great idea. I had the same thing in mind about a year ago.
You can find my ideas about a distributed development system
using already existing semantic web technologies here:
http://turbo24prg.github.com/distributed-development.html
---
TL;DR:

PLEASE use structured and linked data, so people can actually use it mechanically and build tools on top of it. Thanks!

There's the DOAP project, developing a RDF schema for describing software projects: http://trac.usefulinc.com/doap/ . You can find a good description at: http://www.oss-watch.ac.uk/resources/doap.xml

There are already many projects using it, e.g. http://pypi.python.org/pypi and launchpad.net (even including maintainers).

Like this?
http://www.openlierox.net/wiki/index.php/Development

Did that a while ago already. :)

Ack otherwise to the post. Whereby, personally, I don't remember any project where I had problems to find the most recent trunk code.
[User Picture]From: brad
2010-03-21 06:27 pm (UTC)

(Link)

Curious: what OpenID URL did you start with such that LiveJournal or Google gave you such an ugly URL here? Do you have a public Google profile?
[User Picture]From: mart
2010-03-22 03:33 pm (UTC)

(Link)

I think that's what you get if you use the originally-documented identifier-select (or "directed identity") flow. Presumably Google OpenID is stuck with those URLs now because if you change them then folks won't be able to access their existing accounts...

If you choose to "Sign in with Google" on TypePad we assume the identifier http://www.google.com/accounts/o8/id and get the same sort of result.

[User Picture]From: quadhome
2010-03-22 12:05 am (UTC)

(Link)

A page that lists all of the registered projects?
From: pinterface
2010-03-22 05:21 am (UTC)

(Link)

Submitting patches? In my experience, that's the easy part. "The ... code review tools, ... submit queues, continuous builds, test bots, documentation, and all associated machinery and processes", on the other hand are practically non-existent. Shoot, I'd be thrilled if most of the projects I end up patching had outdated documentation and half a test suite because that would be an improvement. (Common Lispers are particularly bad in this regard, which isn't really helping my lackluster documentation and testing habits. :/)

Speaking of all that fancy stuff you guys have to make hackin' easy, is that documented anywhere for general public consumption? Because that, I think, would be an interesting read.

From: http://www.google.com/profiles/billythekid
2010-03-22 05:53 am (UTC)

isn't this what google and a project homepage is for?

(Link)

I'm curious about why you'd like for developers to place project essential information on contributing.appspot rather than the project homepage itself? Most developers interested in a project will just google the project, find the homepage and look for all of this information here. Searching for memcached, for example, lists the project page in the first 10 results, but nothing about contributing.appspot.com. Shouldn't the essential contribution information live closer to the other project documentation or code itself?
[User Picture]From: brad
2010-03-22 06:04 am (UTC)

Re: isn't this what google and a project homepage is for?

(Link)

memcached has moved homes 4 times, and 3 different version control systems so far (cvs -> svn -> git)

I'm thinking long-term here.
[User Picture]From: brad
2010-03-22 06:07 am (UTC)

Re: isn't this what google and a project homepage is for?

(Link)

Also, when you say:

> Most developers interested in a project will just
> google the project, find the homepage and look for
> all of this information here.


That's exactly the problem. I don't want contributing to open source projects to be a research problem, hunting around. It should be easy to find, like OWNERS file in Google.