Knight Blog

The blog of the John S. and James L. Knight Foundation

In two years, DocumentCloud becomes standard

Sept. 18, 2012, 7:44 a.m., Posted by Aron Pilhofer

documentcloud

Knight Foundation recently took a look at the 2009 Knight News Challenge winners, including the success of the project DocumentCloud. Here, one of the founders Aron Pilhofer talks about how the site became a standard tool for newsrooms in just two years.

It was four years ago when Eric Umansky, Scott Klein and I first met to discuss submitting a Knight News Challenge application to address the sorry state of document-based journalism.

Scott, who took notes of the meeting, summed up as follows: “This project will fight the “dark web” nature of source documents on the Web, in which documents are difficult to find and often disappear when a news organization is done telling a particular story.”

Eric proposed a name that everyone liked -- DocumentCloud.org -- and we bought the domain the next day.

Our goals were modest: We hoped to create a platform that would encourage news organizations -- our own if nothing else -- to be more transparent by publishing source documents in a Web-friendly format. At that time, few newsrooms thought to publish documents online, and those that did used awful, bloated proprietary formats like Flash or PDF.

None of us dreamed that in August of 2012, DocumentCloud.org would host more than 350,000 documents, comprising almost 5.5 million pages, for more than 650 organizations. We never imagined DocumentCloud.org would be serving more than a million document views per week, with peaks of more than a million per day.

Today, DocumentCloud has become almost a standard tool for newsrooms wanting to publish documents to the web. No event demonstrated that better than the Supreme Court ruling on the Patient Protection and Affordable Care Act in June. Dozens of newsrooms published the ruling on DocumentCloud, including eight of the top 10 largest newspaper websites.

It hasn’t all gone perfectly. Like many startups, DocumentCloud has experienced the strain of rapid growth, which forced us to pull back some of the core features while we re-engineer the database backend to scale. But it is safe to say that DocumentCloud’s first act has been successful beyond any of our wildest dreams.

But where do we go from here?

The first set of new features will be on the annotation side, giving readers the ability to annotate documents. We believe DocumentCloud can be a powerful tool for newsrooms wanting to crowdsource documents, and user annotations will be one step in the direction.

We’re also working on taking it international. Currently, DocumentCloud is English-only, but we have heard from dozens of newsrooms overseas wanting to use the service. By the end of the year, we should have our first Spanish-language newsrooms publishing documents, and over the next 18 months we will be adding newsrooms from around the world. If you’re interested in helping us test our service in your language when the time comes, please let us know by filling out our questionnaire.

And, we’ll be moving forward with a long-term sustainability plan. From the beginning, we have said we would be charging a small fee for some services, though there will always be a free option. We’ll have more information about this in the coming months.

Four years down the road, DocumentCloud has risen to heights none of us could have foreseen. We’re looking forward to seeing where you help us take it next.