Use a standard format for data discovery of diverse data sets

Rank 2376
Idea#617

Stage: Active

Campaign: Making Data More Accessible

I’ve been looking extensively at the great variety of data-oriented REST and REST-ish APIs that are appearing, especially as part of various government transparency efforts. (As an example, there is the Sunlight Foundation’s API to look up information about congress people, or the Follow The Money API to look up information about lobbying and political contributions.)

I notice the following:

1. There are many and they are appearing (and probably disappearing) constantly. More are being added.

2. For a ‘consumer’ (that would be a programmer) of this information it’s pretty time consuming and error prone to study the documentation of each of these ‘similar but different’ APIs. Most are quite well documented but still each has to be discovered and studied separately.

3. Creating applications (either browsers, or widgets, or middleware applications) that use and combine information from more than one source is hard.

I would propose that the government adopt some kind of decentralized data discovery format which would eliminate each of the above problems. It would have the following characteristics:

1. Allow a single access method to access a very broad range of data, numerical, textual and so on, but focused fundamentally on tabular information (broadly speaking.)

2. Be easy and cheap to implement for the data/information owners/publishers

Specifically not require any centralization. Each data owner can independently decide what data to publish with Data RSS and when. New owners can appear and old ones can disappear with no coordination.

I have a specific sample of what this format could look like and how to design and pilot it. I've placed all that work into the public domain.

Tags

Submitted by

Feedback Score

14 votes
Voting Disabled

Idea Details

Similar Ideas [ 5 ]

ReviewScale

Assessment

Comments

  1. Comment
    Greg Elin

    The meme of standardizing discovery and method calls to data sets seems to be spontaneously bubbling up.

    Going REST provides standardized HTTP error codes for accessing data. It makes sense to see if there could also be a standardized set of basic methods for accessing RESTful data(e.g., search, getList, describe, etc.)

    Pito Salas has has written a case study for "Data RSS" and has been thinking about a set of basic methods that everyone could offer with their data sets. See: http://www.blogbridge.com/2009/02/27/data-rss-early-ideas/

    There's also an overlap with URL scheme idea, in which the idea is to have a parseable schema definition or utilizing URLs to a restful resource. The idea is to publish a definition of one's URL structure for accessing RESTful resources and thereby effectively replacing API "methods" with simply the URL schema. Daniel Bennett has his take on it here: https://docs.google.com/Doc?docid=dfxgcdfc_10ddmrz9g4&hl=en_GB But I believe there are others exploring this notion, too.

    Greg Elin

    http://twitter.com/gregelin

    0 Agreed
    0 Disagreed
  2. Comment
    schnippy

    I now wish that there were a limited number of up/down votes I could cast so I could spend them on great ideas like this one rather than playing whack-a-mole with the trolls.

    Great idea and great comments - as an API programmer I've often wished for the same.

    0 Agreed
    0 Disagreed