Use a standard format for data discovery of diverse data sets


Stage: Active

Campaign: Making Data More Accessible

I’ve been looking extensively at the great variety of data-oriented REST and REST-ish APIs that are appearing, especially as part of various government transparency efforts. (As an example, there is the Sunlight Foundation’s API to look up information about congress people, or the Follow The Money API to look up information about lobbying and political contributions.)

I notice the following:

1. There are many and they are appearing (and probably disappearing) constantly. More are being added.

2. For a ‘consumer’ (that would be a programmer) of this information it’s pretty time consuming and error prone to study the documentation of each of these ‘similar but different’ APIs. Most are quite well documented but still each has to be discovered and studied separately.

3. Creating applications (either browsers, or widgets, or middleware applications) that use and combine information from more than one source is hard.

I would propose that the government adopt some kind of decentralized data discovery format which would eliminate each of the above problems. It would have the following characteristics:

1. Allow a single access method to access a very broad range of data, numerical, textual and so on, but focused fundamentally on tabular information (broadly speaking.)

2. Be easy and cheap to implement for the data/information owners/publishers

Specifically not require any centralization. Each data owner can independently decide what data to publish with Data RSS and when. New owners can appear and old ones can disappear with no coordination.

I have a specific sample of what this format could look like and how to design and pilot it. I've placed all that work into the public domain.


Submitted by

Feedback Score

14 votes
Voting Disabled

Idea Details

Vote Activity (latest 20 votes)

  1. Disagreed
  2. Disagreed
  3. Disagreed
  4. Agreed
  5. Agreed
  6. Disagreed
  7. Agreed
  8. Disagreed
  9. Disagreed
  10. Disagreed
  11. Disagreed
  12. Agreed
  13. Agreed
  14. Agreed
  15. Agreed
  16. Agreed
  17. Agreed
  18. Agreed
  19. Agreed
  20. Agreed
(latest 20 votes)

Similar Ideas [ 5 ]


  1. Comment
    Greg Elin

    The meme of standardizing discovery and method calls to data sets seems to be spontaneously bubbling up.

    Going REST provides standardized HTTP error codes for accessing data. It makes sense to see if there could also be a standardized set of basic methods for accessing RESTful data(e.g., search, getList, describe, etc.)

    Pito Salas has has written a case study for "Data RSS" and has been thinking about a set of basic methods that everyone could offer with their data sets. See:

    There's also an overlap with URL scheme idea, in which the idea is to have a parseable schema definition or utilizing URLs to a restful resource. The idea is to publish a definition of one's URL structure for accessing RESTful resources and thereby effectively replacing API "methods" with simply the URL schema. Daniel Bennett has his take on it here: But I believe there are others exploring this notion, too.

    Greg Elin

  2. Comment

    I now wish that there were a limited number of up/down votes I could cast so I could spend them on great ideas like this one rather than playing whack-a-mole with the trolls.

    Great idea and great comments - as an API programmer I've often wished for the same.