Making Data More Accessible

Use a standard format for data discovery of diverse data sets

I’ve been looking extensively at the great variety of data-oriented REST and REST-ish APIs that are appearing, especially as part of various government transparency efforts. (As an example, there is the Sunlight Foundation’s API to look up information about congress people, or the Follow The Money API to look up information about lobbying and political contributions.)

I notice the following:

1. There are many and they are appearing (and probably disappearing) constantly. More are being added.

2. For a ‘consumer’ (that would be a programmer) of this information it’s pretty time consuming and error prone to study the documentation of each of these ‘similar but different’ APIs. Most are quite well documented but still each has to be discovered and studied separately.

3. Creating applications (either browsers, or widgets, or middleware applications) that use and combine information from more than one source is hard.

I would propose that the government adopt some kind of decentralized data discovery format which would eliminate each of the above problems. It would have the following characteristics:

1. Allow a single access method to access a very broad range of data, numerical, textual and so on, but focused fundamentally on tabular information (broadly speaking.)

2. Be easy and cheap to implement for the data/information owners/publishers

Specifically not require any centralization. Each data owner can independently decide what data to publish with Data RSS and when. New owners can appear and old ones can disappear with no coordination.

I have a specific sample of what this format could look like and how to design and pilot it. I've placed all that work into the public domain.

Tags

Submitted by

Stage: Active

Feedback Score

14 votes
Voting Disabled
Idea#617

Idea Details

Vote Activity (latest 20 votes)

  1. Downvoted
  2. Downvoted
  3. Downvoted
  4. Upvoted
  5. Upvoted
  6. Downvoted
  7. Upvoted
  8. Downvoted
  9. Downvoted
  10. Downvoted
  11. Downvoted
  12. Upvoted
  13. Upvoted
  14. Upvoted
  15. Upvoted
  16. Upvoted
  17. Upvoted
  18. Upvoted
  19. Upvoted
  20. Upvoted
(latest 20 votes)

Similar Ideas [ 5 ]

Comments

  1. Comment
    Greg Elin

    The meme of standardizing discovery and method calls to data sets seems to be spontaneously bubbling up.

    Going REST provides standardized HTTP error codes for accessing data. It makes sense to see if there could also be a standardized set of basic methods for accessing RESTful data(e.g., search, getList, describe, etc.)

    Pito Salas has has written a case study for "Data RSS" and has been thinking about a set of basic methods that everyone could offer with their data sets. See: http://www.blogbridge.com/2009/02/27/data-rss-early-ideas/

    There's also an overlap with URL scheme idea, in which the idea is to have a parseable schema definition or utilizing URLs to a restful resource. The idea is to publish a definition of one's URL structure for accessing RESTful resources and thereby effectively replacing API "methods" with simply the URL schema. Daniel Bennett has his take on it here: https://docs.google.com/Doc?docid=dfxgcdfc_10ddmrz9g4&hl=en_GB But I believe there are others exploring this notion, too.

    Greg Elin

    http://twitter.com/gregelin

  2. Comment
    schnippy

    I now wish that there were a limited number of up/down votes I could cast so I could spend them on great ideas like this one rather than playing whack-a-mole with the trolls.

    Great idea and great comments - as an API programmer I've often wished for the same.