Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Have you gotten Writing software requires getting data from external sources through API's recently?  You send a , which usually means accessing an API by sending a GET or POST request to a URL with your authentication key keys and a request for data, and back comes message and parsing the XML or JSON that comes backYou parse that XML or JSON, store it in your database, and keep going.You probably do this so much that you don't even realize there might be a better way.  Instead, you think it's normal thatWhile this approach is fully automated, it comes with a lot of potential problems:

  • You have to build a different API for every source.
  • The API is periodically unavailable, so you're down as well.
  • The API could change and break your app.
  • The API could simply be taken away from you, to "upgrade to a better API", to "improve security", or for some other perfectly good reason etc. that leaves you scrambling for an alternative.
  • You have to store a local copy of the data retrieved from the API, in case it goes away.
  • You had to work out a procedure for audit and verification of the data obtained from the API, in case anybody thinks you might've altered or changed it.
  • You have to keep up with security to protect your worry about security because now you have a big app with lots of data and authentication keys.

While a new and better Javascript web framework seems to pop up every week, not much has changed in API integration land for years.  We're all so used to doing it this way, we think this is just our lot in life.

Unfortunately, As a result, these API integrations are expensive and time consuming to build and maintain, so they don't scale well in scope.  When we're talking about building a Hyperledger-based blockchain app for Carbon Accounting and Certification Working Group and potentially integrating , which could require integrating data from a large number of sources, we came upon a better way to do it with the blockchain itself.

Instead of each source providing us with an API, they could put their data on the blockchain.  This could be a Hyperledger permissioned channel that is only open to their customers and other trusted parties.  The data could be the carbon emissions of a product or a particular invoice, for example.

Then, a carbon accounting calculator could traverse through multiple channels to obtain the data it needs for its calculations.

It could then open a channel of its own and put the results on this channel.  This channel could then be open to its customers, so they could in turn use the data for their own carbon emissions calculations.

The advantages of this approach are:

  • The data source doesn't have to worry about keeping its API up and running.  Once the data is available, it's written to the blockchain channel and no longer requires a server to provide it, just like web content deployed on a CDN.  
  • The format is standardized by agreement between the provider and users of the data.
  • The content of the data cannot be altered once it is written to the blockchain, removing concerns about audit.
  • Even if the source changes the format or stops providing data in the future, data from the past will always be available.
  • The app consuming the data can be much smaller.  For example, instead of storing an authentication key for many sources and data from many users, it can use keys provided by a customer's wallet.  Once it's finished with the data obtained from those keys, it can put the results on its channel and then delete the data obtained from the blockchain.  
  • Both the data source and data consumer could then be run as serverless micro services.

The channels could be maintained by a third party entity, which charges a small fee for each access through a token.  The data source provider could give out these tokens when it allows a data consumer permission to access its channel.  In return, the third party entity serves as a neutral party tasked with keeping up the operations of the data channel.

See slides illustrating this just not a good way to do it.