Multi Channel Data Architecture

Writing software requires getting data from external sources, which usually means accessing an API by sending a GET or POST request to a URL with authentication keys and a request message and parsing the XML or JSON that comes back. While this approach is fully automated, it comes with a lot of potential problems:

You have to build a different API for every source.
The API is periodically unavailable, so you're down as well.
The API could change and break your app.
The API could simply be taken away from you, to "upgrade to a better API", to "improve security", etc. that leaves you scrambling for an alternative.
You have to store a local copy of the data retrieved from the API, in case it goes away.
You had to work out a procedure for audit and verification of the data obtained from the API.
You worry about security because now you have a big app with lots of data and authentication keys.

As a result, these API integrations are expensive and time consuming to build and maintain, so they don't scale well in scope. When we're talking about building a Hyperledger-based blockchain app for Carbon Accounting and Certification Working Group, which could require integrating data from a large number of sources, we came upon a better way to do it with the blockchain itself.

Instead of each source providing us with an API, they could put their data on the blockchain. This could be a Hyperledger permissioned channel that is only open to their customers and other trusted parties. The data could be the carbon emissions of a product or a particular invoice, for example.

Then, a carbon accounting calculator could traverse through multiple channels to obtain the data it needs for its calculations.

It could then open a channel of its own and put the results on this channel. This channel could then be open to its customers, so they could in turn use the data for their own carbon emissions calculations.

The advantages of this approach are:

The data source doesn't have to worry about keeping its API up and running. Once the data is available, it's written to the blockchain channel and no longer requires a server to provide it, just like web content deployed on a CDN.
The format is standardized by agreement between the provider and users of the data.
The content of the data cannot be altered once it is written to the blockchain, removing concerns about audit.
Even if the source changes the format or stops providing data in the future, data from the past will always be available.
The app consuming the data can be much smaller. For example, instead of storing an authentication key for many sources and data from many users, it can use keys provided by a customer's wallet. Once it's finished with the data obtained from those keys, it can put the results on its channel and then delete the data obtained from the blockchain.
Both the data source and data consumer could then be run as serverless micro services.

The channels could be maintained by a third party entity, which charges a small fee for each access through a token. The data source provider could give out these tokens when it allows a data consumer permission to access its channel. In return, the third party entity serves as a neutral party tasked with keeping up the operations of the data channel.

See slides illustrating this.

Page tree

Multi Channel Data Architecture