A little history on JSON, etc.

I’m glad to see we’ve stirred the pot with our discussion of JSON. It’s driven us to stop and think about our decisions, their impact on our project, and how others may be able to interact with our system.

For the sake of those interested, I thought I might offer some clarity regarding our use of JSON. When OpenSRF (our core information passing framework, discussed here) was first designed, it was XML everywhere. As we were doing our benchmarking of the system, we thought things were good, but sluggish with large datasets. OpenSRF is designed as a set of independent applications which often must communicate with one or more other applications to accomplish any particular task. For example, to retrieve information on a patron, the system must verify the requesting party has such permission, then it must actually retrieve the information. With a lot of such conversation going on across the backbone, you can see there will be a great deal of data flying around the network.

When we came across JSON, we were charmed with it’s simplicity and economy of language. We backported almost all of the internal message passing from XML to JSON and achieved immediate speed gains. We’ve been very pleased with JSON in its use as a backbone data encoding handler and I imagine that OpenSRF 2.0 will be JSON everywhere.

Things get sticky when we start discussing how we want to publish data to the world. We could pass around zeros and ones behind the scenes and no one would really be affected by it (though I’m sure some of you would be amused). Our initial instinct was to pass JSON to the OPAC and staff client. Since both were primarily developed with Javascript, we hardly had to do anything to start passing sensible data to the UI components. The question now is how does this affect others who wish to interact with our data…

A brief discussion of how to interact:

OpenSRF was initially developed with a Jabber interface for both backend and client communication. We still use Jabber for the backbone, but we’ve added (and generally gravitated toward) a standard web interface for accessing published methods from client code. So Any data you need to retrieve from the system can be retrieved by a specially crafted URL. Some data is accessible by anyone, some requires a valid login session and permission, etc.

As mentioned above, our default encoding format was JSON, so we developed an Apache plugin/handler that interacts with the system and returns JSON formatted data. The plugin basically just proxies data without any data conversion.

So, after our discussion of XML and JSON in the previous post, Mike sat down and developed a REST style XML handler to the system. You pass in the same URL parameters and receive data/objects as XML instead of JSON. Now anyone who wishes to use an XML based client interface will have access to the same data and will retrieve it in the same way, except for the location they request in the URL. Allowing Apache plugins to do all of the conversion work for us (the XML plugin was about 120 lines Perl code) we can transform our system data into just about anything that Apache can serve.

———————

One other point I’d like to make regarding our extension of JSON. Someone outside the crew recently suggested using JSON to handle the class hints by doing something like the following:

{ “class”:”letters”, “payload”:[“a”, “b”, “c”]}

Obviously, this would allow us to encode the same information without the need to develop specialized JSON parsers. Honestly, I consider this a valuable suggestion since one of our primary ambitions in this project is to ‘get along with the neighbors’ and requiring variant parsers or requiring that you use one of our parsers, I would argue, is not a good example of getting along. However, the beauty of our approach means that the objects are not altered to allow for class hints (i.e. you can ignore the class hints and the objects would be the same). Also, doing search/replace on class hints within the parser code makes parsing and magically casting recursive objects a snap for Perl and Javascript. These parsers are based on regular expression matching and judicious use of ‘eval’. Using the class/payload technique would require a recursive object traversal to determine if an object had the special class field. This may or may not cause a slow down in the JSON handling. I say it’s worth some serious thought and possibly some testing after the alpha release…