a peek at implementing a feature in Evergreen


In PINES, catalogers work mostly through OCLC, so the functionality for directly creating local MARC records was always being pushed low on my priority list. But today is their lucky day, and perhaps an unlucky day for you, because I’m going to share some of the process. πŸ˜€

Warning: If you’re not a software developer, this blog entry will likely bore you to tears, and if you are a software developer, it’s still likely. Reader beware.

Disclaimer: I’m the staff client developer, so my perspective is focused on only a small part of Evergreen as whole, and I shamelessly embrace all the abstractions Mike (database guru) and Bill (middle layer guru) afford me with their code. I also shamelessly peek behind (not break) the abstractions to see how stuff works, so that I don’t have to bug them with so many questions.

The requirements for this function is pretty simple. Take a handful of MARC files from Bin, PINES’ head cataloger, and offer them as templates to cataloging staff through the staff client program. Staff will edit a template as a new record, and when they’re finished it will receive an auto-generated title control number. A lot of the functionality needed is already in place, and the interface is stubbed out in the menu system, so hopefully this will be a short post. We’ll see. πŸ˜‰

So I’m looking at a file Bin gave me for a “K-level” book template, and it’s in binary MARC21. And I know Bill has created an API somewhere for retrieving MARC templates, I’m just not sure where. I suspect it’s in the OpenSRF application called “cat”, so I go to a nifty introspection interface here: http://dev.gapines.org/opac/extras/docgen.xsl

And enter “open-ils.cat” for the Application and “template” for the method name (without the quotes). And I get a hit:

open-ils.cat.biblio.marc_template.retrieve
API Level: 1
Package: OpenILS::Application::Cat
Packaged Method: retrieve_marc_template
Required argument count: 0
Streaming method: No
Notes:
Returns a MARC 'record tree' based on a set of pre-defined templates. 	Templates include : book

Now, I suspect the documentation is off for this method, and the reference to “record tree” really scares me, as it brings back memories of an early object Mike came up with called Biblio-Record-Node, a collection of which could represent MARC XML. I used this with the original MARC Editor prototype. What I’m hoping for is MARC21Slim, which is what I know that the new MARC Editor we have likes to eat. I also need to figure out how to get the templates into Evergreen; so I go look at the source for this method, in ILS/Open-ILS/src/perlmods/OpenILS/Application/Cat.pm

The juicy bit is here:

__PACKAGE__->register_method(
    method  => "retrieve_marc_template",
    api_name    => "open-ils.cat.biblio.marc_template.retrieve",
    notes       => <<"  NOTES");
    Returns a MARC 'record tree' based on a set of pre-defined templates.
    Templates include : book
    NOTES

sub retrieve_marc_template {
    my( $self, $client, $type ) = @_;

    return $marctemplates{$type} if defined($marctemplates{$type});
    $marctemplates{$type} = _load_marc_template($type);
    return $marctemplates{$type};
}

sub _load_marc_template {
    my $type = shift;

    if(!$conf) { $conf = OpenSRF::Utils::SettingsClient->new; }

    my $template = $conf->config_value(
        "apps", "open-ils.cat","app_settings", "marctemplates", $type );
    warn "Opening template file $template\n";

    open( F, $template ) or
        throw OpenSRF::EX::ERROR ("Unable to open MARC template file: $template : $@");

    my @xml = <F>;
    close(F);
    my $xml = join('', @xml);

    return XML::LibXML->new->parse_string($xml)->documentElement->toString;
}

The use of OpenSRF::Util::SettingsClient is telling me that the settings service is involved, and I know that I have to find the configuration file for that, and the XML::LibXML tells me that middle layer is expecting my template file to already be in XML.

Hrmm, so I know that on our development server everything starts with /openils/conf/bootstrap.conf, and inside that file I see the line “settings_config = /openils/conf/openils.xml”. That’s what the settings service uses. I look in that file and search for “template”, find the <open-ils.cat> element:

            <open-ils.cat>
                <keepalive>5</keepalive>
                <stateless>1</stateless>
                <language>perl</language>
                <implementation>OpenILS::Application::Cat</implementation>
                <max_requests>199</max_requests>

                <unix_config>
                    <unix_sock>open-ils.cat_unix.sock</unix_sock>
                    <unix_pid>open-ils.cat_unix.pid</unix_pid>
                    <max_requests>1000</max_requests>
                    <unix_log>open-ils.cat_unix.log</unix_log>
                    <min_children>5</min_children>
                    <max_children>25</max_children>
                    <min_spare_children>2</min_spare_children>
                    <max_spare_children>5</max_spare_children>
                </unix_config>

                <app_settings>
                    <marctemplates>
                        <book>/openils/var/templates/marc/book.xml</book>
                    </marctemplates>
                </app_settings>

            </open-ils.cat>

Awesome, so now I know where to put the template files and how to advertise them. Once I modify the openils.xml file, I could call

su - opensrf -c '/openils/bin/osrf_ctl.sh -d /tmp/ -p /openils/conf/bootstrap.conf -c /openils/conf/opensrf_core.xml -a restart_all'

to make things register.

But first I need to convert this binary MARC21 file to MARC21Slim. I suspect Mike has a nifty perl one-liner for this, and as I ask him, I notice we have a marc2xml program on the dev server. I have no idea where it came from, but being an opportunist… The resulting file has a <collection> element as the root node, but I change it such that it starts with <record>, which is what the MARC Editor expects. Now, Bin wasn’t able to give me a totally stripped record, so I throw the XML into Google Documents & Spreadsheets and ask Mike to sanity check for me. We rip out some of the control tags and some 99X tags, and there we have it.

I add <k_book>/openils/var/templates/marc/k_book.xml</k_book> to the openils.xml file, restart the services, and test in a command-line program called srfsh:

pines@app07:/openils/conf$ srfsh
srfsh# request open-ils.cat open-ils.cat.biblio.marc_template.retrieve "k_book"

Received Data: "<record xmlns=\"http://www.loc.gov/MARC21/slim\"><leader>00620cam a2200205Ia 4500</leader
><controlfield tag=\"008\">060202s2006    xxua          000 1 eng d</controlfield><datafield tag=\"010\"
ind1=\" \" ind2=\" \"><subfield code=\"a\">LCCN number</subfield></datafield><datafield tag=\"020\" ind1=
\" \" ind2=\" \"><subfield code=\"a\">ISBN number</subfield></datafield><datafield tag=\"100\" ind1=\"1\"
 ind2=\" \"><subfield code=\"a\">Main Entry.Personal Name</subfield></datafield><datafield tag=\"245\" in
d1=\"1\" ind2=\"0\"><subfield code=\"a\">Title Statement</subfield><subfield code=\"b\">Remainder of titl
e</subfield><subfield code=\"c\">Statement of responsibility,
 etc.</subfield></datafield><datafield tag=\"260\" ind1=\" \" ind2=\" \"><subfield code=\"a\">Place of pu
blication,
 distribution,
 etc.</subfield><subfield code=\"b\">Name of publisher,
 distributor,
 etc.</subfield><subfield code=\"c\">Date of publication,
 distribution,
 etc.</subfield></datafield><datafield tag=\"300\" ind1=\" \" ind2=\" \"><subfield code=\"a\">Physical De
scription; extent</subfield><subfield code=\"b\">Other physical details</subfield><subfield code=\"c\">Di
mensions</subfield></datafield><datafield tag=\"500\" ind1=\" \" ind2=\" \"><subfield code=\"a\">General
Note</subfield></datafield><datafield tag=\"650\" ind1=\" \" ind2=\"0\"><subfield code=\"a\">Subject Adde
d Entry.Topical term or geographic name as entry element</subfield><subfield code=\"v\">Form subdivision<
/subfield></datafield><datafield tag=\"650\" ind1=\" \" ind2=\"0\"><subfield code=\"a\">Subject Added Ent
ry.Topical term or geographic name as entry element</subfield></datafield></record>"

------------------------------------
Request Completed Successfully
Request Time in seconds: 0.012046
------------------------------------
srfsh#

Now, being a staff client developer and on the wrong side of the firewall, so to speak, my code doesn’t have access to sensitive internal applications (and its associated API), so I can’t directly query the open-ils.settings service for a list of retrievable MARC templates. But I don’t want to hardcode these template names into the staff client either, so I ask Bill politely and he whips up a wrapper method for me:

srfsh# request open-ils.cat open-ils.cat.marc_template.types.retrieve

Received Data: [
   "k_book",
   "book"
]

Excellent. Now I’m ready to start prototyping some XUL. First, I add these method names to ILS/Open-ILS/xul/staff_client/chrome/content/main/constants.js, giving them these mnemonic names: MARC_XML_TEMPLATE_RETRIEVE and MARC_XML_TEMPLATE_LIST. Sometimes I suspect this is needless abstraction, because 3 months down the road I’ll be debugging this and Bill will ask me what method I’m calling, and I’ll see MARC_XML_TEMPLATE_RETRIEVE, which means nothing to him, so I’ll have to look at that constants.js file. But other times it’s great, since I can try out alternate methods by changing just one file.

The XUL file I’m working with is ILS/Open-ILS/xul/staff_client/server/cat/marc_new.xul. Now, I’ve talked about XUL before, and I really like it. This site explains why you too might like XUL: http://www.xulplanet.com/tutorials/whyxul.html

Here’s what I’m starting with for my prototype:

    <vbox flex="1">
        <hbox id="actions">
            <hbox id="menu_placeholder" />
            <button id="load" label="Load" accesskey="L"/>
        </hbox>
        <iframe id="marc_editor" flex="1"/>
    </vbox>

That’s pretty straightforward. If I wanted to do things Right(tm), I would change some of those attributes for the <button> element to use DTD entities so I could localize them, but I’m lazy and still have hopes of an intern doing that for me.

The next step is to add some Javascript: pull in some needed libraries for networking, etc., retrieve the list of templates, and populate a menu for template selection.

That will look like this:

JSAN.use('util.error'); g.error = new util.error();
JSAN.use('util.network'); g.network = new util.network();
JSAN.use('util.widgets');
JSAN.use('util.functional');

var templates = g.network.simple_request('MARC_XML_TEMPLATE_LIST');
if (typeof templates.ilsevent != 'undefined') throw(templates);
var ml = util.widgets.make_menulist(
    util.functional.map_list(
        templates,
        function(el) {
            return [ el /* The menu entry label */, el /* The menu entry value */ ];
        }
    )
);
$('menu_placeholder').appendChild(ml);

When rendered, that ends up producing this:

Prototype of New MARC Interface

So far, so good. Now, let’s attach some behavior to the Load button:

$('load').addEventListener(
	'command',
	function(ev) {

		var template_name;
		try {

			template_name = $('menu_placeholder').firstChild.value;
			var marc = g.network.simple_request(
				'MARC_XML_TEMPLATE_RETRIEVE',
				[ template_name ]
			);
			if (typeof marc.ilsevent != 'undefined') throw(marc);

			var url = urls.XUL_MARC_EDIT;
			var params = {
				'record' : { 'marc' : marc },
				'save' : {
					'label' : 'Create Record',
					'func' : function(new_marcxml) {
						alert('FIXME - put in call to save the marc');
					}
				}
			};
			$('marc_editor').setAttribute('src',url);
			netscape.security.PrivilegeManager.enablePrivilege("UniversalXPConnect");
			$('marc_editor').contentWindow.xulG = params;

		} catch(E) {
			g.error.standard_unexpected_error_alert(
				'Error loading MARC template: ' + template_name,
				E
			);
		}

	},
	false
);

What we’re doing is grabbing the MARC template ourselves based on the value of the menu widget, and shoving it into the window scope of the iframe containing the Editor interface, using an expected “xulG” placeholder and data structure. We’re also defining a callback in this structure for handling the saving of the soon to be modified record. All this MARC Editor does is edit MARC.

Now, I did cheat here a bit and looked at how I invoked the MARC Editor in other interfaces, and I also got it wrong on the first go, which took a moment to debug. I had used “‘record’ : marc” instead of “‘record’ : { ‘marc’ : marc }”. Mike wrote this MARC Editor, and it can take such parameters as “‘record’ : { ‘url’ : ‘/opac/extras/supercat/retrieve/marcxml/record/’ + docid }” if we needed it to.

Demonstrating the MARC Template prototype

Now, my plan is to replace the alert stub with a remote method call for saving the record. I’m just not sure if there is an appropriate method yet.

new marc prototype stub

Looking at previous uses of the MARC Editor, I have a few candidates:

MARC_XML_RECORD_UPDATE, which is open-ils.cat.biblio.record.xml.update, and takes 3 parameters: the authtoken, the bibliorecord id, and the new MARC XML for the record. So that one is probably out, since we don’t have a record id yet.

MARC_XML_RECORD_IMPORT, which is open-ils.cat.biblio.record.xml.import, which takes 3 paramters: the authtoken, the new MARC XML, and something that is either a label for the “source” of the TCN (such as OCLC), or the name of a PINES defined z39.50 service (such as OCLC); I’m not sure which yet, or if they are actually different in practice. This method is used with EG’s z39.50 client for importing new records, which makes it the most likely candidate so far.

MARC_XML_RECORD_REPLACE, which is open-ils.cat.biblio.record.marc.replace. This is also used with the z39.50 client, but for overlaying an existing record.

And now I notice one more candidate in my constants file, which I haven’t actually used before:

MARC_XML_RECORD_CREATE, which is open-ils.cat.biblio.record.xml.create. I’ll have to introspect or look at the source for that. Bill probably gave me this a long time ago and it’s also a strong candidate.

So back to the docgen interface with a looser search.

Hrmm, with .import, the documentation says “Takes a marcxml record and imports the record into the database. In this case, the marcxml record is assumed to be a complete record (i.e. valid MARC). The title control number is taken from (whichever comes first) tags 001, 039[ab], 020a, 022a, 010, 035a and whichever does not already exist in the database.”

Hrmm, so that’s not good. I don’t want to generate TCN’s on the client side, and we don’t catalogers making up TCN’s either (so we need to make sure we prevent that in the long run).

Looking at the source in Cat.pm
, I see this in the import method:

( $tcn, $tcn_source, $marcdoc, $evt ) = _find_tcn_info($session, $xml, $override);

Those values then go on to get stored in a Biblio Record Entry object that gets passed to the database. Now, I could muck up Bill’s code, and create and publish another import function based off of this, which ignores any TCN’s in the MARC, and instead uses a kludgy auto-generated TCN (based on timestamps and session keys, perhaps), but I’m betting we’d be better served with the TCN being based on a sequence in the database, and that’s deeper than I ever go, so at this point I’m going to punt and pow-wow with Bill (the middle layer being his realm), and see if I can get a usable method out of him.

From IRC:

12:05 < phasefx_> berick: can I steal you for a moment (or three)?
12:05 < berick> phasefx_: what's up?
12:05 < phasefx_> this method: open-ils.cat.biblio.record.xml.import
12:05 < phasefx_> looks in the marcxml for a likely TCN value to use
12:06 < berick> right
12:06 < phasefx_> I need a method for new marc creation, that ignores any TCN values in the MARC and instead uses an
                  auto-generated TCN (perhaps from a sequence in the database)
12:07 < miker_> phasefx_/berick: the DB will supply you a tcn if you don't give it one
12:07 < phasefx_> sweet
12:07 < miker_> and leave tcn_sourc and tcn_value as NULL
12:07 < berick> ok, so we just need a method flag that forces blank tcn data on import?
12:07 < berick> method flag on open-ils.cat.biblio.record.xml.import, that is
12:07 < phasefx_> that sounds good to me
12:09 < berick> phasefx_: do you ever use open-ils.cat.biblio.record.xml.create ?
12:09 < phasefx_> berick: I don't, but it is in my constants file.  Looks to be a wrapper for import?
12:09 < berick> it is
12:09 < berick> just curious why it exists :)
12:09 < phasefx_> I have no idea :D
12:10 < berick> phasefx_: when you call open-ils.cat.biblio.record.xml.create, do you pass in a "source"  (3rd argument)?
12:11 < phasefx_> you mean .import?  I believe I'm passing in the service name we define for z39.50
12:11 < berick> yes, sorry
12:11 < miker_> source is different than tcn_source, btw
12:11 < miker_> and if you'd set it that'd be dandy :)
12:11 < berick> it's set if phasefx_ passes in a source name
12:12 < miker_> it comes from config.bib_source
12:12 < berick> that's all taken care of :
12:12 < miker_> um ... it's an integer
12:12 < berick> :)
12:12 < miker_> ok
12:12 < berick> i know
12:12 < berick> i translate
12:12 < miker_> how does phasefx_ know what the valid values are?
12:13 < phasefx_> I only use .import for the z39.50 interface, and I query the services available for that
12:13 < miker_> phasefx: that's different
12:13 < miker_> entirely
12:14 < phasefx_> I could have been misusing Bill's method all this time
12:15 < miker_> well, see, the z stuff is a (convention-based) subset of bib_source
12:15 < miker_> so, that /could/ be used (and I guess is)
12:16 < berick> phasefx_: when you call open-ils.cat.biblio.record.xml.import, do you pass in 2 or 3 args?
12:16 < miker_> the main thing is that "new marc" should use the "System Local" source
12:16 < phasefx_> var r = obj.network.simple_request('MARC_XML_RECORD_IMPORT', [ ses(), new_marcxml, obj.current_service ]);
12:16 < phasefx_> miker_: or have it default.. do you think we would ever define multiple local sources?
12:17 < phasefx_> berick: 3 args
12:17 < berick> what defines obj.current_service ?
12:17 < miker_> phasefx: certainly (maybe not pines, but ...)
12:17 < phasefx_> berick: open-ils.search.z3950.retrieve_services
12:17 < berick> I don't think there is an interface for fetching the source types... are they hard-coded in the SC?
12:17 < miker_> sources define transcendance (visible without copies) and indicate quality of the source
12:18 < miker_> so, say PINES bought an e-book subscription for the whole consortium
12:18 < miker_> we'd create a "PINES E-Book" source, and make it transcendant
12:18 < miker_> but it would probably be a local source
12:19 < miker_> aka, not oclc
12:19 < berick> phasefx_: ahh, yeah, that's wrong :)
12:19 < phasefx_> so we should probably build that in up front with New MARC.. selection of local sources
12:19 < phasefx_> berick: you gotta love wrong but works :)
12:20 < miker_> phasefx_: probably ... the convetion right now is for Z sources to match bib_sources on the name
12:20 < phasefx_> do we need to batch correct the sources for all the z39.50 imported records since go-live?
12:20 < berick> phasefx_: well, it fails gracefully.. "works".. no so much ;)
12:20 < miker_> phasefx_: it's not wrong, it's just ... fragile ;)
12:20 < berick> ahh, yeah, so for 'oclc', it does probably work by accident ;)
12:20 < miker_> we'd need to add a new bib_source if/when we add a new z server
12:20 < phasefx_> miker_: I hear you.  This is going to be some good stuff for my blog entry :)
12:21 < miker_> berick: no, it's not by accident. I lower cased it to make it work
12:21 < miker_> I think it's fine ... it's a "works if you need it to" addition
12:22 < miker_> I only hurt a little when I think about the lack of ref-integrity ;)
12:23 < berick> ok, fix : i provide api call for fetching the in-database bib sources.  phasefx_ passes a source name to me
                (from that list) depending on where the marc is coming from.  sound sane?
12:24 < berick> this means the source types will have "special" meaning in the staff client
12:24 < phasefx_> do I change anything with z39.50 for the timebeing, or should we just stick with new marc for now?
12:25 < berick> phasefx_: ideally, the third param to MARC_XML_RECORD_IMPORT will come from the bib-sources in the DB,
                otherwise it's not really "correct", but it will work with the current setup
12:25 < phasefx_> berick: alright, I'll bug the z39.50 interface in bugzilla so I won't forget
12:29 < miker_> phasefx_: we should probably add a flag for "source type" so we can specify z or local or other (oai, unapi,
                etc)
12:29 < miker_> phasefx_: so don't change the z interface just yet :)
12:30 < phasefx_> miker_: sure thing.  If it works, break it slowly. ;)
12:30 < miker_> heh.. right
12:33 < berick> phasefx_: open-ils.cat.bib_sources.retrieve.all   -- also, the 4th param to MARC_XML_RECORD_IMPORT, if true,
                will force a blank TCN/TCN-source
12:33 < berick> dev is updated
12:33 < phasefx_> berick++

Now, that was an honest peek at our development process. πŸ™‚

So now I open a bug for the z39.50 interface, and one for what I’m about to do with the New MARC interface (I’m going to hardcode “System Local” as a bib source for now).

And this is what I do for the new save callback:

'func' : function(new_marcxml) {
    try {
        var robj = g.network.simple_request(
            'MARC_XML_RECORD_IMPORT',
            [ ses(), new_marcxml, 'System Local', 1 ]
        );
        if (typeof robj.ilsevent != 'undefined') throw(robj);
        alert('Record created.');

        /* Replace tab with OPAC-view of record */

        var opac_url = xulG.url_prefix( urls.opac_rdetail ) + '?r=' + robj.id();
        var content_params = {
            'session' : ses(),
            'authtime' : ses('authtime'),
            'opac_url' : opac_url,
        };
        xulG.set_tab(
            xulG.url_prefix(urls.XUL_OPAC_WRAPPER),
            {'tab_name':'Retrieving title...'},
            content_params
        );

    } catch(E) {
        g.error.standard_unexpected_error_alert(
            'Error creating MARC record.', E
        );
    }
}

And voi la! It’s not perfect, but it works. For perfection, we’ll need to do something to stop folks from creating records where 245a = “Title Statement”, etc. But we can admonish them for the time being and save that for another day.

This concludes our peek at implementing a feature in Evergreen. I hope you haven’t gone blind.

new marc prototype success

— Jason