User Tools

Site Tools


documentation:search_idx_def_example

Example: Adding a Local Subjects (690) search index

First, every database script should perform its work inside a transaction. This is to protect the consistency of the data and avoid partial implementation of desired changes in the case of an error.

BEGIN;

Now we need to add the indexing definition:

INSERT INTO config.metabib_field (field_class, name, xpath, format, label, search_field, weight) VALUES (
  'subject',                          -- The class to which this definition will belong
  'local',                            -- The symbolic name of the definition within the class
  $$//marc:datafield[@tag="690"]$$,   -- The XPath expression which will extract the desired data from the MARCXML
  'marcxml',                          -- The format in which we will process the XPath for this definition
  'Local Subjects',                   -- The translatable label used for this definition in user interfaces
  TRUE,                               -- Mark this definition as a search-oriented index
  2                                   -- Increase the weight of this definition relative to others, which default to 1
);

Allowing the default id value to be set by the database gives us the ability to use the CURRVAL() function for the auto-incrementing sequence that fills in the id. We can use this when populating the normalization map:

INSERT INTO config.metabib_field_index_norm_map (FIELD, norm)
  SELECT  CURRVAL('config.metabib_field_id_seq'::regclass),  -- Returns the id of the indexing definition inserted above
          cin.id                                             -- The ids of the normalizers we care about
    FROM  config.index_normalizer AS cin  -- Table defining normalizer functions
    WHERE cin.name = 'NACO Normalize';    -- The normalizers we care about for this field

In order to avoid reindexing the entire bibliographic data set for all index definitions when simply adding a new definition, it is best to backfill the index table for the new definition. This is done by pulling data from the existing, flattened MARC data:

INSERT INTO metabib.subject_field_entry (FIELD, SOURCE, VALUE)
  SELECT  CURRVAL('config.metabib_field_id_seq'::regclass),  -- Returns the id of the indexing definition inserted above
          mfr.record,                                        -- The source MARC record from which the data will be extracted
          ARRAY_TO_STRING(ARRAY_ACCUM(VALUE),' ')            -- Coalesce all 690 data into a single space-separated string for initial index population
    FROM  metabib.real_full_rec AS mfr  -- The internal table holding a flattened version of the bibliographic record
    WHERE mfr.tag = '690'               -- Include only 690 tags
    GROUP BY 1, 2;

If an alias is desired for targeting exactly this field in SRU and Z39.50 searches, that can be added as well:

INSERT INTO config.metabib_search_alias (alias, field_class, FIELD) VALUES (
  'eg.subjectlocal',                                -- Alias to accept in SRU/CQL queries, and to map in Z39.50 settings
  'subject'                                         -- The search class to which the definition belongs
  CURRVAL('config.metabib_field_id_seq'::regclass)  -- Returns the id of the indexing definition inserted above
);

The only thing left to do is have the database commit the transaction wrapping all of these commands. The final script thus ends up as:

BEGIN;
 
INSERT INTO config.metabib_field (field_class, name, xpath, format, label, search_field, weight) VALUES (
  'subject',                          -- The class to which this definition will belong
  'local',                            -- The symbolic name of the definition within the class
  $$//marc:datafield[@tag="690"]$$,   -- The XPath expression which will extract the desired data from the MARCXML
  'marcxml',                          -- The format in which we will process the XPath for this definition
  'Local Subjects',                   -- The translatable label used for this definition in user interfaces
  TRUE,                               -- Mark this definition as a search-oriented index
  2                                   -- Increase the weight of this definition relative to others, which default to 1
);
 
INSERT INTO config.metabib_field_index_norm_map (FIELD, norm)
  SELECT  CURRVAL('config.metabib_field_id_seq'::regclass),  -- Returns the id of the indexing definition inserted above
          cin.id                                             -- The ids of the normalizers we care about
    FROM  config.index_normalizer AS cin  -- Table defining normalizer functions
    WHERE cin.name = 'NACO Normalize';    -- The normalizers we care about for this field
 
 
INSERT INTO metabib.subject_field_entry (FIELD, SOURCE, VALUE)
  SELECT  CURRVAL('config.metabib_field_id_seq'::regclass),  -- Returns the id of the indexing definition inserted above
          mfr.record,                                        -- The source MARC record from which the data will be extracted
          ARRAY_TO_STRING(ARRAY_ACCUM(VALUE),' ')            -- Coalesce all 690 data into a single space-separated string for initial index population
    FROM  metabib.real_full_rec AS mfr  -- The internal table holding a flattened version of the bibliographic record
    WHERE mfr.tag = '690'               -- Include only 690 tags
    GROUP BY 1, 2;
 
INSERT INTO config.metabib_search_alias (alias, field_class, FIELD) VALUES (
  'eg.subjectlocal',                                -- Alias to accept in SRU/CQL queries, and to map in Z39.50 settings
  'subject',                                        -- The search class to which the definition belongs
  CURRVAL('config.metabib_field_id_seq'::regclass)  -- Returns the id of the indexing definition inserted above
);
 
COMMIT;
documentation/search_idx_def_example.txt · Last modified: 2022/02/10 13:34 by 127.0.0.1

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International
CC Attribution-Share Alike 4.0 International Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki

© 2008-2022 GPLS and others. Evergreen is open source software, freely licensed under GNU GPLv2 or later.
The Evergreen Project is a U.S. 501(c)3 non-profit organization.