Scholarly Societies 
Project

Border

Editorial, 2006, October 5:
Using JSTOR as a Source of Journal-Title Abbreviations
Background

The Inventory of the Oldest Scholarly Societies Initial I, Ornaten 1999 the Editor of the Scholarly Societies Project created a sub-project entitled the Inventory of the Oldest Scholarly Societies (Repertorium Veterrimarum Societatum Litterariarum). The purpose of the Inventory is to gather information pertaining to scholarly societies that is likely to be of special interest to historians. Pages have therefore been created for each of several hundred scholarly societies founded prior to 1850. Each page gives basic historical information, as well as an enumeration of the important journals of the society, along with abbreviations used for the journal titles.
Abbreviations Found in Journal Indexes For the first several years of the existence of the Inventory, the abbreviations were drawn exclusively from indexes of the journal literature, such as the Reuss Repertorium and the Royal Society of London Catalogue of Scientific Papers. The abbreviations were included on the appropriate history pages, but also in a composite alphabetical index in an area entitled Abbreviations Used for the Journal Titles of Scholarly Societies.
Abbreviations Found in the Journals Themselves In the summer of 2005, it occurred to the Editor that it might be possible to document the journal-title abbreviations used in the literature itself by using a large searchable archive like JSTOR.

Further background information may found in The Problem of Early Journal-Title Abbreviations.

The JSTOR Archive
Scope Initial T, Ornatehe JSTOR Archive is a rich repository, containing the full-text of lengthy runs of hundreds of journals. The emphasis is on English-language journals, but some journals in other languages are included. Most of the runs are within the 20th century, but some runs extend into the 19th century and earlier.
Search Engine The search engine of the JSTOR archive allows one to perform a single search over the entire set journal articles in this immense corpus of scholarly literature. It is provided with many useful searching features, including proximity searching, which allows one to specify how close to one another the search terms are to appear.
Limitations Perhaps the most evident limitation of this archive is that this is a fee-based services, so that users must be affiliated with an institution that has a subscription to the resource.

That said, it must be acknowledged that the JSTOR archive is truly extraordinary, its only other limitations being rather minor in nature. Among the remaining limitations:

  • The pages are not scalable; the size displayed is the largest available. This can be a problem with some journals notorious for their use of truly miniscule fonts.
  • In the results of a search, the search terms are not highlighted; hence a careful visual scan of the results is required in order to determine why the pages were retrieved.
  • The program that converted the digital images to searchable text sometimes does not recognize multiple columns of text. Hence text strings that should be contiguous are sometimes broken.
  • Text employing diacritical marks (usually accent marks) is not always well handled. For example, the é character appears to be converted to a simple e over 80% of the time, but is treated as garbage the rest of the time. At the other extreme, the ü character appears to to converted to a simple u less than 10% of the time, and is treated as garbage the rest of the time.
  • Characters in italic are sometimes poorly handled. A good example is the rather ornate capital J with a top banner to the left that is found in some fonts; this is quite often treated as garbage.
  • There are problems in distinguishing among similar characters, for example among lowercase l, uppercase I and the numeral 1.
Searching the JSTOR Archive for Abbreviations
Searching for Abbreviations of a Specific Journal Initial I, Ornaten order to use the JSTOR archive to to locate journal title abbreviations used for a specific journal of a scholarly society, the Editor first creates a sequence of search expressions that appear likely to result in a relatively complete search of the archive. The steps involved in this process described below.
Steps in Constructing the Search Expressions There are four steps in creating the sequence of search expressions: (1) identifying a minimal set of critical words in the journal title (2) coming up with abbreviated versions of those words (3) setting an adjacency value, that is, determining how closely these word fragments need to occur together and (4) coming up with the final sequence of search expressions by taking all reasonable combinations of the abbreviated words and applying suitable adjacency operators to specify how closely together the terms should occur.
An Example Here is an example of a sequence of searches appropriate for the journals of the Society of Antiquaries of Newcastle upon Tyne with the number of hits found on 2006, July 9:
"Soc Antiquaries Newcastle" ~2 = 10
"Soc Antiquar Newcastle" ~2 = 0
"Soc Antiqu Newcastle" ~2 = 2
"Soc Antiq Newcastle" ~2 = 10
"Soc Antt Newcastle" ~2 = 0
"Soc Ant Newcastle" ~2 = 8
Processing the JSTOR Search Results
Assessing the Search Results Initial T, Ornatehe results of a seach consist of a set of pages that match the search criteria. One must take each page in turn and examine it to locate the matching passage. Once the match is found, one needs to ask: Is this a journal-title abbreviation? One also needs to ask: Does this correspond to one of the society's journals?; for example, one needs to ask whether the volume and year designations are consistent with the cataloguing data.
Recording and Verifying Data Once one is certain that an abbreviation corresponds to a particular journal, one must document the information. Since JSTOR journal-page images do not support a copy function, the abbreviation, and the year and volume cited must be manually transcribed. The bibliographic information about the citing source, however, does support a copy function, so that data may be recorded reliably.

Because the transcription is a manual operation it must be verified; this is done by copying the transcribed string and sending it back through the search engine as an exact phrase.

Installing the Data Once the correctness of the data has been verified, it is then made publically available. At the time of writing, this is done by adding the abbreviation data to the society's history page in the area reserved for the journal in question. Taking our example above, the abbreviation data is given in the Society of Antiquaries of Newcastle upon Tyne history page. The abbreviation data is also added to the composite index of journal-title abbreviations (maintained manually) in the appropriate file. In the future it is expected that the composite index will be generated from the data in the history pages.
Assessing JSTOR as a Source of Journal-Title Abbreviations
The JSTOR Search Engine is Less than Perfect Initial A, Ornates noted above, the JSTOR search engine is not without certain problems. It should specially be noted that its poor handling of diacritical marks and its frequent misinterpretation of italic text conspire to reduce the effectiveness of the search engine.
JSTOR is Nonetheless Stunningly Useful The usefulness of JSTOR in this application, and its potential for usefulness in other applications, stems from two factors. First, it is a very large archive that is both broad in that it covers many academic subjects, and deep in that it includes journal runs that cover a considerable time period.

Second, it has a search engine that allows one to search the entire archive in one search. Although the search engine has a few problems, it is nonetheless a very powerful tool, with many useful search options. Taking all things into consideration, the problems with the search engine pale by comparison with the merits of both the richness of the archive and the considerable power of the search engine.

It needs to be emphasized that without such a tool, it would have been impossible to assemble so large a collection of journal-title abbreviations directly from the journal literature itself.

Similar Sources of Journal-Title Abbreviations
Why a Tool like JSTOR is Ideal for the Task Initial A, Ornates mentioned above, what makes JSTOR so fruitful a source of journal-title abbreviations is that it satisfies two criteria. First, it is a very large archive. Second, it has a search engine that allows one to search the entire archive in one search.
And, Sadly, Some Others are Not

Two candidates that satisfy the first criterion, namely being very large archives, are Gallica and the Göttinger DigitalisierungsZentrum. Either of them would be an ideal source to search, were it not for the fact that neither of them have associated search engines.

Border

Published 2006, October 5
Jim Parrott, Editor
Scholarly Societies Project, and
Repertorium Veterrimarum Societatum Litterariarum
Sending Email to the Project

Home