Last Updated: 2008, December 4

Scholarly Societies 
Project

The URL-Stability Index

Table of Contents

Border

Introduction

During 1996 we noticed that a significant number of the URLs in the Project had changed. By the end of March 1997 we had completed a massive review of all the sites in the Scholarly Societies Project. We corrected in excess of 200 URLs in the process.

One encouraging trend is that in the latter part of 1996, a significant number of the URL changes (of which we are aware) were to a format that might be called canonical domain-name format. We describe this concept in the section below.

Border

Domain Names & Standard Formats for URLs
Domain Name
A domain name is a "permanent" address assigned to an institution or group. For example, the domain name for the IEEE is ieee.org. The institution or group has to apply to an appropriate domain-name registrar to obtain a domain name. As a part of doing this, the institution or group specifies the Internet Protocol (IP) addresses for email to this domain name, and for the URL of the website for this domain name. Whenever the IP address changes, the URL will still work correctly if the institution or group ensures that the IP addresses are changed in the appropriate Domain Name Servers. The system administrator for the site will know how to do this. For more information on domain names, see the Technical Considerations section of the Recommendations.

Standard Domain-Name Format for the URL
There is a standard, or canonical way of assigning the URL for the website of a group that has already obtained a domain name. The format is: http://www.domain_name/. For example, the canonical domain-name format for the IEEE website would be http://www.ieee.org/, and that is indeed what the IEEE has chosen as the URL for their site. Because the IEEE has chosen to follow this canonical format, its URL can be considered to be stable for the time being. Here are some non-canonical ways in which the IEEE could have (but didn't) formed their site's URL:
  • http://www.ieee.org/welcome.html
  • http://www.ieee.org/ieee/
  • http://ieee.org/

Border

Stability Index For URLs

We have defined a measure of the stability of URLs in the Scholarly Societies Project using the technique described below. We assign to each site the following stability index:

1.0 if the URL of the website is in canonical domain-name format We assign the value 1 since there is a reasonably good chance that the URL will remain the same.
0.5 if the URL of the website is in non-canonical domain name format We assign a value of less than 1.0 since the URL is very likely to change to canonical form. We assign a value significantly above 0 since we can predict the final URL.
0.2 if the email address includes the domain name, but the domain name is not part of the URL.
We assign a value somewhat above 0, since the eventual URL may be predictable from the domain name in the email address.
0 if the URL has no domain name as part of it.
We assign the value 0 here, since the likelihood that the URL will change over the next few years is very great, because the host machine is likely to change with time, or the pathway on the host machine is likely to change with time.

We then add all these numbers together and then divide by the total number of sites in the Project, and express the result as a percentage to obtain a composite URL-stability index for the Project.

As of 2003, April 1, the composite URL-stability index for the Project is about 91.1%.

We anticipate that this percentage will increase considerably over the next few years, since scholarly societies are beginning to realize the benefits of having a stable URL for their websites.

Border

Using the Stability Index for a URL

We have been using the individual stability index values for the sites in the project to direct the testing of URLs. These values have been entered into the database of the Project (which is distinct from the public HTML files, and which is accessible to the public via the Search Engine). Note that the database entries are assigned accession numbers based on the date added (with the earlier entries having the lower accession numbers).

Here is the protocol for testing URLs in the Scholarly Societies Project:

  • Ignore all sites that have a stability index of 1, since their URLs are considerably less likely to change than the others.
  • Begin by examining each site with stability index of 0.5 to see whether the site has enabled the canonical form of the domain-name format yet.
  • Examine each site with a stability index of 0.2 to see whether the site has incorporated the domain name that occurred in the email address into the URL.
  • Finally, examine all remaining URLs in accession number order from oldest to newest, on the theory that the older a site that had no domain name when last examined, the more likely that they will have acquired a domain name since then.

This protocol has gained in efficiency as more and more sites have acquired and have used domain names, since fewer and fewer sites need to be tested for URL changes on an on-going basis.

Border

Tracking the Composite URL-Stability Index of the Project
Below we track the increase in the composite URL-stability index of the Scholarly Societies Project.
1997 Jan.31	18.0%
1997 Feb.4	19.0%
1997 Feb.10	20.2%
1997 Feb.14	21.2%
1997 Feb.18	22.0%
1997 Feb.20	24.1%
1997 Feb.24	25.5%
1997 Feb.26	26.6%
1997 Feb.28	28.0%
1997 Mar.6	29.1%
1997 Mar.11	30.0%
1997 Apr.14	31.8%
1997 Apr.24	32.1%
1997 May 1	32.4%
1997 May 22	32.8%
1997 June 4	33.4%
1997 June 13	34.1%
1997 July 2	34.6%
1997 July 11	35.1%
1997 Aug.8	35.6%
1997 Sept.18	36.0%
1997 Oct.10	36.5%
1997 Nov.3	37.0%
1997 Dec.1	38.4%
1998 Jan.2	40.8%
1998 Feb.2	43.5%
1998 Mar.2	44.6%
1998 Apr.2	45.1%
1998 May 1	46.2%
1998 June 1	47.2%
1998 July 1	48.1%
1998 Aug.1	49.4%
1998 Sept.1	50.6%
1998 Oct.1	52.7%
1998 Nov.1	54.3%
1998 Dec.1	55.3%
1999 Jan.4	55.8%
1999 Feb.1	57.4%
1999 Mar.1	58.5%
1999 Apr.1	60.8%
1999 May 3	62.3%
1999 June 1	63.5%
1999 July 1	63.8%
2000 Feb.9	67.5%
2000 Mar.1	69.3%
2000 Mar.19	70.1%
2000 May 2	71.5%
2000 Sept.5	73.1%
2000 Oct.6	73.6%
2000 Nov.1	74.8%
2000 Dec.1	75.1%
2001 Jan.2	77.6%
2001 Feb.1	78.2%
2001 Mar.1	79.5%
2001 Apr.2	80.3%
2001 May 1	81.5%
2001 Jun.1	82.2%
2001 Jul.1	82.9%
2001 Aug.1	83.2%
2001 Sept.1	83.5%
2001 Oct.1	84.0%
2001 Nov.1	84.5%
[a 5 month gap in regular processing,
due to re-working of subject classification]
2002 May 1	85.9%
2002 June 1	87.2%
2002 July 1	88.0%
2002 Aug.1	88.8%
2002 Sept.1	89.1%
2002 Oct.1	89.4%
2002 Nov.1	89.7%
2002 Dec.1	90.1%
2003 Feb.1	90.8%
2003 Apr.1	91.0%
2003 May 1	91.1%

Border

Subject Disciplines Ranked by URL-Stability Index

Below, the subject disciplines covered by the Scholarly Societies Project are ranked by their URL-Stability Indices in descending order.

Speaking very roughly, it appears that the disciplines associated with professions are most likely to have permanent URLs, followed by the sciences and social sciences, and then by the humanities.

Subject Area
Number of Societies
URL Stability Index
Actuarial Science
35
100.0%
Allergology & Immunology
30
100.0%
Atmospheric & Meteorological Sciences
24
100.0%
Biochemistry
35
100.0%
Cardiovascular Medicine
41
100.0%
Chemical Engineering
103
100.0%
Communication & Media Studies
37
100.0%
Dental Science
49
100.0%
Drama
3
100.0%
Environmental Engineering
30
100.0%
Environmental / Occupational Health & Safety
34
100.0%
Finance
14
100.0%
Food Science & Nutrition
38
100.0%
Gender Studies
30
100.0%
Material, Metal & Mineral Sciences
66
100.0%
Microbiology, Bacteriology and Virology
37
100.0%
Pharmacology
90
100.0%
Public Health
47
100.0%
Publishing, Scholarly
21
100.0%
Reproduction, Obstetrics & Gynaecology
37
100.0%
Soil Science
20
100.0%
Standards
59
100.0%
Toxicology
29
100.0%
Medical & Health Specialities
230
99.1%
Agriculture
104
99.0%
Management Science & Operations Research
73
98.6%
Neurosciences & Psychiatry
106
98.1%
Optometry & Ophthalmology
53
98.1%
Veterinary Medicine
52
98.1%
General Engineering
295
98.0%
Medical Physics & Technology
99
98.0%
Surgery
77
97.4%
General Health & Medicine
215
97.2%
Anatomy & Physiology
35
97.1%
Civil Engineering
134
97.0%
Genetics & Evolution
32
96.9%
Cell Biology, Cytology & Histology
31
96.8%
Botany
86
96.5%
Architecture
122
95.9%
Ecology, Biodiversity & Conservation Biology
24
95.8%
Physics
138
95.7%
Parasitology, Tropical Medicine & International Health
22
95.5%
Library & Information Science
100
95.0%
Mechanical Engineering
120
95.0%
Law
59
94.9%
Instruments, Measurement & Control
19
94.7%
Music
75
94.7%
Chemistry
105
94.3%
Molecular Biology
17
94.1%
University Matters
16
93.8%
Electrical & Computer Engineering
174
93.7%
Psychology
191
93.5%
Computer Science
235
93.4%
Water Science, Hydrology & Oceanology
30
93.3%
Geography
84
92.9%
General Biology & Environment
194
92.8%
Education
154
92.2%
Earth Sciences
173
91.9%
Zoology
47
91.5%
Linguistics
35
91.4%
Dance
11
90.9%
Environmental Sciences
87
90.8%
General Business, Economics & Mathematics
43
90.7%
Recreation & Leisure Studies
19
89.5%
Bibliography & History of the Book
9
88.9%
General Social Sciences
9
88.9%
Archaeology
59
88.1%
Statistics
33
87.9%
Economics
78
86.5%
Fine Arts
55
85.5%
General Science
324
85.3%
Astronomy
40
85.0%
Political Science
62
84.7%
Sociology
78
84.6%
History
155
84.5%
Languages
102
84.3%
General Arts & Humanities
51
82.4%
Literature
80
81.3%
Area Studies & Time-Period Studies
86
79.1%
Mathematics
83
78.3%
Classical Studies
22
77.3%
Philosophy
55
76.4%
Religious Studies
33
75.8%
Anthropology
51
70.6%

Border
Sending Email to the Scholarly Societies Project

Home