EPSILON

ABOUT THE WALLACE CORRESPONDENCE PROJECT'S METADATA AND TRANSCRIPTS IN EPSILON

The Wallace Correspondence Project (WCP) is now using the online Epsilon database to host its archive of transcripts of Alfred Russel Wallace's correspondence. We do not recommend using the WCP's original internet portal Wallace Letters Online for your research, as the information it contains was last updated in 2015 and is very out of date, is replete with errors, and is missing some 1,400 documents. The data in http://nialloleary.ie and https://www.omnia.ie are similarly out of date, as too are the transcripts in JSTOR.

Epsilon was developed by Cambridge University Library's Darwin Correspondence Project in order to bring together the letters of 19th century scientists (including those of Darwin) in a cross-searchable digital platform. Combining correspondence data in this way helps recreate the vital communication networks that sustained scientific development, and opens up new research opportunities. Epsilon is as 'future proof' as possible. Letter transcriptions and metadata are held in TEI XML, an established standard markup language for historical texts, and funding is available to secure the archive's long-term future.

The notes below are intended to give users of the WCP's Epsilon archive a brief introduction of how to use it. They also provide an overview of the project's metadata and protocols, since the WCP's protocols may well differ from those of the other correspondence projects also using the Epsilon system.

How to use Epsilon

Epsilon has a powerful search system which should make it easy to find letters you are interested in. Here is a brief guide:

First, go to Epsilon - CLICK HERE. If you know the WCP number of a letter you wish to see (e.g. "WCP4102"), simply enter it in the Simple Search box in the upper right corner and click Search. If you would like to search Wallace's correspondence for a particular word or phrase (e.g. "I'm afraid the ship's on fire"), then click the Advanced Search link in the upper right corner and enter the following:

Be sure to click the "Limit by collection" drop-down box and select "Alfred Russel Wallace", otherwise you will be searching the correspondence of many other people in addition to Wallace's. When you have entered your search criteria, click the "Search" button and you will get a list of all the letters that match your criteria. You can sort this list in a variety of ways by selecting a sort order (e.g., by date) from the "Sorted by" drop-down box in the top right hand corner.

What we Catalogue & Transcribe

The WCP aims to locate, digitise, catalogue, transcribe, interpret, and publish Wallace's surviving correspondence, other manuscripts (e.g., notebooks), and sketches, but not his marginalia in printed works. Epsilon contains only his correspondence, so it is that which we will discuss below.

The WCP catalogues all known surviving letters (including published excerpts and author's drafts) sent to or written by Wallace - including the original envelopes and any enclosures. We also catalogue selected letters between others which pertain to Wallace (e.g., a letter from Charles Darwin to Thomas Henry Huxley which discusses Wallace), and letters written by Wallace's close relatives (parents and children) which contain information useful to scholars studying Wallace's life.

In addition to letters which we have scans of, we also catalogue letters which we have some information about and are reasonably sure still exist, even if we do not know their contents. For example, letters sold at auction in the last c. 20 years, where we only know the sender, recipient and date.

We transcribe all textual documents. Some enclosures are not textual (e.g., photographs), so we describe, rather than transcribe, them.

What is a Letter?

For the purposes of our project a letter is defined as a manuscript communication which has been posted or telegraphed by one person to another. This therefore excludes items such as books or magazines posted by or to Wallace without an enclosed manuscript message.

Most letters were sent enclosed in an envelope, but some were not i.e. postcards and some early letters (lettersheets) where the writing paper bearing the handwritten text is folded in such a way that one side bears the postal address of the intended recipient. All items enclosed in an envelope together with a manuscript letter and the envelope itself, are regarded as being part of the letter (more precisely the ‘letter packet’). Each of these items is assigned a unique number (see below).

In addition to posted letters, we catalogue the following items:

1) Author's drafts of letters to or from Wallace, even when the text of the final ‘posted’ version of the letter is known.

2) Carbon copies of letters (these lack printed or embossed letterheads and usually the signature of the sender).

3) Old handwritten/typed transcripts of letters to or from Wallace, especially where the original version is not known.

4) Published letters (or excerpts) to or from Wallace, largely excluding letters which were specifically written by Wallace for publication (i.e. ‘letters to the editor’ (LTTEs), which have all been catalogued by Charles Smith and are reproduced on his Wallace Page website). Only the earliest published version of a letter will usually be catalogued, unless later versions are significantly different, in which case they too will be included. Published excerpts of letters are catalogued only in cases where a more complete version of the letter is not known.

All items associated with a letter (e.g., the items which make up the letter packet, plus an author's draft, published transcript of the letter etc) are assigned a master WCP cataloguing number (e.g., "WCP788") to unite them. Each document is also assigned a unique item number, e.g., an enclosure to letter WCP788 may have the item number "98" and its complete WCP cataloguing number would therefore be "WCP788.98".

Current Coverage

The WCP is a work in progress which will take many years to complete. Our metadata and transcripts are therefore constantly being added to and edited.

As of January 2022, the WCP has found letters and other documents in 245 repositories worldwide and in 245 articles and books. We currently have records and transcripts of 5688 letters, of which 2748 were written by ARW and 2159 were sent to him. Most of the other 781 are third party letters which pertain to ARW. The metadata for all of these (and more) are presented in our Catalogue of the Correspondence of Alfred Russel Wallace, published in March 2023. This catalogue gives a different view of the data to Epsilon. Note that we also have records of 487 other manuscripts such as notebooks, but these are not present in Epsilon.

We have reason to believe that hundreds of letters to and from Wallace remain to be discovered in archives and private collections worldwide.

Notes on Protocols Used for Metadata and Transcripts

A. Metadata in the right sidebar of the Epsilon interface

1) Summary: Alfred Russel Wallace is referred to as "ARW".

2) Names of author/addressee: The full name used by that person at the time of their death. Any commonly used nicknames or pseudonyms are noted in brackets and quotes (" ") after the forename(s) and any earlier surname(s) is/are noted in brackets ( ) after the surname, using "née" to indicate a maiden name, "formerly" to indicate a change for a reason other than marriage, and "then" to indicate any earlier married names.

Some examples:

Alexander, Patrick Proctor ("Pat")
Allingham (née Paterson), Helen Mary Elizabeth
Sims (née Wallace), Frances ("Fanny")
Comerford-Casey (formerly Casey), George Edward
de Grey (née Withers then Gwytherne-Williams), Marion

3) Date of the letter: Any inferred information is enclosed in square brackets “[ ]” e.g., if the month was inferred the date might read “26 [April] 1856”. Question marks are used to indicate uncertainty about the date or part of the date e.g., “26 April? 1856” if the month is uncertain; or “26? April? 1856?” if the entire date is uncertain. The reasons why the date was inferred or questioned are usually given in the general notes field below the physical description. If a letter was written on multiple dates, we take the last date as the date of the letter, and give the earliest and last date as a date range.

4) Addresses of author & addressee: The full contemporary address is given, with modern versions of place names in "[ ]". These are not necessarily the addresses given on the envelope or on the letter. For example, the address on the letter may simply read “Old Orchard” but in the right sidebar the complete address will be given i.e. “Old Orchard, Broadstone, Dorset, England”. If the whole address is enclosed in square brackets (“[ ]”) it means that it has been inferred for that letter. If it is followed by a question mark (“?”) this means we are uncertain whether the letter was posted from/to that address.

5) Physical description: This figure is the number of pages with original text e.g., excluding blank pages and pages which only have text written/typed on to it after the letter was posted (e.g. notes made by the recipient on a blank sheet of the letter).

6) General notes: These notes may relate to any aspect of the letter. They are informally written and largely intended for use by project staff, but they have been displayed since the information they contain is often important for understanding the letter.

B. Transcripts

The text of the letters has been transcribed following the Wallace Correspondence Project's transcription protocol. This is a method of transcription which aims to preserve much of the layout of the original text, but imposes some formatting rules to standardise it in order to make the text easier to read and understand. The aim is to capture those aspects of the layout which are necessary to make sense of the text, rather than record the exact position of every word on the original page. Editorial comments in the text and endnotes are added to further assist the reader in interpreting the text.

1) Numbers in square brackets: These are ‘reading order numbers’ rather than page numbers, and denote the order in which the text pages of a manuscript should be read. Blank pages are ignored, so if a letter has four pages but page 2 is blank, then the text pages of the transcript would be numbered [1], [2] and [3]. Note that the reading order of a letter may sometimes be different to the physical order of its pages e.g., if one page has several layers of writing on it (see the WCP’s transcription protocol for more details).

In the case of transcripts of published letters, the page numbers printed in the publication are given after the reading order number (e.g., "[1] [p. 262]", where “262” is the number of the published page).

2) Sender's address: In the case of posted manuscript letters (i.e., not authour's drafts or old transcripts) it is aligned right, and for handwritten/typed transcripts, drafts and published letters, it is aligned left. If the address of a manuscript letter is printed or embossed then it is given in italics.

3) Common conventions:

  • Words inserted into the text by the author are shown in superscript
  • Where the author of a manuscript letter has indicated greater emphasis by underlining a word or passage two or more times, the text is formatted as bold underlined text
  • Dashes are transcribed as two hyphens "--"
  • Editorial insertions are enclosed in non-italicised square brackets, "[ ]" e.g., "In [18]98 I visited"
  • Editorial comments are enclosed in italicised square brackets "[ ]", e.g., "[2 words illeg.]"
  • Surmised text is given in angle brackets "< >", e.g., "My son <William> wrote to you"
  • In endnotes Alfred Russel Wallace is referred to as "ARW"

4) Scientific names of plants & animals: Our policy is to correct the original spelling if necessary, but we do not attempt to determine the current valid names of organisms. This would be a task for specialists on the taxonomy of the organisms in question and is beyond the scope of our project.

C. Envelopes

Handwritten and typewritten text on envelopes is transcribed and the earliest postmark (usually on a stamp) is always transcribed. Any other postmarks on the front or back of the envelope are noted but are not transcribed. The only exception is when the earliest postmark is completely illegible or cut out of the envelope, in which case we transcribe the next earliest, if there is one.

Accuracy of our Metadata & Transcripts

The process of editing our transcripts and associated metadata is a work in progress which will take many years to complete and currently there are many errors especially in the transcripts. Our project’s policy is, however, to make the information we have available to users at the earliest possible opportunity, even if it is incomplete and/or imperfect. The accuracy of our transcripts is indicated by their 'editorial status', which is shown below the transcript in Epsilon i.e.

  • Draft transcripts: These are likely to contain many transcription and formatting errors. If endnotes are present they should not be relied on. Currently (August 2020) 86% of letters are at this stage of the editorial process.
  • Edited (but not proofed) transcripts: These have been edited by experienced researchers and there should be few if any errors in the text. Endnotes may sometimes contain errors. Currently (August 2020) 14% of letters are at this stage of the editorial process.
  • Fully edited and proofed transcripts: These have been edited by a Wallace specialist and are deemed to be of publishable quality. Currently (August 2020) 0% of letters are at this stage of the editorial process.

In addition, the number of times a transcript has been edited by project staff (it's "Revision history") gives a further indication of how accurate it is likely to be.

If you find a mistake in our metadata, especially with the names of the sender or recipient, with the sender's address, or the date of the letter, then we would be very grateful if you could inform George Beccaloni (g.beccaloni@wallaceletters.org). Errors in our draft transcripts are numerous, but we do not need to be informed of them as we will correct them when they are edited in due course.

Note that the WCP's fully edited and proofed transcripts will be made available in Epsilon two years after the date of their publication in our proposed series The Correspondence of Alfred Russel Wallace.

Database Right

The metadata in Epsilon which the WCP has meticulously compiled, arranged and edited are protected under UK law by The Copyright and Rights in Databases Regulations 1997, which extended existing copyright law to databases, to the extent that they constitute "the author's own intellectual creation". In addition, regulations 13 and 14 create a database right. Database rights automatically subsist if there has been a "substantial investment in obtaining, verifying or presenting the contents" of the database.

Such rights remain in force under regulation 17(2) until the end of the 15th calendar year from the date on which the database was first made available to the public. During that period, database right will be infringed by any person who, without consent, "extracts or re-uses all or a substantial part of the contents of the database", whether all at once or by repeated extractions of "insubstantial" parts.

The Alfred Russel Wallace Trust has granted permission for Cambridge University Library to supply copies of the WCP's metadata, excluding letter summaries and notes, to researchers for the purpose of computational analysis. The data may only be used for the purposes of such studies and substantial parts of the supplied metadata may not be published in electronic or other form. The WCP's Wallace Collection in Epsilon should be credited in publications which result from such studies.

Copyright of Metadata and Transcripts

The copyright of certain metadata associated with the letters (e.g., letter summaries, endnotes etc.) is owned by the Alfred Russel Wallace Trust.

Under UK copyright law any literary work created by an author who died before 1969 and which had not been published by 1 August 1989 will be protected by copyright until 31 December 2039. If you wish to publish a transcript of a copyrighted manuscript letter you will need to seek permission from the literary estate of the deceased author. For information about the copyright of Alfred Russel Wallace’s unpublished literary works see https://www.wallaceletters.myspecies.info/content/wallace-literary-estate

It is the custom and practice in academic publishing that the reproduction of short extracts of text may be permitted on a limited basis for the purposes of criticism and review without securing formal permission, on the basis that:

  • the purpose of quotation or use is objective and evidenced scholarly criticism or review (not merely illustration)
  • a quotation is reproduced accurately, either within quotation marks or as displayed text
  • full attribution is given

How to Cite Transcripts

If you publish an excerpt from a letter we have transcribed, we would be grateful if you could cite its WCP cataloguing number. This is good scholarly practice, as it shows that the source of the transcript was the WCP and enables errors to be traced by future scholars. For example, you may publish an excerpt from a transcript which is then altered by us in the future because we find and correct an error in it.

When citing a WCP cataloguing number, please (ideally) include the number of the item as well e.g., “WCP56.78”, where “56” is the master number and “78” the number of a particular document (e.g., an enclosure) associated with the letter. This allows the exact item to be identified. However, it is usually sufficient just to give the master number e.g., "WCP56".

If possible please cite the transcript in a similar format to the following:

Wallace, Alfred Russel. 1846. [WCP340.340: Letter to Henry Walter Bates, dated 11 April 1846]. In: Beccaloni, G. W. (Ed.). 2021. Epsilon: The Alfred Russel Wallace Collection. World Wide Web electronic publication. <https://epsilon.ac.uk/view/wallace/letters/WCP340> [accessed 23 November 2020]

If you want to cite the url of the homepage of WCP's Wallace Collection in Epsilon please use https://tinyurl.com/WallaceInEpsilon

Takedown Policy

The WCP's IPR takedown policy can be read HERE

Scratchpads developed and conceived by (alphabetical): Ed Baker, Katherine Bouton Alice Heaton Dimitris Koureas, Laurence Livermore, Dave Roberts, Simon Rycroft, Ben Scott, Vince Smith