Some datasets were created with the involvement of the Link-Lives team, while others were provided to the project from external sources. Depending, among other things, on how the data was originally created, it has been necessary to process it to prepare it for the linking process. Below are examples of what needed to be processed and how.

New structure: parish registers transcribed by Ancestry

The transcription of parish registers from 1814–1917 was received by Link-Lives as a single large spreadsheet from Ancestry. This meant that it was necessary to split it into events (births, confirmations, marriages, deaths, arrivals, departures). At the same time, all individuals involved in an event (e.g. father, mother, and child at a birth) were recorded in separate columns on the same row. To fit the Link-Lives structure, where each individual person must be linked, the data had to be transformed so that each person had their own row.

Standardization of name spellings

Spellings, especially of names and place names, can vary greatly from one personal record to another. The older the sources, the greater the variation. For example, Ane Laursdatter in the 1850 census may very well be the same person as Anna Larsdatter in the 1845 census. This knowledge is necessary in order to link records. In the Link-Lives project, different spellings have therefore been standardized into so-called synonym catalogs, which connect the various spellings. These catalogs have been developed by subject-matter experts.

Data enrichment

In some datasets, certain pieces of information are missing but can be added relatively easily. This may include, for example, gender information, which is not always present in the original source but can be inferred based on other information—such as a person’s first name. This is, of course, not a 100% reliable method (for example, Bodil can be both a male and a female name, and sometimes first names are abbreviated (B. Jensen), obscuring the gender). Therefore, it will be stated in the project documentation when data has been created based on other fields.

Read more

Learn much more about the creation, transformation, and processing of data in Link-Lives in the Link-Lives Release 2 Guide. Download the Link-Lives Release 2 Guide under “Documentation” her.