The life courses are made from single sources, that are then sequentially linked whenever the Link-Lives team assess that there was sufficient information to do so. The links are are made using various automatic methods.
The life courses you can search are created automatically. They are a qualified estimate of which personal records across different sources refer to the same individual.
Some life courses contain errors and many are incomplete. Errors may be due to inaccuracies in the original source, mistakes or misunderstandings during data entry, or because the methods used to link the data were unable to distinguish incorrect links from correct ones.
Missing or incomplete life courses will often be due to the sources simply not containing enough information to determine whether two personal records refer to the same individual. Therefore, you must take into account the risk of errors and omissions when using the life courses, and not view the suggested life courses as a definitive answer.
If you are doing genealogical research, you can use the life courses as suggestions for how your ancestors’ lives may have unfolded. Or as a treasure map with clues pointing you towards original sources in the archives. The life courses may also inspire you to double-check the archives. For example, whether a great-great-grandmother, whom Link-Lives has assigned a slightly different life course than what appears in the family tree, may have remarried in Copenhagen, even though she was later buried in Hillerød with her first husband.
Students and researchers who wish to use the Link-Lives dataset can access documentation of the methods alongside the dataset, allowing bias and other challenges to be incorporated into the research design. This documentation can be found in the Link-Lives Release 2 Guide. Download the Link-Lives Release 2 Guide under “Documentation” here.
The life courses you see in the search are the best estimates at the moment. The database will occasionally be updated when new data becomes available or when data is linked using improved methods. When this happens, new life courses will be added, and those that are no longer the best will move into the background.
In the presentation of a life course, you can see for each individual link which method was used to create it.
In the presentation of a life course, you can see for each individual link which method was used to create it. The methods are applied to two sources at a time, which are then linked. Link-Lives has most often linked backward; for example, the 1850 census is linked to the 1845 census. This is because the newer source is expected to contain all individuals from the older source, except those who have died in the meantime. Individuals who have been added since will be younger than the number of years between the two sources and can therefore be relatively easily excluded from the comparison. However, other types of sources have been linked forward, while some have been linked both forward and backward—depending on what was assessed to give the best results.
Link-Lives has worked with three different methods to arrive at the assumption that two personal records refer to the same individual. These are briefly presented here, but you can read more about the methods in the Link-Lives Release 2 Guide (section 6). Download the Link-Lives Release 2 Guide under “Documentation” here.
Trained historians and genealogists can create highly reliable links between two sources. However, this is very time-consuming if the goal is to link entire populations. Therefore, manually created links have only been made for a smaller part of the population—for two purposes:
Rule-based linking means that the computer compares personal records in two sources. If the comparison satisfies a set of rules defined by the project team, it results in a link. This was the first beta method used in the Link-Lives project.
Rule-based links and life courses are no longer the default on the site. If you want to include them in your search, you need to uncheck the box Include only the latest methods.
The manually created training data has been used to train various models that link large datasets. To assess the quality of this automated linking, a portion of the manually created links is used to evaluate the reliability of the model’s output.
The newest links available on the website are all machine learning-based. If you want to see the older links, you need to uncheck the box Include only the latest methods.