0

Lumped Source Splitter

Background and Application

Source Records in Family Historian can be structured as either “split” or “lumped”. With split sources, all the relevant source information, such as a text transcript and image file, are stored in the Source record, and Citations serve only to link the source to the fact that it supports. Split source records tend to be very specific in their scope, such as a particular household in a census, or a single record in a Parish Register.

By contrast, lumped sources would regard the parent document or item as a single source (for example, the 1911 Census) with the citation containing all the detail that supports a particular fact.

The optimum structure for Family Historian is to use split sources for material that supports multiple facts or individuals (such as a household census entry), and restrict the use of lumped sources to those that generally support just a single individual or fact (such as a Civil Registration index entry).

Some other popular genealogy applications, such as Family Tree Maker or RootsMagic, make much more extensive use of lumped sources than is normally applied in Family Historian. This can result in untidy and inefficient duplication of citations when data from these applications are imported into Family Historian. This problem is particular pronounced for census citations, due to the large number of facts that could be cited. For example, a typical family of 6 individuals, each with birth, residence and occupation cited to the census would result in 18 separate but identical citations, each containing full details of the relevant entry (text transcript, image, etc).

This plugin removes this duplication by splitting a lumped source into a series of individual sources, based on equivalent citations.

Determining Citation Equivalence

Citations to a lumped source are regarded as equivalent if all of the below match (if present):

  • Where within Source
  • Text From Source (citation)
  • Attached media records (citation)
  • Citation level data fields (templated sources only)

Split Source Structure

The plugin splits each lumped source into a series of distinct new sources. Source level data such as notes, source images, repository, etc, are copied to each new source. Citation level data are processed slightly differently according to whether it is a generic or templated source:

Generic sources – Each new split source inherits its Title and Short Title from the original lumped source. The Where within Source field is converted to Publication Info for the new source, and appended to both the Title and Short Title. For example, if the lumped source is “Baptisms, Chelsea St Luke”, and the individual citation is to “1856, John Smith”, the new source is called “Baptism Chelsea St Luke: 1856, John Smith”.

Templated sources – A copy of the source template is created, where each citation level field is moved to source level, and each new split source is linked to the new template. If there is an existing template with this structure, it is used automatically. Titles for the new split sources are determined entirely by the template (as intended within the FH7 design), rather than created individually for each source. If the original lumped source does not contain a Record Title Format (as would be the case with sources imported from RootsMagic, which does not support this feature), a new value is created for the copy comprising the name of the template plus the source level data fields. Finally, the citation level fields are added to the Record Title Format.

By defining the source titles within the template, it becomes a simple matter to modify the format after splitting as required, for example by limiting the number of fields reported. Note that Family Historian does not update source titles automatically if the template is changed. To update titles, run the Refresh Source Record Auto Titles plugin, which was added to the store by Calico Pie themselves to plug this gap in functionality.

It is important to note that when a source is split, only data that relates to what the citation is are used to determine equivalence. Data relating to the citation entry or evaluation (including citation level notes from version 1.4 onwards) are maintained as distinct, and individually copied to each citation for the new split source.

Source Selection and Processing

An initial menu enables either a single source or multiple sources to be selected for processing, or all sources linked to a specific template. Once this selection is made, the plugin scans the project data to collate all source citations in preparation for subsequent processing. This step may take a few seconds on particularly large projects.

Source splitting then progresses automatically without further user input. A warning message is given if it detects any citations to the target source that do not contain any detailed data, as these cannot be split.

Once the selected sources have been processed, the plugin returns to the main main, where the user can either select further sources or close the plugin.

Result Reporting

When the plugin closes, it generates a table summarising the results of the processing (limitations in the way Family Historian plugins work do no permit this table to be displayed while the plugin is still active).

The following columns are displayed for each source selected for processing:

  • A link to the original lumped source
  • A link to its template (if relevant)
  • The total number of citations to that source prior to splitting
  • The number of distinct lumped citations (so the number of new sources created by the plugin)
  • The number of citations to the selected source after processing

The plugin does not delete the original lumped source, even if it has no remaining citations, as this would compromise the ability to display a link to the source in the results table. However, these can easily be deleted by the user if required. If there are a large number of such sources, the simplest techniques are to either copy them to a named list and delete the list, or selected them with a query and delete from the query results.

Compatibility With Previous Version

The original store version of this plugin processed templated sources in a slightly different way to that currently used. They were still split based on distinct citation data, but each new source was linked to the original template (so citation level fields were still at citation level), and a new title was assigned based on up to three user-selected citation fields.

If required, sources generated by the original plugin can be processed by the current version to convert them into the newer format, which better exploits the properties of templated sources within Family Historian.


Plugin Lumped Source Splitter

Help content on this page is owned and provided by Mark Draper, the plugin's author, Calico Pie takes no responsibility for its content.