Lumped Source Splitter

Background and Application

Source Records in Family Historian can be structured as either “split” or “lumped”. With split sources, all the relevant source information, such as a text transcript and image file, are stored in the Source record, and Citations serve only to link the source to the fact that it supports. Split source records tend to be very specific in their scope, such as a particular household in a census, or a single record in a Parish Register.

By contrast, lumped sources would regard the parent document or event as a single source (for example, the 1911 Census) with the citation containing all the detail that supports a particular fact.

The optimum structure for Family Historian is to use split sources for material that supports multiple facts or individuals (such as a household census entry), and restrict the use of lumped sources to those that generally support just a single individual or fact (such as a Civil Registration index entry).

Some other genealogy applications make much more extensive use of lumped sources than is normally applied in Family Historian. This can result in untidy and inefficient duplication of citations when these GEDCOM files are imported.

This plugin removes this duplication by splitting a lumped source into a series of individual sources, based on equivalent citations.

Determining Citation Equivalence

Citations to a lumped source are regarded as equivalent if all of the below match (if present):

  • Where within Source
  • Text From Source (citation)
  • Attached media records (citation)
  • Notes (local and note records)

This is typical of how a lumped citation would be characterised in products such as Family Tree Maker or RootsMagic, with one common citation irrespective of how many records and facts it is copied to. When imported to Family Historian, each individual citation is duplicated, making future revisions much more complicated. This problem is particular pronounced for census citations, due to the large number of facts that could be cited. For example, a typical family of 6 individuals, each with birth, residence and occupation cited to the census would result in 18 separate but identical citations, each containing full details of the relevant entry (text transcript, image, etc).

Each new split sources inherits its Title, Short Title, Repository, etc from the original lumped source.   Where within Source is converted to Publication Info for the new source, and appended to both the Title and Short Title. For example, if the lumped source is “Baptisms, Chelsea St Luke”, and the individual citation is to “1856, John Smith”, the new source is called “Baptism Chelsea St Luke: 1856, John Smith”.

All of the common items listed above are moved from the old lumped citations to the new split source, so recorded only once. Other fields, such as Date Entered, Rating, Event Responsible (FH7 only) are kept separate for each citation, but also moved to the new split source.

Source Selection

An initial menu enables either a single source or multiple sources to be selected for processing.  When a source is selected, the plugin scans the project data and determines the number of unique citations based on the criteria described above.  This is reported back to the user, and a final yes/no decision made on whether to proceed with splitting.

Once splitting has been completed, there is an option to delete the original lumped source if it has no remaining citations or links.

Once all selected sources have been processed, the plugin returns to the main menu for further selection or closing, as appropriate.

Use with Templated Sources

It is anticipated that this plugin will be used primarily for generic sources, and these are split as described above.  However, it is also compatible with templated sources.  These are most likely to occur in a database that originated in RootsMagic where the source templates have been imported via plugin.  See this FHUG forum thread for a fuller discussion of this topic and the plugin download.

A templated sources typically has no Where Within Source field.  Instead of using this single value, all the citation level data fields must be identical for citations to be regarded as equivalent.  The new source titles and short titles are determined from a menu that permits up to three source level data fields to be used rather than the single Where Within Source for a generic source.

When splitting a templated source, the source template is applied to each newly created split source, meaning that the overall structure of the sources is maintained.

Treatment of Undefined Data Fields (UDFs)

Sources imported from some applications may contain residual Undefined Data Fields that Family Historian has imported but does not recognise.  If the plugin detects the presence of an undefined RootsMagic source template, splitting that source is blocked as this is likely to result in data loss.  To resolve this, import the source template definitions from the original RootsMagic GEDCOM file to recreate the original source structure as described above, and then proceed to split as required.

For other miscellaneous UDF fields, the plugin warns about their presence, but enables splitting to go ahead after confirmation.  The UDF fields are discarded and not copied to the new split sources.

Help content on this page is owned and provided by Mark Draper, the plugin's author, Calico Pie takes no responsibility for its content.