IGI Stats Statistics and the International Genealogical Index (IGI) The only approach that I have seen is Martin Ecclestone’s “The diffusion of English surnames”, (Local Historian, 1989) which examines the 1988 edition of the IGI. The following is a paraphrase of his illuminating article. This was groundbreaking material at the time. Now, in this age of the IGI on CD-ROM and IGI on-line, are these latest versions able to deliver similar or indeed enhanced statistical information? All statistics in the following are copyright © 1989 Martin Ecclestone, who has gracefully approved their reproduction here. Mr Ecclestone wrote this article before the advent of the CD-ROM version of the IGI. A great advantage of the previous fiche version was its amenability to statistical analysis. The numbered frames of the fiche made it straightforward to count the number of entries per surname and the proportion of a county that any given surname entries consititutes. Repeating the exercise for each of the 39 English counties allows the geographical distribution of that surname’s frequency to be tabulated. An obvious drawback is the wide range of dates in the IGI – from 1538 to about 1900. Plus the well-known inconsistency of geographical coverage. Do these minuses nullify any findings? Mr Ecclestone attempts to address these issues. He considers the dates objection by constructing a histogram of 2760 dates randomly selected from the Index. The resulting graph reveals a steady growth in entries from 1538, peaking in 1837, when there is a dramatic drop. This occurs because many parish record transcriptions stop in 1837 when the St Catherine’s house records begin. The histogram, and the following table reveals that the 1988 IGI entries are chiefly representative of eighteenth century England. County No. of frames 1801 Pop/Frames Median Date Bedford 14849 4.27 1754 Berkshire 12206 8.95 1754 Bucks 12470 8.62 1754 Cambridge 9179 9.73 1746 Chesire 12352 15.52 1754 Cornwall 26719 7.05 1789 Cumberland 18022 6.50 1803 Derby 21343 7.55 1796 Devon 42081 8.15 1726 Dorset 3987 28.92 1774 Durham 19975 8.03 1758 Essex 12044 18.80 1734 Gloucester 25905 9.68 1762 Hants/IOW 21411 10.26 1808 Hereford 6115 14.59 1777 Hertford 18015 5.42 1741 Hunts 991 37.91 1772 Kent 24875 12.37 1773 Lancashire 83541 8.05 1814 Leicester 17348 7.50 1774 Lincoln 37412 5.57 1711 London 154724 5.29 1770 Norfolk 20116 13.59 1735 Northants 5407 24.37 1746 Northumberland 21707 7.24 1767 Nottingham 17267 8.13 1801 Oxford 8649 12.67 1798 Rutland 1900 8.61 1741 Shropshire 23931 7.01 1773 Somerset 8820 31.04 1752 Stafford 33846 7.07 1780 Suffolk 21171 9.94 1779 Surrey 21121 12.74 1810 Sussex 20815 7.65 1748 Warwick 39389 5.29 1818 Westmoreland 5705 7.29 1770 Wiltshire 11613 15.94 1772 Worcester 23546 5.92 1806 Yorkshire 102989 8.34 1783 ENGLAND 987574 8.39 1772 WALES+ MONMOUTH (1984) 32589 18.01 1820 Note: The number of frames excludes frames with no surname. Note: The median date is the date for which there are as many earlier dates as later dates in the sample. For England as a whole, the median date is 1772, the lower quartile date is 1693, and the upper quartile date is 1820. Thus 50% of the IGI entries fall in the inter-quartile range of 127 years. earliest lowest median date for a county is Lincolnshire (1711), whilst the latest is Warwickshire (1818). The interquartile range for any county can be estimated as the difference between its median date and 1888. IGI County Coverage Column 3 of the above table is the ratio between the 1801 county populations and the number of IGI frames for each county. This ratio is 8.39 for England as a whole, but varies between 4.27 (Bedfordshire) and 37.9 (Huntingdonshire). High values represent counties that are under represented in the IGI in relation to their 1801 population, whilst conversely low values show the counties whose registers are the most complete or have been most fully transcribed. Back Projection “The tabulated ratios may be used to convert the number of frames containing a particular surname into an estimate of the 1801 population of that surname.” Mr Ecclestone cites the example of the surname Fuller. There are 7.5 frames of Fullers in the Bedford county index. Thus he estimates there were 32 (7.5 x 4.27) Fullers alive in 1801. Applying this method to the rest, results in an estimate of 4275 Fullers for England as a whole. With my own name, Dance, there are 22 frames for the county of Worcester, which equates to a population of 130 people in 1801. I know from the censuses that the actual population in 1851 is 144, so the 130 estimate is a reasonable one. It is however important to cleanse the IGI data of any duplicates or patron submittals. IGI Births/Marriages/Deaths Coverage Martin Ecclestone says that “a measure of completeness of the English index is the proportion of births and marriages that are recorded as IGI entries at different periods.” He gives the proportion of marriages (derived from random sampling ) as: 1540-1599 40% 1600-1699 39% 1700-1799 34% 1800-early 1800s 29% This is then compared with an independent estimate of the number of marriages that actually occurred during the same decades. The same procedure is used to compare IGI baptismal records with total births. Decade IGI bapt IGI marr Total births Total marr % bapt IGI % marr IGI 1570-9 0.16 0.12 1.135 0.333 14% 36% 1629-9 0.56 0.18 1.517 0.372 37% 48% 1650-9 0.37 0.11 1.445 0.452 26% 24% 1670-9 0.55 0.18 1.471 0.354 37% 51% 1720-9 0.71 0.22 1.754 0.480 40% 46% 1770-9 1.16 0.30 2.409 0.589 48% 51% 1820-9 1.96 0.40 4.770 0.980 41% 41% The above table summarises his results from seven selected decades. “It demonstrates that births and marriages are more or less equally recorded except during the sixteenth century” and “apart from the Commonwealth period… the IGI is 40% to 50% complete between 1600 and 1837.” Mr Ecclestone concludes that the IGI contains almost a half of the number of records possible, during the 18th century. (This figure needs to be adjusted for individual counties, as shown in the first table). Although the median date varies for each county, “since surname distributions change rather slowly, it is felt that those which are obtained from the IGI data are probably fair descriptions of the mid-eighteenth century situation.” The article then proceeds to give some case studies from actual surname examples, and shows how their diffusion can be measured. Overall, it is a fascinating article. If you are interested in the possibilities of the IGI, then seek out a copy. The Local Historian is published by the British Association for Local History (BALH).