Thursday 19 July 2012

How many of us are there?

As you turn back from researching a blind alley after finding that yet another person is not the ancestor you are seeking, you might be inclined to say "Well at least we don't have any Smiths in our tree." Which prompts the question How common are the surnames of our family?

There are a number of websites that enable you to compare the occurrence of a particular name across countries and (in some places) within a single country. The World Names Profiler enabled me to look for BURTONS across the world and within the United Kingdom.

This site reports a measure of FPM (frequency per million people)and other mapping tools provide similar. But it is rare to find one that exposes the underlying raw data.

For the United States, FindTheData does just that, by drawing upon the Social Security Administration database. At first glance, you appear to be limited to searching on one name at a time. However, the "Compare" button in the results allows you to collect the data on up to ten names and view them as a table or in various charts.

With a little copying and pasting into a spreadsheet, I was able to extract data on each of the 30 unique surnames that occur in 6 generations of our tree and create a column chart showing how many US citizens share them with us.

While the ANDERSON name clearly dominates, there are 11 more popular surnames in the database. (For every Anderson, you can find 30 American Smiths. "Well at least we don't have any Smiths ... )

Considering our geographical origins, we find a top ten made up of:

  • a Scottish group Anderson[1] Cameron[6]
  • an English group Cook [2] Lloyd [4] Wilkins [5]
  • an Irish group Burton [3] McAllister [10]
  • a German group Kuhn [8] Cramer [9]
  • and one Welsh name Davies [7]

Restricting the data to the 8 surnames found in 4 recent generations, the chart looks like this.

I was surprised that there is not a single entry for SUDDABY in the US database. Hidden in the scale of the first chart is the fact that there are no MEDWELL, HEATHWOOD or BARGH either.

On the other hand, the raw numbers for even the low frequency names are huge. More than 9000 NOYES would create quite a few blind alleys to explore. In moments of frustration, perhaps we should say "Well at least we aren't working on the USA."

The ready availability of all these American data highlights the apparent difficulty of obtaining comparable information on surname frequency in Australia. The site British Surnames has a list of the 100 Top Australian Surnames that suggests raw data may be able to be extracted by someone sufficiently determined. But that may have to be a task for another day.

No comments:

Post a Comment

Related Posts Plugin for WordPress, Blogger...