Blog Note:

The latest installment will always show up at the top of this blog, but to read the entries sequentially (recommended), start with the introduction just below the latest post and read down from there.

Real Experimental Error Isn't Hard To Find Doug

In Installment # 6  - Titled:  "Experimental Error Does Not Mean 'Mistake' Doug",  we touched on the scientific usage of the term "error".  As we discovered, it's widely used to mean the difference between a measured value and an actual value and thus cannot be taken to automatically imply 'mistake' as Doug tried to do against Willard Libby, the inventor of carbon dating.  Since 'error rates', 'error ranges' and 'experimental errors' are such a critical part of science (and life), let's dig a bit deeper into that.


Remember the frozen mammoth dating fiasco that Batchelor stepped into with his false claims in Installment # 8 - Titled:  "The Bigger They're Told, The Harder They Fail"?  Near the end of that installment I explained that I was mystified as to why Doug felt the need to use made up lies when he could have easily found a real example of a frozen carcass dating to two different ages.  Let's use just such a real example to start our discussion.

Let's again look at that 1975 published USGS Professional Paper #862 by Troy Pewe titled : "Quaternary stratigraphic nomenclature in unglaciated central Alaska" - this is the same paper we visited back in Installment # 8.  Go to page 30 (page number is printed in top left).  There you will find a matrix of radiocarbon dated ages, samples, locations, etc. from central Alaska.


Here is an image containing the relevant portion of that matrix.  To save screen space and make it easier to view I have separated out the two samples we wish to compare.  They were not contiguous in the original matrix so don't expect them to find them that way.  I assure you that if you search that original page you will find both of the samples I am using.

One note of explanation regarding the column headings in the referenced image: In the vertical column with the heading that says "Lab. No." - the letter designation denotes the lab that is testing the particular sample while the number designation is a unique sample ID.  In this case the "SI" stands for the "Smithsonian Institute" which from 1962 until 1986 had their own radiocarbon lab. Just think of that column as the sample ID and you'll be fine.  Also using the information in the "Reference" column one can use the "Reference" section of the paper to access more information.  This particular reference tells us that R. Stuckenrath was responsible for the testing of the sample and that it happened previous to 1973 (publish date) and after 1968 (when Stuckenrath took the position of SI Radiocarbon Lab Director).

So our frozen Alaskan Ovibos is a Musk Ox and two samples were taken - hair from the hind limb and muscle from the scalp.  The hair dated to 17,210 years +/- 500 while the muscle dated 24,140 years +/- 2,200.   So after integrating our error ranges, our date range could be as narrow as 17,710 - 21,940 years or as wide as 16,710 to 26,340 years.  That's a best case range of ~24% and a worst case of ~58%.  While neither of those ranges are insignificant, they are nothing like the 100% error that Doug was bragging about in his imaginary case.


So why such a broad range of dates on this animal?  It seems rather likely that both the front and the back of this Musk Ox died at the same time so why aren't the dates exactly the same? One would think that if the dating method was any good, both results would at a minimum fall within the error ranges of each other, right?  Before I give up the most likely answer, I'm going to let you consider it on your own for a bit.  With what we learned about the development of radiocarbon dating in Installment # 6, can anyone spot the information on this chart that would be our biggest clue to the little mystery?  We'll come back to this later in this installment after you've had time to think about it.

Let's talk about "outliers" for a bit.  The related Wikipedia description reads as follows: 
"In statistics, an outlier is an observation point that is distant from other observations. An outlier may be due to variability in the measurement or it may indicate experimental error; the latter are sometimes excluded from the data set."
 Let's look at a simple, real world example of an outlier.

I happen to be a trail runner.  Compared to road running, I much prefer the varied terrain that keeps my mind involved and coordination current.  I also generally run with a GPS and barometric altimeter equipped watch along with a wireless heart rate monitor feeding into the watch.  Thus I can later plot and review that data in chart form.



Here are two charts from a recent one hour out and back trail run.  The upper chart is an overlay of heart rate (red) over speed (light blue).  The second is the same heart rate data over a route driven, topo based altitude plot (dark blue).  Let's treat this data plot as if we were scientists who don't know the runner.

Notice the spike in the heart rate data near the end of the run.  For almost the entire run my heart rate hovered near 160bpm and suddenly it skyrockets to near 235.  How would we deal with this outlier if this were an important data set?

Well through much previous testing and experience with physiology we know the typical external causes of suddenly increasing heart rate - these would be increased load/exertion and increased fear/anxiety.  Let's look for evidence of those in our other plots.  Did the trail suddenly veer steeply uphill at that point in the run or did I suddenly speed up, indicating an increase in exertion?  No - and in fact, we can tell the speed is slowly dropping and the trail is continuing a steady downhill trend. Sudden increase in exertion is out as a cause.  What about fear?  If fear were involved - say an angry bear encounter or a man pointing a gun at the runner, one would certainly be surprised if there were not a reaction in the speed plot - either a dramatic increase in speed to escape or a sudden stop to acquiesce and follow instructions.  Neither are indicated.  


But lastly (and mostly), we can pretty much be assured the data point is spurious because 235 beats per minute without artificial chemical or electrical inducement of the heart is so rare as to be bordering on impossible. Knowing the limitations of inexpensive heart rate sensors and then considering other possibilities such as momentary wireless interference, etc., it is overwhelmingly more likely that this was merely an outlier of the experimental error variety - a difference between a measured value and an actual value

So here are our two choices as scientists -- we can assume that the data point is real against all odds or we can assume that the point is spurious and the result of known, testable and repeatable weaknesses in our measurement system. Heck, we know from experience that we can create such errors at will simply by a momentary ill fitting of the sensor - perhaps the runner moved the sensor aside to scratch an itch or by introducing a splash of signal interference.  After all, our testing has shown this to be easily possible and not particularly rare.

Now as good scientists, after both reviewing this data - along with perhaps thousands of hours more from other runners, and exploring the sensors and other sources of such an anomalous data point, we conclude that this is mere experimental error.  It happens.  We see that one can go for hours upon hours of recording such data -- tens of thousands, even hundreds of thousands of accurate points between such outliers. It's a limitation with very few consequences so we filter (remove) the point out of the data set (more on filtering of data in another installment).


What now?  Do we declare the measurement system worthless?  Do we produce videos and web pages and travel the world gathering people together, showing this 235bpm data point without considering the context and using it to describe just how awful, unreliable and stupid wireless heart rate monitors are?  "THE MONITORISTS ARE MISLEADING YOU ALL!" we might scream.  That's what Batchelor does. Think about it for a bit.  We'll revisit that accusation in the summary of this installment.

Back to our frozen carcass dating dilemma:



Have you figured out what might be the issue here?  Look at the red highlight that I've added.  See the 30+ year span?  These sample were taken from the frozen ground in 1940 - a time when no one knew of nor could possibly have anticipated their usefulness as candidates for radiocarbon dating.  

Since the method wasn't fully developed or widely available until the '60s, we know these samples spent a majority of those years being viewed, handled and stored without a single thought as to their radiocarbon testing value.  As we will learn in a later installment, sample contamination is the enemy of reliable radiocarbon results (the enemy of any scientific test actually).  Today, to prevent this there are well established protocols for sample protection - not unlike those for reliable forensic work in the criminal fields.  In 1940 however, no one would have thought to consider such.

The early years of radiocarbon dating were filled with dates derived from samples of objects discovered, handled and cataloged before the method was invented.  We simply have to accept that for those items, significant outliers will be more common and view the data from that period with that in mind.

Based on the fact that Batchelor's specific mammoth dating accusations were proven totally false in Installment #8 by his own references, it's clear Doug has no regard for the truth - he just repeats stuff when he likes how it sounds.  But even if Doug did find actual occurrences to use just like I did, do you think he would even stop and consider whether the data point he is highlighting is actually a demonstration of a fundamental flaw in the method or merely an outlier that deserves to be filtered out?  The answer is clear from his behaviors - don't confuse Doug with facts, his mind is already made up.



No comments:

Post a Comment